com.google.cloud.dataflow.sdk.transforms.Sum Java Examples

The following examples show how to use com.google.cloud.dataflow.sdk.transforms.Sum. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: ExactDollarRides.java    From cloud-dataflow-nyc-taxi-tycoon with Apache License 2.0 5 votes vote down vote up
public static void main(String[] args) {
  CustomPipelineOptions options =
      PipelineOptionsFactory.fromArgs(args).withValidation().as(CustomPipelineOptions.class);
  Pipeline p = Pipeline.create(options);

  p.apply(PubsubIO.Read.named("read from PubSub")
      .topic(String.format("projects/%s/topics/%s", options.getSourceProject(), options.getSourceTopic()))
      .timestampLabel("ts")
      .withCoder(TableRowJsonCoder.of()))

   .apply("extract dollars",
      MapElements.via((TableRow x) -> Double.parseDouble(x.get("meter_increment").toString()))
        .withOutputType(TypeDescriptor.of(Double.class)))

   .apply("fixed window", Window.into(FixedWindows.of(Duration.standardMinutes(1))))
   .apply("trigger",
      Window.<Double>triggering(
        AfterWatermark.pastEndOfWindow()
          .withEarlyFirings(AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardSeconds(1)))
          .withLateFirings(AfterPane.elementCountAtLeast(1)))
        .accumulatingFiredPanes()
        .withAllowedLateness(Duration.standardMinutes(5)))

   .apply("sum whole window", Sum.doublesGlobally().withoutDefaults())
   .apply("format rides", ParDo.of(new TransformRides()))

   .apply(PubsubIO.Write.named("WriteToPubsub")
      .topic(String.format("projects/%s/topics/%s", options.getSinkProject(), options.getSinkTopic()))
      .withCoder(TableRowJsonCoder.of()));
  p.run();
}
 
Example #2
Source File: FXTimeSeriesPipelineSRGTests.java    From data-timeseries-java with Apache License 2.0 4 votes vote down vote up
@org.junit.Test
public void testDataInput() {

  Pipeline pipeline = setup();

  PCollection<KV<String, TSProto>> tsData =
      setupDataInput(pipeline, GenerateSampleData.getTestData());

  LOG.info("Check that we have 42 elements in the Input PCollection");

  DataflowAssert.that(
      tsData.apply("TestInputElementCount", ParDo.of(new DoFn<KV<String, TSProto>, Integer>() {

        @Override
        public void processElement(DoFn<KV<String, TSProto>, Integer>.ProcessContext c)
            throws Exception {

          c.output(1);
        }

      })).apply(Sum.integersGlobally())).containsInAnyOrder(42);

  pipeline.run();

}