org.apache.flink.streaming.api.functions.KeyedProcessFunction Java Examples

The following examples show how to use org.apache.flink.streaming.api.functions.KeyedProcessFunction. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: ProcessFunctionTestHarnesses.java    From flink with Apache License 2.0 6 votes vote down vote up
/**
 * Returns an initialized test harness for {@link KeyedProcessFunction}.
 *
 * @param function instance of a {@link KeyedCoProcessFunction} under test
 * @param <K> key type
 * @param <IN> type of input stream elements
 * @param <OUT> type of output stream elements
 * @return {@link KeyedOneInputStreamOperatorTestHarness} wrapped around {@code function}
 */
public static <K, IN, OUT>
KeyedOneInputStreamOperatorTestHarness<K, IN, OUT> forKeyedProcessFunction(
	final KeyedProcessFunction<K, IN, OUT> function,
	final KeySelector<IN, K> keySelector,
	final TypeInformation<K> keyType) throws Exception {

	KeyedOneInputStreamOperatorTestHarness<K, IN, OUT> testHarness =
		new KeyedOneInputStreamOperatorTestHarness<>(
			new KeyedProcessOperator<>(
				Preconditions.checkNotNull(function)),
			keySelector,
			keyType,
			1,
			1,
			0);
	testHarness.open();
	return testHarness;
}
 
Example #2
Source File: ProcTimeRangeBoundedPrecedingFunction.java    From flink with Apache License 2.0 6 votes vote down vote up
@Override
public void processElement(
		RowData input,
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		Collector<RowData> out) throws Exception {
	long currentTime = ctx.timerService().currentProcessingTime();
	// buffer the event incoming event

	// add current element to the window list of elements with corresponding timestamp
	List<RowData> rowList = inputState.get(currentTime);
	// null value means that this si the first event received for this timestamp
	if (rowList == null) {
		rowList = new ArrayList<RowData>();
		// register timer to process event once the current millisecond passed
		ctx.timerService().registerProcessingTimeTimer(currentTime + 1);
		registerCleanupTimer(ctx, currentTime);
	}
	rowList.add(input);
	inputState.put(currentTime, rowList);
}
 
Example #3
Source File: KeyedStream.java    From Flink-CEPplus with Apache License 2.0 6 votes vote down vote up
/**
 * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream.
 *
 * <p>The function will be called for every element in the input streams and can produce zero
 * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)}
 * function, this function can also query the time and set timers. When reacting to the firing
 * of set timers the function can directly emit elements and/or register yet more timers.
 *
 * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream.
 *
 * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}.
 *
 * @return The transformed {@link DataStream}.
 */
@PublicEvolving
public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) {

	TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType(
			keyedProcessFunction,
			KeyedProcessFunction.class,
			1,
			2,
			TypeExtractor.NO_INDEX,
			getType(),
			Utils.getCallLocationName(),
			true);

	return process(keyedProcessFunction, outType);
}
 
Example #4
Source File: ProcTimeRangeBoundedPrecedingFunction.java    From flink with Apache License 2.0 6 votes vote down vote up
@Override
public void processElement(
		BaseRow input,
		KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
		Collector<BaseRow> out) throws Exception {
	long currentTime = ctx.timerService().currentProcessingTime();
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, currentTime);

	// buffer the event incoming event

	// add current element to the window list of elements with corresponding timestamp
	List<BaseRow> rowList = inputState.get(currentTime);
	// null value means that this si the first event received for this timestamp
	if (rowList == null) {
		rowList = new ArrayList<BaseRow>();
		// register timer to process event once the current millisecond passed
		ctx.timerService().registerProcessingTimeTimer(currentTime + 1);
	}
	rowList.add(input);
	inputState.put(currentTime, rowList);
}
 
Example #5
Source File: KeyedStream.java    From flink with Apache License 2.0 6 votes vote down vote up
/**
 * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream.
 *
 * <p>The function will be called for every element in the input streams and can produce zero
 * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)}
 * function, this function can also query the time and set timers. When reacting to the firing
 * of set timers the function can directly emit elements and/or register yet more timers.
 *
 * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream.
 *
 * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}.
 *
 * @return The transformed {@link DataStream}.
 */
@PublicEvolving
public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) {

	TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType(
			keyedProcessFunction,
			KeyedProcessFunction.class,
			1,
			2,
			TypeExtractor.NO_INDEX,
			getType(),
			Utils.getCallLocationName(),
			true);

	return process(keyedProcessFunction, outType);
}
 
Example #6
Source File: KeyedStream.java    From flink with Apache License 2.0 6 votes vote down vote up
/**
 * Applies the given {@link KeyedProcessFunction} on the input stream, thereby creating a transformed output stream.
 *
 * <p>The function will be called for every element in the input streams and can produce zero
 * or more output elements. Contrary to the {@link DataStream#flatMap(FlatMapFunction)}
 * function, this function can also query the time and set timers. When reacting to the firing
 * of set timers the function can directly emit elements and/or register yet more timers.
 *
 * @param keyedProcessFunction The {@link KeyedProcessFunction} that is called for each element in the stream.
 *
 * @param <R> The type of elements emitted by the {@code KeyedProcessFunction}.
 *
 * @return The transformed {@link DataStream}.
 */
@PublicEvolving
public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY, T, R> keyedProcessFunction) {

	TypeInformation<R> outType = TypeExtractor.getUnaryOperatorReturnType(
			keyedProcessFunction,
			KeyedProcessFunction.class,
			1,
			2,
			TypeExtractor.NO_INDEX,
			getType(),
			Utils.getCallLocationName(),
			true);

	return process(keyedProcessFunction, outType);
}
 
Example #7
Source File: RowTimeRowsBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void processElement(
		RowData input,
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		Collector<RowData> out) throws Exception {
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());

	// triggering timestamp for trigger calculation
	long triggeringTs = input.getLong(rowTimeIdx);

	Long lastTriggeringTs = lastTriggeringTsState.value();
	if (lastTriggeringTs == null) {
		lastTriggeringTs = 0L;
	}

	// check if the data is expired, if not, save the data and register event time timer
	if (triggeringTs > lastTriggeringTs) {
		List<RowData> data = inputState.get(triggeringTs);
		if (null != data) {
			data.add(input);
			inputState.put(triggeringTs, data);
		} else {
			data = new ArrayList<RowData>();
			data.add(input);
			inputState.put(triggeringTs, data);
			// register event time timer
			ctx.timerService().registerEventTimeTimer(triggeringTs);
		}
	}
}
 
Example #8
Source File: ProcTimeRowsBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void onTimer(
		long timestamp,
		KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx,
		Collector<RowData> out) throws Exception {
	if (stateCleaningEnabled) {
		cleanupState(inputState, accState, counterState, smallestTsState);
		function.cleanup();
	}
}
 
Example #9
Source File: ProcTimeUnboundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void onTimer(
		long timestamp,
		KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx,
		Collector<RowData> out) throws Exception {
	if (stateCleaningEnabled) {
		cleanupState(accState);
		function.cleanup();
	}
}
 
Example #10
Source File: ProcTimeUnboundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void processElement(
		RowData input,
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		Collector<RowData> out) throws Exception {
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());

	RowData accumulators = accState.value();
	if (null == accumulators) {
		accumulators = function.createAccumulators();
	}
	// set accumulators in context first
	function.setAccumulators(accumulators);

	// accumulate input row
	function.accumulate(input);

	// update the value of accumulators for future incremental computation
	accumulators = function.getAccumulators();
	accState.update(accumulators);

	// prepare output row
	RowData aggValue = function.getValue();
	output.replace(input, aggValue);
	out.collect(output);
}
 
Example #11
Source File: ProcTimeRangeBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
private void registerCleanupTimer(
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		long timestamp) throws Exception {
	// calculate safe timestamp to cleanup states
	long minCleanupTimestamp = timestamp + precedingTimeBoundary + 1;
	long maxCleanupTimestamp = timestamp + (long) (precedingTimeBoundary * 1.5) + 1;
	// update timestamp and register timer if needed
	Long curCleanupTimestamp = cleanupTsState.value();
	if (curCleanupTimestamp == null || curCleanupTimestamp < minCleanupTimestamp) {
		// we don't delete existing timer since it may delete timer for data processing
		// TODO Use timer with namespace to distinguish timers
		ctx.timerService().registerProcessingTimeTimer(maxCleanupTimestamp);
		cleanupTsState.update(maxCleanupTimestamp);
	}
}
 
Example #12
Source File: RowTimeRangeBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void processElement(
		RowData input,
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		Collector<RowData> out) throws Exception {
	// triggering timestamp for trigger calculation
	long triggeringTs = input.getLong(rowTimeIdx);

	Long lastTriggeringTs = lastTriggeringTsState.value();
	if (lastTriggeringTs == null) {
		lastTriggeringTs = 0L;
	}

	// check if the data is expired, if not, save the data and register event time timer
	if (triggeringTs > lastTriggeringTs) {
		List<RowData> data = inputState.get(triggeringTs);
		if (null != data) {
			data.add(input);
			inputState.put(triggeringTs, data);
		} else {
			data = new ArrayList<RowData>();
			data.add(input);
			inputState.put(triggeringTs, data);
			// register event time timer
			ctx.timerService().registerEventTimeTimer(triggeringTs);
		}
		registerCleanupTimer(ctx, triggeringTs);
	}
}
 
Example #13
Source File: RowTimeRangeBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
private void registerCleanupTimer(
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		long timestamp) throws Exception {
	// calculate safe timestamp to cleanup states
	long minCleanupTimestamp = timestamp + precedingOffset + 1;
	long maxCleanupTimestamp = timestamp + (long) (precedingOffset * 1.5) + 1;
	// update timestamp and register timer if needed
	Long curCleanupTimestamp = cleanupTsState.value();
	if (curCleanupTimestamp == null || curCleanupTimestamp < minCleanupTimestamp) {
		// we don't delete existing timer since it may delete timer for data processing
		// TODO Use timer with namespace to distinguish timers
		ctx.timerService().registerEventTimeTimer(maxCleanupTimestamp);
		cleanupTsState.update(maxCleanupTimestamp);
	}
}
 
Example #14
Source File: AbstractRowTimeUnboundedPrecedingOver.java    From flink with Apache License 2.0 5 votes vote down vote up
/**
 * Puts an element from the input stream into state if it is not late.
 * Registers a timer for the next watermark.
 *
 * @param input The input value.
 * @param ctx   A {@link Context} that allows querying the timestamp of the element and getting
 *              TimerService for registering timers and querying the time. The
 *              context is only valid during the invocation of this method, do not store it.
 * @param out   The collector for returning result values.
 * @throws Exception
 */
@Override
public void processElement(
		RowData input,
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		Collector<RowData> out) throws Exception {
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());

	long timestamp = input.getLong(rowTimeIdx);
	long curWatermark = ctx.timerService().currentWatermark();

	// discard late record
	if (timestamp > curWatermark) {
		// ensure every key just registers one timer
		// default watermark is Long.Min, avoid overflow we use zero when watermark < 0
		long triggerTs = curWatermark < 0 ? 0 : curWatermark + 1;
		ctx.timerService().registerEventTimeTimer(triggerTs);

		// put row into state
		List<RowData> rowList = inputState.get(timestamp);
		if (rowList == null) {
			rowList = new ArrayList<RowData>();
		}
		rowList.add(input);
		inputState.put(timestamp, rowList);
	}
}
 
Example #15
Source File: AbstractRowTimeUnboundedPrecedingOver.java    From flink with Apache License 2.0 5 votes vote down vote up
/**
 * Puts an element from the input stream into state if it is not late.
 * Registers a timer for the next watermark.
 *
 * @param input The input value.
 * @param ctx   A {@link Context} that allows querying the timestamp of the element and getting
 *              TimerService for registering timers and querying the time. The
 *              context is only valid during the invocation of this method, do not store it.
 * @param out   The collector for returning result values.
 * @throws Exception
 */
@Override
public void processElement(
		BaseRow input,
		KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
		Collector<BaseRow> out) throws Exception {
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());

	long timestamp = input.getLong(rowTimeIdx);
	long curWatermark = ctx.timerService().currentWatermark();

	// discard late record
	if (timestamp > curWatermark) {
		// ensure every key just registers one timer
		// default watermark is Long.Min, avoid overflow we use zero when watermark < 0
		long triggerTs = curWatermark < 0 ? 0 : curWatermark + 1;
		ctx.timerService().registerEventTimeTimer(triggerTs);

		// put row into state
		List<BaseRow> rowList = inputState.get(timestamp);
		if (rowList == null) {
			rowList = new ArrayList<BaseRow>();
		}
		rowList.add(input);
		inputState.put(timestamp, rowList);
	}
}
 
Example #16
Source File: RowTimeRangeBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void processElement(
		BaseRow input,
		KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
		Collector<BaseRow> out) throws Exception {
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());

	// triggering timestamp for trigger calculation
	long triggeringTs = input.getLong(rowTimeIdx);

	Long lastTriggeringTs = lastTriggeringTsState.value();
	if (lastTriggeringTs == null) {
		lastTriggeringTs = 0L;
	}

	// check if the data is expired, if not, save the data and register event time timer
	if (triggeringTs > lastTriggeringTs) {
		List<BaseRow> data = inputState.get(triggeringTs);
		if (null != data) {
			data.add(input);
			inputState.put(triggeringTs, data);
		} else {
			data = new ArrayList<BaseRow>();
			data.add(input);
			inputState.put(triggeringTs, data);
			// register event time timer
			ctx.timerService().registerEventTimeTimer(triggeringTs);
		}
	}
}
 
Example #17
Source File: RowTimeRowsBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void processElement(
		BaseRow input,
		KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
		Collector<BaseRow> out) throws Exception {
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());

	// triggering timestamp for trigger calculation
	long triggeringTs = input.getLong(rowTimeIdx);

	Long lastTriggeringTs = lastTriggeringTsState.value();
	if (lastTriggeringTs == null) {
		lastTriggeringTs = 0L;
	}

	// check if the data is expired, if not, save the data and register event time timer
	if (triggeringTs > lastTriggeringTs) {
		List<BaseRow> data = inputState.get(triggeringTs);
		if (null != data) {
			data.add(input);
			inputState.put(triggeringTs, data);
		} else {
			data = new ArrayList<BaseRow>();
			data.add(input);
			inputState.put(triggeringTs, data);
			// register event time timer
			ctx.timerService().registerEventTimeTimer(triggeringTs);
		}
	}
}
 
Example #18
Source File: ProcTimeRowsBoundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void onTimer(
		long timestamp,
		KeyedProcessFunction<K, BaseRow, BaseRow>.OnTimerContext ctx,
		Collector<BaseRow> out) throws Exception {
	if (stateCleaningEnabled) {
		cleanupState(inputState, accState, counterState, smallestTsState);
		function.cleanup();
	}
}
 
Example #19
Source File: ProcTimeUnboundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void onTimer(
		long timestamp,
		KeyedProcessFunction<K, BaseRow, BaseRow>.OnTimerContext ctx,
		Collector<BaseRow> out) throws Exception {
	if (stateCleaningEnabled) {
		cleanupState(accState);
		function.cleanup();
	}
}
 
Example #20
Source File: ProcTimeUnboundedPrecedingFunction.java    From flink with Apache License 2.0 5 votes vote down vote up
@Override
public void processElement(
		BaseRow input,
		KeyedProcessFunction<K, BaseRow, BaseRow>.Context ctx,
		Collector<BaseRow> out) throws Exception {
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());

	BaseRow accumulators = accState.value();
	if (null == accumulators) {
		accumulators = function.createAccumulators();
	}
	// set accumulators in context first
	function.setAccumulators(accumulators);

	// accumulate input row
	function.accumulate(input);

	// update the value of accumulators for future incremental computation
	accumulators = function.getAccumulators();
	accState.update(accumulators);

	// prepare output row
	BaseRow aggValue = function.getValue();
	output.replace(input, aggValue);
	out.collect(output);
}
 
Example #21
Source File: ProcessFunctionTestHarnessesTest.java    From flink with Apache License 2.0 5 votes vote down vote up
@Test
public void testHarnessForKeyedProcessFunction() throws Exception {
	KeyedProcessFunction<Integer, Integer, Integer> function = new KeyedProcessFunction<Integer, Integer, Integer>() {
		@Override
		public void processElement(Integer value, Context ctx, Collector<Integer> out) throws Exception {
			out.collect(value);
		}
	};
	OneInputStreamOperatorTestHarness<Integer, Integer> harness = ProcessFunctionTestHarnesses
		.forKeyedProcessFunction(function, x -> x, BasicTypeInfo.INT_TYPE_INFO);

	harness.processElement(1, 10);

	assertEquals(harness.extractOutputValues(), Collections.singletonList(1));
}
 
Example #22
Source File: AbstractRowTimeUnboundedPrecedingOver.java    From flink with Apache License 2.0 4 votes vote down vote up
@Override
public void onTimer(
		long timestamp,
		KeyedProcessFunction<K, RowData, RowData>.OnTimerContext ctx,
		Collector<RowData> out) throws Exception {
	if (isProcessingTimeTimer(ctx)) {
		if (stateCleaningEnabled) {

			// we check whether there are still records which have not been processed yet
			if (inputState.isEmpty()) {
				// we clean the state
				cleanupState(inputState, accState);
				function.cleanup();
			} else {
				// There are records left to process because a watermark has not been received yet.
				// This would only happen if the input stream has stopped. So we don't need to clean up.
				// We leave the state as it is and schedule a new cleanup timer
				registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
			}
		}
		return;
	}

	Iterator<Long> keyIterator = inputState.keys().iterator();
	if (keyIterator.hasNext()) {
		Long curWatermark = ctx.timerService().currentWatermark();
		boolean existEarlyRecord = false;

		// sort the record timestamps
		do {
			Long recordTime = keyIterator.next();
			// only take timestamps smaller/equal to the watermark
			if (recordTime <= curWatermark) {
				insertToSortedList(recordTime);
			} else {
				existEarlyRecord = true;
			}
		} while (keyIterator.hasNext());

		// get last accumulator
		RowData lastAccumulator = accState.value();
		if (lastAccumulator == null) {
			// initialize accumulator
			lastAccumulator = function.createAccumulators();
		}
		// set accumulator in function context first
		function.setAccumulators(lastAccumulator);

		// emit the rows in order
		while (!sortedTimestamps.isEmpty()) {
			Long curTimestamp = sortedTimestamps.removeFirst();
			List<RowData> curRowList = inputState.get(curTimestamp);
			if (curRowList != null) {
				// process the same timestamp datas, the mechanism is different according ROWS or RANGE
				processElementsWithSameTimestamp(curRowList, out);
			} else {
				// Ignore the same timestamp datas if the state is cleared already.
				LOG.warn("The state is cleared because of state ttl. " +
					"This will result in incorrect result. " +
					"You can increase the state ttl to avoid this.");
			}
			inputState.remove(curTimestamp);
		}

		// update acc state
		lastAccumulator = function.getAccumulators();
		accState.update(lastAccumulator);

		// if are are rows with timestamp > watermark, register a timer for the next watermark
		if (existEarlyRecord) {
			ctx.timerService().registerEventTimeTimer(curWatermark + 1);
		}
	}

	// update cleanup timer
	registerProcessingCleanupTimer(ctx, ctx.timerService().currentProcessingTime());
}
 
Example #23
Source File: KeyedProcessOperator.java    From flink with Apache License 2.0 4 votes vote down vote up
public KeyedProcessOperator(KeyedProcessFunction<K, IN, OUT> function) {
	super(function);

	chainingStrategy = ChainingStrategy.ALWAYS;
}
 
Example #24
Source File: KeyedProcessOperator.java    From flink with Apache License 2.0 4 votes vote down vote up
ContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) {
	function.super();
	this.timerService = checkNotNull(timerService);
}
 
Example #25
Source File: KeyedProcessOperator.java    From flink with Apache License 2.0 4 votes vote down vote up
OnTimerContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) {
	function.super();
	this.timerService = checkNotNull(timerService);
}
 
Example #26
Source File: KeyedProcessOperatorTest.java    From flink with Apache License 2.0 4 votes vote down vote up
@Test
public void testKeyQuerying() throws Exception {

	class KeyQueryingProcessFunction extends KeyedProcessFunction<Integer, Tuple2<Integer, String>, String> {

		@Override
		public void processElement(
			Tuple2<Integer, String> value,
			Context ctx,
			Collector<String> out) throws Exception {

			assertTrue("Did not get expected key.", ctx.getCurrentKey().equals(value.f0));

			// we check that we receive this output, to ensure that the assert was actually checked
			out.collect(value.f1);
		}
	}

	KeyedProcessOperator<Integer, Tuple2<Integer, String>, String> operator =
		new KeyedProcessOperator<>(new KeyQueryingProcessFunction());

	try (
		OneInputStreamOperatorTestHarness<Tuple2<Integer, String>, String> testHarness =
			new KeyedOneInputStreamOperatorTestHarness<>(operator, (in) -> in.f0 , BasicTypeInfo.INT_TYPE_INFO)) {

		testHarness.setup();
		testHarness.open();

		testHarness.processElement(new StreamRecord<>(Tuple2.of(5, "5"), 12L));
		testHarness.processElement(new StreamRecord<>(Tuple2.of(42, "42"), 13L));

		ConcurrentLinkedQueue<Object> expectedOutput = new ConcurrentLinkedQueue<>();
		expectedOutput.add(new StreamRecord<>("5", 12L));
		expectedOutput.add(new StreamRecord<>("42", 13L));

		TestHarnessUtil.assertOutputEquals(
			"Output was not correct.",
			expectedOutput,
			testHarness.getOutput());
	}
}
 
Example #27
Source File: KeyedProcessOperatorTest.java    From flink with Apache License 2.0 4 votes vote down vote up
@Test
public void testKeyQuerying() throws Exception {

	class KeyQueryingProcessFunction extends KeyedProcessFunction<Integer, Tuple2<Integer, String>, String> {

		@Override
		public void processElement(
			Tuple2<Integer, String> value,
			Context ctx,
			Collector<String> out) throws Exception {

			assertTrue("Did not get expected key.", ctx.getCurrentKey().equals(value.f0));

			// we check that we receive this output, to ensure that the assert was actually checked
			out.collect(value.f1);
		}
	}

	KeyedProcessOperator<Integer, Tuple2<Integer, String>, String> operator =
		new KeyedProcessOperator<>(new KeyQueryingProcessFunction());

	try (
		OneInputStreamOperatorTestHarness<Tuple2<Integer, String>, String> testHarness =
			new KeyedOneInputStreamOperatorTestHarness<>(operator, (in) -> in.f0 , BasicTypeInfo.INT_TYPE_INFO)) {

		testHarness.setup();
		testHarness.open();

		testHarness.processElement(new StreamRecord<>(Tuple2.of(5, "5"), 12L));
		testHarness.processElement(new StreamRecord<>(Tuple2.of(42, "42"), 13L));

		ConcurrentLinkedQueue<Object> expectedOutput = new ConcurrentLinkedQueue<>();
		expectedOutput.add(new StreamRecord<>("5", 12L));
		expectedOutput.add(new StreamRecord<>("42", 13L));

		TestHarnessUtil.assertOutputEquals(
			"Output was not correct.",
			expectedOutput,
			testHarness.getOutput());
	}
}
 
Example #28
Source File: ProcTimeRowsBoundedPrecedingFunction.java    From flink with Apache License 2.0 4 votes vote down vote up
@Override
public void processElement(
		RowData input,
		KeyedProcessFunction<K, RowData, RowData>.Context ctx,
		Collector<RowData> out) throws Exception {
	long currentTime = ctx.timerService().currentProcessingTime();
	// register state-cleanup timer
	registerProcessingCleanupTimer(ctx, currentTime);

	// initialize state for the processed element
	RowData accumulators = accState.value();
	if (accumulators == null) {
		accumulators = function.createAccumulators();
	}
	// set accumulators in context first
	function.setAccumulators(accumulators);

	// get smallest timestamp
	Long smallestTs = smallestTsState.value();
	if (smallestTs == null) {
		smallestTs = currentTime;
		smallestTsState.update(smallestTs);
	}
	// get previous counter value
	Long counter = counterState.value();
	if (counter == null) {
		counter = 0L;
	}

	if (counter == precedingOffset) {
		List<RowData> retractList = inputState.get(smallestTs);
		if (retractList != null) {
			// get oldest element beyond buffer size
			// and if oldest element exist, retract value
			RowData retractRow = retractList.get(0);
			function.retract(retractRow);
			retractList.remove(0);
		} else {
			// Does not retract values which are outside of window if the state is cleared already.
			LOG.warn("The state is cleared because of state ttl. " +
				"This will result in incorrect result. " +
				"You can increase the state ttl to avoid this.");
		}
		// if reference timestamp list not empty, keep the list
		if (retractList != null && !retractList.isEmpty()) {
			inputState.put(smallestTs, retractList);
		} // if smallest timestamp list is empty, remove and find new smallest
		else {
			inputState.remove(smallestTs);
			Iterator<Long> iter = inputState.keys().iterator();
			long currentTs = 0L;
			long newSmallestTs = Long.MAX_VALUE;
			while (iter.hasNext()) {
				currentTs = iter.next();
				if (currentTs < newSmallestTs) {
					newSmallestTs = currentTs;
				}
			}
			smallestTsState.update(newSmallestTs);
		}
	} // we update the counter only while buffer is getting filled
	else {
		counter += 1;
		counterState.update(counter);
	}

	// update map state, counter and timestamp
	List<RowData> currentTimeState = inputState.get(currentTime);
	if (currentTimeState != null) {
		currentTimeState.add(input);
		inputState.put(currentTime, currentTimeState);
	} else { // add new input
		List<RowData> newList = new ArrayList<RowData>();
		newList.add(input);
		inputState.put(currentTime, newList);
	}

	// accumulate current row
	function.accumulate(input);
	// update the value of accumulators for future incremental computation
	accumulators = function.getAccumulators();
	accState.update(accumulators);

	// prepare output row
	RowData aggValue = function.getValue();
	output.replace(input, aggValue);
	out.collect(output);
}
 
Example #29
Source File: StreamBookmarker.java    From pravega-samples with Apache License 2.0 4 votes vote down vote up
public static void main(String[] args) throws Exception {
    // Initialize the parameter utility tool in order to retrieve input parameters.
    ParameterTool params = ParameterTool.fromArgs(args);

    // Clients will contact with the Pravega controller to get information about Streams.
    URI pravegaControllerURI = URI.create(params.get(Constants.CONTROLLER_ADDRESS_PARAM, Constants.CONTROLLER_ADDRESS));
    PravegaConfig pravegaConfig = PravegaConfig
            .fromParams(params)
            .withControllerURI(pravegaControllerURI)
            .withDefaultScope(Constants.DEFAULT_SCOPE);

    // Create the scope if it is not present.
    StreamManager streamManager = StreamManager.create(pravegaControllerURI);
    streamManager.createScope(Constants.DEFAULT_SCOPE);

    // Create the Pravega source to read from data produced by DataProducer.
    Stream sensorEvents = Utils.createStream(pravegaConfig, Constants.PRODUCER_STREAM);
    SourceFunction<Tuple2<Integer, Double>> reader = FlinkPravegaReader.<Tuple2<Integer, Double>>builder()
            .withPravegaConfig(pravegaConfig)
            .forStream(sensorEvents)
            .withReaderGroupName(READER_GROUP_NAME)
            .withDeserializationSchema(new Tuple2DeserializationSchema())
            .build();

    // Create the Pravega sink to output the stream cuts representing slices to analyze.
    Stream streamCutsStream = Utils.createStream(pravegaConfig, Constants.STREAMCUTS_STREAM);
    SinkFunction<SensorStreamSlice> writer = FlinkPravegaWriter.<SensorStreamSlice>builder()
            .withPravegaConfig(pravegaConfig)
            .forStream(streamCutsStream)
            .withSerializationSchema(PravegaSerialization.serializationFor(SensorStreamSlice.class))
            .withEventRouter(new EventRouter())
            .build();

    // Initialize the Flink execution environment.
    final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment()
                                                                     .enableCheckpointing(CHECKPOINT_INTERVAL);
    env.getCheckpointConfig().setCheckpointTimeout(CHECKPOINT_INTERVAL);
    GuavaImmutableMapSerializer.registerSerializers(env.getConfig());

    // Bookmark those sections of the stream with values < 0 and write the output (StreamCuts).
    DataStreamSink<SensorStreamSlice> dataStreamSink = env.addSource(reader)
                                                          .setParallelism(Constants.PARALLELISM)
                                                          .keyBy(0)
                                                          .process((KeyedProcessFunction) new Bookmarker(pravegaControllerURI))
                                                          .addSink(writer);

    // Execute within the Flink environment.
    env.execute("StreamBookmarker");
    LOG.info("Ending StreamBookmarker...");
}
 
Example #30
Source File: KeyedProcessOperator.java    From flink with Apache License 2.0 4 votes vote down vote up
OnTimerContextImpl(KeyedProcessFunction<K, IN, OUT> function, TimerService timerService) {
	function.super();
	this.timerService = checkNotNull(timerService);
}