Java Code Examples for org.apache.flink.api.java.Utils#getCallLocationName()

The following examples show how to use org.apache.flink.api.java.Utils#getCallLocationName() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: CsvReader.java    From Flink-CEPplus with Apache License 2.0 6 votes vote down vote up
/**
 * Configures the reader to read the CSV data and parse it to the given type. The type must be a subclass of
 * {@link Tuple}. The type information for the fields is obtained from the type class. The type
 * consequently needs to specify all generic field types of the tuple.
 *
 * @param targetType The class of the target type, needs to be a subclass of Tuple.
 * @return The DataSet representing the parsed CSV data.
 */
public <T extends Tuple> DataSource<T> tupleType(Class<T> targetType) {
	Preconditions.checkNotNull(targetType, "The target type class must not be null.");
	if (!Tuple.class.isAssignableFrom(targetType)) {
		throw new IllegalArgumentException("The target type must be a subclass of " + Tuple.class.getName());
	}

	@SuppressWarnings("unchecked")
	TupleTypeInfo<T> typeInfo = (TupleTypeInfo<T>) TypeExtractor.createTypeInfo(targetType);
	CsvInputFormat<T> inputFormat = new TupleCsvInputFormat<T>(path, this.lineDelimiter, this.fieldDelimiter, typeInfo, this.includedMask);

	Class<?>[] classes = new Class<?>[typeInfo.getArity()];
	for (int i = 0; i < typeInfo.getArity(); i++) {
		classes[i] = typeInfo.getTypeAt(i).getTypeClass();
	}

	configureInputFormat(inputFormat);
	return new DataSource<T>(executionContext, inputFormat, typeInfo, Utils.getCallLocationName());
}
 
Example 2
Source File: CsvReader.java    From flink with Apache License 2.0 6 votes vote down vote up
/**
 * Configures the reader to read the CSV data and parse it to the given type. The type must be a subclass of
 * {@link Tuple}. The type information for the fields is obtained from the type class. The type
 * consequently needs to specify all generic field types of the tuple.
 *
 * @param targetType The class of the target type, needs to be a subclass of Tuple.
 * @return The DataSet representing the parsed CSV data.
 */
public <T extends Tuple> DataSource<T> tupleType(Class<T> targetType) {
	Preconditions.checkNotNull(targetType, "The target type class must not be null.");
	if (!Tuple.class.isAssignableFrom(targetType)) {
		throw new IllegalArgumentException("The target type must be a subclass of " + Tuple.class.getName());
	}

	@SuppressWarnings("unchecked")
	TupleTypeInfo<T> typeInfo = (TupleTypeInfo<T>) TypeExtractor.createTypeInfo(targetType);
	CsvInputFormat<T> inputFormat = new TupleCsvInputFormat<T>(path, this.lineDelimiter, this.fieldDelimiter, typeInfo, this.includedMask);

	Class<?>[] classes = new Class<?>[typeInfo.getArity()];
	for (int i = 0; i < typeInfo.getArity(); i++) {
		classes[i] = typeInfo.getTypeAt(i).getTypeClass();
	}

	configureInputFormat(inputFormat);
	return new DataSource<T>(executionContext, inputFormat, typeInfo, Utils.getCallLocationName());
}
 
Example 3
Source File: JoinOperatorSetsBase.java    From flink with Apache License 2.0 6 votes vote down vote up
protected DefaultJoin<I1, I2> createDefaultJoin(Keys<I2> keys2) {
	if (keys2 == null) {
		throw new NullPointerException("The join keys may not be null.");
	}

	if (keys2.isEmpty()) {
		throw new InvalidProgramException("The join keys may not be empty.");
	}

	try {
		keys1.areCompatible(keys2);
	} catch (Keys.IncompatibleKeysException e) {
		throw new InvalidProgramException("The pair of join keys are not compatible with each other.", e);
	}
	return new DefaultJoin<>(input1, input2, keys1, keys2, joinHint, Utils.getCallLocationName(4), joinType);
}
 
Example 4
Source File: JoinOperator.java    From Flink-CEPplus with Apache License 2.0 5 votes vote down vote up
public <R> EquiJoin<I1, I2, R> with(JoinFunction<I1, I2, R> function) {
	if (function == null) {
		throw new NullPointerException("Join function must not be null.");
	}
	FlatJoinFunction<I1, I2, R> generatedFunction = new WrappingFlatJoinFunction<>(clean(function));
	TypeInformation<R> returnType = TypeExtractor.getJoinReturnTypes(function, getInput1Type(), getInput2Type(), Utils.getCallLocationName(), true);
	return new EquiJoin<>(getInput1(), getInput2(), getKeys1(), getKeys2(), generatedFunction, function, returnType, getJoinHint(), Utils.getCallLocationName(), joinType);
}
 
Example 5
Source File: CoGroupOperator.java    From Flink-CEPplus with Apache License 2.0 5 votes vote down vote up
/**
 * Finalizes a CoGroup transformation by applying a {@link org.apache.flink.api.common.functions.RichCoGroupFunction} to groups of elements with identical keys.
 *
 * <p>Each CoGroupFunction call returns an arbitrary number of keys.
 *
 * @param function The CoGroupFunction that is called for all groups of elements with identical keys.
 * @return An CoGroupOperator that represents the co-grouped result DataSet.
 *
 * @see org.apache.flink.api.common.functions.RichCoGroupFunction
 * @see DataSet
 */
public <R> CoGroupOperator<I1, I2, R> with(CoGroupFunction<I1, I2, R> function) {
	if (function == null) {
		throw new NullPointerException("CoGroup function must not be null.");
	}
	TypeInformation<R> returnType = TypeExtractor.getCoGroupReturnTypes(function, input1.getType(), input2.getType(),
			Utils.getCallLocationName(), true);

	return new CoGroupOperator<>(input1, input2, keys1, keys2, input1.clean(function), returnType,
			groupSortKeyOrderFirst, groupSortKeyOrderSecond,
			customPartitioner, Utils.getCallLocationName());
}
 
Example 6
Source File: DataSetUtils.java    From flink with Apache License 2.0 5 votes vote down vote up
/**
 * Generate a sample of DataSet which contains fixed size elements.
 *
 * <p><strong>NOTE:</strong> Sample with fixed size is not as efficient as sample with fraction, use sample with
 * fraction unless you need exact precision.
 *
 * @param withReplacement Whether element can be selected more than once.
 * @param numSamples       The expected sample size.
 * @param seed            Random number generator seed.
 * @return The sampled DataSet
 */
public static <T> DataSet<T> sampleWithSize(
	DataSet <T> input,
	final boolean withReplacement,
	final int numSamples,
	final long seed) {

	SampleInPartition<T> sampleInPartition = new SampleInPartition<>(withReplacement, numSamples, seed);
	MapPartitionOperator mapPartitionOperator = input.mapPartition(sampleInPartition);

	// There is no previous group, so the parallelism of GroupReduceOperator is always 1.
	String callLocation = Utils.getCallLocationName();
	SampleInCoordinator<T> sampleInCoordinator = new SampleInCoordinator<>(withReplacement, numSamples, seed);
	return new GroupReduceOperator<>(mapPartitionOperator, input.getType(), sampleInCoordinator, callLocation);
}
 
Example 7
Source File: JoinOperator.java    From flink with Apache License 2.0 5 votes vote down vote up
public <R> EquiJoin<I1, I2, R> with(JoinFunction<I1, I2, R> function) {
	if (function == null) {
		throw new NullPointerException("Join function must not be null.");
	}
	FlatJoinFunction<I1, I2, R> generatedFunction = new WrappingFlatJoinFunction<>(clean(function));
	TypeInformation<R> returnType = TypeExtractor.getJoinReturnTypes(function, getInput1Type(), getInput2Type(), Utils.getCallLocationName(), true);
	return new EquiJoin<>(getInput1(), getInput2(), getKeys1(), getKeys2(), generatedFunction, function, returnType, getJoinHint(), Utils.getCallLocationName(), joinType);
}
 
Example 8
Source File: DataSetUtils.java    From Flink-CEPplus with Apache License 2.0 4 votes vote down vote up
/**
 * Range-partitions a DataSet on the specified fields.
 */
public static <T> PartitionOperator<T> partitionByRange(DataSet<T> input, DataDistribution distribution, String... fields) {
	return new PartitionOperator<>(input, PartitionOperatorBase.PartitionMethod.RANGE, new Keys.ExpressionKeys<>(fields, input.getType()), distribution, Utils.getCallLocationName());
}
 
Example 9
Source File: DataSetUtils.java    From flink with Apache License 2.0 4 votes vote down vote up
/**
 * Range-partitions a DataSet on the specified fields.
 */
public static <T> PartitionOperator<T> partitionByRange(DataSet<T> input, DataDistribution distribution, String... fields) {
	return new PartitionOperator<>(input, PartitionOperatorBase.PartitionMethod.RANGE, new Keys.ExpressionKeys<>(fields, input.getType()), distribution, Utils.getCallLocationName());
}
 
Example 10
Source File: DataSetUtils.java    From flink with Apache License 2.0 4 votes vote down vote up
/**
 * Range-partitions a DataSet using the specified key selector function.
 */
public static <T, K extends Comparable<K>> PartitionOperator<T> partitionByRange(DataSet<T> input, DataDistribution distribution, KeySelector<T, K> keyExtractor) {
	final TypeInformation<K> keyType = TypeExtractor.getKeySelectorTypes(keyExtractor, input.getType());
	return new PartitionOperator<>(input, PartitionOperatorBase.PartitionMethod.RANGE, new Keys.SelectorFunctionKeys<>(input.clean(keyExtractor), input.getType(), keyType), distribution, Utils.getCallLocationName());
}
 
Example 11
Source File: UnsortedGrouping.java    From Flink-CEPplus with Apache License 2.0 3 votes vote down vote up
/**
 * Applies a Reduce transformation on a grouped {@link DataSet}.
 *
 * <p>For each group, the transformation consecutively calls a {@link org.apache.flink.api.common.functions.RichReduceFunction}
 *   until only a single element for each group remains.
 * A ReduceFunction combines two elements into one new element of the same type.
 *
 * @param reducer The ReduceFunction that is applied on each group of the DataSet.
 * @return A ReduceOperator that represents the reduced DataSet.
 *
 * @see org.apache.flink.api.common.functions.RichReduceFunction
 * @see ReduceOperator
 * @see DataSet
 */
public ReduceOperator<T> reduce(ReduceFunction<T> reducer) {
	if (reducer == null) {
		throw new NullPointerException("Reduce function must not be null.");
	}
	return new ReduceOperator<T>(this, inputDataSet.clean(reducer), Utils.getCallLocationName());
}
 
Example 12
Source File: SortedGrouping.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Applies a GroupReduce transformation on a grouped and sorted {@link DataSet}.
 *
 * <p>The transformation calls a {@link org.apache.flink.api.common.functions.RichGroupReduceFunction} for each group of the DataSet.
 * A GroupReduceFunction can iterate over all elements of a group and emit any
 *   number of output elements including none.
 *
 * @param reducer The GroupReduceFunction that is applied on each group of the DataSet.
 * @return A GroupReduceOperator that represents the reduced DataSet.
 *
 * @see org.apache.flink.api.common.functions.RichGroupReduceFunction
 * @see GroupReduceOperator
 * @see DataSet
 */
public <R> GroupReduceOperator<T, R> reduceGroup(GroupReduceFunction<T, R> reducer) {
	if (reducer == null) {
		throw new NullPointerException("GroupReduce function must not be null.");
	}
	TypeInformation<R> resultType = TypeExtractor.getGroupReduceReturnTypes(reducer,
			inputDataSet.getType(), Utils.getCallLocationName(), true);
	return new GroupReduceOperator<>(this, resultType, inputDataSet.clean(reducer), Utils.getCallLocationName());
}
 
Example 13
Source File: AllWindowedStream.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Applies the given window function to each window. The window function is called for each
 * evaluation of the window. The output of the window function is
 * interpreted as a regular non-windowed stream.
 *
 * <p>Note that this function requires that all data in the windows is buffered until the window
 * is evaluated, as the function provides no means of incremental aggregation.
 *
 * @param function The process window function.
 * @return The data stream that is the result of applying the window function to the window.
 */
@PublicEvolving
public <R> SingleOutputStreamOperator<R> process(ProcessAllWindowFunction<T, R, W> function) {
	String callLocation = Utils.getCallLocationName();
	function = input.getExecutionEnvironment().clean(function);
	TypeInformation<R> resultType = getProcessAllWindowFunctionReturnType(function, getInputType());
	return apply(new InternalIterableProcessAllWindowFunction<>(function), resultType, callLocation);
}
 
Example 14
Source File: UnsortedGrouping.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Applies a GroupCombineFunction on a grouped {@link DataSet}.
 * A GroupCombineFunction is similar to a GroupReduceFunction but does not perform a full data exchange. Instead, the
 * CombineFunction calls the combine method once per partition for combining a group of results. This
 * operator is suitable for combining values into an intermediate format before doing a proper groupReduce where
 * the data is shuffled across the node for further reduction. The GroupReduce operator can also be supplied with
 * a combiner by implementing the RichGroupReduce function. The combine method of the RichGroupReduce function
 * demands input and output type to be the same. The CombineFunction, on the other side, can have an arbitrary
 * output type.
 * @param combiner The GroupCombineFunction that is applied on the DataSet.
 * @return A GroupCombineOperator which represents the combined DataSet.
 */
public <R> GroupCombineOperator<T, R> combineGroup(GroupCombineFunction<T, R> combiner) {
	if (combiner == null) {
		throw new NullPointerException("GroupCombine function must not be null.");
	}
	TypeInformation<R> resultType = TypeExtractor.getGroupCombineReturnTypes(combiner,
			this.getInputDataSet().getType(), Utils.getCallLocationName(), true);

	return new GroupCombineOperator<T, R>(this, resultType, inputDataSet.clean(combiner), Utils.getCallLocationName());
}
 
Example 15
Source File: SortedGrouping.java    From Flink-CEPplus with Apache License 2.0 3 votes vote down vote up
/**
 * Applies a GroupReduce transformation on a grouped and sorted {@link DataSet}.
 *
 * <p>The transformation calls a {@link org.apache.flink.api.common.functions.RichGroupReduceFunction} for each group of the DataSet.
 * A GroupReduceFunction can iterate over all elements of a group and emit any
 *   number of output elements including none.
 *
 * @param reducer The GroupReduceFunction that is applied on each group of the DataSet.
 * @return A GroupReduceOperator that represents the reduced DataSet.
 *
 * @see org.apache.flink.api.common.functions.RichGroupReduceFunction
 * @see GroupReduceOperator
 * @see DataSet
 */
public <R> GroupReduceOperator<T, R> reduceGroup(GroupReduceFunction<T, R> reducer) {
	if (reducer == null) {
		throw new NullPointerException("GroupReduce function must not be null.");
	}
	TypeInformation<R> resultType = TypeExtractor.getGroupReduceReturnTypes(reducer,
			inputDataSet.getType(), Utils.getCallLocationName(), true);
	return new GroupReduceOperator<>(this, resultType, inputDataSet.clean(reducer), Utils.getCallLocationName());
}
 
Example 16
Source File: AllWindowedStream.java    From Flink-CEPplus with Apache License 2.0 3 votes vote down vote up
/**
 * Applies the given window function to each window. The window function is called for each
 * evaluation of the window. The output of the window function is
 * interpreted as a regular non-windowed stream.
 *
 * <p>Note that this function requires that all data in the windows is buffered until the window
 * is evaluated, as the function provides no means of incremental aggregation.
 *
 * @param function The process window function.
 * @return The data stream that is the result of applying the window function to the window.
 */
@PublicEvolving
public <R> SingleOutputStreamOperator<R> process(ProcessAllWindowFunction<T, R, W> function) {
	String callLocation = Utils.getCallLocationName();
	function = input.getExecutionEnvironment().clean(function);
	TypeInformation<R> resultType = getProcessAllWindowFunctionReturnType(function, getInputType());
	return apply(new InternalIterableProcessAllWindowFunction<>(function), resultType, callLocation);
}
 
Example 17
Source File: JoinOperator.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Finalizes a Join transformation by applying a {@link org.apache.flink.api.common.functions.RichFlatJoinFunction} to each pair of joined elements.
 *
 * <p>Each JoinFunction call returns exactly one element.
 *
 * @param function The JoinFunction that is called for each pair of joined elements.
 * @return An EquiJoin that represents the joined result DataSet
 *
 * @see org.apache.flink.api.common.functions.RichFlatJoinFunction
 * @see org.apache.flink.api.java.operators.JoinOperator.EquiJoin
 * @see DataSet
 */
public <R> EquiJoin<I1, I2, R> with(FlatJoinFunction<I1, I2, R> function) {
	if (function == null) {
		throw new NullPointerException("Join function must not be null.");
	}
	TypeInformation<R> returnType = TypeExtractor.getFlatJoinReturnTypes(function, getInput1Type(), getInput2Type(), Utils.getCallLocationName(), true);
	return new EquiJoin<>(getInput1(), getInput2(), getKeys1(), getKeys2(), clean(function), returnType, getJoinHint(), Utils.getCallLocationName(), joinType);
}
 
Example 18
Source File: UnsortedGrouping.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Applies a Reduce transformation on a grouped {@link DataSet}.
 *
 * <p>For each group, the transformation consecutively calls a {@link org.apache.flink.api.common.functions.RichReduceFunction}
 *   until only a single element for each group remains.
 * A ReduceFunction combines two elements into one new element of the same type.
 *
 * @param reducer The ReduceFunction that is applied on each group of the DataSet.
 * @return A ReduceOperator that represents the reduced DataSet.
 *
 * @see org.apache.flink.api.common.functions.RichReduceFunction
 * @see ReduceOperator
 * @see DataSet
 */
public ReduceOperator<T> reduce(ReduceFunction<T> reducer) {
	if (reducer == null) {
		throw new NullPointerException("Reduce function must not be null.");
	}
	return new ReduceOperator<T>(this, inputDataSet.clean(reducer), Utils.getCallLocationName());
}
 
Example 19
Source File: JoinOperator.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Finalizes a Join transformation by applying a {@link org.apache.flink.api.common.functions.RichFlatJoinFunction} to each pair of joined elements.
 *
 * <p>Each JoinFunction call returns exactly one element.
 *
 * @param function The JoinFunction that is called for each pair of joined elements.
 * @return An EquiJoin that represents the joined result DataSet
 *
 * @see org.apache.flink.api.common.functions.RichFlatJoinFunction
 * @see org.apache.flink.api.java.operators.JoinOperator.EquiJoin
 * @see DataSet
 */
public <R> EquiJoin<I1, I2, R> with(FlatJoinFunction<I1, I2, R> function) {
	if (function == null) {
		throw new NullPointerException("Join function must not be null.");
	}
	TypeInformation<R> returnType = TypeExtractor.getFlatJoinReturnTypes(function, getInput1Type(), getInput2Type(), Utils.getCallLocationName(), true);
	return new EquiJoin<>(getInput1(), getInput2(), getKeys1(), getKeys2(), clean(function), returnType, getJoinHint(), Utils.getCallLocationName(), joinType);
}
 
Example 20
Source File: AllWindowedStream.java    From flink with Apache License 2.0 2 votes vote down vote up
/**
 * Applies the given window function to each window. The window function is called for each
 * evaluation of the window. The output of the window function is
 * interpreted as a regular non-windowed stream.
 *
 * <p>Note that this function requires that all data in the windows is buffered until the window
 * is evaluated, as the function provides no means of incremental aggregation.
 *
 * @param function The window function.
 * @return The data stream that is the result of applying the window function to the window.
 */
public <R> SingleOutputStreamOperator<R> apply(AllWindowFunction<T, R, W> function, TypeInformation<R> resultType) {
	String callLocation = Utils.getCallLocationName();
	function = input.getExecutionEnvironment().clean(function);
	return apply(new InternalIterableAllWindowFunction<>(function), resultType, callLocation);
}