Java Code Examples for org.apache.flink.api.java.io.TextInputFormat#setCharsetName()

The following examples show how to use org.apache.flink.api.java.io.TextInputFormat#setCharsetName() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: ExecutionEnvironment.java    From Flink-CEPplus with Apache License 2.0 3 votes vote down vote up
/**
 * Creates a {@link DataSet} that represents the Strings produced by reading the given file line wise.
 * The {@link java.nio.charset.Charset} with the given name will be used to read the files.
 *
 * @param filePath The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
 * @param charsetName The name of the character set used to read the file.
 * @return A {@link DataSet} that represents the data read from the given file as text lines.
 */
public DataSource<String> readTextFile(String filePath, String charsetName) {
	Preconditions.checkNotNull(filePath, "The file path may not be null.");

	TextInputFormat format = new TextInputFormat(new Path(filePath));
	format.setCharsetName(charsetName);
	return new DataSource<>(this, format, BasicTypeInfo.STRING_TYPE_INFO, Utils.getCallLocationName());
}
 
Example 2
Source File: StreamExecutionEnvironment.java    From Flink-CEPplus with Apache License 2.0 3 votes vote down vote up
/**
 * Reads the given file line-by-line and creates a data stream that contains a string with the
 * contents of each such line. The {@link java.nio.charset.Charset} with the given name will be
 * used to read the files.
 *
 * <p><b>NOTES ON CHECKPOINTING: </b> The source monitors the path, creates the
 * {@link org.apache.flink.core.fs.FileInputSplit FileInputSplits} to be processed,
 * forwards them to the downstream {@link ContinuousFileReaderOperator readers} to read the actual data,
 * and exits, without waiting for the readers to finish reading. This implies that no more checkpoint
 * barriers are going to be forwarded after the source exits, thus having no checkpoints after that point.
 *
 * @param filePath
 * 		The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path")
 * @param charsetName
 * 		The name of the character set used to read the file
 * @return The data stream that represents the data read from the given file as text lines
 */
public DataStreamSource<String> readTextFile(String filePath, String charsetName) {
	Preconditions.checkArgument(!StringUtils.isNullOrWhitespaceOnly(filePath), "The file path must not be null or blank.");

	TextInputFormat format = new TextInputFormat(new Path(filePath));
	format.setFilesFilter(FilePathFilter.createDefaultFilter());
	TypeInformation<String> typeInfo = BasicTypeInfo.STRING_TYPE_INFO;
	format.setCharsetName(charsetName);

	return readFile(format, filePath, FileProcessingMode.PROCESS_ONCE, -1, typeInfo);
}
 
Example 3
Source File: ExecutionEnvironment.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Creates a {@link DataSet} that represents the Strings produced by reading the given file line wise.
 * The {@link java.nio.charset.Charset} with the given name will be used to read the files.
 *
 * @param filePath The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
 * @param charsetName The name of the character set used to read the file.
 * @return A {@link DataSet} that represents the data read from the given file as text lines.
 */
public DataSource<String> readTextFile(String filePath, String charsetName) {
	Preconditions.checkNotNull(filePath, "The file path may not be null.");

	TextInputFormat format = new TextInputFormat(new Path(filePath));
	format.setCharsetName(charsetName);
	return new DataSource<>(this, format, BasicTypeInfo.STRING_TYPE_INFO, Utils.getCallLocationName());
}
 
Example 4
Source File: StreamExecutionEnvironment.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Reads the given file line-by-line and creates a data stream that contains a string with the
 * contents of each such line. The {@link java.nio.charset.Charset} with the given name will be
 * used to read the files.
 *
 * <p><b>NOTES ON CHECKPOINTING: </b> The source monitors the path, creates the
 * {@link org.apache.flink.core.fs.FileInputSplit FileInputSplits} to be processed,
 * forwards them to the downstream {@link ContinuousFileReaderOperator readers} to read the actual data,
 * and exits, without waiting for the readers to finish reading. This implies that no more checkpoint
 * barriers are going to be forwarded after the source exits, thus having no checkpoints after that point.
 *
 * @param filePath
 * 		The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path")
 * @param charsetName
 * 		The name of the character set used to read the file
 * @return The data stream that represents the data read from the given file as text lines
 */
public DataStreamSource<String> readTextFile(String filePath, String charsetName) {
	Preconditions.checkArgument(!StringUtils.isNullOrWhitespaceOnly(filePath), "The file path must not be null or blank.");

	TextInputFormat format = new TextInputFormat(new Path(filePath));
	format.setFilesFilter(FilePathFilter.createDefaultFilter());
	TypeInformation<String> typeInfo = BasicTypeInfo.STRING_TYPE_INFO;
	format.setCharsetName(charsetName);

	return readFile(format, filePath, FileProcessingMode.PROCESS_ONCE, -1, typeInfo);
}
 
Example 5
Source File: ExecutionEnvironment.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Creates a {@link DataSet} that represents the Strings produced by reading the given file line wise.
 * The {@link java.nio.charset.Charset} with the given name will be used to read the files.
 *
 * @param filePath The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path").
 * @param charsetName The name of the character set used to read the file.
 * @return A {@link DataSet} that represents the data read from the given file as text lines.
 */
public DataSource<String> readTextFile(String filePath, String charsetName) {
	Preconditions.checkNotNull(filePath, "The file path may not be null.");

	TextInputFormat format = new TextInputFormat(new Path(filePath));
	format.setCharsetName(charsetName);
	return new DataSource<>(this, format, BasicTypeInfo.STRING_TYPE_INFO, Utils.getCallLocationName());
}
 
Example 6
Source File: StreamExecutionEnvironment.java    From flink with Apache License 2.0 3 votes vote down vote up
/**
 * Reads the given file line-by-line and creates a data stream that contains a string with the
 * contents of each such line. The {@link java.nio.charset.Charset} with the given name will be
 * used to read the files.
 *
 * <p><b>NOTES ON CHECKPOINTING: </b> The source monitors the path, creates the
 * {@link org.apache.flink.core.fs.FileInputSplit FileInputSplits} to be processed,
 * forwards them to the downstream readers to read the actual data,
 * and exits, without waiting for the readers to finish reading. This implies that no more checkpoint
 * barriers are going to be forwarded after the source exits, thus having no checkpoints after that point.
 *
 * @param filePath
 * 		The path of the file, as a URI (e.g., "file:///some/local/file" or "hdfs://host:port/file/path")
 * @param charsetName
 * 		The name of the character set used to read the file
 * @return The data stream that represents the data read from the given file as text lines
 */
public DataStreamSource<String> readTextFile(String filePath, String charsetName) {
	Preconditions.checkArgument(!StringUtils.isNullOrWhitespaceOnly(filePath), "The file path must not be null or blank.");

	TextInputFormat format = new TextInputFormat(new Path(filePath));
	format.setFilesFilter(FilePathFilter.createDefaultFilter());
	TypeInformation<String> typeInfo = BasicTypeInfo.STRING_TYPE_INFO;
	format.setCharsetName(charsetName);

	return readFile(format, filePath, FileProcessingMode.PROCESS_ONCE, -1, typeInfo);
}