org.apache.flink.streaming.api.functions.sink.filesystem.bucketassigners.DateTimeBucketAssigner Java Examples

The following examples show how to use org.apache.flink.streaming.api.functions.sink.filesystem.bucketassigners.DateTimeBucketAssigner. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: HadoopPathBasedPartFileWriterTest.java    From flink with Apache License 2.0 5 votes vote down vote up
@Test
public void testWriteFile() throws Exception {
	File file = TEMPORARY_FOLDER.newFolder();
	Path basePath = new Path(file.toURI());

	List<String> data = Arrays.asList(
		"first line",
		"second line",
		"third line");

	StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
	env.setParallelism(1);
	env.enableCheckpointing(100);

	DataStream<String> stream = env.addSource(
		new FiniteTestSource<>(data), TypeInformation.of(String.class));
	Configuration configuration = new Configuration();

	HadoopPathBasedBulkFormatBuilder<String, String, ?> builder =
		new HadoopPathBasedBulkFormatBuilder<>(
			basePath,
			new TestHadoopPathBasedBulkWriterFactory(),
			configuration,
			new DateTimeBucketAssigner<>());
	TestStreamingFileSinkFactory<String> streamingFileSinkFactory = new TestStreamingFileSinkFactory<>();
	stream.addSink(streamingFileSinkFactory.createSink(builder, 1000));

	env.execute();
	validateResult(data, configuration, basePath);
}
 
Example #2
Source File: HdfsSink2.java    From sylph with Apache License 2.0 4 votes vote down vote up
@Override
public void run(DataStream<Row> stream)
{
    final RichSinkFunction<byte[]> sink = StreamingFileSink.forBulkFormat(
            new Path(writerDir),
            (BulkWriter.Factory<byte[]>) fsDataOutputStream -> new BulkWriter<byte[]>()
            {
                private final CompressionCodec codec = ReflectionUtils.newInstance(codecClass, new Configuration());
                private final CompressionOutputStream outputStream = codec.createOutputStream(fsDataOutputStream);
                private long bufferSize;

                @Override
                public void addElement(byte[] element)
                        throws IOException
                {
                    outputStream.write(element);
                    outputStream.write(10); //write \n
                    bufferSize += element.length;
                    if (bufferSize >= batchSize) {
                        outputStream.flush();
                        this.bufferSize = 0;
                    }
                }

                @Override
                public void flush()
                        throws IOException
                {
                    outputStream.flush();
                }

                @Override
                public void finish()
                        throws IOException
                {
                    outputStream.finish();
                    outputStream.close();
                }
            })
            .withBucketAssigner(new DateTimeBucketAssigner<>("yyyy-MM-dd--HH"))
            .build();
    stream.map(row -> {
        StringBuilder builder = new StringBuilder();
        for (int i = 0; i < row.getArity(); i++) {
            builder.append("\u0001").append(row.getField(i));
        }
        return builder.substring(1).getBytes(UTF_8);
    })
            .addSink(sink)
            .name(this.getClass().getSimpleName());
}
 
Example #3
Source File: StreamingFileSink.java    From Flink-CEPplus with Apache License 2.0 2 votes vote down vote up
/**
 * Creates the builder for a {@code StreamingFileSink} with row-encoding format.
 * @param basePath the base path where all the buckets are going to be created as sub-directories.
 * @param encoder the {@link Encoder} to be used when writing elements in the buckets.
 * @param <IN> the type of incoming elements
 * @return The builder where the remaining of the configuration parameters for the sink can be configured.
 * In order to instantiate the sink, call {@link RowFormatBuilder#build()} after specifying the desired parameters.
 */
public static <IN> StreamingFileSink.RowFormatBuilder<IN, String> forRowFormat(
		final Path basePath, final Encoder<IN> encoder) {
	return new StreamingFileSink.RowFormatBuilder<>(basePath, encoder, new DateTimeBucketAssigner<>());
}
 
Example #4
Source File: StreamingFileSink.java    From Flink-CEPplus with Apache License 2.0 2 votes vote down vote up
/**
 * Creates the builder for a {@link StreamingFileSink} with row-encoding format.
 * @param basePath the base path where all the buckets are going to be created as sub-directories.
 * @param writerFactory the {@link BulkWriter.Factory} to be used when writing elements in the buckets.
 * @param <IN> the type of incoming elements
 * @return The builder where the remaining of the configuration parameters for the sink can be configured.
 * In order to instantiate the sink, call {@link RowFormatBuilder#build()} after specifying the desired parameters.
 */
public static <IN> StreamingFileSink.BulkFormatBuilder<IN, String> forBulkFormat(
		final Path basePath, final BulkWriter.Factory<IN> writerFactory) {
	return new StreamingFileSink.BulkFormatBuilder<>(basePath, writerFactory, new DateTimeBucketAssigner<>());
}
 
Example #5
Source File: StreamingFileSink.java    From flink with Apache License 2.0 2 votes vote down vote up
/**
 * Creates the builder for a {@code StreamingFileSink} with row-encoding format.
 * @param basePath the base path where all the buckets are going to be created as sub-directories.
 * @param encoder the {@link Encoder} to be used when writing elements in the buckets.
 * @param <IN> the type of incoming elements
 * @return The builder where the remaining of the configuration parameters for the sink can be configured.
 * In order to instantiate the sink, call {@link RowFormatBuilder#build()} after specifying the desired parameters.
 */
public static <IN> StreamingFileSink.RowFormatBuilder<IN, String> forRowFormat(
		final Path basePath, final Encoder<IN> encoder) {
	return new StreamingFileSink.RowFormatBuilder<>(basePath, encoder, new DateTimeBucketAssigner<>());
}
 
Example #6
Source File: StreamingFileSink.java    From flink with Apache License 2.0 2 votes vote down vote up
/**
 * Creates the builder for a {@link StreamingFileSink} with row-encoding format.
 * @param basePath the base path where all the buckets are going to be created as sub-directories.
 * @param writerFactory the {@link BulkWriter.Factory} to be used when writing elements in the buckets.
 * @param <IN> the type of incoming elements
 * @return The builder where the remaining of the configuration parameters for the sink can be configured.
 * In order to instantiate the sink, call {@link RowFormatBuilder#build()} after specifying the desired parameters.
 */
public static <IN> StreamingFileSink.BulkFormatBuilder<IN, String> forBulkFormat(
		final Path basePath, final BulkWriter.Factory<IN> writerFactory) {
	return new StreamingFileSink.BulkFormatBuilder<>(basePath, writerFactory, new DateTimeBucketAssigner<>());
}
 
Example #7
Source File: StreamingFileSink.java    From flink with Apache License 2.0 2 votes vote down vote up
/**
 * Creates the builder for a {@code StreamingFileSink} with row-encoding format.
 * @param basePath the base path where all the buckets are going to be created as sub-directories.
 * @param encoder the {@link Encoder} to be used when writing elements in the buckets.
 * @param <IN> the type of incoming elements
 * @return The builder where the remaining of the configuration parameters for the sink can be configured.
 * In order to instantiate the sink, call {@link RowFormatBuilder#build()} after specifying the desired parameters.
 */
public static <IN> StreamingFileSink.DefaultRowFormatBuilder<IN> forRowFormat(
		final Path basePath, final Encoder<IN> encoder) {
	return new DefaultRowFormatBuilder<>(basePath, encoder, new DateTimeBucketAssigner<>());
}
 
Example #8
Source File: StreamingFileSink.java    From flink with Apache License 2.0 2 votes vote down vote up
/**
 * Creates the builder for a {@link StreamingFileSink} with row-encoding format.
 * @param basePath the base path where all the buckets are going to be created as sub-directories.
 * @param writerFactory the {@link BulkWriter.Factory} to be used when writing elements in the buckets.
 * @param <IN> the type of incoming elements
 * @return The builder where the remaining of the configuration parameters for the sink can be configured.
 * In order to instantiate the sink, call {@link RowFormatBuilder#build()} after specifying the desired parameters.
 */
public static <IN> StreamingFileSink.DefaultBulkFormatBuilder<IN> forBulkFormat(
		final Path basePath, final BulkWriter.Factory<IN> writerFactory) {
	return new StreamingFileSink.DefaultBulkFormatBuilder<>(basePath, writerFactory, new DateTimeBucketAssigner<>());
}