Java Code Examples for org.apache.beam.sdk.annotations.Experimental.Kind#FILESYSTEM

The following examples show how to use org.apache.beam.sdk.annotations.Experimental.Kind#FILESYSTEM . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: FileBasedSink.java    From beam with Apache License 2.0 6 votes vote down vote up
@Experimental(Kind.FILESYSTEM)
public ResourceId getDestinationFile(
    boolean windowedWrites,
    DynamicDestinations<?, DestinationT, ?> dynamicDestinations,
    int numShards,
    OutputFileHints outputFileHints) {
  checkArgument(getShard() != UNKNOWN_SHARDNUM);
  checkArgument(numShards > 0);
  FilenamePolicy policy = dynamicDestinations.getFilenamePolicy(destination);
  if (windowedWrites) {
    return policy.windowedFilename(
        getShard(), numShards, getWindow(), getPaneInfo(), outputFileHints);
  } else {
    return policy.unwindowedFilename(getShard(), numShards, outputFileHints);
  }
}
 
Example 2
Source File: TextIO.java    From beam with Apache License 2.0 5 votes vote down vote up
/**
 * See {@link TypedWrite#to(DynamicDestinations)}.
 *
 * @deprecated Use {@link FileIO#write()} or {@link FileIO#writeDynamic()} ()} with {@link
 *     #sink()} instead.
 */
@Experimental(Kind.FILESYSTEM)
@Deprecated
public Write to(DynamicDestinations<String, ?, String> dynamicDestinations) {
  return new Write(
      inner.to((DynamicDestinations) dynamicDestinations).withFormatFunction(null));
}
 
Example 3
Source File: AvroIO.java    From beam with Apache License 2.0 5 votes vote down vote up
/**
 * Use a {@link DynamicAvroDestinations} object to vend {@link FilenamePolicy} objects. These
 * objects can examine the input record when creating a {@link FilenamePolicy}. A directory for
 * temporary files must be specified using {@link #withTempDirectory}.
 *
 * @deprecated Use {@link FileIO#write()} or {@link FileIO#writeDynamic()} instead.
 */
@Experimental(Kind.FILESYSTEM)
@Deprecated
public <NewDestinationT> TypedWrite<UserT, NewDestinationT, OutputT> to(
    DynamicAvroDestinations<UserT, NewDestinationT, OutputT> dynamicDestinations) {
  return toBuilder()
      .setDynamicDestinations((DynamicAvroDestinations) dynamicDestinations)
      .build();
}
 
Example 4
Source File: FileBasedSink.java    From beam with Apache License 2.0 5 votes vote down vote up
/**
 * Construct a {@link FileBasedSink} with the given temp directory, producing uncompressed files.
 */
@Experimental(Kind.FILESYSTEM)
public FileBasedSink(
    ValueProvider<ResourceId> tempDirectoryProvider,
    DynamicDestinations<?, DestinationT, OutputT> dynamicDestinations) {
  this(tempDirectoryProvider, dynamicDestinations, Compression.UNCOMPRESSED);
}
 
Example 5
Source File: TextIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/** See {@link TypedWrite#to(FilenamePolicy)}. */
@Experimental(Kind.FILESYSTEM)
public Write to(FilenamePolicy filenamePolicy) {
  return new Write(
      inner.to(filenamePolicy).withFormatFunction(SerializableFunctions.identity()));
}
 
Example 6
Source File: FileBasedSink.java    From beam with Apache License 2.0 4 votes vote down vote up
/**
 * Returns the directory inside which temprary files will be written according to the configured
 * {@link FilenamePolicy}.
 */
@Experimental(Kind.FILESYSTEM)
public ValueProvider<ResourceId> getTempDirectoryProvider() {
  return tempDirectoryProvider;
}
 
Example 7
Source File: GcsOptions.java    From beam with Apache License 2.0 4 votes vote down vote up
/** If true, reports metrics of certain operations, such as batch copies. */
@Description("Experimental. Whether to report performance metrics of certain GCS operations.")
@Default.Boolean(false)
@Experimental(Kind.FILESYSTEM)
Boolean getGcsPerformanceMetrics();
 
Example 8
Source File: AvroIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/** Set the base directory used to generate temporary files. */
@Experimental(Kind.FILESYSTEM)
public TypedWrite<UserT, DestinationT, OutputT> withTempDirectory(ResourceId tempDirectory) {
  return withTempDirectory(StaticValueProvider.of(tempDirectory));
}
 
Example 9
Source File: TextIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/** See {@link TypedWrite#toResource(ValueProvider)}. */
@Experimental(Kind.FILESYSTEM)
public Write toResource(ValueProvider<ResourceId> filenamePrefix) {
  return new Write(
      inner.toResource(filenamePrefix).withFormatFunction(SerializableFunctions.identity()));
}
 
Example 10
Source File: AvroIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/** See {@link TypedWrite#withTempDirectory(ValueProvider)}. */
@Experimental(Kind.FILESYSTEM)
public Write<T> withTempDirectory(ValueProvider<ResourceId> tempDirectory) {
  return new Write<>(inner.withTempDirectory(tempDirectory));
}
 
Example 11
Source File: FileBasedSink.java    From beam with Apache License 2.0 4 votes vote down vote up
@Experimental(Kind.FILESYSTEM)
public ResourceId getTempFilename() {
  return tempFilename;
}
 
Example 12
Source File: TextIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/** See {@link TypedWrite#withTempDirectory(ResourceId)}. */
@Experimental(Kind.FILESYSTEM)
public Write withTempDirectory(ResourceId tempDirectory) {
  return new Write(inner.withTempDirectory(tempDirectory));
}
 
Example 13
Source File: AvroIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/** Like {@link #to(ResourceId)}. */
@Experimental(Kind.FILESYSTEM)
public TypedWrite<UserT, DestinationT, OutputT> toResource(
    ValueProvider<ResourceId> outputPrefix) {
  return toBuilder().setFilenamePrefix(outputPrefix).build();
}
 
Example 14
Source File: AvroIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/**
 * Writes to files named according to the given {@link FileBasedSink.FilenamePolicy}. A
 * directory for temporary files must be specified using {@link #withTempDirectory}.
 */
@Experimental(Kind.FILESYSTEM)
public TypedWrite<UserT, DestinationT, OutputT> to(FilenamePolicy filenamePolicy) {
  return toBuilder().setFilenamePolicy(filenamePolicy).build();
}
 
Example 15
Source File: TextIO.java    From beam with Apache License 2.0 4 votes vote down vote up
/** Like {@link #to(String)}. */
@Experimental(Kind.FILESYSTEM)
public TypedWrite<UserT, DestinationT> to(ResourceId filenamePrefix) {
  return toResource(StaticValueProvider.of(filenamePrefix));
}
 
Example 16
Source File: FileBasedSink.java    From beam with Apache License 2.0 3 votes vote down vote up
/**
 * When a sink has requested windowed or triggered output, this method will be invoked to return
 * the file {@link ResourceId resource} to be created given the base output directory and a
 * {@link OutputFileHints} containing information about the file, including a suggested
 * extension (e.g. coming from {@link Compression}).
 *
 * <p>The policy must return unique and consistent filenames for different windows and panes.
 */
@Experimental(Kind.FILESYSTEM)
public abstract ResourceId windowedFilename(
    int shardNumber,
    int numShards,
    BoundedWindow window,
    PaneInfo paneInfo,
    OutputFileHints outputFileHints);
 
Example 17
Source File: FileBasedSink.java    From beam with Apache License 2.0 3 votes vote down vote up
/**
 * This is a helper function for turning a user-provided output filename prefix and converting it
 * into a {@link ResourceId} for writing output files. See {@link TextIO.Write#to(String)} for an
 * example use case.
 *
 * <p>Typically, the input prefix will be something like {@code /tmp/foo/bar}, and the user would
 * like output files to be named as {@code /tmp/foo/bar-0-of-3.txt}. Thus, this function tries to
 * interpret the provided string as a file {@link ResourceId} path.
 *
 * <p>However, this may fail, for example if the user gives a prefix that is a directory. E.g.,
 * {@code /}, {@code gs://my-bucket}, or {@code c://}. In that case, interpreting the string as a
 * file will fail and this function will return a directory {@link ResourceId} instead.
 */
@Experimental(Kind.FILESYSTEM)
public static ResourceId convertToFileResourceIfPossible(String outputPrefix) {
  try {
    return FileSystems.matchNewResource(outputPrefix, false /* isDirectory */);
  } catch (Exception e) {
    return FileSystems.matchNewResource(outputPrefix, true /* isDirectory */);
  }
}
 
Example 18
Source File: FileBasedSink.java    From beam with Apache License 2.0 2 votes vote down vote up
/**
 * When a sink has not requested windowed or triggered output, this method will be invoked to
 * return the file {@link ResourceId resource} to be created given the base output directory and
 * a {@link OutputFileHints} containing information about the file, including a suggested (e.g.
 * coming from {@link Compression}).
 *
 * <p>The shardNumber and numShards parameters, should be used by the policy to generate unique
 * and consistent filenames.
 */
@Experimental(Kind.FILESYSTEM)
@Nullable
public abstract ResourceId unwindowedFilename(
    int shardNumber, int numShards, OutputFileHints outputFileHints);
 
Example 19
Source File: MatchResult.java    From beam with Apache License 2.0 2 votes vote down vote up
/**
 * Last modification timestamp in milliseconds since Unix epoch.
 *
 * <p>Note that this field is not encoded with the default {@link MetadataCoder} due to a need
 * for compatibility with previous versions of the Beam SDK. If you want to rely on {@code
 * lastModifiedMillis} values, be sure to explicitly set the coder to {@link MetadataCoderV2}.
 * Otherwise, all instances will have the default value of 0, consistent with the behavior of
 * {@link File#lastModified()}.
 *
 * <p>The following example sets the coder explicitly and accesses {@code lastModifiedMillis} to
 * set record timestamps:
 *
 * <pre>{@code
 * PCollection<Metadata> metadataWithTimestamp = p
 *     .apply(FileIO.match().filepattern("hdfs://path/to/*.gz"))
 *     .setCoder(MetadataCoderV2.of())
 *     .apply(WithTimestamps.of(metadata -> new Instant(metadata.lastModifiedMillis())));
 * }</pre>
 */
@Experimental(Kind.FILESYSTEM)
public abstract long lastModifiedMillis();
 
Example 20
Source File: AvroIO.java    From beam with Apache License 2.0 2 votes vote down vote up
/**
 * Writes to file(s) with the given output prefix. See {@link FileSystems} for information on
 * supported file systems. This prefix is used by the {@link DefaultFilenamePolicy} to generate
 * filenames.
 *
 * <p>By default, a {@link DefaultFilenamePolicy} will build output filenames using the
 * specified prefix, a shard name template (see {@link #withShardNameTemplate(String)}, and a
 * common suffix (if supplied using {@link #withSuffix(String)}). This default can be overridden
 * using {@link #to(FilenamePolicy)}.
 *
 * <p>This default policy can be overridden using {@link #to(FilenamePolicy)}, in which case
 * {@link #withShardNameTemplate(String)} and {@link #withSuffix(String)} should not be set.
 * Custom filename policies do not automatically see this prefix - you should explicitly pass
 * the prefix into your {@link FilenamePolicy} object if you need this.
 *
 * <p>If {@link #withTempDirectory} has not been called, this filename prefix will be used to
 * infer a directory for temporary files.
 */
@Experimental(Kind.FILESYSTEM)
public TypedWrite<UserT, DestinationT, OutputT> to(ResourceId outputPrefix) {
  return toResource(StaticValueProvider.of(outputPrefix));
}