Java Code Examples for org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader

The following examples show how to use org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader. These examples are extracted from open source projects. You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source Project: beam   Source File: HadoopFormatIOSequenceFileTest.java    License: Apache License 2.0 6 votes vote down vote up
private Stream<KV<Text, LongWritable>> extractResultsFromFile(String fileName) {
  try (SequenceFileRecordReader<Text, LongWritable> reader = new SequenceFileRecordReader<>()) {
    Path path = new Path(fileName);
    TaskAttemptContext taskContext =
        HadoopFormats.createTaskAttemptContext(new Configuration(), new JobID("readJob", 0), 0);
    reader.initialize(
        new FileSplit(path, 0L, Long.MAX_VALUE, new String[] {"localhost"}), taskContext);
    List<KV<Text, LongWritable>> result = new ArrayList<>();

    while (reader.nextKeyValue()) {
      result.add(
          KV.of(
              new Text(reader.getCurrentKey().toString()),
              new LongWritable(reader.getCurrentValue().get())));
    }

    return result.stream();
  } catch (Exception e) {
    throw new RuntimeException(e);
  }
}
 
Example 2
/**
 * Actually instantiate the user's chosen RecordReader implementation.
 */
@SuppressWarnings("unchecked")
private void createChildReader() throws IOException, InterruptedException {
  LOG.debug("ChildSplit operates on: " + split.getPath(index));

  Configuration conf = context.getConfiguration();

  // Determine the file format we're reading.
  Class rrClass;
  if (ExportJobBase.isSequenceFiles(conf, split.getPath(index))) {
    rrClass = SequenceFileRecordReader.class;
  } else {
    rrClass = LineRecordReader.class;
  }

  // Create the appropriate record reader.
  this.rr = (RecordReader<LongWritable, Object>)
      ReflectionUtils.newInstance(rrClass, conf);
}
 
Example 3
Source Project: hadoop   Source File: DynamicInputChunk.java    License: Apache License 2.0 5 votes vote down vote up
private void openForRead(TaskAttemptContext taskAttemptContext)
        throws IOException, InterruptedException {
  reader = new SequenceFileRecordReader<K, V>();
  reader.initialize(new FileSplit(chunkFilePath, 0,
          DistCpUtils.getFileSize(chunkFilePath, configuration), null),
          taskAttemptContext);
}
 
Example 4
Source Project: hadoop   Source File: GenerateDistCacheData.java    License: Apache License 2.0 5 votes vote down vote up
/**
 * Returns a reader for this split of the distributed cache file list.
 */
@Override
public RecordReader<LongWritable, BytesWritable> createRecordReader(
    InputSplit split, final TaskAttemptContext taskContext)
    throws IOException, InterruptedException {
  return new SequenceFileRecordReader<LongWritable, BytesWritable>();
}
 
Example 5
Source Project: big-c   Source File: DynamicInputChunk.java    License: Apache License 2.0 5 votes vote down vote up
private void openForRead(TaskAttemptContext taskAttemptContext)
        throws IOException, InterruptedException {
  reader = new SequenceFileRecordReader<K, V>();
  reader.initialize(new FileSplit(chunkFilePath, 0,
          DistCpUtils.getFileSize(chunkFilePath, configuration), null),
          taskAttemptContext);
}
 
Example 6
Source Project: big-c   Source File: GenerateDistCacheData.java    License: Apache License 2.0 5 votes vote down vote up
/**
 * Returns a reader for this split of the distributed cache file list.
 */
@Override
public RecordReader<LongWritable, BytesWritable> createRecordReader(
    InputSplit split, final TaskAttemptContext taskContext)
    throws IOException, InterruptedException {
  return new SequenceFileRecordReader<LongWritable, BytesWritable>();
}
 
Example 7
Source Project: kangaroo   Source File: WritableValueInputFormat.java    License: Apache License 2.0 5 votes vote down vote up
@Override
public RecordReader<NullWritable, V> createRecordReader(final InputSplit split, final TaskAttemptContext context)
        throws IOException, InterruptedException {
    final SequenceFileRecordReader<NullWritable, V> reader = new SequenceFileRecordReader<NullWritable, V>();
    reader.initialize(split, context);
    return reader;
}
 
Example 8
Source Project: circus-train   Source File: DynamicInputChunk.java    License: Apache License 2.0 4 votes vote down vote up
private void openForRead(TaskAttemptContext taskAttemptContext) throws IOException, InterruptedException {
  reader = new SequenceFileRecordReader<>();
  reader
      .initialize(new FileSplit(chunkFilePath, 0, getFileSize(chunkFilePath, configuration), null),
          taskAttemptContext);
}
 
Example 9
Source Project: hadoop   Source File: DynamicInputChunk.java    License: Apache License 2.0 4 votes vote down vote up
/**
 * Getter for the record-reader, opened to the chunk-file.
 * @return Opened Sequence-file reader.
 */
public SequenceFileRecordReader<K,V> getReader() {
  assert reader != null : "Reader un-initialized!";
  return reader;
}
 
Example 10
Source Project: big-c   Source File: DynamicInputChunk.java    License: Apache License 2.0 4 votes vote down vote up
/**
 * Getter for the record-reader, opened to the chunk-file.
 * @return Opened Sequence-file reader.
 */
public SequenceFileRecordReader<K,V> getReader() {
  assert reader != null : "Reader un-initialized!";
  return reader;
}
 
Example 11
Source Project: hiped2   Source File: SequenceFileStockLoader.java    License: Apache License 2.0 4 votes vote down vote up
@SuppressWarnings("unchecked")
@Override
public void prepareToRead(RecordReader reader, PigSplit split)
    throws IOException {
  this.reader = (SequenceFileRecordReader) reader;
}
 
Example 12
Source Project: spork   Source File: SequenceFileLoader.java    License: Apache License 2.0 4 votes vote down vote up
@SuppressWarnings("unchecked")
@Override
public void prepareToRead(RecordReader reader, PigSplit split)
      throws IOException {
  this.reader = (SequenceFileRecordReader) reader;
}
 
Example 13
Source Project: kangaroo   Source File: S3SequenceFileInputFormat.java    License: Apache License 2.0 4 votes vote down vote up
@Override
public RecordReader<K, V> createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException {
    return new SequenceFileRecordReader<K, V>();
}
 
Example 14
Source Project: hadoop   Source File: UniformSizeInputFormat.java    License: Apache License 2.0 3 votes vote down vote up
/**
 * Implementation of InputFormat::createRecordReader().
 * @param split The split for which the RecordReader is sought.
 * @param context The context of the current task-attempt.
 * @return A SequenceFileRecordReader instance, (since the copy-listing is a
 * simple sequence-file.)
 * @throws IOException
 * @throws InterruptedException
 */
@Override
public RecordReader<Text, CopyListingFileStatus> createRecordReader(
    InputSplit split, TaskAttemptContext context)
    throws IOException, InterruptedException {
  return new SequenceFileRecordReader<Text, CopyListingFileStatus>();
}
 
Example 15
Source Project: big-c   Source File: UniformSizeInputFormat.java    License: Apache License 2.0 3 votes vote down vote up
/**
 * Implementation of InputFormat::createRecordReader().
 * @param split The split for which the RecordReader is sought.
 * @param context The context of the current task-attempt.
 * @return A SequenceFileRecordReader instance, (since the copy-listing is a
 * simple sequence-file.)
 * @throws IOException
 * @throws InterruptedException
 */
@Override
public RecordReader<Text, CopyListingFileStatus> createRecordReader(
    InputSplit split, TaskAttemptContext context)
    throws IOException, InterruptedException {
  return new SequenceFileRecordReader<Text, CopyListingFileStatus>();
}
 
Example 16
Source Project: circus-train   Source File: UniformSizeInputFormat.java    License: Apache License 2.0 2 votes vote down vote up
/**
 * Implementation of InputFormat::createRecordReader().
 *
 * @param split The split for which the RecordReader is sought.
 * @param context The context of the current task-attempt.
 * @return A SequenceFileRecordReader instance, (since the copy-listing is a simple sequence-file.)
 * @throws IOException
 * @throws InterruptedException
 */
@Override
public RecordReader<Text, CopyListingFileStatus> createRecordReader(InputSplit split, TaskAttemptContext context)
  throws IOException, InterruptedException {
  return new SequenceFileRecordReader<>();
}
 
Example 17
Source Project: circus-train   Source File: DynamicInputChunk.java    License: Apache License 2.0 2 votes vote down vote up
/**
 * Getter for the record-reader, opened to the chunk-file.
 *
 * @return Opened Sequence-file reader.
 */
public SequenceFileRecordReader<K, V> getReader() {
  assert reader != null : "Reader un-initialized!";
  return reader;
}