Java Code Examples for org.apache.hadoop.mapred.FileOutputFormat#getTaskOutputPath()

The following examples show how to use org.apache.hadoop.mapred.FileOutputFormat#getTaskOutputPath() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: ExportManifestOutputFormat.java    From emr-dynamodb-connector with Apache License 2.0 5 votes vote down vote up
@Override
public RecordWriter<K, Text> getRecordWriter(FileSystem ignored, JobConf job, String name,
    Progressable progress) throws IOException {
  String extension = "";
  Path file = FileOutputFormat.getTaskOutputPath(job, MANIFEST_FILENAME);
  FileSystem fs = file.getFileSystem(job);
  FSDataOutputStream fileOut = fs.create(file, progress);
  if (getCompressOutput(job)) {
    Class<? extends CompressionCodec> codecClass = getOutputCompressorClass(job, GzipCodec.class);
    CompressionCodec codec = ReflectionUtils.newInstance(codecClass, job);
    extension = codec.getDefaultExtension();
  }
  return new ExportManifestRecordWriter<>(fileOut, FileOutputFormat.getOutputPath(job),
      extension);
}
 
Example 2
Source File: WARCOutputFormat.java    From warc-hadoop with MIT License 4 votes vote down vote up
public WARCWriter(JobConf job, String filename, Progressable progress) throws IOException {
    CompressionCodec codec = getCompressOutput(job) ? WARCFileWriter.getGzipCodec(job) : null;
    Path workFile = FileOutputFormat.getTaskOutputPath(job, filename);
    this.writer = new WARCFileWriter(job, codec, workFile, progress);
}
 
Example 3
Source File: AvroAsJsonOutputFormat.java    From iow-hadoop-streaming with Apache License 2.0 4 votes vote down vote up
@Override
public RecordWriter<Text, NullWritable>
    getRecordWriter(FileSystem ignore, JobConf job, String name, Progressable prog)
    throws IOException {

    Schema schema;
    Schema.Parser p = new Schema.Parser();
    String strSchema = job.get("iow.streaming.output.schema");
    if (strSchema == null) {

        String schemaFile = job.get("iow.streaming.output.schema.file", "streaming_output_schema");

        if (job.getBoolean("iow.streaming.schema.use.prefix", false)) {
            // guess schema from file name
            // format is: schema:filename
            // with special keyword default - 'default:filename'

            String str[] = name.split(":");
            if (!str[0].equals("default"))
                schemaFile = str[0];

            name = str[1];
        }

        LOG.info(this.getClass().getSimpleName() + ": Using schema from file: " + schemaFile);
        File f = new File(schemaFile);
        schema = p.parse(f);
    }
    else {
        LOG.info(this.getClass().getSimpleName() + ": Using schema from jobconf.");
        schema = p.parse(strSchema);
    }

    if (schema == null) {
        throw new IOException("Can't find proper output schema");
    }

    DataFileWriter<GenericRecord> writer = new DataFileWriter<GenericRecord>(new GenericDatumWriter<GenericRecord>());

    configureDataFileWriter(writer, job);

    Path path = FileOutputFormat.getTaskOutputPath(job, name + org.apache.avro.mapred.AvroOutputFormat.EXT);
    writer.create(schema, path.getFileSystem(job).create(path));

    return createRecordWriter(writer, schema);
}