S3 Parquet output plugin for Embulk

Release CI Status Badge Test CI Status Badge

Embulk output plugin to dump records as Apache Parquet files on S3.

Overview

Configuration

Example

out:
  type: s3_parquet
  bucket: my-bucket
  path_prefix: path/to/my-obj.
  file_ext: snappy.parquet
  compression_codec: snappy
  default_timezone: Asia/Tokyo
  canned_acl: bucket-owner-full-control

Note

Development

Run example:

$ ./run_s3_local.sh
$ ./example/prepare_s3_bucket.sh
$ ./gradlew gem
$ embulk run example/config.yml -Ibuild/gemContents/lib

Run test:

$ ./run_s3_local.sh
$ ./gradlew scalatest

Build

$ ./gradlew gem --write-locks  # -t to watch change of files and rebuild continuously

Release gem:

Fix build.gradle, then

$ ./gradlew gemPush

ChangeLog

CHANGELOG.md