Beast

Build Status Maintainability codecov

Kafka to BigQuery Sink

Architecture



Building & Running

Prerequisite

Run locally:

git clone https://github.com/gojekfarm/beast
export $(cat ./env/sample.properties | xargs -L1) && gradle clean runConsumer

Run with Docker

The image is available in gojektech dockerhub.

export TAG=80076c77dc8504e7c758865602aca1b05259e5d3
docker run --env-file beast.env -v ./local_dir/project-secret.json:/var/bq-secret.json -it gojektech/beast:$TAG

Running on Kubernetes

Create a beast deployment for a topic in kafka, which needs to be pushed to BigQuery.

BQ Setup:

Given a TestMessage proto file, you can create bigtable with schema

# create new table from schema
bq mk --table <project_name>:dataset_name.test_messages ./docs/test_messages.schema.json

# query total records
bq query --nouse_legacy_sql 'SELECT count(*) FROM `<project_name>:dataset_name.test_messages LIMIT 10'

#  update bq schema from local schema json file
bq update --format=prettyjson <project_name>:dataset_name.test_messages  booking.schema

# dump the schema of table to file
bq show --schema --format=prettyjson <project_name>:dataset_name.test_messages > test_messages.schema.json

Produce messages to Kafka

You can generate messages with TestMessage.proto with sample-kafka-producer, which pushes N messages

Running Stencil Server

Contribution

To run and test locally:

git clone https://github.com/gojekfarm/beast
export $(cat ./env/sample.properties | xargs -L1) && gradlew test