The Spark-Riak connector enables you to connect Spark applications to Riak KV and Riak TS with the Spark RDD and Spark DataFrames APIs. You can write your app in Scala, Python, and Java. The connector makes it easy to partition the data you get from Riak so multiple Spark workers can process the data in parallel and it has support for failover if a Riak node goes down while your Spark job is running.
In order to use the Spark-Riak connector, you must have the following installed:
Using the Spark-Riak Connector
The Riak Users Mailing List is highly trafficked and a great resource for technical discussions, Riak issues and questions, and community events and announcements.
We pride ourselves on answering every email that comes over the Riak User mailing list. Sign up and send away. If you prefer points for your questions, you can always tag Riak on StackOverflow.
The #riak IRC room on irc.freenode.net is a great place for real-time help with your Riak issues and questions.
To report a bug or issue, please open a new issue against this repository.
You can read the full guidelines for bug reporting on the Riak Docs.
Basho encourages contributions to the Spark-Riak Connector from the community. Here’s how to get started.
Copyright © 2016 Basho Technologies
Licensed under the Apache License, Version 2.0