The Kite Examples project provides examples of how to use the Kite SDK.
Each example is a standalone Maven module with associated documentation.
dataset
is the closest to a HelloWorld example of Kite. It shows how to create datasets and perform streaming writes and reads over them.dataset-hbase
shows how to store entities in HBase using the RandomAccessDataset
API.dataset-staging
shows how to use two datasets to manage Parquet-formatted datalogging
is an example of logging events from a command-line programs to Hadoop via Flume, using log4j as the logging API.logging-webapp
is like logging
, but the logging source is a webapp.demo
is a full end-to-end example of a webapp that logs events using Flume and performs session analysis using Crunch and Hive.The easiest way to run the examples is on the Cloudera QuickStart VM, which has all the necessary Hadoop services pre-installed, configured, and running locally. See the notes below for any initial setup steps you should take.
The current examples run on version 5.1.0 of the QuickStart VM.
Checkout the latest branch of this repository in the VM:
git clone git://github.com/kite-sdk/kite-examples.git
cd kite-examples
Then choose the example you want to try and refer to the README in the relevant subdirectory.
There are two ways to run the examples with the QuickStart VM:
cloudera
).The advantage of the first approach is that you don't need to install anything extra on your host computer, such as Java or Maven, so there are no fewer set up steps.
For either approach, you need to make the following changes while logged into the VM:
sudo ntpdate pool.ntp.org
.dfs.namenode.rpc-bind-host
property in /etc/hadoop/conf/hdfs-site.xml
:
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
mapreduce.jobhistory.address
property
in /etc/hadoop/conf/mapred-site.xml
:
<property>
<name>mapreduce.jobhistory.address</name>
<value>0.0.0.0:10020</value>
</property>
Configure HBase to listen on all interfaces In order to access the cluster from
the host computer, HBase must be configured to listen on all network interfaces. This
is done by setting the hbase.master.ipc.address
and hbase.regionserver.ipc.address
properties in /etc/hbase/conf/hbase-site.xml
:
<property>
<name>hbase.master.ipc.address</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hbase.regionserver.ipc.address</name>
<value>0.0.0.0</value>
</property>
sudo shutdown -r now
The second approach is preferable when you want to use tools from your own development environment (browser, IDE, command line). However, there are a few extra steps you need to take to configure the QuickStart VM, listed below:
/etc/hosts
on the host machine
127.0.0.1 localhost.localdomain localhost quickstart.cloudera
If you have VBoxManage installed on your host machine, you can do this via command line as well. In bash, this would look something like:
# Set VM_NAME to the name of your VM as it appears in VirtualBox
VM_NAME="QuickStart VM"
PORTS="8032 10020"
for port in $PORTS; do
VBoxManage modifyvm "$VM_NAME" --natpf1 "Rule $port,tcp,,$port,,$port"
done
Some of the examples include integration tests. You can run them all with the following command:
for module in $(ls -d -- */); do
(cd $module; mvn clean verify; if [ $? -ne 0 ]; then break; fi)
done
What are the usernames/passwords for the VM?
I can't find the file in VirtualBox (or VMWare)!
.7z
file. In linux or mac, cd
to where you copied the
file and run 7zr e cloudera-quickstart-vm-4.3.0-kite-vbox-4.4.0.7z
How do I open a .ovf
file?
.ovf
file and select itWhat is a .vmdk
file?
.vmdk
file is the virtual machine disk image that accompanies a
.ovf
file, which is a portable VM description.How do I open a .vbox
file?
.vbox
file and select itHow do I fix "VTx" errors?
How do I get my mouse back?
CTRL
key. If you don't have one (or that
didn't work), then the release key will be in the lower-right of the
VirtualBox windowOther problems