This test program uses structured streaming of spark, receive message that is json type from kafka, then get counts of DNS for every people In one minute. (I think one source Ip addr is one people). Finally, the result will insert to mysql database.
scala version is 2.11.8, if you not, you will change the build.sbt, modify
scalaVersion:= "your version"
spark-2.1.1-bin-hadoop2.6
KAFAK-2.1.1-1.2.1.1.po.18
mysql version:14.14
linux:centos 6.5(Final)
git clone https://github.com/cynthia1wang/jdbcsink.git
$cd jdbcsink/
$sbt clean package
$spark-submit --deploy-mode cluster --master yarn --supervise --driver-class-path mysql-connector-java-5.1.42-bin.jar --class DNSstatJob --jars spark-sql-kafka-0-10_2.11-2.1.1.jar,spark-streaming-kafka-0-10-assembly_2.11-2.1.1.jar,mysql-connector-java-5.1.42-bin.jar target/scala-2.11/dnsjob-project_2.11-1.0.jar