Spark Google AdWords Library

Join the chat at https://gitter.im/spark-google-adwords/Lobby

A library for querying Google AdWords data with Apache Spark, for Spark SQL and DataFrames.

Build Status

Requirements

This library is tested with Spark 2.1+. It might work on older versions, but we don't provide any support on that.

Linking

You can link against this library in your program at the following coordinates:

Scala 2.11

groupId: com.crealytics
artifactId: spark-google-adwords_2.11
version: 0.9.2

Using with Spark shell

This package can be added to Spark using the --packages command line option. For example, to include it when starting the spark shell:

Spark compiled with Scala 2.11

$SPARK_HOME/bin/spark-shell --packages com.crealytics:spark-google-adwords_2.11:0.9.2

Features

This package allows querying Google AdWords reports as Spark DataFrames. The API accepts several options (see the Google AdWords developer docs for details):

Scala API

Spark 1.4+:

Generate a refresh token (if you don't have one yet):

import com.crealytics.google.adwords._
val clientId = "123456789123-yourclientid.apps.googleusercontent.com"
val clientSecret = "yourclientsecret-1"
val authHelper = new AdWordsAuthHelper(clientId, clientSecret)

// The next line prints a URL that you have to open in the browser and copy the displayed authentication code
println(authHelper.authorizationUrl)

// Paste the authentication code from the browser window here to get the refresh token
println(authHelper.getRefreshToken("TheAuthenticationTokenFromTheBrowser"))

Create a DataFrame from an AdWords report:

import org.apache.spark.sql.SQLContext

val sqlContext = new SQLContext(sc)
val df = sqlContext.read
    .format("com.crealytics.google.adwords")
    .option("clientId", clientId)
    .option("clientSecret", clientSecret)
    .option("developerToken", "YourDeveloperToken")
    .option("refreshToken", "1/YourRefreshToken")
    .option("reportType", "SHOPPING_PERFORMANCE_REPORT")
    .option("clientCustomerId", "1234567890")
    .option("userAgent", "Spark")
    .option("during", "LAST_30_DAYS")
    .load()

Building From Source

This library is built with SBT. To build a JAR file simply run sbt assembly from the project root. The build configuration includes support for Scala 2.11.