-
under Apache License 2.0 license
-
Eclipse Deeplearning4j, ND4J, DataVec and more - deep learning & linear algebra for Java/Scala with GPUs + Spark
-
under MIT License license
-
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
-
under Apache License 2.0 license
-
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
-
under Apache License 2.0 license
-
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
-
under Apache License 2.0 license
-
Cross-platform real-time collaboration client optimized for business and organizations.
-
under Apache License 2.0 license
-
A Java wrapper to run Spring, Jersey, Spark, and other apps inside AWS Lambda.
-
under Apache License 2.0 license
-
High Performance Kafka Connector for Spark Streaming.Supports Multi Topic Fetch, Kafka Security. Reliable offset management in Zookeeper. No Data-loss. No dependency on HDFS and WAL. In-built PID rate controller. Support Message Handler . Offset Lag checker.
-
under Apache License 2.0 license
-
Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...
-
under Apache License 2.0 license
-
Stock inference engine using Spring XD, Apache Geode / GemFire and Spark ML Lib.
-
under Apache License 2.0 license
-
DC/OS SDK is a collection of tools, libraries, and documentation for easy integration of technologies such as Kafka, Cassandra, HDFS, Spark, and TensorFlow with DC/OS.
-
under GNU Affero General Public License v3.0 license
-
REST web service for the true real-time scoring (<1 ms) of R, Scikit-Learn and Apache Spark models
-
under Apache License 2.0 license
-
A simple Android sparkline chart view.
-
under Apache License 2.0 license
-
Example of one possible way of structuring a Spark application
-
under Apache License 2.0 license
-
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
-
under MIT License license
-
Based on Flask's MiniTwit example and written in Java with the Spark web framework, Spring, and HSQLDB
-
under Apache License 2.0 license
-
:sparkles: Excel operation component based on poi & CSV :sparkles:
-
under Apache License 2.0 license
-
Mazerunner extends a Neo4j graph database to run scheduled big data graph compute algorithms at scale with HDFS and Apache Spark.
-
under Apache License 2.0 license
-
The Deep Learning training framework on Spark
-
under MIT License license
-
A cool and elegant Submit Button
-
under MIT License license
-
A pinview library for android. :sparkles:
-
under Apache License 2.0 license
-
Repository for different Template engine implementations.
-
under Apache License 2.0 license
-
Web UI for Presto, Hive, Elasticsearch, SparkSQL
-
under MIT License license
-
A Carousel picker library for android which supports both text and icons . :sparkles:
-
under Apache License 2.0 license
-
Simple Spark Application
-
under Apache License 2.0 license
-
大数据实践项目 Hadoop、Spark、Kafka、Hbase、Flink.....
-
under Apache License 2.0 license
-
Powered by Spark Streaming & Siddhi
-
under Apache License 2.0 license
-
Build configuration-driven ETL pipelines on Apache Spark
-
under Apache License 2.0 license
-
Spark TS Examples
-
under Apache License 2.0 license
-
如果你在从事大数据BI的工作,想对比一下MySQL、GreenPlum、Elasticsearch、Hive、Spark SQL、Presto、Impala、Drill、HAWQ、Druid、Pinot、Kylin、ClickHouse、Kudu等不同实现方案之间的表现,那你就需要一份标准的数据进行测试,这个开源项目就是为了生成这样的标准数据。
-
under Apache License 2.0 license
-
每天更新一张精选妹纸图片,所以叫每日一妹纸(一脸绅士(๑•̀ㅂ•́) ✧)
-
under GNU Affero General Public License v3.0 license
-
Java library and command-line application for converting Apache Spark ML pipelines to PMML
-
under Apache License 2.0 license
-
Spark Terasort
-
under Apache License 2.0 license
-
Apache Spark applications
-
under Apache License 2.0 license
-
Spark job that aggregates zipkin spans for use in the UI
-
under Apache License 2.0 license
-
TodoMVC usin intercooler.js and Spark
-
under MIT License license
-
Java library for consuming RESTful APIs for Cisco Spark
-
under Apache License 2.0 license
-
MrGeo is a geospatial toolkit designed to provide raster-based geospatial capabilities that can be performed at scale. MrGeo is built upon Apache Spark and the Hadoop ecosystem to leverage the storage and processing of hundreds of commodity computers. See the wiki for more details.
-
under Apache License 2.0 license
-
Chat application tutorial using Spark and WebSockets
-
under Apache License 2.0 license
-
Applying the Lambda Architecture with Spark, Kafka, and Cassandra.
-
under Apache License 2.0 license
-
Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
-
under Apache License 2.0 license
-
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
-
under Apache License 2.0 license
-
BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
-
under Apache License 2.0 license
-
Fluent client for interacting with Spark Standalone Mode's Rest API for submitting, killing and monitoring the state of jobs.
-
under Apache License 2.0 license
-
[毕业设计]基于Spark网易云音乐数据分析【1.图计算 2.机器学习预测歌曲分类 3.评论词云 4.评论时间段 5.评论top榜 6.热歌top榜 7.用户性别比例 8.用户星座比例 9.用户年龄比例 10.用户全国地理分布 11.热评搜索等等..】
-
under Apache License 2.0 license
-
-
under MIT License license
-
A ViewPager animator that animates Views within pages as well as views across pages.
-
under Apache License 2.0 license
-
Connecting Apache Spark with different data stores [DEPRECATED]
-
under Apache License 2.0 license
-
-
under Apache License 2.0 license
-
当有持续不断的结构化或非结构化大数据集以流(streaming)的方式进入分布式计算平台,能够保存在大规模分布式存储上,并且能够提供准实时SQL查询,这个系统多少人求之不得。
-
under Apache License 2.0 license
-
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
-
under Apache License 2.0 license
-
Sqoop on Apache Spark Engine
-
under Apache License 2.0 license
-
Spark job for dependency links
-
under Apache License 2.0 license
-
KodeBeagle - Large scale code analytics and search using Apache Spark.
-
under GNU Affero General Public License v3.0 license
-
PMML evaluator library for the Apache Spark cluster computing system (http://spark.apache.org/)
-
under GNU General Public License v3.0 license
-
This is the weather app I've built as a part of the assignment of the Android Internship Workshop I'd attended at Amrita in the Summer of 2016
-
under MIT License license
-
Self-contained examples using Apache Spark with the functional features of Java 8
-
under MIT License license
-
Apache Spark 2x for Java Developers, published by Packt
-
under Apache License 2.0 license
-
Operator for managing the Spark clusters on Kubernetes and OpenShift.
-
under MIT License license
-
Data Stream Development with Apache Spark, Kafka and Spring Boot by Packt Publishing
-
under Apache License 2.0 license
-
Former home of the Official Particle Cloud SDK for Android
-
under MIT License license
-
A semicircular seekbar view for selecting angle from 0° to 180° :sparkles:
-
under MIT License license
-
基于spark-ml,spark-mllib,spark-streaming的推荐算法实现
-
under MIT License license
-
A HorizontalPicker view for android, which supports both text and icon. :sparkles:
-
under Apache License 2.0 license
-
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
-
under MIT License license
-
spark全示例代码(java、scala) Spark most full instance code DEMO (java、scala)
-
under Apache License 2.0 license
-
Kafka delivery semantics in the case of failure depend on how and when offsets are stored. Spark output operations are at-least-once. So if you want the equivalent of exactly-once semantics, you must either store offsets after an idempotent output, or store offsets in an atomic transaction alongside output.There is Spark Streaming how to store Kafka topic offset with HBase.
-
under Apache License 2.0 license
-
Spring-Shiro-Spark是Spring-Boot Hibernate Spark Spark-SQL Shiro iView VueJs... ...的集成尝试
-
under Apache License 2.0 license
-
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
-
under Apache License 2.0 license
-
Java Client of the Spark Job Server implementing the arranged Rest APIs
-
under Apache License 2.0 license
-
File compaction tool that runs on top of the Spark framework.
-
under Apache License 2.0 license
-
Former home of the Particle Device Setup library for Android
-
under Apache License 2.0 license
-
Explore, transform, and analyze FHIR data with Apache Spark
-
under Eclipse Public License 1.0 license
-
-
under Apache License 2.0 license
-
Spark-Transformers: Library for exporting Apache Spark MLLIB models to use them in any Java application with no other dependencies.
-
under Apache License 2.0 license
-
A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations of possible data sources. Multiple execution modes in multiple environments enable the user to generate a diff report as a Java/Scala-friendly DataFrame or as a file for future use. Comes with out of the box SparkFactory and SparkCompare tools.
-
under GNU General Public License v2.0 license
-
A top down 2D Shooter game
-
under Apache License 2.0 license
-
Pig on Apache Spark
-
under Apache License 2.0 license
-
High Performance Spark Streaming with Direct Kafka in Java
-
under Apache License 2.0 license
-
Apache Spark examples exclusively in Java
-
under GNU General Public License v3.0 license
-
spark is a performance profiling plugin based on sk89q's WarmRoast profiler.
-
under Apache License 2.0 license
-
Spark app that demonstrates reading and writing data to from MongoDB and BSON files
-
under MIT License license
-
A simple application described in a post on the Spark framework
-
under Eclipse Public License 1.0 license
-
Multiple cursor support for Eclipse IDE
-
under Apache License 2.0 license
-
Spark on Cassandra QuickStart Project
-
under Apache License 2.0 license
-
Examples of Integrating Spark Streaming, Flume, and HBase to solve Streaming problems
-
under Apache License 2.0 license
-
OpenRate Event Rating Engine
-
under BSD 2-Clause "Simplified" License license
-
Spark MLlib wrapper for the Snowball framework
-
under Apache License 2.0 license
-
Java implementation of the Sparkey key value store
-
under BSD 2-Clause "Simplified" License license
-
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
-
under Apache License 2.0 license
-
Spark博客案例,博客地址:https://blog.csdn.net/youbitch1
-
under Apache License 2.0 license
-
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
-
under Apache License 2.0 license
-
Example Spark applications that run on Kubernetes and access GCP products, e.g., GCS, BigQuery, and Cloud PubSub
-
under MIT License license
-
Code for tutorial on my blog http://taywils.me/2013/11/05/javasparkframeworktutorial/
-
under Apache License 2.0 license
-
study demos for hadoop、hbase、hive、spark、storm .......
-
under Apache License 2.0 license
-
make dex patch based on the structure of dex file
-
under Apache License 2.0 license
-
Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.
-
under MIT License license
-
Example of running a Genetic Algorithm (Travelling Salesman) on Apache Spark
-
under GNU Affero General Public License v3.0 license
-
Pluggable server to Stream messages / events to queues like Kafka and other systems
-
under Apache License 2.0 license
-
A distributed implementation of AdaBoost.MH and MP-Boost using Apache Spark
-
under Apache License 2.0 license
-
Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...
-
under MIT License license
-
Apache Spark Web Monitor Tool, varOne
-
under BSD 3-Clause "New" or "Revised" License license
-
Spark 3.0.0 Structured Streaming Kafka Avro Demo
-
under Apache License 2.0 license
-
【bigdata】spirngboot+spark 脚手架+相关实例
-
under GNU General Public License v3.0 license
-
-
under MIT License license
-
Uncharted Ensemble Clustering is a flexible multi-threaded clustering library for rapidly constructing tailored clustering solutions that leverage the different semantic aspects of heterogeneous data. The library can be used on a single machine using multi-threading or distributed computing using Spark.
-
under Eclipse Public License 2.0 license
-
SPARQL to SQL translation engine for multiple backends, such as DB2, PostgreSQL and Apache Spark
-
under Apache License 2.0 license
-
Time series analysis with Apache Spark based on Chronix |
-
under Apache License 2.0 license
-
running spark application leveraging the power of spring framework
-
under Apache License 2.0 license
-
Example project to show how to use Kafka from Spark Streaming with the Confluent schema registry
-
under Apache License 2.0 license
-
A solution describing data-processing design pattern for streaming data through Kinesis and Spark Streaming at real-time.
-
under Apache License 2.0 license
-
Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Storm, Spark or Flink.
-
under Apache License 2.0 license
-
A streaming alternative to Zipkin's collector
-
under MIT License license
-
ViraPipe is a Apache Spark based scalable pipeline for metagenome analysis from NGS read data
-
under Apache License 2.0 license
-
Tablasco is a JUnit rule for comparing tables and Spark module for comparing large data sets
-
under Apache License 2.0 license
-
HadoopCV Hadoop,Spark Reader Video!
-
under Apache License 2.0 license
-
Demonstrates creating a healthcare data warehouse using the MIMIC-III dataset on Redshift, Spark on EMR and Lambda
-
under GNU Affero General Public License v3.0 license
-
JPMML-SparkML plugin for converting XGBoost4J-Spark models to PMML
-
under Apache License 2.0 license
-
:sparkling_heart: Pure Java binding for dear-imgui
-
under Apache License 2.0 license
-
Spark (http://sparkjava.com/) support for Swagger (https://swagger.io/)
-
under Apache License 2.0 license
-
Spark mainframe connector
-
under Apache License 2.0 license
-
NetFlow data source for Spark SQL and DataFrames
-
under Apache License 2.0 license
-
Super-fast Spark RDD for Titan Graph Database on HBase
-
under Apache License 2.0 license
-
Examples demonstrating how to use Amazon S3 Inventory to analyze your S3 storage using Spark and EMR.
-
under Apache License 2.0 license
-
This code is used to build & run a Docker container for performing predictions against a Spark ML Pipeline.
-
under MIT License license
-
A Datalog API for Spark
-
under Apache License 2.0 license
-
-
under Apache License 2.0 license
-
Allows testing Spark Web Framework based applications through HTTP
-
under Apache License 2.0 license
-
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/ML etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
-
under Apache License 2.0 license
-
-
under MIT License license
-
Eclipse plugin to generate builders
-
under MIT License license
-
Developing Spark External Data Sources using the V2 API
-
under Apache License 2.0 license
-
Collaborative filtering with MLLib on Spark based on data in Cassandra