RepoDatabricks (DBRX)Databricks (DBRX)published Aug 19, 2014seen 5d

databricks/reference-apps

Scala

Open original ↗

Captured source

source ↗
published Aug 19, 2014seen 5dcaptured 9hhttp 200method plain

databricks/reference-apps

Description: Spark reference applications

Language: Scala

License: NOASSERTION

Stars: 649

Forks: 335

Open issues: 33

Created: 2014-08-19T01:09:11Z

Pushed: 2024-10-03T17:34:54Z

Default branch: master

Fork: no

Archived: no

README:

Databricks Reference Apps

At Databricks, we are developing a set of reference applications that demonstrate how to use Apache Spark. This book/repo contains the reference applications.

The reference applications will appeal to those who want to learn Spark and learn better by example. Browse the applications, see what features of the reference applications are similar to the features you want to build, and refashion the code samples for your needs. Additionally, this is meant to be a practical guide for using Spark in your systems, so the applications mention other technologies that are compatible with Spark - such as what file systems to use for storing your massive data sets.

  • Log Analysis Application - The log analysis reference application contains a series of tutorials for learning Spark by example as well as a final application that can be used to monitor Apache access logs. The examples use Spark in batch mode, cover Spark SQL, as well as Spark Streaming.
  • Weather TimeSeries Data Application with Cassandra - This reference application works with Weather Data which is taken for a given weather station at a given point in time. The app demonstrates several strategies for leveraging Spark Streaming integrated with Apache Cassandra and Apache Kafka for fast, fault-tolerant, streaming computations with time series data.

These reference apps are covered by license terms covered here.