What kind of sensor data?

Sensor Data Analysis for HackZurich

Based on SHMACK deploy the components to perform sensor data ingestion and analysis in a DC/OS cluster in the Amazon AWS Cloud. The data can get received from the corresponding smartphone apps for iOS or Android

What kind of sensor data?

So far, we are expecting and prepared to receive the different data described here: https://github.com/Zuehlke/hackzurich-sensordata-ios/blob/master/README.md

Getting started

In order to be able to run the sensor data analysis in the cloud, you need a running cluster.

For HackZurich: Please form teams and share a cluster. We will provide you with credentials each team can use to start up your cluster. Each team should decide on a team name - so we can identify the credentials and resources asigned to you. We will also assist you in the process of getting your environment ready to work, so don't hesitate to ask us for help. Most likely you will avoid plenty of problems of you run SHMACK in an Ubuntu 16.04 VM and use IntelliJ IDEA as IDE instead of Eclipse as it still has an edge over Eclipse with ScalaIDE when it comes to supporting language features of Scala and integrating Gradle.

Setting up a DC/OS cluster for this purpose works best using SHMACK with the following settings:

Stick to DC/OS 1.7; the newer version 1.8 has been released September 15 2016 ... it is bleeding edge and may bleed a little too much for now :-) By using TEMPLATE_URL="https://s3-us-west-1.amazonaws.com/shmack/single-master.cloudformation.json" in shmack_env you will use DC/OS 1.7.
- Since Spark proposes In Memory computation and Cassandra/Kafka/HDFS provide their own redundancy, better switch to memory-optimized EC2 SLAVE_INSTANCE_TYPE in shmack_env, e.g. r3.xlarge. They are a bit more expensive, but significantly increase the data you can handle per node compared to the default (m3.xlarge). Spot instances are always a good choice if you want to save money and don't care if you may loose part of or your entire cluster if you get outbid.
- You may add all the packages you will need anyway in create-stack.sh: INSTALL_PACKAGES="spark cassandra kafka marathon marathon-lb", so they already get installed when you create your cluster using the CloudFormation template.
- For HackZurich: Set STACK_NAME in shmack_env to your team's name.
Now you can go through the steps described in https://github.com/Zuehlke/SHMACK from Installation to Stack Creation.
Once the stack is up-and-running an you see on the DC/OS master dashboard that services are healthy, you can deploy the Sensor Ingestion Akka REST Service
- For HackZurich: Just to get started, you don't need to build and push the application to docker yourself, you can just use the docker image from bwedenik/sensor-ingestion as defined in sensor-ingestion-options.json. Only after you make changes in the ingestion service, you will need to build and deploy your own docker images.
- For HackZurich: When you have assured that in principle Sensor Ingestion is running properly, tell us your endpoint URL, so we can forward you the sensor readings we receive at the main ingestion cluster. You may use this -your own endpoint- with the iOS or Android app if you don't need other teams to see your sensor readings.
Run the KafkaToCassandra Spark streaming job, in the process of which Cassandra tables will get created to make it much nicer to query and visulize data in Zeppelin.
Now you are basically done - you have a cluster running Mesos, an reactive Akka Web Service to store incoming data to Kafka, a Spark Streaming job reading from Kafka and writing to Cassandra, and Zeppelin to interactively explore the recorded data. In addition, you find some more examples in this repository, e.g. a Simple Spark Testrun that can be a basis for your own jobs and KafkaS3Backup that illustrates how to access external resources - in particular of AWS.
For HackZurich: You may now focus on the challenge, good luck!

Affiliation within Zühlke

Initiated for the purpose of HackZurich, main contributions from members of:

Zühlke Focusgroup - Big Data / Cloud
Zühlke Focusgroup - Data Analytics
Zühlke Focusgroup - Reactive Computing

License Details

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
KafkaToAccelerometer		KafkaToAccelerometer
KafkaToCassandra		KafkaToCassandra
KafkaToSensorIngestionForwarder		KafkaToSensorIngestionForwarder
akka-data-analytics		akka-data-analytics
classes/production/sensor-ingestion		classes/production/sensor-ingestion
common-utils		common-utils
gradle/wrapper		gradle/wrapper
kafka-s3-backup		kafka-s3-backup
sensor-ingestion-client-simulator		sensor-ingestion-client-simulator
sensor-ingestion		sensor-ingestion
sensor-reading		sensor-reading
simple-spark-testrun		simple-spark-testrun
spark-data-analytics		spark-data-analytics
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.MD		README.MD
S3ForSparkSubmit.md		S3ForSparkSubmit.md
current-endpoint.json		current-endpoint.json
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What kind of sensor data?

Getting started

Affiliation within Zühlke

License Details

About

Releases

Packages

Contributors 6

Languages

License

Zuehlke/hackzurich-sensordataanalysis

Folders and files

Latest commit

History

Repository files navigation

What kind of sensor data?

Getting started

Affiliation within Zühlke

License Details

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages