Initial Documentation
Added Jupyter Book (Easy) structure for initial docs.
Migrated GitHub Wiki -> GitHub pages Jekyll Static site
Cyb3rWard0g committed Jan 21, 2020
# HELK [Alpha]

[![License: GPL v3](](
[![GitHub issues-closed](](
[![Open Source Love](](

The Hunting ELK or simply the HELK is one of the first open source hunt platforms with advanced analytics capabilities such as SQL declarative language, graphing, structured streaming, and even machine learning via Jupyter notebooks and Apache Spark over an ELK stack. This project was developed primarily for research, but due to its flexible design and core components, it can be deployed in larger environments with the right configurations and scalable infrastructure.

Expand All @@ -20,23 +21,7 @@ The Hunting ELK or simply the HELK is one of the first open source hunt platform

The project is currently in an alpha stage, which means that the code and the functionality are still changing. We haven't yet tested the system with large data sources and in many scenarios. We invite you to try it and welcome any feedback.

# HELK Features

* **Kafka:** A distributed publish-subscribe messaging system that is designed to be fast, scalable, fault-tolerant, and durable.
* **Elasticsearch:** A highly scalable open-source full-text search and analytics engine.
* **Logstash:** A data collection engine with real-time pipelining capabilities.
* **Kibana:** An open source analytics and visualization platform designed to work with Elasticsearch.
* **ES-Hadoop:** An open-source, stand-alone, self-contained, small library that allows Hadoop jobs (whether using Map/Reduce or libraries built upon it such as Hive, Pig or Cascading or new upcoming libraries like Apache Spark ) to interact with Elasticsearch.
* **Spark:** A fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.
* **GraphFrames:** A package for Apache Spark which provides DataFrame-based Graphs.
* **Jupyter Notebook:** An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text.
* **KSQL:** Confluent KSQL is the open source, streaming SQL engine that enables real-time data processing against Apache Kafka®. It provides an easy-to-use, yet powerful interactive SQL interface for stream processing on Kafka, without the need to write code in a programming language such as Java or Python
* **Elastalert:** ElastAlert is a simple framework for alerting on anomalies, spikes, or other patterns of interest from data in Elasticsearch
* **Sigma:** Sigma is a generic and open signature format that allows you to describe relevant log events in a straightforward manner.

# Getting Started

## Docs:

* [Introduction](
* [Architecture Overview](
Expand All @@ -47,31 +32,6 @@ The project is currently in an alpha stage, which means that the code and the fu
* [Spark](
* [Installation](

## (Docker) Accessing the HELK's Images

By default, the HELK's containers are run in the background (Detached). You can see all your docker containers by running the following command:
sudo docker ps
a97bd895a2b3 cyb3rward0g/helk-spark-worker:2.3.0 "./spark-worker-entr…" About an hour ago Up About an hour>8082/tcp helk-spark-worker2
cbb31f688e0a cyb3rward0g/helk-spark-worker:2.3.0 "./spark-worker-entr…" About an hour ago Up About an hour>8081/tcp helk-spark-worker
5d58068aa7e3 cyb3rward0g/helk-kafka-broker:1.1.0 "./kafka-entrypoint.…" About an hour ago Up About an hour>9092/tcp helk-kafka-broker
bdb303b09878 cyb3rward0g/helk-kafka-broker:1.1.0 "./kafka-entrypoint.…" About an hour ago Up About an hour>9093/tcp helk-kafka-broker2
7761d1e43d37 cyb3rward0g/helk-nginx:0.0.2 "./nginx-entrypoint.…" About an hour ago Up About an hour>80/tcp helk-nginx
ede2a2503030 cyb3rward0g/helk-jupyter:0.32.1 "./jupyter-entrypoin…" About an hour ago Up About an hour>4040/tcp,>8880/tcp helk-jupyter
ede19510e959 cyb3rward0g/helk-logstash:6.2.4 "/usr/local/bin/dock…" About an hour ago Up About an hour 5044/tcp, 9600/tcp helk-logstash
e92823b24b2d cyb3rward0g/helk-spark-master:2.3.0 "./spark-master-entr…" About an hour ago Up About an hour>7077/tcp,>8080/tcp helk-spark-master
6125921b310d cyb3rward0g/helk-kibana:6.2.4 "./kibana-entrypoint…" About an hour ago Up About an hour 5601/tcp helk-kibana
4321d609ae07 cyb3rward0g/helk-zookeeper:3.4.10 "./zookeeper-entrypo…" About an hour ago Up About an hour 2888/tcp,>2181/tcp, 3888/tcp helk-zookeeper
9cbca145fb3e cyb3rward0g/helk-elasticsearch:6.2.4 "/usr/local/bin/dock…" About an hour ago Up About an hour 9200/tcp, 9300/tcp helk-elasticsearch

Then, you will just have to pick which container you want to access and run the following following commands:
sudo docker exec -ti <image-name> bash
# Resources

* [Welcome to HELK! : Enabling Advanced Analytics Capabilities](
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
source ''

group :jekyll_plugins do
gem 'github-pages'
gem 'jekyll-feed', '~> 0.6'

# Textbook plugins
gem 'jekyll-redirect-from'
gem 'jekyll-scholar'

# Windows does not include zoneinfo files, so bundle the tzinfo-data gem
gem 'tzinfo-data', platforms: [:mingw, :mswin, :x64_mingw, :jruby]

# Performance-booster for watching directories on Windows
gem 'wdm', '~> 0.1.0' if Gem.win_platform?

# Development tools
gem 'guard', '~> 2.14.2'
gem 'guard-jekyll-plus', '~> 2.0.2'
gem 'guard-livereload', '~> 2.5.2'
