HELK/README.md

80 lines
4.6 KiB
Markdown
Raw Normal View History

# HELK
v0.1.6-alpha12132018 HELK base image + Updated to 0.0.3 HELK ELK Version + Now using 6.5.3 official ELK Docker Images (https://www.elastic.co/blog/elastic-stack-6-5-3-released) helk_install + Users can now select between two deployments: ++ helk-kibana-analysis (KAFKA + KSQL + ELK + NGNIX + ELASTALERT) ++ helk-kibana-notebooks (KAFKA + KSQL + ELK + NGNIX + ELASTALERT + SPARK + JUPYTER) + Fixed https://github.com/Cyb3rWard0g/HELK/issues/131 . Users can now set up the Kibana UI User password during installation. Also, user can set the Elasticsearch elastic account password when using the Trial license option. helk-elastalert + Elastalert deployed and ready to use with SIGMA integration. Blog available at https://medium.com/@Cyb3rWard0g helk-elasticsearch + consolidated main configs in one + added more environment variables for ELASTIC_PASSWORD and default values in case it is not used to be compatible with the default values applied to HELK. helk-logstash + updated to 6.5.3 + simplified pipeline to have only one folder + logstash-entrypoint script can now enable elastic password on all logstash output conf files. + New environment variables (ELASTIC_PASSWORD, ELASTIC_HOST, ELASTIC_PORT) helk-nginx + split the default config for the two deployment options (helk-kibana-analysis (trial/base) and helk-kibana-notebook-analysis (trial/base) helk-kibana + Updated to version 6.5.3 + Added new environment variables (ELASTICSEARCH_URL, SERVER_HOST, SERVER_PORT, ELASTIC_PASSWORD, ELASTIC_HOST, ELASTIC_PORT, ELASTICSEARCH_USERNAME, ELASTICSEARCH_PASSWORD, KIBANA_UI_PASSWORD) and logic to make the build more dynamic helk-jupyter + updated Jupyterlab to 0.35.4 + updated jupyterhub to 0.9.4 + updated jupyterlab hub extension to 0.12.0 + updated ES_HADOOP to 6.5.3 + updated org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.0 + Added extra notebooks to test deployment and provide more information for analyst experiencing Jupyter for the first time helk-kafka-base + reduced docker container size + updated Kafka to 2.1.0 (this affects Kafka brokers and zookeeper) helk-kafka-broker + User can now define a list of topics to be created via the new environment variable KAFKA_CREATE_TOPICS. That needs to be defined either in the docker-compose file or while running the docker container on its own. helk-zookeeper + reduced size of container + updated build to kafka 2.1.0 helk-KSQL + initial integration of KSQL + KSQL Server and KSQL CLI are available + Blog post coming soon ;)
2018-12-13 21:27:17 +00:00
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![GitHub issues-closed](https://img.shields.io/github/issues-closed/Cyb3rward0g/HELK.svg)](https://GitHub.com/Cyb3rWard0g/HELK/issues?q=is%3Aissue+is%3Aclosed)
[![Twitter](https://img.shields.io/twitter/follow/THE_HELK.svg?style=social&label=Follow)](https://twitter.com/THE_HELK)
2020-01-14 23:40:27 +00:00
[![Open Source Love](https://badges.frapsoft.com/os/v1/open-source.png?v=103)](https://github.com/ellerbrock/open-source-badges/)
[![stability-alpha](https://img.shields.io/badge/stability-alpha-f4d03f.svg)](https://github.com/mkenney/software-guides/blob/master/STABILITY-BADGES.md#alpha)
v0.1.6-alpha12132018 HELK base image + Updated to 0.0.3 HELK ELK Version + Now using 6.5.3 official ELK Docker Images (https://www.elastic.co/blog/elastic-stack-6-5-3-released) helk_install + Users can now select between two deployments: ++ helk-kibana-analysis (KAFKA + KSQL + ELK + NGNIX + ELASTALERT) ++ helk-kibana-notebooks (KAFKA + KSQL + ELK + NGNIX + ELASTALERT + SPARK + JUPYTER) + Fixed https://github.com/Cyb3rWard0g/HELK/issues/131 . Users can now set up the Kibana UI User password during installation. Also, user can set the Elasticsearch elastic account password when using the Trial license option. helk-elastalert + Elastalert deployed and ready to use with SIGMA integration. Blog available at https://medium.com/@Cyb3rWard0g helk-elasticsearch + consolidated main configs in one + added more environment variables for ELASTIC_PASSWORD and default values in case it is not used to be compatible with the default values applied to HELK. helk-logstash + updated to 6.5.3 + simplified pipeline to have only one folder + logstash-entrypoint script can now enable elastic password on all logstash output conf files. + New environment variables (ELASTIC_PASSWORD, ELASTIC_HOST, ELASTIC_PORT) helk-nginx + split the default config for the two deployment options (helk-kibana-analysis (trial/base) and helk-kibana-notebook-analysis (trial/base) helk-kibana + Updated to version 6.5.3 + Added new environment variables (ELASTICSEARCH_URL, SERVER_HOST, SERVER_PORT, ELASTIC_PASSWORD, ELASTIC_HOST, ELASTIC_PORT, ELASTICSEARCH_USERNAME, ELASTICSEARCH_PASSWORD, KIBANA_UI_PASSWORD) and logic to make the build more dynamic helk-jupyter + updated Jupyterlab to 0.35.4 + updated jupyterhub to 0.9.4 + updated jupyterlab hub extension to 0.12.0 + updated ES_HADOOP to 6.5.3 + updated org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.0 + Added extra notebooks to test deployment and provide more information for analyst experiencing Jupyter for the first time helk-kafka-base + reduced docker container size + updated Kafka to 2.1.0 (this affects Kafka brokers and zookeeper) helk-kafka-broker + User can now define a list of topics to be created via the new environment variable KAFKA_CREATE_TOPICS. That needs to be defined either in the docker-compose file or while running the docker container on its own. helk-zookeeper + reduced size of container + updated build to kafka 2.1.0 helk-KSQL + initial integration of KSQL + KSQL Server and KSQL CLI are available + Blog post coming soon ;)
2018-12-13 21:27:17 +00:00
The Hunting ELK or simply the HELK is one of the first open source hunt platforms with advanced analytics capabilities such as SQL declarative language, graphing, structured streaming, and even machine learning via Jupyter notebooks and Apache Spark over an ELK stack. This project was developed primarily for research, but due to its flexible design and core components, it can be deployed in larger environments with the right configurations and scalable infrastructure.
2018-01-16 01:11:13 +00:00
![alt text](resources/images/HELK_Design.png "HELK Infrastructure")
2017-04-14 05:29:04 +00:00
2017-06-29 15:21:59 +00:00
# Goals
* Provide an open source hunting platform to the community and share the basics of Threat Hunting.
* Expedite the time it takes to deploy a hunt platform.
* Improve the testing and development of hunting use cases in an easier and more affordable way.
* Enable Data Science capabilities while analyzing data via Apache Spark, GraphFrames & Jupyter Notebooks.
2017-06-29 15:21:59 +00:00
# Current Status: Alpha
The project is currently in an alpha stage, which means that the code and the functionality are still changing. We haven't yet tested the system with large data sources and in many scenarios. We invite you to try it and welcome any feedback.
## Docs:
HELK ELK 6.2.0 & New features Elasticsearch + Deleted Docker elasticsearch config file (Duplicate) Logstash + Adjusted Batch size to 300 (Testing) + Renamed scripts to follow a standard naming convention + Added a fingerprint filter to all logs to help reduce duplicate logs + Removed ELK Version strings from all Logstash configs so that I dont have to update every single script every time ELK gets updated. + Added Document_id to every logstash output config to take the fingerprint value. Kibana + Renamed Index Patterns to standard naming convention. + Added experimental visualization vega setting. Enabling External URLs to use D3 libraries from their repos. This is grayed out in the Kibana config so user will have to enable it. + Updated name of index patterns across all visualizations and dashboards. Kafka + Log retention is now 24 hours and not 268 Hours + added auto_offset_reset => "earliest" to beats kafka input config Spark + updated es-hadoop version to 6.2.0 and added new spark jar packages: org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.1 & databricks:spark-sklearn:0.2.3 + Created an init file to run spark and jupyter all together as a service. This will allow us to restart jupyter and pyspark gracefully. Winlogbeat + Updated Winlogbeat config to take PowerShell and Microsoft-Windows-WMI-Activity/Operational logs. New Features + Cerebro + Python packages: -scipy==1.0.0 scikit-learn==0.19.1 nltk==3.2.5 matplotlib==2.1.2 seaborn==0.8.1 datasketch==1.2.5 tensorflow==1.5.0 keras==2.1.3 pyflux==0.4.15 imbalanced-learn==0.3.2 lime==0.1.1.29 Docker Hub + New HELK image available
2018-02-15 08:28:48 +00:00
* [Introduction](https://github.com/Cyb3rWard0g/HELK/wiki)
* [Architecture Overview](https://github.com/Cyb3rWard0g/HELK/wiki/Architecture-Overview)
* [Kafka](https://github.com/Cyb3rWard0g/HELK/wiki/Kafka)
* [Logstash](https://github.com/Cyb3rWard0g/HELK/wiki/Logstash)
* [Elasticsearch](https://github.com/Cyb3rWard0g/HELK/wiki/Elasticsearch)
* [Kibana](https://github.com/Cyb3rWard0g/HELK/wiki/Kibana)
* [Spark](https://github.com/Cyb3rWard0g/HELK/wiki/Spark)
* [Installation](https://github.com/Cyb3rWard0g/HELK/wiki/Installation)
# Resources
* [Welcome to HELK! : Enabling Advanced Analytics Capabilities](https://cyberwardog.blogspot.com/2018/04/welcome-to-helk-enabling-advanced_9.html)
* [Spark](https://spark.apache.org/docs/latest/index.html)
* [Spark Standalone Mode](https://spark.apache.org/docs/latest/spark-standalone.html)
* [Setting up a Pentesting.. I mean, a Threat Hunting Lab - Part 5](https://cyberwardog.blogspot.com/2017/02/setting-up-pentesting-i-mean-threat_98.html)
* [An Integrated API for Mixing Graph and Relational Queries](https://cs.stanford.edu/~matei/papers/2016/grades_graphframes.pdf)
* [Graph queries in Spark SQL](https://www.slideshare.net/SparkSummit/graphframes-graph-queries-in-spark-sql)
* [Graphframes Overview](http://graphframes.github.io/index.html)
* [Elastic Producs](https://www.elastic.co/products)
* [Elastic Subscriptions](https://www.elastic.co/subscriptions)
* [Elasticsearch Guide](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html)
* [spujadas elk-docker](https://github.com/spujadas/elk-docker)
* [deviantony docker-elk](https://github.com/deviantony/docker-elk)
2017-06-29 15:21:59 +00:00
# Author
2018-01-08 22:58:42 +00:00
* Roberto Rodriguez [@Cyb3rWard0g](https://twitter.com/Cyb3rWard0g) [@THE_HELK](https://twitter.com/THE_HELK)
2017-05-26 06:11:09 +00:00
# Current Committers
2018-01-08 22:58:42 +00:00
* Nate Guagenti [@neu5ron](https://twitter.com/neu5ron)
2017-06-29 15:21:59 +00:00
# Contributing
2018-01-08 22:58:42 +00:00
There are a few things that I would like to accomplish with the HELK as shown in the To-Do list below. I would love to make the HELK a stable build for everyone in the community. If you are interested on making this build a more robust one and adding some cool features to it, PLEASE feel free to submit a pull request. #SharingIsCaring
2017-06-29 15:21:59 +00:00
HELK-07122018 License: GPL-3.0 Update ++ Updated all the local documents ++ Docker images in Dockerhub in progreess Docker-Compose ++ Created two options: basic and trial ELK Stack Docker Files ++ Created Trial Folders to make sure the configurations are set properly for when the user selects trial version of HELK. ++++ HELK trial = x-pack + trial license + security enabled ++ Deprecating the HELKs Platinum's Branch. Merging that branch with the HELKs master to allow user to select the type of license during the install process. Jupyter ++ Getting ready for Jupyterhub ++ Created two folders: basic and trial to allow elasticsearch interaciton with username and password hardcoded in the spark session. trial license requires any interaction with elasticsearch to be authenticated. Kibana ++ Added trial folder with scripts that set up security configs for the trial version of HELK. It creates users and roles to test the security features of x-pack Logstash ++ Created trial folder with another pipeline folder in it. The pipeline in trial has output configs with elasticsearch's username and password hardcoded. Ready for when the user sets the build with trial license and wants to send logs to elasticsearch. The logstash configs are the same as the ones from the defailt pipeline. They only have username and password configs on all the output configs. Nginx ++ set trial folder with the right config to allow Kibana handle the authentication process when user builds and installs HELK with a trial license. No need for nginx to handle the authentication. helk_install bash script ++ Updated script to handle license choice : basic or trial ++ basic license is selected by default. If user selects trial, it runs the specific docker-compose file needed to build and install HELK with the right trial configs. ++ Updated also the CLI options. User now will have to specify the license for HELK. Example: sudo ./helk_install.sh -i 192.168.64.131 -l basic
2018-07-12 04:29:09 +00:00
# License: GPL-3.0
[ HELK's GNU General Public License](https://github.com/Cyb3rWard0g/HELK/blob/master/LICENSE)
2017-06-29 15:21:59 +00:00
# TO-Do
HELK 6.2.4-050318 ## Overall + Removed the Init files dependencies on all containers + Added more resources to the resources folder (papers and presentations) + Updated to-do list on main README + Removed Static Network setting. Addressing overlapping network issues (https://github.com/Cyb3rWard0g/HELK/issues/43) + Updated WIki and added new images to it + Started documenting potential error messages or bugs with a few quick fixes ## Helk Install Script + Script now collects information about Available Memory and Disk size for LINUX host ONLY. it only continues if the box hosting the HELK has at least 12GB of RAM and 50GB of Disk Available. (This can be overwritten manually by just editing the helk_install script before installing the HELK) ## ELK Stack + Started using Elastic Docker Images as a base + Updated ELK stack to 6.2.4 version + X-Pack Basic Free License attached to build automatically + Monitoring capabilities are now enabled in the build (Reason why Cerebro went away) ## Spark + Integrated Spark Standalone Cluster Manager + Spark Node running with Jupyter Notebook now points to the Helk-Spark-Master container for any execution of code + Added Spark Master and Worker Docker Images + Build runs now with 2 Workers and 1 Master by default. + Apache Arrow is enabled for Pandas Dataframe optimization + Created Spark-Base Docker Image (Applied to the Jupyter Image) ## Kafka + Kafka Container was split in Kafka Brokers and one Zookeeper + Helk runs with 2 Kafka Brokers and 1 Zookeeper by default ## Jupyter Container + Preparing to add Zeppelin Notebook. the Analytics container is now named Jupyter. It uses the Spark-Base image to build on the top and install the necessary packagess + New packages were added: ++ nxviz ++ hiveplot ++ pyarrow + Apache Arrow is not enabled on the Jupyter node to be able to optimize the use of Pandas DataFrames
2018-05-03 19:54:12 +00:00
- [ ] Kubernetes Cluster Migration
- [ ] OSQuery Data Ingestion
- [ ] MITRE ATT&CK mapping to logs or dashboards
- [ ] Cypher for Apache Spark Integration (Adding option for Zeppelin Notebook)
- [ ] Test and integrate neo4j spark connectors with build
2018-01-08 22:58:42 +00:00
- [ ] Add more network data sources (i.e Bro)
2018-03-04 04:44:09 +00:00
- [ ] Research & integrate spark structured direct streaming
- [ ] Packer Images
- [ ] Terraform integration (AWS, Azure, GC)
- [ ] Add more Jupyter Notebooks to teach the basics
- [ ] Auditd beat intergation
More coming soon...