helk-analytics
+ Init file and Dockerfile updated with Spark version 2.3.0
+Jupyter Notebook from getting started folder updated
+ New jupyter notebook with graphframes example presented in BSColumbus 2018
helk-elk
+ Added properties to elasticsearch config file to set it as a standalone cluster. (It helps for when elasticsearch is restarted)
+ Updated Dashboards
+ Updated Kibana timeout to 60000
+ Updated Logstas - elasticsearch mapping templates after renaming fields.
+ Updated logstash filters renaming fields keeping a new flat schema. No more nested fields style.
helk-kafka
+ Updated Log retention hours to 2 hours
Resources:
- Created README to share all the blog posts, documentes and presentations that helped me to work on the HELK
Scripts
+ Deprecated most of the scripts used before to install ELK via TAR and DEB. Also deprecated scripts to updated geoip database.
HELK Design
+ moved everything to docker-compose approach for a more modular design.
+ separated the HELK in 3 services:
++helk-elk, helk-kafka, helk-analytics
+ Updated Design picture to show WEF ideas and also show Jupyter Lab integrations.
HELK Docker-Compose
+ Added ESDATA volume to keep logs after contaners get stopped
+ Services restart automatically after reboot
+ created blank env file for Kafka service. This allows the host to pass its own local IP to Kafka. This is needed for advertised listener configs on each broker.
HELK-ELK Version
- Updated to 6.2.2
ELasticsearch
- Added local docker network as part of the network.host option. This allows the HELK-ELK service to publish its docker local IP to other services/images in the docker compose environment.
Logstash
+ minimal updates to certain configs (Mainly renaming files and replacing certain strings)
Kibana
+ enableExternalUrls set to true for Vega visualization that need external libraries.
Spark - Analytics
+ Renamed service to Analytics
+ Integrated Apache Toree to allow Scala kernel in Jupyter
+ Pyspark, Scala and SQL are now available in Jupyter
Jupyter
+ Jupyter LAB has been enabled
Elasticsearch
+ Deleted Docker elasticsearch config file (Duplicate)
Logstash
+ Adjusted Batch size to 300 (Testing)
+ Renamed scripts to follow a standard naming convention
+ Added a fingerprint filter to all logs to help reduce duplicate logs
+ Removed ELK Version strings from all Logstash configs so that I dont have to update every single script every time ELK gets updated.
+ Added Document_id to every logstash output config to take the fingerprint value.
Kibana
+ Renamed Index Patterns to standard naming convention.
+ Added experimental visualization vega setting. Enabling External URLs to use D3 libraries from their repos. This is grayed out in the Kibana config so user will have to enable it.
+ Updated name of index patterns across all visualizations and dashboards.
Kafka
+ Log retention is now 24 hours and not 268 Hours
+ added auto_offset_reset => "earliest" to beats kafka input config
Spark
+ updated es-hadoop version to 6.2.0 and added new spark jar packages: org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.1 & databricks:spark-sklearn:0.2.3
+ Created an init file to run spark and jupyter all together as a service. This will allow us to restart jupyter and pyspark gracefully.
Winlogbeat
+ Updated Winlogbeat config to take PowerShell and Microsoft-Windows-WMI-Activity/Operational logs.
New Features
+ Cerebro
+ Python packages:
-scipy==1.0.0
scikit-learn==0.19.1
nltk==3.2.5
matplotlib==2.1.2
seaborn==0.8.1
datasketch==1.2.5
tensorflow==1.5.0
keras==2.1.3
pyflux==0.4.15
imbalanced-learn==0.3.2
lime==0.1.1.29
Docker Hub
+ New HELK image available
+ @bsisco via Issue #19 let us know that communication between systems and kafka was not working. I forgot to expose the right ports when running the HELK Docker image after being pulled.
+ ELK 6.1.3 version (Jun 30,2018 release)
+ Kafka Integration
-- Bash, DockerFile & Docker Image
+ Replaced ELK DEB Install Packages for TAR packages (Easier deployement and more control)
+ Logstash: JVM Heap 2GB default
+ ELK (Init Files created)
-- More control over service start
+ Left Linux DEB install bash script (deprecating it in next release)
+ ELK .yml files are not available to adjust deployment in an easier way.
+ Fixed Docker Run environment parameters to be call before pointing to the HELK image.
+ Edited every single file to have the right headers:
-- ELK version 6.1.3
-- Aplha Version
-Using Official Docker install script known as convenience script
- Saved a copy of the convenience script (Edge version) locally just in case (Script needs to be modified if it is intended to use in production.
-Moved spark folder out of enrichments to root.
- Removed ipython & inotebook deb packages. Jupyter is installed via PIP only.
- Added new contributor to README
While installing the HELK from local bash script, process did not go further in "Creating Kibana index-patterns, dashboards and visualizations automatically.." step. After some debugging, the problem detected in helk_kibana_setup.sh script which uses "curl". "curl" is not installed by default in 16.04.2 Ubuntu. As a conclusion, installation of "curl" was added to this script.
- kibana index patter creation script needed an update
- install script updated to be executed without sh
- updated sysmon template name to match sysmon logstash sysmon output config
- Logstash
-- Cleaned output configurations
-- Created Sysmon teamplte
-- Added sysmon template to sysmon elasticsearch output
-- Removed sniffing = True from every elasticsearch output
- Scripts
-- Updated Install config
-- Added creation of Kibana index patterns to install script
-- Added headers to every script but posh script
-- renamed scripts to keep naming standard helk-*
- Updated PowerShell Filter and output to also parse 400 and 600
- Split winlogbeat output to show new indices
-- sysmon
-- application
-- system
-- security
-- powershell
- Removed neo4j install (replacing it with something that could scale)
- Added creation of folder /op/helk and cron job in helk_install script
- updated sysmon logstash script to grap intelligence from the new path /opt/helk/otx