Commit Graph

5 Commits (251870c92c818eae263db6793db44ef173e29593)

Author SHA1 Message Date
Roberto Rodriguez 3baec9f79c Updated post-install info & Resources 2018-05-04 00:35:45 -04:00
Roberto Rodriguez 10de1b6b0a HELK 6.2.4-050318
## Overall
+ Removed the Init files dependencies on all containers
+ Added more resources to the resources folder (papers and presentations)
+ Updated to-do list on main README
+ Removed Static Network setting. Addressing overlapping network issues (https://github.com/Cyb3rWard0g/HELK/issues/43)
+ Updated WIki and added new images to it
+ Started documenting potential error messages or bugs with a few quick fixes

## Helk Install Script
+ Script now collects information about Available Memory and Disk size for LINUX host ONLY. it only continues if the box hosting the HELK has at least 12GB of RAM and 50GB of Disk Available. (This can be overwritten manually by just editing the helk_install script before installing the HELK)

## ELK Stack
+ Started using Elastic Docker Images as a base
+ Updated ELK stack to 6.2.4 version
+ X-Pack Basic Free License attached to build automatically
+ Monitoring capabilities are now enabled in the build (Reason why Cerebro went away)

## Spark
+ Integrated Spark Standalone Cluster Manager
+ Spark Node running with Jupyter Notebook now points to the Helk-Spark-Master container for any execution of code
+ Added Spark Master and Worker Docker Images
+ Build runs now with 2 Workers and 1 Master by default.
+ Apache Arrow is enabled for Pandas Dataframe optimization
+ Created Spark-Base Docker Image (Applied to the Jupyter Image)

## Kafka
+ Kafka Container was split in Kafka Brokers and one Zookeeper
+ Helk runs with 2 Kafka Brokers and 1 Zookeeper by default

## Jupyter Container
+ Preparing to add Zeppelin Notebook. the Analytics container is now named Jupyter. It uses the Spark-Base image to build on the top and install the necessary packagess
+ New packages were added:
++ nxviz
++ hiveplot
 ++ pyarrow
+ Apache Arrow is not enabled on the Jupyter node to be able to optimize the use of Pandas DataFrames
2018-05-03 15:54:12 -04:00
Roberto Rodriguez 11c8720fe4 Updated Resources Links 2018-03-03 23:58:40 -05:00
Roberto Rodriguez 92d105ce51 Updated Apache Spark Kafka jar 2.3.0 2018-03-03 23:44:09 -05:00
Roberto Rodriguez 5859ba3dab HELK 6.2.2 - 030318
helk-analytics
+ Init file and Dockerfile updated with Spark version 2.3.0
+Jupyter Notebook from getting started folder updated
+ New jupyter notebook with graphframes example presented in BSColumbus 2018

helk-elk
+ Added properties to elasticsearch config file to set it as a standalone cluster. (It helps for when elasticsearch is restarted)
+ Updated Dashboards
+ Updated Kibana timeout to 60000
+ Updated Logstas - elasticsearch mapping templates after renaming fields.
+ Updated logstash filters renaming fields keeping a new flat schema. No more nested fields style.

helk-kafka
+ Updated Log retention hours to 2 hours

Resources:
- Created README to share all the blog posts, documentes and presentations that helped me to work on the HELK

Scripts
+ Deprecated most of the scripts used before to install ELK via TAR and DEB. Also deprecated scripts to updated geoip database.
2018-03-03 21:15:35 -05:00