Commit Graph

1 Commits (c18aac2f5136cf86e906ca67f89c673e5895e54e)

Author SHA1 Message Date
Roberto Rodriguez 10de1b6b0a HELK 6.2.4-050318
## Overall
+ Removed the Init files dependencies on all containers
+ Added more resources to the resources folder (papers and presentations)
+ Updated to-do list on main README
+ Removed Static Network setting. Addressing overlapping network issues (https://github.com/Cyb3rWard0g/HELK/issues/43)
+ Updated WIki and added new images to it
+ Started documenting potential error messages or bugs with a few quick fixes

## Helk Install Script
+ Script now collects information about Available Memory and Disk size for LINUX host ONLY. it only continues if the box hosting the HELK has at least 12GB of RAM and 50GB of Disk Available. (This can be overwritten manually by just editing the helk_install script before installing the HELK)

## ELK Stack
+ Started using Elastic Docker Images as a base
+ Updated ELK stack to 6.2.4 version
+ X-Pack Basic Free License attached to build automatically
+ Monitoring capabilities are now enabled in the build (Reason why Cerebro went away)

## Spark
+ Integrated Spark Standalone Cluster Manager
+ Spark Node running with Jupyter Notebook now points to the Helk-Spark-Master container for any execution of code
+ Added Spark Master and Worker Docker Images
+ Build runs now with 2 Workers and 1 Master by default.
+ Apache Arrow is enabled for Pandas Dataframe optimization
+ Created Spark-Base Docker Image (Applied to the Jupyter Image)

## Kafka
+ Kafka Container was split in Kafka Brokers and one Zookeeper
+ Helk runs with 2 Kafka Brokers and 1 Zookeeper by default

## Jupyter Container
+ Preparing to add Zeppelin Notebook. the Analytics container is now named Jupyter. It uses the Spark-Base image to build on the top and install the necessary packagess
+ New packages were added:
++ nxviz
++ hiveplot
 ++ pyarrow
+ Apache Arrow is not enabled on the Jupyter node to be able to optimize the use of Pandas DataFrames
2018-05-03 15:54:12 -04:00