Apache Mahout

Apache Mahout is a project of the Apache Software Foundation to Produce free implementations of distributed gold Otherwise scalable machine learning algorithms Focused Primarily in the areas of collaborative filtering , clustering and classification. Many of the implementations use the Apache Hadoop platform. [2] [3] Mahout also provides Java libraries for common math operations and Java primitive collections. Mahout is a work in progress; the number of implemented algorithms has grown quickly, [4] but various algorithms are still missing.

Apache Beam

Apache Beam is an open source unified programming model to define and execute data processing pipelines , including ETL , batch and stream (continuous) processing. [1] Beam Pipelines are defined by one of the provided SDKs and executed in one of the Beam’s supported runners (distributed processing back-ends) including Apache Apex , Apache Flink , Apache Spark , and Google Cloud Dataflow [2]