XGBoost [1] is an open-source software library that provides the gradient boosting framework for C ++ , Java , Python , [2] R , [3] and Julia . [4] It works on Linux , Windows , [5] and macOS . [6] From the project description, it aims to provide a “Scalable, Portable and Distributed Gradient Boosting (GBM, GBRT, GBDT) Library”. Other than running on a single machine, it also supports the distributed processing frameworksApache Hadoop , Spark Apache , and Apache Flink . It has gained much popularity and attention recently as it was the algorithm of choice for… More →

Virtuoso Universal Server

Virtuoso Universal Server is a middleware and database engine that combines the functionality of a traditional Relational Database Management System (RDBMS), object-relational database (ORDBMS), virtual database , RDF , XML , free-text , web application server and file server functionality in a single system. Virtuoso is a “universal server”; it allows a single multithreaded server processthat implements multiple protocols. The open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso . The software has-been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects .

Apache Spark

Apache Spark is an open-source cluster-computing framework . Originally Developed at the University of California, Berkeley ‘s AMPLab , the Spark codebase Was later Donated to the Apache Software Foundation , qui HAS maintained it since. Spark provides an interface for full programming with implicit data parallelism and fault tolerance .


SAP HANA is an in-memory , column-oriented , relational database management system developed and marketed by SAP SE . [1] [2] Its primary function is a database server and is retrieved as requested by the applications. In addition, it Performs advanced analytics ( predictive analytics , spatial data processing , text analytics, text search, streaming analytics , graph data processing ) and includes ETL capabilities as well as an Application Server .


Qizx is a proprietary XML database that provides native storage for XML data. Qizx was first developed by Xavier Franc of Axyana [1] and was purchased by Qualcomm in 2013. [2] Qizx was re-released by Qualcomm in late 2014 on Amazon Web Services . [3]

Oracle NoSQL Database

Oracle NoSQL Database is a NoSQL -type distributed key-value database from Oracle Corporation . [1] [2] [3] [4] It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.


NoSQLz is a consistent key-value for the large data store ( NoSQL database) for z / OS IBM systems. [1] It was developed by Thierry Falissard in 2013. The purpose is to provide a low-cost alternative to all proprietary mainframe DBMS (version 1 is free software ).


MonetDB is an open source column-oriented database management system developed at the Wiskunde Centrum & Informatica (CWI) in the Netherlands . It was designed to provide high performance on complex queries against large databases, such as combining tables with millions of rows and millions of rows. MonetDB has been applied in high-performance applications for online analytical processing , data mining , geographic information system (GIS), [1] Resource Description Framework (RDF), [2] text retrieval and sequence alignmentprocessing.[3]

Predix (software)

Predix is General Electric’s software platform for the collection and analysis of data from industrial machines. [1] General Electric plans to support the growing industrial Internet of things with cloud servers and an app store . [2] GE is a member of the Industrial Internet Consortium, which works with the development and use of industrial internet technologies. [3]

Draft: MindSphere

MindSphere is an open cloud platform or “IoT operating system” [1] developed by Siemens for applications in the context of the Internet of Things ( IoT ). [2] MindSphere stores operational data and makes it accessible through digital applications (“MindApps”) to enable industrial customers to make decisions based on valuable factual information. [3] The system is used in such applications as automated production and vehicle fleet management. [2] [4]

Hibari (database)

Hibari is highly consistent, highly available, distributed, key-value Big Data store. ( NoSQL database) [1] It was developed by Cloudian, Inc. , formerly Gemini Mobile Technologies to support its mobile messaging and email services and released as open source on July 27, 2010.

Apache Hadoop

Apache Hadoop ( / h ə d u p / ) is an open source software framework used for distributed storage and processing of dataset of big data using the MapReduce programming model . It consists of computer clusters built from commodity hardware . All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be handled by the framework. [2]

H2O (software)

H2O is open source software for big-data analysis . It is produced by the company H2O.ai (formerly 0xdata ), which launched in 2011 in Silicon Valley . H2O allows users to make thousands of potential models as part of discovering patterns in data.

Apache Cassandra

Apache Cassandra is a free and open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers , providing high availability with no single point of failure . Cassandra offers robust support for multiple spanning datacenter clusters , [1] with asynchronous masterless replication allowing low latency operations for all clients.

Apache SystemML

Apache SystemML is a flexible machine learning system that automatically scales to Spark and Hadoop clusters. SystemML’s distinguishing characteristics are: Algorithm customizability via R-like and Python-like languages. Multiple execution modes, including Standalone, Spark Batch, Spark MLContext, Hadoop Batch, and JMLC. Automatic optimization based on data and cluster characteristics to ensure both efficiency and scalability.

Apache Mahout

Apache Mahout is a project of the Apache Software Foundation to Produce free implementations of distributed gold Otherwise scalable machine learning algorithms Focused Primarily in the areas of collaborative filtering , clustering and classification. Many of the implementations use the Apache Hadoop platform. [2] [3] Mahout also provides Java libraries for common math operations and Java primitive collections. Mahout is a work in progress; the number of implemented algorithms has grown quickly, [4] but various algorithms are still missing.

Apache Beam

Apache Beam is an open source unified programming model to define and execute data processing pipelines , including ETL , batch and stream (continuous) processing. [1] Beam Pipelines are defined by one of the provided SDKs and executed in one of the Beam’s supported runners (distributed processing back-ends) including Apache Apex , Apache Flink , Apache Spark , and Google Cloud Dataflow [2]