Greenplum

Greenplum was a big data analytics company headquartered in San Mateo , California . Greenplum was acquired by EMC Corporation in July 2010. [1] Starting in 2012 its database management system software became known as the Pivotal Greenplum Database sold through Pivotal Software and is currently being developed by the Greenplum Database and open source community Pivotal.

Company

Greenplum, the company, was founded in September 2003 by Scott Yara and Luke Lonergan. It was a merger of two smaller companies Metapa (founded in August 2000 near Los Angeles ) [2] and Didera in Fairfax, Virginia . [3] Investors included SoundView Ventures, Hudson Ventures and Royal Wulff Ventures. A total of US $ 20 million in funding was announced at the merger. [4] Greenplum, based in San Mateo, California , released its database management system based on PostgreSQL in April 2005 calling it Bizgres. [5] Rounds of venture capitalof about US $ 15 million each were invested in March 2006 and February 2007. [6]

In July 2006 a partnership with Sun Microsystems was announced. [7] Sun, who had also acquired MySQL AB , participated in a $ 27 million investment in January 2009, led by Meritech Capital Partners . [6] The Bizgres project was a few other members, and was supported through 2008, when the product was just called “Greenplum” as well. [8] [9] The Sun Fire X4500was a reference to the architecture and used by the majority of customers until a transition was made to Linux around that time. Greenplum was acquired by EMC Corporationin July 2010, becoming the foundation of EMC’s big data software division. [1] Although EMC did not disclose the value, it was estimated at US $ 300 million . [10] [11] Greenplum’s products at the time of acquisition were the Greenplum Database, Chorus (a management tool), and Data Science Labs. Greenplum had customers in vertical markets including eBay . [12] It became part of Pivotal Software in 2012. [13]

A variant using Apache Hadoop to store data in the Hadoop file system called Hawq was announced in 2013. [14] [15] In 2015 the GreenplumDB and Hawq open source software projects were announced. [16]

Technology

Pivotal’s Greenplum database product uses massively parallel processing (MPP) techniques. Each computer cluster consists of a master node, a standby master node, and a segment node. [17] All of the data resides on the segment nodes and the catalog information is stored in the master nodes. Segment nodes run one or more segments, which are modified PostgreSQL database instances and are assigned a content identifier. For Each table the data is divided Among the segment nodes based on the distribution keys column specified by the user in the data definition language. For each segment, there is a primary segment and a mirror segment that is not running on the same physical host. When a query enters the master node, it is parsed, planned and dispatched to all of the segments to execute the query. The Structured Query Language , SQL version : 2003 , is used to present queries to the system. Transaction with ACID . [18]

MPD database management systems provided by major vendors such as Teradata , Amazon Redshift , Microsoft Azure and IBM Netezza . [17] [19] Additional competition comes from other smaller concurrents, column-oriented databases Such As HP Vertica and data warehousing vendors with non MPP architecture, Such As Oracle Exadata , IBM DB2 .

Greenplum Version 5

In September 2017 Greenplum Database Version 5 was released. Version 5 includes the first iteration of the Greenplum project strategy of merging PostgreSQL later versions back to Greenplum and is based on PostgreSQL version 8.3 up from the previous version 8.2. [20] Version 5 also introduces the General Availability of the GPORCA Optimizer [21] for cost-based optimization of SQL designed for big data.

See also

  • Pivotal_Software

References

  1. ^ Jump up to:b “EMC to Acquire Greenplum” . Press release . EMC Corporation. July 6, 2010 . Retrieved March 15, 2017 .
  2. Jump up^ “Form D: Notice of Sale of Securities” (PDF) . US SEC. July 30, 2003. Retrieved March 15, 2017 .
  3. Jump up^ Maureen O’Gara (September 26, 2003). “Metapa Buys Didera” . Linux Business News . Retrieved March 15, 2017 .
  4. Jump up^ “Metapa Acquires Didera and Closes Additional Funding; Industry Pioneers in High-Performance Computing Combines to Create Breakthrough Linux Database Clustering Solution for Decision Support”. Press release . September 23, 2003.
  5. Jump up^ “Bizgres project launched” . PostgreSQL developer’s web site . April 17, 2005 . Retrieved March 15, 2017 .
  6. ^ Jump up to:b Duncan Riley (January 21, 2008). “Greenplum Takes $ 27 Million Series C” . Tech Crunch . Retrieved March 15, 2017 .
  7. Jump up^ White Colin, Richard Hackathorn (June 26, 2007). “Sun / Greenplum”. Business Intelligence Best Practices . Retrieved March 15, 2017 .
  8. Jump up^ “History” . Old Bizgres.org web site . Archived from the original on December 22, 2008 . Retrieved March 15, 2017 .
  9. Jump up^ “Open Source Greenplum Updates Database” . Information Week . February 22, 2008 . Retrieved March 15, 2017 .
  10. Jump up^ Om Malik (July 6, 2010). “Big Data = Big Money: EMC Buys Greenplum” . GigaOm . Retrieved March 15, 2017 .
  11. Jump up^ Alexander Haislip (July 7, 2010). “Microsoft, Sun, And SAP Surprising Winners In Sale Greenplum” . Forbes . Retrieved March 15, 2017 .
  12. Jump up^ “ebay’s two huge data warehouses” . DBMS2 blog . Monash Research. April 30, 2009 . Retrieved March 15, 2017 .
  13. Jump up^ Timothy Prickett Morgan (March 20, 2012). “EMC wants to be the linux of big data: Chips tool opens, borgs agile coders Pivotal Labs” . The Register . Retrieved March 15, 2017 .
  14. Jump up^ “When should I use Greenplum Database versus HAWQ?” . Pivotal Guru web site . January 31, 2014 . Retrieved March 15, 2017 .
  15. Jump up^ Timothy Prickett Morgan (February 25, 2013). “EMC Hadoop morphs elephant into SQL database Hawq” . The Register . Retrieved March 15,2017 .
  16. Jump up^ Cade Metz (February 17, 2015). “Pivotal Doubles Down on Open Source in a Sign of Changing Software World” . Wired . Retrieved March 15,2017 .
  17. ^ Jump up to:Timothy Prickett Morgan b (April 6, 2011). “EMC gets fat and flashy with Greenplum appliances: Take that, Teradata, Exadata, Netezza” . The Register . Retrieved March 18, 2017 .
  18. Jump up^ Sunila Gollapudi (2013). Getting Started with Greenplum for Big Data Analytics . Packt Publishing. ISBN  9781782177050 .
  19. Jump up^ “System Properties Comparison Amazon Redshift vs. Greenplum vs. Microsoft Azure SQL Database vs. Teradata Aster” . DB-engines . Retrieved March 18, 2017 . }
  20. Jump up^ “Pivotal Greenplum is alive and kicking” . ZDNet . Retrieved September 14, 2017 . }
  21. Jump up^ “Orca: A Modular Query Optimizer Architecture for Big Data” (PDF) . ZDNet . Retrieved April 14, 2016 . }