Apache Cloudberry 2.1.0 Launches as Greenplum Rival

Introduced release of the distributed DBMS Apache Cloudberry 2.1.0, which continues the development of the open source codebase Greenplum DBMS, which was turned into a closed product by Broadcom after acquiring VMware. The project is currently in the Apache incubator and will be transferred to the number of primary Apache projects once the infrastructure and maintainers are ready.

The Cloudberry DBMS is a distributed edition of the open PostgreSQL DBMS, optimized for performing analytical queries on large amounts of data (Data Warehouse). For parallel data processing, massively parallel architecture (MPP, massively parallel processing), providing storage scalability up to petabyte sizes by dividing data into segments and using a cluster from a group of servers to store and process it.

Among improvements in Apache Cloudberry 2.1.0:

  • Implemented UDP2 protocol for communication between nodes, which made it possible to increase the efficiency of distributed query execution.
  • Added support for MCP server (Model Context Protocol) to simplify integration with tools based on large language models.
  • Added the ability to use the LZ4 algorithm to compress table columns to reduce I/O and reducing memory consumption.
  • The work of the ORCA optimizer has been improved.
  • The greenplum_path.sh script, used to configure the DBMS user environment, replaced with cloudberry-env.sh.
  • The backup toolkit has been renamed to cloudberry-backup. The main repository includes a plugin for storage based on the S3 protocol.
/Reports, release notes, official announcements.