Alibaba Cloud has unveiled a major update to its E – MapReduce (EMR) platform, which is positioned as a fully managed cloud solution for big data, built on an open-source ecosystem. The platform serves as one of the key computing and analytical cores of the comprehensive OpenLake solution and ensures deep compatibility with popular technologies such as Hadoop, Spark, StarRocks, Presto, Hudi, Iceberg, and Paimon. The main goal of the architecture is to provide enterprise customers with a unified foundation for building modern data lakes, where traditional information processing in batch and stream modes is organically combined with artificial intelligence capabilities.
The infrastructural core of the updated lineup relies on three flexible deployment models, allowing customers to precisely adapt the computing environment to the specifics of their projects. The EMR Serverless format provides automatic resource scaling in seconds depending on current business load, is billed solely on actual usage, and does not require prior capacity reservation. Combined with a multi-tiered storage strategy for hot and cold data, this approach significantly reduces the total cost of ownership of corporate IT infrastructure.
Within the serverless architecture, special attention has been given to the development of the EMR Serverless Spark solution, which is a high-performance Lakehouse-class product focused on collaborative work with data and algorithms. The built-in Fusion 2.0 engine can deliver performance four times higher than standard open-source Spark. Users are not required to manually manage clusters, and the tool natively supports GPU scheduling and the distributed Ray framework.
For real-time data analytics, the platform offers a managed EMR Serverless StarRocks service, which maintains one hundred percent compatibility with the original open-source community version. Internal mechanism optimization allows this solution to demonstrate up to ten times faster query performance on data lake tables compared to the base code. Enterprises can use this tool to create efficient OLAP systems and accelerate the deployment of lightweight data warehouses.
The platform's technical superiority is confirmed by the results of independent international tests. In the TPC-H benchmark, which assesses the efficiency of data analytics systems, the EMR Serverless StarRocks service, powered by the Stella 1.2.0 engine, achieved global leadership, reaching a score of over 7.54 million QphH, which is 111 percent higher than the nearest competitor's result. In the TPC-DS decision support tests, the EMR Serverless Spark configuration, in conjunction with DLF, took first place, showing a result of over 65.68 million QphDS and providing a 100 percent advantage in pure performance.
Sources
Replies (0)
No replies in this topic yet.