DataPelago Unveils World’s First Universal Data Processing Engine - Read More

Harness the power of accelerated computing for any engine

Break through boundaries in data processing for engines, including open source Spark and Trino, unlocking GenAI and analytics value for your enterprise.

engine components

DataApp is a pluggable component and leverages Substrait-based open-source frameworks such as Apache Gluten and related technologies to accelerate Spark and Trino engines. It parses and optimizes workload requests to generate an industry-standard execution plan. This composable and modular architecture delivers acceleration for GenAI/LLM and Lakehouse analytics workloads developed in SQL, Python, and other widely used programming languages, permitting friction-free adoption.

DataOS services analytics and GenAI transformations in Substrait or equivalent Intermediate Representation (IR) on a fabric of heterogeneous accelerated computing elements. DataOS has two novel dimensions of performance optimization: dynamic reconfiguration of the computing elements to match the requirements of IR operations, and a cost based dynamic mapping of operators to a target computing element based on cost-performance characteristics.

The industry’s first virtual machine with a domain-specific Instruction Set Architecture (ISA) for data operators and exchange that natively supports multi-modal data types for GenAI. It is architected to be realizable on all the hardware including CPU, GPUs, FPGAs, and emerging new accelerated computing elements. The platform leverages widely adopted industry frameworks such as LLVM, CUDA, and ROCm.

UNIFIED ENGINE
Core features

Built for enterprise scale

Rapid Adoption at Scale

Quickly adopt, deploy, and operate DataPelago's solution at scale with minimal IT effort and without any disruption to business users' experience. DataPelago offers flexible deployment models that can align with your requirements.

Security, Governance & Compliance
Seamless Integration
No Vendor Lock in
Testimonials

New possibilities created with DataPelago

The exponential growth of semi-structured and unstructured data along with rapid Gen AI/AI adoption is driving innovation, not only in AI, but in data management and data processing. McAfee has been proud to partner with DataPelago on the design of their technology that shows promising results, including significant performance and cost improvements on certain workloads. Congratulations on your product launch!
Steve Grobman Executive VP and CTO, McAfee
Samsung SDS America has been working with DataPelago to evaluate their data processing platform in our AWS VPC, leveraging Accelerated Computing Infrastructure (GPUs). In preliminary testing with sample data, we’ve seen promising results in terms of performance and cost efficiency compared to traditional compute engines. DataPelago's platform shows potential in modernizing architecture and unifying data processing pipelines for GenAI and analytics, handling structured, semi-structured, and unstructured data types. This collaboration aligns with our interest in exploring innovative solutions that separate compute and storage, enhancing flexibility and reducing vendor lock-in.
Prashant Vithlani Head of Division | Cloud Business, Samsung SDS America
Twingo is proud to partner with and serve as an official reseller for DataPelago, delivering cutting-edge Big Data solutions to the Israeli market. As an early design partner, we are excited to offer DataPelago’s unified data processing platform, accelerating engines like Spark and Trino using advanced CPU and GPU infrastructure across any data lakehouse format, including Iceberg, Hudi, and Delta Lake. The benchmarks from our collaboration are groundbreaking, reducing Total Cost of Ownership and delivering exceptional value. This partnership reinforces our commitment to innovation and next-gen solutions for data-driven organizations.
Golan Nahum Founder & CEO
The growth in the volume of data processed by security systems is exponential as the adoption of AI and GenAI in cybersecurity continues to grow. Datapelago enables cost-effective expansion of AI/GenAI and cybersecurity systems by transforming the economics of data processing with its heterogeneous accelerated computing engine. As a security practitioner, I am excited with its modular architecture which allows for seamless plug-and-play integration with open-source components like Spark and Apache Gluten, ensuring frictionless deployment without any vendor lock-in.
Malcolm Harkins Chief Security and Trust Officer, HiddenLayer & ex-CISO for Intel Corp
As Director of Engineering at Uber and Presto Foundation GB Chair, I have extensive experience developing and running open-source analytics software at an enterprise scale. Our workloads typically included heavy scan/filter/join operations, which are ideal for hardware acceleration. It's exciting to see how DataPelago disrupts the industry by accelerating open-source frameworks like Presto and Spark with custom hardware infrastructure. I'm particularly impressed with their dynamic mapping to heterogeneous computing elements and reconfigurable run-time techniques. By accelerating open-source frameworks, I think DataPelago will significantly transform today's performance/$ paradigm and reshape the economics of data processing.
Girish Baliga Ex-Director of Engineering, Uber & Chair of the Presto Foundation
At Akad Seguros, innovation is woven into our DNA, fueling our unwavering commitment to exceptional customer service. Our partnership with DataPelago exemplifies this dedication, as we modernize our data architecture and unify processing pipelines for GenAI and data analysis. Leveraging DataPelago's advanced platform, we can seamlessly process structured, semi-structured, and unstructured data, reducing our costs by more than 50% and enhancing operational performance. By fully utilizing AWS’s Accelerated Computing (GPU) infrastructure, this collaboration is transforming our capacity to deliver superior results and elevating the quality of service for our customers.
Andre Fichel CTO, Akad Seguros
I’ve been privileged to be around some of the brightest minds in technology over the last several decades and it's clear to me that Rajan Goyal, co-founder and CEO, possesses the vision, intellect, experience and passion to build a truly great and innovative company. I'm excited to participate in one of the next Silicon Valley success stories!
Paula Hurd Advisor and Investor
Congratulations to DataPelago on their launch and announcement that their engine will extend Gluten, Substrait and Velox to deliver the benefits of accelerated computing for Spark to address the performance and cost challenges in the Apache Spark community. Apache Gluten is designed to reuse Apache Spark's whole control flow, while offloading the compute-intensive data processing part to high performance native libraries in the backend. DataPelago is taking this quantum leap forward by extending Gluten with native accelerated computing enhancements, yielding orders of magnitude performance and cost improvements for Spark workloads!
Binwei Yang Apache Gluten Initiator
Try it now

Experience the new economics of data at scale.

Get in touch

Fill out the form and a DataPelago team member will reach out.