DataPelago Appoints John “JG” Chirapurath as President to Accelerate Growth - Read More

DataPelago Nucleus

The Universal Data Processing Engine. Demolish performance barriers for GenAI & Analytics for any type of data, with any framework—from Spark to Ray, on any hardware—from CPU to GPU

McAfee
Samsung
Akad
HiddenLayer
Twingo
RevSure
ShareChat

Accelerate any Framework,
on any Hardware,
on any Data.

[ ANY FRAMEWORK ]

Spark
Trino
Flink
Presto
Your Framework

[ ANY HARDWARE ]

GPU
CPU
FPGA
Envidia
AMD

[ ANY DATA ]

Iceberg
Delta Lake
Hudi
Video
Image
Text
Audio

Why is DataPelago Nucleus unique?

Unified Query Engine

Seamlessly converts any query or execution plan into standards-based formats like Substrait using mechanisms like Apache Gluten —purpose-built for GenAI, LLMs, and Lakehouse analytics. Supports native SQL, Python and more for lightning-fast integration

Intelligent Execution Optimizer

Dynamically orchestrates GenAI and analytics transformations by automatically building optimal execution pipelines and selecting the best-performing hardware based on real-time cost-performance analysis

Revolutionary Data VM

The industry's first VM with domain-specific instruction set architecture that unleashes multi-modal data processing across any hardware—CPUs, GPUs, and beyond—leveraging proven frameworks like LLVM, CUDA, and ROCm for maximum compatibility

Core features

Built for enterprise scale

Enterprise-Ready Deployment

Deploy at scale with zero IT overhead and zero business disruption. Flexible deployment models fit your infrastructure requirements while maintaining full operational control within your secure environment

Security & Compliance First
Investment Protection
Customers & Partners

New possibilities created with DataPelago

At Akad Seguros, innovation is woven into our DNA, fueling our unwavering commitment to exceptional customer service. Our partnership with DataPelago exemplifies this dedication, as we modernize our data architecture and unify processing pipelines for GenAI and data analysis. Leveraging DataPelago's advanced platform, we can seamlessly process structured, semi-structured, and unstructured data, reducing our costs by more than 50% and enhancing operational performance. By fully utilizing AWS’s Accelerated Computing (GPU) infrastructure, this collaboration is transforming our capacity to deliver superior results and elevating the quality of service for our customers.
Andre Fichel CTO, Akad Seguros
DataPelago enabled us to scale our analytics and AI workloads without any re-engineering. We saw proof of value within days-meaningful performance gains and costs savings with zero application changes required. It’s truly production-ready from day one, and their customer support made the entire experience seamless. DataPelago serves as a powerful force multiplier that delivers real business impact.
Deepinder Singh Dhingra CEO, RevSure.ai
With DataPelago, we were finally able to complete our heaviest OLAP cube jobs on OSS Spark—something that had been impossible due to data skew and performance bottlenecks. This opens the door for a full migration from managed platforms without compromising speed or reliability while reducing our costs by 50%.
Arya Ketan Distinguished Engineer & VP of Data, ShareChat
The exponential growth of semi-structured and unstructured data along with rapid Gen AI/AI adoption is driving innovation, not only in AI, but in data management and data processing. McAfee has been proud to partner with DataPelago on the design of their technology that shows promising results, including significant performance and cost improvements on certain workloads.
Steve Grobman Executive VP and CTO, McAfee
Samsung SDS America has been working with DataPelago to evaluate their data processing platform in our AWS VPC, leveraging Accelerated Computing Infrastructure (GPUs). In preliminary testing with sample data, we’ve seen promising results in terms of performance and cost efficiency compared to traditional compute engines. DataPelago's platform shows potential in modernizing architecture and unifying data processing pipelines for GenAI and analytics, handling structured, semi-structured, and unstructured data types. This collaboration aligns with our interest in exploring innovative solutions that separate compute and storage, enhancing flexibility and reducing vendor lock-in.
Prashant Vithlani Head of Division | Cloud Business, Samsung SDS America
Twingo is proud to partner with and serve as an official reseller for DataPelago, delivering cutting-edge Big Data solutions to the Israeli market. As an early design partner, we are excited to offer DataPelago’s unified data processing platform, accelerating engines like Spark and Trino using advanced CPU and GPU infrastructure across any data lakehouse format, including Iceberg, Hudi, and Delta Lake. The benchmarks from our collaboration are groundbreaking, reducing Total Cost of Ownership and delivering exceptional value. This partnership reinforces our commitment to innovation and next-gen solutions for data-driven organizations.
Golan Nahum Founder & CEO
The growth in the volume of data processed by security systems is exponential as the adoption of AI and GenAI in cybersecurity continues to grow. Datapelago enables cost-effective expansion of AI/GenAI and cybersecurity systems by transforming the economics of data processing with its heterogeneous accelerated computing engine. As a security practitioner, I am excited with its modular architecture which allows for seamless plug-and-play integration with open-source components like Spark and Apache Gluten, ensuring frictionless deployment without any vendor lock-in.
Malcolm Harkins Chief Security and Trust Officer, HiddenLayer & ex-CISO for Intel Corp
As Director of Engineering at Uber and Presto Foundation GB Chair, I have extensive experience developing and running open-source analytics software at an enterprise scale. Our workloads typically included heavy scan/filter/join operations, which are ideal for hardware acceleration. It's exciting to see how DataPelago disrupts the industry by accelerating open-source frameworks like Presto and Spark with custom hardware infrastructure. I'm particularly impressed with their dynamic mapping to heterogeneous computing elements and reconfigurable run-time techniques. By accelerating open-source frameworks, I think DataPelago will significantly transform today's performance/$ paradigm and reshape the economics of data processing.
Girish Baliga Ex-Director of Engineering, Uber & Chair of the Presto Foundation
Congratulations to DataPelago on their launch and announcement that their engine will extend Gluten, Substrait and Velox to deliver the benefits of accelerated computing for Spark to address the performance and cost challenges in the Apache Spark community. Apache Gluten is designed to reuse Apache Spark's whole control flow, while offloading the compute-intensive data processing part to high performance native libraries in the backend. DataPelago is taking this quantum leap forward by extending Gluten with native accelerated computing enhancements, yielding orders of magnitude performance and cost improvements for Spark workloads!
Binwei Yang Apache Gluten Initiator

Get in touch

Fill out the form and a DataPelago team member will reach out.