DataPelago Appoints John “JG” Chirapurath as President to Accelerate Growth - Read More

The first mile decides everything

DataPelago doesn't just accelerate data. We transform data processing economics. Your GenAI and Analytics aren't just faster—they're in a different league entirely 

McAfee
Samsung
Akad
HiddenLayer
Twingo
RevSure
ShareChat
What we do

10X faster at 80% cost reduction for GenAI and Analytics Data Processing

Any data. Any format. Lightning fast.
at any scale with unbeatable economics

Structured, semi-structured, or unstructured — we supercharge it all. Training models, fine-tuning AI, powering RAG, or extracting insights—we accelerate data processing in every workload

Discover new value 
with zero disruption or lock in

90% of your data sits untapped because processing is too slow and expensive. we accelerate processing of massive datasets in record time — zero changes to your applications or infrastructure, zero vendor lock-in.

how we do it

DataPelago Nucleus— The Universal Data Processing Engine

Go beyond Moore’s Law

Go beyond Moore’s wall with unprecedented price / performance advantage and unlock new workloads. The platform refactors data processing to exploit accelerated computing - leveraging the higher degree of parallelism and tightly-coupled memory model to deliver orders of magnitude of higher performance.

Learn more
Leverage all your hardware

Novel computing abstraction to enable heterogeneous accelerated computing including GPUs, FPGAs, and CPUs. The platform intelligently maps operations to execution units that are also dynamically reconfigured to match the query operators.

Learn more
Empower open-source frameworks

DataPelago’s engine accelerates data processing for GenAI and Lakehouse Analytics. The engine leverages Substrait-based open-source frameworks such as Gluten and related technologies. Now Spark, Trino, and other query engines can fully exploit the benefits of GPU, CPU and FPGA acceleration.

Learn more
Zero Friction

Seamless integration with SQL, Python, and other programming languages, workflow automation tools such as Airflow, query clients such as Notebook, Tableau, Power BI, etc. Deploy without any changes to data, tools, and processes. No vendor lock-in.

Learn more
Who we serve

Data and AI practitioners

GenAI

Accelerate GenAI from Data to Deployment

DataPelago accelerates multi-modal GenAI data processing end-to-end. Extract, filter, chunk, tokenize, and embed—then deploy foundation models, fine-tune systems, or build RAG applications faster with always-fresh data

Learn more

Analytics
Backed By
Eclipse
Alter
Qualcomm
Taiwania
Nautilus
Customers & Partners

New possibilities created with DataPelago

At Akad Seguros, innovation is woven into our DNA, fueling our unwavering commitment to exceptional customer service. Our partnership with DataPelago exemplifies this dedication, as we modernize our data architecture and unify processing pipelines for GenAI and data analysis. Leveraging DataPelago's advanced platform, we can seamlessly process structured, semi-structured, and unstructured data, reducing our costs by more than 50% and enhancing operational performance. By fully utilizing AWS’s Accelerated Computing (GPU) infrastructure, this collaboration is transforming our capacity to deliver superior results and elevating the quality of service for our customers.
Andre Fichel CTO, Akad Seguros
DataPelago enabled us to scale our analytics and AI workloads without any re-engineering. We saw proof of value within days-meaningful performance gains and costs savings with zero application changes required. It’s truly production-ready from day one, and their customer support made the entire experience seamless. DataPelago serves as a powerful force multiplier that delivers real business impact.
Deepinder Singh Dhingra CEO, RevSure.ai
With DataPelago, we were finally able to complete our heaviest OLAP cube jobs on OSS Spark—something that had been impossible due to data skew and performance bottlenecks. This opens the door for a full migration from managed platforms without compromising speed or reliability while reducing our costs by 50%.
Arya Ketan Distinguished Engineer & VP of Data, ShareChat
The exponential growth of semi-structured and unstructured data along with rapid Gen AI/AI adoption is driving innovation, not only in AI, but in data management and data processing. McAfee has been proud to partner with DataPelago on the design of their technology that shows promising results, including significant performance and cost improvements on certain workloads.
Steve Grobman Steve Grobman, EVP McAfee
Samsung SDS America has been working with DataPelago to evaluate their data processing platform in our AWS VPC, leveraging Accelerated Computing Infrastructure (GPUs). In testing with sample data, we’ve seen promising results in terms of performance and cost efficiency compared to traditional compute engines. DataPelago's platform shows potential in modernizing architecture and unifying data processing pipelines for GenAI and analytics, handling structured, semi-structured, and unstructured data types. This collaboration aligns with our interest in exploring innovative solutions that separate compute and storage, enhancing flexibility and reducing vendor lock-in.
Prashant Vithlani Head of Division | Cloud Business, Samsung SDS America
Twingo is proud to partner with and serve as an official reseller for DataPelago, delivering cutting-edge Big Data solutions to the Israeli market. As an early design partner, we are excited to offer DataPelago’s unified data processing platform, accelerating engines like Spark and Trino using advanced CPU and GPU infrastructure across any data lakehouse format, including Iceberg, Hudi, and Delta Lake. The benchmarks from our collaboration are groundbreaking, reducing Total Cost of Ownership and delivering exceptional value. This partnership reinforces our commitment to innovation and next-gen solutions for data-driven organizations.
Golan Nahum Founder & CEO, Twingo
The growth in the volume of data processed by security systems is exponential as the adoption of AI and GenAI in cybersecurity continues to grow. Datapelago enables cost-effective expansion of AI/GenAI and cybersecurity systems by transforming the economics of data processing with its heterogeneous accelerated computing engine. As a security practitioner, I am excited with its modular architecture which allows for seamless plug-and-play integration with open-source components like Spark and Apache Gluten, ensuring frictionless deployment without any vendor lock-in.
Malcolm Harkins Chief Security and Trust Officer, HiddenLayer & ex-CISO, Intel Corp
As Director of Engineering at Uber and Presto Foundation GB Chair, I have extensive experience developing and running open-source analytics software at an enterprise scale. Our workloads typically included heavy scan/filter/join operations, which are ideal for hardware acceleration. It's exciting to see how DataPelago disrupts the industry by accelerating open-source frameworks like Presto and Spark with custom hardware infrastructure. I'm particularly impressed with their dynamic mapping to heterogeneous computing elements and reconfigurable run-time techniques. By accelerating open-source frameworks, I think DataPelago will significantly transform today's performance/$ paradigm and reshape the economics of data processing.
Girish Baliga Ex-Director of Engineering, Uber & Chair of the Presto Foundation
Congratulations to DataPelago on their launch and announcement that their engine will extend Gluten, Substrait and Velox to deliver the benefits of accelerated computing for Spark to address the performance and cost challenges in the Apache Spark community. Apache Gluten is designed to reuse Apache Spark's whole control flow, while offloading the compute-intensive data processing part to high performance native libraries in the backend. DataPelago is taking this quantum leap forward by extending Gluten with native accelerated computing enhancements, yielding orders of magnitude performance and cost improvements for Spark workloads!
Binwei Yang Apache Gluten Initiator

Get in touch

Fill out the form and a DataPelago team member will reach out.