Powered by Apache Doris ↗

REAL-TIME
ANALYTICS &
SEARCH DB

The unified engine for customer-facing analytics. Sub-second performance on petabytes of data with 100x more concurrency.

1M+
Concurrent Queries
0.1s
Avg. Query Latency
100PB+
Data Capacity

VeloDB and Apache Doris help over 10,000+ enterprises to bring real-time analytics into the AI era

Enterprise customer 1
Enterprise customer 2
Enterprise customer 3
Enterprise customer 4
Enterprise customer 5
Enterprise customer 6
Enterprise customer 7
Enterprise customer 8
Enterprise customer 9
Enterprise customer 10
Enterprise customer 11
Enterprise customer 12
Enterprise customer 13
Enterprise customer 14
Enterprise customer 15
Enterprise customer 16
Enterprise customer 17
Enterprise customer 18
Enterprise customer 19
Enterprise customer 20
Enterprise customer 21
Enterprise customer 22
Enterprise customer 23
Enterprise customer 24
Enterprise customer 25
Enterprise customer 26
Enterprise customer 27
Enterprise customer 28
Enterprise customer 29
Enterprise customer 30
Enterprise customer 31
Enterprise customer 32
Enterprise customer 33
Enterprise customer 34
Enterprise customer 35
Enterprise customer 36
Enterprise customer 1
Enterprise customer 2
Enterprise customer 3
Enterprise customer 4
Enterprise customer 5
Enterprise customer 6
Enterprise customer 7
Enterprise customer 8
Enterprise customer 9
Enterprise customer 10
Enterprise customer 11
Enterprise customer 12
Enterprise customer 13
Enterprise customer 14
Enterprise customer 15
Enterprise customer 16
Enterprise customer 17
Enterprise customer 18
Enterprise customer 19
Enterprise customer 20
Enterprise customer 21
Enterprise customer 22
Enterprise customer 23
Enterprise customer 24
Enterprise customer 25
Enterprise customer 26
Enterprise customer 27
Enterprise customer 28
Enterprise customer 29
Enterprise customer 30
Enterprise customer 31
Enterprise customer 32
Enterprise customer 33
Enterprise customer 34
Enterprise customer 35
Enterprise customer 36
Pain Points

Challenges with real-time analytics
in the AI era

01
Fragmented Systems

The Multi-Database Trap

One DB for search, another for complex joins, a third for dashboards. No unified warehouse — just fragile ETL pipelines duct-taping silos together.

02
Freshness vs Speed

The Real-Time Paradox

Data freshness or query speed — pick one. Sub-second SLAs slip under heavy AI workloads. Endless tuning just to keep up.

03
Scale Bottleneck

AI Concurrency Bottleneck

Agentic workflows demand high-concurrency search across structured and unstructured data. Without it, real-time RAG becomes unsustainable.

Built to solve current and emerging real-time data challenges

Four workloads, one engine. Each powered by purpose-built technologies that deliver measurable outcomes.

ANALYTICS

Customer / Agent-Facing Analytics

Insights in milliseconds with performant search and aggregations under fast-changing data

10,000+
QPS Concurrency
<100ms
P99 Query Latency
~1s
Data Freshness
Learn more
01

Materialized Views

Incremental materialized views replace batch ETL, refreshing complex transformations in minutes instead of hours. Data is queryable the moment it lands.

02

Inverted Index + Smart Indexes

Partition pruning, bucketing, pre-aggregated tables, and inverted indexes let queries touch only the data they need — keeping P99 latency under 100ms even at 10,000+ QPS.

03

MCP Server for AI Agents

A built-in Model Context Protocol server lets autonomous AI agents query live operational data through a standard interface — no custom integration code required.

04

Vectorized MPP Engine

A Cost-Based Optimizer with runtime filtering drives the MPP execution engine. Complex multi-table JOINs that take minutes elsewhere finish in seconds.

OBSERVABILITY

Log, Trace & Metric Analytics

Analyze and search PB-scale log, trace, and metric data effectively

10 GB/s
Write Throughput
80%
Cost Reduction vs ES
<2s
Search on 1B Logs
Learn more
01

Inverted Index for Full-Text

Native inverted indexes replace Elasticsearch's architecture. Full-text keyword search across 1 billion log records returns in under 2 seconds — with 5× the write throughput.

02

VARIANT Data Type

Schema-on-read for JSON logs. The VARIANT type auto-extracts fields as typed sub-columns without ETL, delivering 8× analytical performance and 3× compression over raw JSON.

03

ZSTD 1:5–1:10 Compression

Columnar storage with ZSTD compression achieves 5–10× compression ratios. Tiered storage on object storage further reduces costs by 70%, making PB-scale retention affordable.

04

OpenTelemetry Native

Drop-in integration with Logstash, Beats, OpenTelemetry exporters, Grafana, and Langfuse. Your existing observability stack works out of the box.

GENERATIVE AI

Hybrid Search for RAG

Power GenAI with a cost-effective knowledge store, leveraging hybrid search and progressive filtering

3-in-1
SQL + FTS + Vector
<15ms
Hybrid Query
Real-time
Vector Updates
Learn more
01

Full-Text Search (BM25)

Built-in BM25 ranking with inverted and N-gram indexes. No external search engine needed — keyword relevance scoring runs natively inside the same engine as your analytics.

02

Vector ANN (HNSW)

High-performance approximate nearest neighbor search with HNSW indexes. Embed and query vectors alongside structured data in the same table, the same query, the same transaction.

03

Columnar JSON + VARIANT

Schemaless JSON ingestion with automatic columnar sub-column extraction. Perfect for evolving knowledge base schemas — no migrations, no ETL.

04

Progressive Filtering in SQL

Combine SQL predicates, full-text ranking, and vector similarity in a single execution plan. Filter by tenant, rank by text, sort by embedding distance — one query, one round-trip.

DATA WAREHOUSING

Lakehouse Analytics

Minimize ETL and scale real-time OLAP with Lakehouse architecture

Faster than Trino
48%
Compute Savings
PB-scale
Lakehouse Queries
Learn more
01

Open Format Support

Query Iceberg, Hudi, Hive, and Delta Lake tables directly. Federated queries join lakehouse data with real-time tables — no data movement, no copies.

02

Compute-Storage Separation

Scale compute nodes independently in seconds. Keep 100% of data on cost-effective S3/BLOB storage. Spin up clusters for peak traffic, shut them down when done.

03

CBO for Complex JOINs

A Cost-Based Optimizer with advanced statistics drives distributed multi-table JOINs. 3–8× faster than Greenplum on TPC-H, 3× faster than Trino on TPC-DS at 1TB scale.

04

Fast Metadata Caching

Local SSD caching of hot data and metadata eliminates the object storage latency penalty. Delivers sub-second interactive queries on petabyte-scale lakehouse datasets.

POWER YOUR
NEXT GENERATION

Join high-growth companies building lightning-fast search and analytics features with VeloDB.

No credit card required