New • Cited answers across every document type

Ask anything. Across every document.

RAG Engine turns PDFs, Word, Excel/CSV, PowerPoint, text, Markdown, HTML, JSON or images into a grounded knowledge surface — with cited answers, confidence scores and refusal guardrails out of the box.

Get started Watch demo

No credit card • Local demo workspace • Works with OpenAI, Anthropic, Cohere

app.rag-engine.dev/chat

Sessions

Q3 fiscal report

Onboarding policies

Vendor SLA review

What were our Q3 revenue drivers?

Q3 revenue grew +22% YoY, driven primarily by the launch of Atlas Tier and a 14% lift in retention from the new onboarding flow.

94% confidenceQ3-Earnings.pdf • p.4Retention-Memo.pdf • p.2

Sources verified

What makes it different

Built for answers you can trust.

Three layers of accuracy stacked on top of every retrieval — so users get cited answers, not eloquent guesses.

Grounded reasoning

Hybrid retrieval blends dense vectors with sparse BM25 to surface every relevant chunk before the model speaks.

Refusal guardrails

Pre-gates and PII detection refuse politely when the answer isn't in your docs — no hallucinations sneaking through.

Provider agnostic

Drop in OpenAI, Anthropic, Cohere or local models. Embeddings stay consistent across failover.

Deep dive

Premium retrieval, grounded answers, production-grade ops

Three stages, each tunable per request. Click into any section to see how the engine keeps your answers honest, fast, and observable.

Under the hood — Retrieval

Hybrid retrieval that actually understands your corpus

Dense embeddings + sparse BM25 are fused with reciprocal-rank fusion, optionally rewritten via HyDE and multi-query, then reranked with a cross-encoder before the model ever sees a token.

Hybrid + RRF
Dense semantic search and sparse keyword recall are merged with reciprocal-rank fusion for resilient top-K.
HyDE & multi-query
Tough questions get rewritten into hypothetical answers and 3-way paraphrases, then unioned and deduped.
Cross-encoder rerank
BAAI/bge-reranker-base re-scores the candidate set so only the most defensible passages reach the LLM.

Question

HyDE / paraphrases

Hybrid (dense + BM25)

RRF fusion

Cross-encoder rerank

Top-K passages

Under the hood — Grounding

Every answer is verified, cited, and tunable

Span-level citations link each sentence to the exact passage that supports it. A groundedness verifier scores each answer and the engine refuses if support drops below your threshold.

Span-level citations
Click a marker and a source viewer slides in, highlighting the exact span — no hand-waving.
Confidence + refusal
A calibrated confidence score is shown on every reply. Answers below your floor refuse politely instead of hallucinating.
Eval harness
Ship with a golden YAML and a one-command eval runner so you can prove uplift before every release.

The 2023 climate report attributes 42% of emissions to industrial activity¹, while transportation accounts for 29%².

Confidence0.92

Groundedness0.96

Refusal threshold0.55

Under the hood — Ops & security

Production-grade operations from day one

OpenTelemetry tracing on every stage, structured logs, per-cookie config overrides, slowapi rate limiting, optional PII redaction at ingest, and webhooks for ingestion + query lifecycle events.

Observability
Latency histograms, refusal rate, mean confidence, and per-question cost — surfaced in /app/status and /metrics.
Security
PII redaction before embedding, per-IP / per-cookie rate limits, and clear audit trails on every mutation.
Automation
Webhooks fire on ingest + query events, and an admin CLI handles ingest / query / eval / export-corpus / clear-cache.

p50

412 ms

p95

1.18 s

refusal

2.1 %

Plays nicely with the model garden

Swap providers without re-indexing

OpenAIAnthropicCohereVoyageQdrantPostgresMistralHugging FaceGroqTogether

“We replaced three different internal search tools with RAG Engine. Confidence scores ended an entire category of support tickets — people now trust the answer.”

Jayesh Koli

AI Engineer