New • Cited answers across every document type

Ask anything. Across every document.

RAG Engine turns PDFs, Word, Excel/CSV, PowerPoint, text, Markdown, HTML, JSON or images into a grounded knowledge surface — with cited answers, confidence scores and refusal guardrails out of the box.

No credit card • Local demo workspace • Works with OpenAI, Anthropic, Cohere

app.rag-engine.dev/chat
JD
What were our Q3 revenue drivers?
R

Q3 revenue grew +22% YoY, driven primarily by the launch of Atlas Tier and a 14% lift in retention from the new onboarding flow.

94% confidenceQ3-Earnings.pdf • p.4Retention-Memo.pdf • p.2
Sources verified

What makes it different

Built for answers you can trust.

Three layers of accuracy stacked on top of every retrieval — so users get cited answers, not eloquent guesses.

Grounded reasoning

Hybrid retrieval blends dense vectors with sparse BM25 to surface every relevant chunk before the model speaks.

Refusal guardrails

Pre-gates and PII detection refuse politely when the answer isn't in your docs — no hallucinations sneaking through.

Provider agnostic

Drop in OpenAI, Anthropic, Cohere or local models. Embeddings stay consistent across failover.

Deep dive

Premium retrieval, grounded answers, production-grade ops

Three stages, each tunable per request. Click into any section to see how the engine keeps your answers honest, fast, and observable.

Under the hood — Retrieval

Hybrid retrieval that actually understands your corpus

Dense embeddings + sparse BM25 are fused with reciprocal-rank fusion, optionally rewritten via HyDE and multi-query, then reranked with a cross-encoder before the model ever sees a token.

  • Hybrid + RRF

    Dense semantic search and sparse keyword recall are merged with reciprocal-rank fusion for resilient top-K.

  • HyDE & multi-query

    Tough questions get rewritten into hypothetical answers and 3-way paraphrases, then unioned and deduped.

  • Cross-encoder rerank

    BAAI/bge-reranker-base re-scores the candidate set so only the most defensible passages reach the LLM.

1
Question
2
HyDE / paraphrases
3
Hybrid (dense + BM25)
4
RRF fusion
5
Cross-encoder rerank
6
Top-K passages
Under the hood — Grounding

Every answer is verified, cited, and tunable

Span-level citations link each sentence to the exact passage that supports it. A groundedness verifier scores each answer and the engine refuses if support drops below your threshold.

  • Span-level citations

    Click a marker and a source viewer slides in, highlighting the exact span — no hand-waving.

  • Confidence + refusal

    A calibrated confidence score is shown on every reply. Answers below your floor refuse politely instead of hallucinating.

  • Eval harness

    Ship with a golden YAML and a one-command eval runner so you can prove uplift before every release.

The 2023 climate report attributes 42% of emissions to industrial activity1, while transportation accounts for 29%2.

Confidence0.92
Groundedness0.96
Refusal threshold0.55
Under the hood — Ops & security

Production-grade operations from day one

OpenTelemetry tracing on every stage, structured logs, per-cookie config overrides, slowapi rate limiting, optional PII redaction at ingest, and webhooks for ingestion + query lifecycle events.

  • Observability

    Latency histograms, refusal rate, mean confidence, and per-question cost — surfaced in /app/status and /metrics.

  • Security

    PII redaction before embedding, per-IP / per-cookie rate limits, and clear audit trails on every mutation.

  • Automation

    Webhooks fire on ingest + query events, and an admin CLI handles ingest / query / eval / export-corpus / clear-cache.

p50
412 ms
p95
1.18 s
refusal
2.1 %

Plays nicely with the model garden

OpenAIAnthropicCohereVoyageQdrantPostgresMistralHugging FaceGroqTogether

“We replaced three different internal search tools with RAG Engine. Confidence scores ended an entire category of support tickets — people now trust the answer.”

JK

Jayesh Koli

AI Engineer