Atlas Metis: RAG Engine
Platform Status Brief
March 2026
Confidential
01 — Overview
Platform Status

The Atlas Metis RAG Engine is a multi-tenant RAG-as-a-Service platform. The core product — document ingestion, hybrid search, AI-powered answers, and real-time health monitoring — is fully operational. Scale features (batch processing, connector syncs, cost tracking) are built but require wiring before production launch.

55
API Routes
44
Python Modules
7
Health Checks / Tenant
E2E
Pipeline Verified

System Status

02 — Operations
How a Client Gets Onboarded
Current onboarding time: < 5 minutes from tenant creation to first query.
03 — Current State
Operational Systems
Single File Ingestion
Upload PDF, DOCX, TXT, CSV, audio — auto-process to searchable vectors
Hybrid Search
Semantic + keyword search combined via Reciprocal Rank Fusion
Cohere Reranking
Cross-encoder reranking boosts raw scores from ~0.016 to ~0.97
LLM Generation
GPT-4o-mini generates grounded answers with source citations
Self-RAG Validation
Validates chunk relevance before generating — catches hallucinations
SSE Streaming
Real-time streaming responses via Server-Sent Events
Multi-Tenant Isolation
Verified — Tenant A cannot see Tenant B’s data
Dual API Key Auth
Tenant keys (scoped) + Admin keys (full access) with bcrypt
Job Tracking
Every ingestion tracked with status, timing, and error capture
Health Diagnostics
7 automated checks per tenant with alert creation and resolution
Admin Dashboard
Fleet overview, alert center, metrics bar, diagnostics trigger
Usage Tracking
Queries, tokens, rerank units tracked per tenant per period
04 — Issues
Issues by Priority
P0 Critical — Dead Code
FeatureIssue
URL Ingestion Endpoint creates DB record but never fetches or processes the URL
Batch Upload Creates job records but never processes files — only single upload works
Connector Sync Downloads files from cloud sources but skips the entire ingestion pipeline
P1 Major Gaps
FeatureIssue
Cost Tracking Always $0.00 — no cost calculation from token counts
Collection Counters Document/chunk counts stay at 0 after ingestion
Celery Fallbacks Connector sync, eval runs crash without Redis — no sync fallback
Concept Clustering Code is complete but never triggered during ingestion
Auth Performance O(n) bcrypt comparisons per request — doesn’t scale past ~50 keys
Query Cache Complete module exists but is never wired into the pipeline
OAuth Refresh Google Drive/Dropbox tokens expire after ~1 hour with no refresh
P2 Pre-Production
FeatureIssue
HyDE Fallback Advertised but not implemented
Faithfulness Check Code exists but never called post-generation
Rate Limiting Stored per key but never enforced
File Size Limits No upload size cap — memory exhaustion risk
Connector Types 8 of 11 declared types are not implemented
05 — Architecture
Technical Architecture
External
Client App
Gateway
FastAPI
55 routes • async
Security
Auth Layer
Dual key • bcrypt
Ingestion Pipeline
Parse → Chunk → Embed → Store
Retrieval Pipeline
Search → Rerank → Validate → Generate
Diagnostics
7 Health Checks → Alerts → Auto-Resolve
Database
Supabase
pgvector • RLS
Embeddings + LLM
OpenAI
3072d • GPT-4o-mini
Reranking
Cohere
Rerank v3.5
Task Queue
Redis / Celery
Not running
06 — Roadmap
Path to Production
A
Fix P0s — Critical Dead Code
Estimated: 1–2 days
  • Wire URL ingestion to sync processing pipeline
  • Wire batch upload to process files (not just create records)
  • Complete connector sync → ingestion pipeline connection
  • Add sync fallbacks for all Celery-dependent endpoints
B
Fix P1s — Major Gaps
Estimated: 2–3 days
  • Implement cost calculation from token counts (OpenAI/Cohere pricing)
  • Fix collection document/chunk counters
  • Wire concept clustering into ingestion pipeline
  • Add API key caching (prefix lookup) to replace O(n) bcrypt scan
  • Wire query cache module into retrieval pipeline
C
Production Hardening
Estimated: 3–5 days
  • Deploy Celery + Redis for async processing
  • Implement rate limiting middleware
  • Add file upload size limits
  • OAuth token refresh for connectors
  • Docker Compose + Railway deployment config
  • End-to-end smoke test suite
Total estimated time to production-ready: 2–3 weeks
Built by Atlas Minds
atlas-minds.com
March 2026