Architecture &Technical Specifications
Everything your technical team needs to evaluate Shodh. Architecture, benchmarks, API specs, and security.
Benchmarks
Real-world performance metrics. Tested on commodity hardware (8 core CPU, 32GB RAM).
Architecture Overview
Modular components designed for flexibility and performance.
┌─────────────────────────────────────────────────────────────────┐
│ CLIENT APPLICATION │
│ (Python SDK / REST API / MCP) │
└─────────────────────────────┬───────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ QUERY PIPELINE │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐ │
│ │ Rewriter │→ │ Retriever │→ │ Reranker + Filter │ │
│ └──────────────┘ └──────────────┘ └──────────────────────┘ │
└─────────────────────────────┬───────────────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Vector Store │ │ BM25 Index │ │ Graph Store │
│ (HNSW) │ │ (Sparse) │ │ (Optional) │
└───────────────┘ └───────────────┘ └───────────────┘
▲ ▲
│ │
┌───────────────────────────────────────────────────────────────┐
│ INDEXING PIPELINE │
│ ┌──────────┐ ┌───────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Parser │→ │ Chunker │→ │ Embedder │→ │ Storage │ │
│ └──────────┘ └───────────┘ └──────────┘ └──────────────┘ │
└───────────────────────────────────────────────────────────────┘
Document Processor
Handles PDF, DOCX, TXT, Markdown, HTML with structure preservation. Chunking with overlap, metadata extraction.
Embedding Engine
Local embedding models (BGE, E5, all-MiniLM) with ONNX Runtime. Optional cloud fallback.
Vector Store
HNSW index with configurable M and ef parameters. Supports LMDB, RocksDB, or in-memory.
Query Pipeline
Hybrid retrieval combining dense vectors, sparse embeddings (BM25), and reranking.
Generation Layer
LLM integration with local (Ollama, llama.cpp) or cloud (OpenAI, Anthropic) providers.
Citation Engine
Source attribution with page/line references. Confidence scoring and hallucination detection.
API Reference
RESTful API with comprehensive endpoints. Python SDK available.
/api/ingestIndex documents with metadata/api/queryRAG query with citations/api/searchVector similarity search/api/chatConversational RAG/api/documentsList indexed documents/api/documents/:idRemove documentfrom shodh import RAG
# Initialize
rag = RAG(storage_path="./my_docs")
# Index documents
rag.ingest("contracts/", metadata={"type": "legal"})
# Query with citations
result = rag.query(
"What are the payment terms?",
filters={"type": "legal"},
top_k=5
)
print(result.answer)
for citation in result.citations:
print(f" - {citation.source}:{citation.page}")Security & Compliance
Built for enterprises with strict data governance requirements.
Data Isolation
Multi-tenant architecture with workspace isolation. Data never crosses tenant boundaries.
Authentication
API key authentication, JWT tokens, OAuth2 integration. Role-based access control.
Encryption
AES-256 encryption at rest. TLS 1.3 in transit. Optional client-side encryption.
Audit Logging
Complete audit trail of all operations. Exportable logs for compliance.
Data Residency
On-premise deployment ensures data never leaves your infrastructure. No cloud dependency.
Compliance Ready
Designed for GDPR, HIPAA, SOC2 requirements. Self-hosted for maximum control.
Deployment Options
Flexible deployment to match your infrastructure requirements.
On-Premise
Full deployment on your infrastructure. Maximum control and data sovereignty.
Hybrid
Local indexing with optional cloud LLM. Best of both worlds.
Edge / Fleet
Distributed deployment for IoT, robotics, and multi-site scenarios.
Integrations
Connect with your existing tools and workflows.
Ready to Evaluate?
Try it yourself in Google Colab, or schedule a technical deep-dive with our team.