Technical Deep-Dive

Architecture &Technical Specifications

Everything your technical team needs to evaluate Shodh. Architecture, benchmarks, API specs, and security.

Performance

Benchmarks

Real-world performance metrics. Tested on commodity hardware (8 core CPU, 32GB RAM).

<10ms
Query Latency (P95)
On 1M+ documents
97%
Citation Accuracy
With source verification
10K docs/min
Indexing Speed
With embeddings
~200MB
Memory Footprint
Base system
1000+
Concurrent Users
Per node
384-4096
Vector Dimensions
Configurable
System Design

Architecture Overview

Modular components designed for flexibility and performance.


┌─────────────────────────────────────────────────────────────────┐
│                        CLIENT APPLICATION                        │
│                  (Python SDK / REST API / MCP)                   │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                         QUERY PIPELINE                           │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐   │
│  │   Rewriter   │→ │   Retriever  │→ │   Reranker + Filter  │   │
│  └──────────────┘  └──────────────┘  └──────────────────────┘   │
└─────────────────────────────┬───────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
┌───────────────┐    ┌───────────────┐    ┌───────────────┐
│  Vector Store │    │   BM25 Index  │    │  Graph Store  │
│    (HNSW)     │    │   (Sparse)    │    │  (Optional)   │
└───────────────┘    └───────────────┘    └───────────────┘
        ▲                     ▲
        │                     │
┌───────────────────────────────────────────────────────────────┐
│                      INDEXING PIPELINE                         │
│  ┌──────────┐  ┌───────────┐  ┌──────────┐  ┌──────────────┐  │
│  │  Parser  │→ │  Chunker  │→ │ Embedder │→ │   Storage    │  │
│  └──────────┘  └───────────┘  └──────────┘  └──────────────┘  │
└───────────────────────────────────────────────────────────────┘
              

Document Processor

Handles PDF, DOCX, TXT, Markdown, HTML with structure preservation. Chunking with overlap, metadata extraction.

Rust + Python bindings

Embedding Engine

Local embedding models (BGE, E5, all-MiniLM) with ONNX Runtime. Optional cloud fallback.

ONNX + SIMD optimization

Vector Store

HNSW index with configurable M and ef parameters. Supports LMDB, RocksDB, or in-memory.

Custom Rust implementation

Query Pipeline

Hybrid retrieval combining dense vectors, sparse embeddings (BM25), and reranking.

Configurable pipeline

Generation Layer

LLM integration with local (Ollama, llama.cpp) or cloud (OpenAI, Anthropic) providers.

Abstracted LLM interface

Citation Engine

Source attribution with page/line references. Confidence scoring and hallucination detection.

Probabilistic verification
Developer API

API Reference

RESTful API with comprehensive endpoints. Python SDK available.

Core Endpoints
POST/api/ingestIndex documents with metadata
POST/api/queryRAG query with citations
GET/api/searchVector similarity search
POST/api/chatConversational RAG
GET/api/documentsList indexed documents
DELETE/api/documents/:idRemove document
Python Example
from shodh import RAG

# Initialize
rag = RAG(storage_path="./my_docs")

# Index documents
rag.ingest("contracts/", metadata={"type": "legal"})

# Query with citations
result = rag.query(
    "What are the payment terms?",
    filters={"type": "legal"},
    top_k=5
)

print(result.answer)
for citation in result.citations:
    print(f"  - {citation.source}:{citation.page}")
Enterprise Security

Security & Compliance

Built for enterprises with strict data governance requirements.

Data Isolation

Multi-tenant architecture with workspace isolation. Data never crosses tenant boundaries.

Authentication

API key authentication, JWT tokens, OAuth2 integration. Role-based access control.

Encryption

AES-256 encryption at rest. TLS 1.3 in transit. Optional client-side encryption.

Audit Logging

Complete audit trail of all operations. Exportable logs for compliance.

Data Residency

On-premise deployment ensures data never leaves your infrastructure. No cloud dependency.

Compliance Ready

Designed for GDPR, HIPAA, SOC2 requirements. Self-hosted for maximum control.

Infrastructure

Deployment Options

Flexible deployment to match your infrastructure requirements.

On-Premise

Full deployment on your infrastructure. Maximum control and data sovereignty.

Docker / Kubernetes ready
Air-gapped support
16GB RAM minimum

Hybrid

Local indexing with optional cloud LLM. Best of both worlds.

Data stays local
Use any LLM provider
8GB RAM minimum

Edge / Fleet

Distributed deployment for IoT, robotics, and multi-site scenarios.

Zenoh mesh networking
Offline-first design
ARM64 support
Ecosystem

Integrations

Connect with your existing tools and workflows.

Python SDK
Available
View
REST API
Available
MCP Server
Available
View
LangChain
Coming Soon
LlamaIndex
Coming Soon
Ollama
Supported

Ready to Evaluate?

Try it yourself in Google Colab, or schedule a technical deep-dive with our team.