Technical Deep-Dive

Architecture &Technical Specifications

Everything your technical team needs to evaluate Shodh. Architecture, benchmarks, API specs, and security.

Benchmarks Architecture API Security Deployment Integrations

Performance

Benchmarks

Real-world performance metrics. Tested on commodity hardware (8 core CPU, 32GB RAM).

<10ms

Query Latency (P95)

On 1M+ documents

97%

Citation Accuracy

With source verification

10K docs/min

Indexing Speed

With embeddings

~200MB

Memory Footprint

Base system

1000+

Concurrent Users

Per node

384-4096

Vector Dimensions

Configurable

System Design

Architecture Overview

Modular components designed for flexibility and performance.


┌─────────────────────────────────────────────────────────────────┐
│                        CLIENT APPLICATION                        │
│                  (Python SDK / REST API / MCP)                   │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                         QUERY PIPELINE                           │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐   │
│  │   Rewriter   │→ │   Retriever  │→ │   Reranker + Filter  │   │
│  └──────────────┘  └──────────────┘  └──────────────────────┘   │
└─────────────────────────────┬───────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ▼                     ▼                     ▼
┌───────────────┐    ┌───────────────┐    ┌───────────────┐
│  Vector Store │    │   BM25 Index  │    │  Graph Store  │
│    (HNSW)     │    │   (Sparse)    │    │  (Optional)   │
└───────────────┘    └───────────────┘    └───────────────┘
        ▲                     ▲
        │                     │
┌───────────────────────────────────────────────────────────────┐
│                      INDEXING PIPELINE                         │
│  ┌──────────┐  ┌───────────┐  ┌──────────┐  ┌──────────────┐  │
│  │  Parser  │→ │  Chunker  │→ │ Embedder │→ │   Storage    │  │
│  └──────────┘  └───────────┘  └──────────┘  └──────────────┘  │
└───────────────────────────────────────────────────────────────┘

Document Processor

Handles PDF, DOCX, TXT, Markdown, HTML with structure preservation. Chunking with overlap, metadata extraction.

Rust + Python bindings

Embedding Engine

Local embedding models (BGE, E5, all-MiniLM) with ONNX Runtime. Optional cloud fallback.

ONNX + SIMD optimization

Vector Store

HNSW index with configurable M and ef parameters. Supports LMDB, RocksDB, or in-memory.

Custom Rust implementation

Query Pipeline

Hybrid retrieval combining dense vectors, sparse embeddings (BM25), and reranking.

Configurable pipeline

Generation Layer

LLM integration with local (Ollama, llama.cpp) or cloud (OpenAI, Anthropic) providers.

Abstracted LLM interface

Citation Engine

Source attribution with page/line references. Confidence scoring and hallucination detection.

Probabilistic verification

Developer API

API Reference

RESTful API with comprehensive endpoints. Python SDK available.

Core Endpoints

POST/api/ingestIndex documents with metadata

POST/api/queryRAG query with citations

GET/api/searchVector similarity search

POST/api/chatConversational RAG

GET/api/documentsList indexed documents

DELETE/api/documents/:idRemove document

Python Example

from shodh import RAG

# Initialize
rag = RAG(storage_path="./my_docs")

# Index documents
rag.ingest("contracts/", metadata={"type": "legal"})

# Query with citations
result = rag.query(
    "What are the payment terms?",
    filters={"type": "legal"},
    top_k=5
)

print(result.answer)
for citation in result.citations:
    print(f"  - {citation.source}:{citation.page}")

Enterprise Security

Security & Compliance

Built for enterprises with strict data governance requirements.

Data Isolation

Multi-tenant architecture with workspace isolation. Data never crosses tenant boundaries.

Authentication

API key authentication, JWT tokens, OAuth2 integration. Role-based access control.

Encryption

AES-256 encryption at rest. TLS 1.3 in transit. Optional client-side encryption.

Audit Logging

Complete audit trail of all operations. Exportable logs for compliance.

Data Residency

On-premise deployment ensures data never leaves your infrastructure. No cloud dependency.

Compliance Ready

Designed for GDPR, HIPAA, SOC2 requirements. Self-hosted for maximum control.

Infrastructure

Deployment Options

Flexible deployment to match your infrastructure requirements.

On-Premise

Full deployment on your infrastructure. Maximum control and data sovereignty.

Docker / Kubernetes ready

Air-gapped support

16GB RAM minimum

Hybrid

Local indexing with optional cloud LLM. Best of both worlds.

Data stays local

Use any LLM provider

8GB RAM minimum

Edge / Fleet

Distributed deployment for IoT, robotics, and multi-site scenarios.

Zenoh mesh networking

Offline-first design

ARM64 support

Ecosystem

Integrations

Connect with your existing tools and workflows.

Python SDK

Available

View

REST API

Available

MCP Server

Available

View

LangChain

Coming Soon

LlamaIndex

Coming Soon

Ollama

Supported

Ready to Evaluate?

Try it yourself in Google Colab, or schedule a technical deep-dive with our team.

Try Demo View Source