Self-Hosted · AI-Powered

Your organization's second brain

Transform scattered documents, emails, and files into an intelligent knowledge system. AI-generated answers with verifiable citations, powered by your data, on your infrastructure.

Explore Features How It Works
hippocortex query
$ hippocortex query "What were the key decisions in the Q4 planning meeting?"

Searching 2,847 documents across 14 sources...
Found 12 relevant chunks · Reranked · ACL filtered

The Q4 planning meeting established three key priorities:
1. Migrate auth service to OAuth 2.1 [meeting-notes-oct12.pdf:3]
2. Launch self-serve onboarding by Nov 15 [slack-thread-#product:142]
3. Reduce P95 latency below 200ms [jira-PERF-847, email-cto-oct14]

3 sources · 4 citations · confidence: high
Hybrid
Vector + Keyword Search
441+
Tests Passing
100%
Self-Hosted
Zero
Vendor Lock-In

Your knowledge is scattered.
Your search is broken.

Knowledge workers spend 20% of their time searching for information. Traditional tools can't connect the dots.

Without Hippocortex

  • ×

    Keyword search misses context

    Searching "Q4 decisions" doesn't find the email that says "we agreed to prioritize latency"

  • ×

    Answers without sources

    AI chatbots hallucinate confidently. No way to verify where information came from

  • ×

    No access control

    Sensitive documents leak into AI responses. HR data mixed with public content

  • ×

    Knowledge silos

    Information trapped in emails, Slack, Drive, and wikis. No unified view

With Hippocortex

  • Semantic understanding

    Hybrid vector + keyword search finds answers by meaning, not just matching words

  • Every answer has citations

    Click any claim to see the original source. Frozen snapshots preserve provenance

  • ACL filtering built in

    Permissions checked before content reaches the AI. 13 red-team tests verify isolation

  • Unified knowledge base

    One system ingests all sources and builds an auto-generated wiki organized by topic

Everything you need for
intelligent knowledge management

A complete system, not a library. Ingest, search, synthesize, and govern your organization's knowledge.

Hybrid Search

Vector embeddings + full-text search fused via Reciprocal Rank Fusion. Graph-enhanced retrieval connects entities across documents.

Enterprise Governance

Fine-grained ACLs, sensitivity labels, pre-ranking permission filtering. GDPR-ready with tombstone propagation for right-to-forget.

Multi-Source Ingestion

Email (Gmail, Outlook), cloud drives, PDFs, web pages, Markdown. Thread-aware email processing with quote detection and delta extraction.

Verifiable Citations

Every AI-generated answer links to source chunks with frozen snapshots. Click to verify. Provenance tracking from ingestion to answer.

Auto-Generated Wiki

AI synthesizes clean, topic-organized wiki pages from raw documents. Taxonomy with 5 categories, bidirectional linking, and staleness detection.

Knowledge Graph

Automatic entity extraction (11 types), relationship mapping, and interactive graph visualization. Multi-hop reasoning across documents.

From raw data to
intelligent answers

A production-grade pipeline that processes, enriches, and indexes your documents for instant retrieval.

1

Ingest

Connect your sources. Upload documents, link email accounts, paste URLs. Connectors handle authentication and incremental sync.

Gmail Outlook PDF Web S3
2

Parse & Chunk

Documents are parsed with format-aware extractors, then split into semantic chunks that preserve context and structure.

Markdown output 200-1500 tokens Content-hash IDs
3

Enrich

LLMs extract entities, assign topic tags, generate summaries, and classify content. All enrichments run in parallel for speed.

11 entity types Topic tags Summaries
4

Embed & Index

Vector embeddings stored in PostgreSQL with pgvector HNSW index. Full-text search via GIN index. No separate vector database needed.

pgvector HNSW tsvector GIN Postgres-canonical
5

Search & Generate

Hybrid search finds relevant chunks, ACL filtering enforces permissions, reranking prioritizes quality, and an LLM generates cited answers.

RRF fusion ACL filter Rerank Citations

Production-grade stack,
no exotic dependencies

Proven technologies that your team already knows. Everything runs in Docker Compose on a single machine.

Fa

FastAPI

Async Python backend

Pg

PostgreSQL

Vectors + relations + graph

Re

Redis

Cache + rate limiting

Pf

Prefect

Durable orchestration

Nx

Next.js

React 19 frontend

Dk

Docker

Full stack in 5 containers

Vg

Voyage AI

Embeddings (512-dim)

OR

OpenRouter

LLM gateway + fallbacks

Built different

Not a library, not a vector database. A complete, self-hosted knowledge system.

Capability Hippocortex RAG Libraries Commercial RAG Vector DBs
Self-hosted / data control
Production-ready application
ACL / governance
Knowledge graph
Citation provenance Partial
Auto wiki synthesis
No vendor lock-in Partial
Single-DB architecture

Ready to build your
organization's memory?

Deploy on your infrastructure with Docker Compose. No vendor lock-in, no usage-based pricing, full data control.

docker compose up -d