Terraphim AI is a local-first semantic search engine built in Rust. It constructs knowledge graphs from your documents, matches patterns with compiled automata, and ranks results through graph traversal. Every computation happens on your machine.
Terraphim indexes documents from multiple sources -- local files, Confluence wikis, Jira tickets, Discourse forums, email inboxes -- and builds a knowledge graph unique to each role you configure. The graph captures concepts and their relationships, extracted through Aho-Corasick automata.
Search queries are expanded through a thesaurus, scored by your chosen relevance function (TitleScorer, BM25Plus, or TerraphimGraph), and returned with sub-millisecond latency. The entire 29-crate Rust workspace compiles to native binaries, WebAssembly, and Tauri desktop applications. Built by Applied Knowledge Systems Ltd.
Each role maintains its own knowledge graph. Concept nodes are extracted from indexed documents. Semantic edges encode relationships. The thesaurus provides synonym expansion for improved recall. Graph path connectivity verification ensures coherent search results.
Aho-Corasick automata built from thesaurus entries perform simultaneous multi-pattern matching in a single text pass. LeftmostLongest matching strategy. The engine compiles to WebAssembly at approximately 200KB compressed, enabling browser-based autocomplete with identical matching behaviour.
Seven pluggable data source types: ripgrep for local files, Atlassian for Confluence and Jira, Discourse for forums, JMAP for email, Quickwit for log analysis, MCP for AI tool integration, and Atomic Server for linked data. Each implements a common indexing interface.
Multi-backend storage with transparent cache warm-up. Memory, DashMap, SQLite, and S3 backends ordered by speed. Zstd compression for objects over 1MB.
Tokio async throughout. Bounded channels for backpressure. Structured concurrency with scoped tasks. Non-blocking operations. Zero runtime garbage collection.
Firecracker microVMs for untrusted execution. Knowledge graph validation before operations. 1Password CLI integration. Zero telemetry. Full source auditability.
Model Context Protocol exposes autocomplete, text matching, thesaurus management, and graph connectivity. stdio and SSE transports. Optional OAuth bearer tokens.
Ollama for local inference. OpenRouter for cloud models. Provider-agnostic LlmClient trait. Document summarisation, intelligent descriptions, context-aware chat.
Native binaries. WebAssembly modules. Tauri desktop. Svelte frontend with Bulma CSS. Multi-platform Docker builds for linux/amd64, arm64, and arm/v7.