The Search Engine That Never Leaves Your Machine


Applied Knowledge Systems Ltd

There is a quiet revolution happening in how software engineers interact with their own knowledge. Not in the cloud. Not through a chatbot proxy. But locally, on their own hardware, with a search engine that understands the shape of their thinking. Terraphim AI is that engine -- a privacy-first semantic search system built in Rust, designed to run without network access, and structured around the radical idea that your knowledge graph belongs to you.

Built by Applied Knowledge Systems, Terraphim takes a fundamentally different approach to search. Rather than indexing the web or querying a remote API, it builds knowledge graphs from your own repositories -- your documentation, your notes, your email, your project management tools -- and makes that knowledge instantly searchable through Aho-Corasick automata that deliver sub-millisecond concept extraction.

Your data never leaves your machine. No telemetry. No cloud dependency. Every search happens locally, every graph is yours alone.
Core Design Principle

How It Thinks


The knowledge graph is the foundation. Terraphim constructs per-role semantic graphs that map concepts, synonyms, and documents into a navigable structure. A systems engineer sees different search results than a product manager, because their knowledge graphs are shaped by their roles -- different thesauruses, different relevance functions, different haystacks to search.

The thesaurus system drives query expansion. When you search for "persistence," Terraphim does not just match that string. It understands that persistence connects to caching, storage backends, SQLite, S3, and DashMap through the knowledge graph. It expands your query semantically, then scores results using configurable relevance functions: BM25 for statistical text relevance, TitleScorer for simple matching, or TerraphimGraph for full semantic ranking that traverses the concept graph.

Haystacks are where data lives. Terraphim searches across multiple backends simultaneously. Local files via Ripgrep. Confluence and Jira through the Atlassian integration. Discourse forums. Email via JMAP. Logseq notes. ClickUp tasks. Quickwit indices. Even other AI tools through the Model Context Protocol. Each haystack is configured per role, and results are merged and ranked by the active relevance function.

Twenty-Nine Crates, One Binary


The architecture is a Cargo workspace of twenty-nine crates. At the foundation sits terraphim_automata, which builds Aho-Corasick finite state machines from thesaurus entries for pattern matching. Above it, terraphim_rolegraph constructs and queries the knowledge graph. The terraphim_middleware crate orchestrates haystack indexing and search across all configured backends. And terraphim_service ties everything together -- search, document management, AI summarisation, and chat.

The persistence layer is particularly elegant. It supports multiple backends ordered by speed: memory, DashMap, SQLite, S3. When data is loaded from a slower backend, it is automatically cached to the fastest one using a fire-and-forget pattern. Objects over one megabyte are compressed with zstd. If the schema evolves and cached data fails to deserialise, the cache entry is quietly deleted and data is re-fetched from persistent storage. It handles the mess so you do not have to.

For secure execution, Terraphim includes Firecracker microVM integration. Untrusted code runs in sandboxed virtual machines that boot in under two seconds. The system intelligently selects between local execution, Firecracker isolation, and hybrid modes depending on the operation type. It is security without ceremony.

Sub-millisecond concept extraction. Sub-two-second VM boot. Zero bytes transmitted to the cloud.
Performance Characteristics

In Practice


The terminal interface is where Terraphim feels most alive. The interactive REPL loads your role configuration, builds the automata, and gives you a prompt. From there, every search happens locally, every result is scored against your personal knowledge graph, and every interaction refines the system's understanding of your domain.

Exhibit A: A typical search session
# Launch the interactive REPL
$ terraphim-agent
Terraphim AI v0.9.0 -- Privacy-first semantic search
Loading role: Terraphim Engineer
Thesaurus: 2,847 concepts | Automata built in 12ms

# Search across all configured haystacks
terraphim> /search "async persistence caching"

Searching 4 haystacks... (BM25 + TerraphimGraph)

  [0.94] terraphim_persistence/src/lib.rs
         Multi-backend storage with transparent cache warm-up.
         Memory > DashMap > SQLite > S3, ordered by speed.

  [0.87] terraphim_rolegraph/src/graph.rs
         Per-role knowledge graph with node/edge relationships.
         Thesaurus-driven concept extraction and expansion.

  [0.81] crates/terraphim_config/src/roles.rs
         Role configuration with relevance function selection.

3 results in 47ms | 0 network calls | 0 bytes transmitted

# Ask the local LLM to summarise
terraphim> /chat "How does cache warm-up work?"

When data is loaded from a slower backend, it is automatically
cached to the fastest operator via fire-and-forget tokio::spawn.
Objects over 1MB are compressed with zstd before caching.
Schema evolution is handled gracefully through cache invalidation.

Terraphim also compiles to WebAssembly. The autocomplete engine runs in the browser at roughly 200 kilobytes compressed, compatible with Chrome 57 and later, Firefox 52 and later, and Safari 11 and later. The desktop application, built with Svelte and Tauri, provides real-time search, knowledge graph visualisation, and configuration management across macOS, Linux, and Windows. There is also a full MCP server that exposes autocomplete, text matching, fuzzy search, and graph connectivity as tools for AI development environments.

Start Searching Locally

Terraphim AI is open source under MIT and Apache 2.0 licences. Clone, build, run. No account required. No data leaves your machine.