The Terraphim Times - Privacy-First Semantic Search

Local AI Assistant Promises Complete Data Privacy

By the Terraphim Development Team

Terraphim AI is a privacy-first AI assistant that operates entirely on your local device. Unlike cloud-dependent search tools, every query, every index, and every result stays under your control. The system uses knowledge graphs, semantic embeddings, and multiple search algorithms to deliver relevant results without ever transmitting your data externally.

Built with Rust for performance and memory safety, Terraphim manages a workspace of 29 specialised crates covering everything from text matching and autocomplete to agent supervision and knowledge graph orchestration. The architecture separates concerns across core service, persistence, middleware, and frontend layers.

The system supports multiple persistence backends -- memory, DashMap, SQLite, and S3 -- with transparent cache warm-up between layers. Objects exceeding 1MB are automatically compressed with zstd before caching, and schema evolution is handled gracefully: failed deserialisation triggers cache invalidation and re-fetch from persistent storage.

"Privacy is not a feature. It is the architecture."

Knowledge Graphs Power Semantic Understanding

Technical Report

At the heart of Terraphim lies a custom knowledge graph system built on Aho-Corasick automata with LeftmostLongest matching. Documents are processed through a pipeline of concept extraction, graph construction, and semantic indexing. The resulting RoleGraph maintains per-role document-to-concept relationships, enabling personalised search that adapts to your domain.

Five relevance functions are available: TitleScorer for basic text matching, BM25, BM25F, and BM25Plus for advanced statistical ranking, and TerraphimGraph for full semantic graph-based ranking with thesaurus expansion. Users configure relevance per role, switching strategies as the query context demands.

Eight Source Types Now Searchable

Integration Correspondent

Terraphim's haystack system connects to local files via Ripgrep, Confluence and Jira through Atlassian APIs, Discourse forums, email via JMAP protocol, task management through ClickUp, personal knowledge via Logseq, Rust documentation through QueryRs, and AI tools via the Model Context Protocol.

Each source becomes a searchable haystack within the knowledge graph. The MCP server exposes autocomplete, text processing, thesaurus management, graph connectivity checks, and fuzzy search as standard AI tools, supporting stdio, SSE/HTTP, and OAuth transports.

Getting Started in Three Commands

Technical Desk

Installation

git clone https://github.com/terraphim/terraphim-ai
cd terraphim-ai
cargo build --release
cargo run -- --config terraphim_engineer_config.json

The server exposes a REST API with endpoints for search, configuration, document summarisation, and chat completion. The Svelte-based desktop application connects to the backend for real-time search, knowledge graph visualisation, and role management.

Desktop Application

cd desktop
yarn install
yarn run tauri dev

For terminal users, the TUI provides an interactive REPL with hierarchical commands, ASCII graph rendering, and Firecracker VM management with sub-2 second boot times.

Editorial: Why Local-First Matters

Opinion

In an era where every keystroke risks becoming training data for someone else's model, Terraphim takes a principled stance: your knowledge belongs to you. There is no analytics endpoint. No usage tracking. No cloud dependency. External connections to Confluence, Jira, or email are explicit, user-initiated, and authenticated with credentials you control through 1Password.

The Firecracker microVM integration extends this philosophy to code execution. Untrusted operations run in isolated virtual machines with sub-500ms allocation times. Knowledge graph validation occurs before execution, ensuring that even sandboxed commands align with your domain understanding.