Shipped

JARVIS

Local Windows voice assistant with 18 tools and 3-tier safety layer.

<2s

speech-to-answer latency

tools in the registry

100%

local — no cloud APIs

Problem

Cloud voice assistants ship everything you say to a third party. Not an option for sensitive data.

Solution

Streaming STT (Deepgram) + local LLM (Ollama qwen3:8b) + local TTS (Kokoro ONNX). Tool registry with 18 tools (system, files, web, vault). 3-tier safety: allow → confirm → deny by command regex.

Outcome

Production-ready voice UX, semantic memory (model2vec embeddings in SQLite), <2s latency speech-to-answer. Zero cloud cost after setup.

Stack

Python
Ollama
Deepgram Nova-3
Kokoro ONNX
SQLite + model2vec

Capabilities

Local LLM inference
Tool-calling agents
Safety layer
Semantic memory

Want something similar built for your company?

Configure project Book a call