JARVIS
Local Windows voice assistant with 18 tools and 3-tier safety layer.
<2s
speech-to-answer latency
18
tools in the registry
100%
local — no cloud APIs
Problem
Cloud voice assistants ship everything you say to a third party. Not an option for sensitive data.
Solution
Streaming STT (Deepgram) + local LLM (Ollama qwen3:8b) + local TTS (Kokoro ONNX). Tool registry with 18 tools (system, files, web, vault). 3-tier safety: allow → confirm → deny by command regex.
Outcome
Production-ready voice UX, semantic memory (model2vec embeddings in SQLite), <2s latency speech-to-answer. Zero cloud cost after setup.
Stack
- Python
- Ollama
- Deepgram Nova-3
- Kokoro ONNX
- SQLite + model2vec
Capabilities
- Local LLM inference
- Tool-calling agents
- Safety layer
- Semantic memory
Want something similar built for your company?