nicedreamzapp/claude-code-local
nicedreamzapp/claude-code-localRun Claude Code with local AI on Apple Silicon. 122B model at 41 tok/s with Google TurboQuant. No cloud, no API fees.
From the README
š§ Claude Code Local
Run a 122 billion parameter AI on your MacBook.No cloud. No fees. No data leaves your machine.
š¤ What Is This?
Your MacBook has a powerful GPU built right into the chip. This project uses that GPU to run a massive AI model ā the same kind that powers ChatGPT and Claude ā entirely on your computer.
š« No internet needed š° No monthly subscription š No one sees your code or data ā Full Claude Code experience ā write code, edit files, manage projects, control your browser
š± You (Mac or Phone)
ā
š¤ Claude Code ā the AI coding tool you know
ā
ā” MLX Native Server ā our server (200 lines of Python)
ā
š§ Qwen 3.5 122B ā 122 billion parameter brain
ā
š„ļø Apple Silicon GPU ā your M-series chip does all the work
š± Control From Your Phone
You don't have to be at your Mac to use this. We built a remote control pipeline:
š± Your iPhone š» Your Mac
ā ā
āāā iMessage āāāāāāāāāāāāāāāāāā>ā
ā āāā Claude Code
ā āāā MLX Server
ā āāā Qwen 3.5 122B
ā āāā (does the work)
ā š” **Pro tip:** Anthropic's Dispatch doesn't read your CLAUDE.md. Mention it in your message or it'll miss your custom setup. Our iMessage system doesn't have this problem.
## š Benchmarks
We built and tested three different approaches. Each one got faster.
### ā” Speed Comparison
Tokens per Second
š Ollama (Gen 1) āāāāāāāāāāāāāāāāāāāāāāāāāāāāāā 30 tok/s š llama.cpp (Gen 2) āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā 41 tok/s š MLX Native (Gen 3) āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā 65 tok/s
### ā±ļø Real-World Claude Code Task
How long to ask Claude Code to write a function:
š“ Ollama + Proxy āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā 133 seconds š llama.cpp + Proxy āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā 133 seconds š„ MLX Native (no proxy) āāāāāā 17.6 seconds
7.5x faster ā”
### š Side-by-Side
| | š Ollama | š llama.cpp + TurboQuant | š **MLX Native (ours)** |
|---|:---:|:---:|:---:|
| **Speed** | 30 tok/s | 41 tok/s | **65 tok/s** |
| **Claude Code task** | 133s | 133s | **17.6s** |
| **Needs a proxy?** | ā Yes | ā Yes | ā
**No** |
| **Lines of code** | N/A | N/A (C++ fork) | **~200 Python** |
| **Apple native?** | ā Generic | ā Ported | ā
**MLX** |
### āļø vs Cloud APIs
| | š„ļø **Our Local Setup** | āļø Claude Sonnet | āļø Claude Opus |
|---|:---:|:---:|:---:|
| Speed | 65 tok/s | ~80 tok/