The Workstation Duo
A two-machine AI lab. Local inference, distributed builds, zero cloud lock-in.
🖥️
Primary Workstation
Main Dev · Local LLM Inference Intel Core i9-14900K (24C/32T)
NVIDIA GeForce RTX 5070 Ti · 16GB GDDR7 VRAM
32GB DDR5 RAM @6400MHz
1.8TB NVMe SSD
Pop!_OS 24.04 (Linux)
Qwen 3.5 · Granite 4.1 · Phi 4 · Nemotron · Llama 3.2CUDA 13.0 / NVIDIA Driver 550+
🖥️
Secondary Workstation
Distributed Builds · GPU Compute Intel Core i7-12700KF (12C/20T)
2 x 16GB AMD Radeon RX 9060 XT · 32GB VRAM total
32GB DDR4 RAM @2400MHz
4TB SSD
Linux Mint 22.3 (Cinnamon)
ROCm · OpenCL · Distributed training workloads
Software Stack
Ollama →
Local LLM inference server. Runs Qwen, Granite, Phi, Nemotron, and more.
Astro →
Static site generator. This site is Astro + Bun + Tailwind, built locally.
Bun →
JavaScript runtime and bundler. Faster than Node for this workload.
Tailwind CSS v4 →
Utility-first CSS. Uses @theme directive for custom design tokens.
Docker →
Containerized services. Everything that can be containerized, is.
Cloudflare →
Edge deployment. DNS, Pages, and Workers for global distribution.
Why Local?
Every prompt to a closed cloud service becomes training data. Every file you upload to a remote API is stored, analyzed, and potentially reproduced. For personal projects, code, and proprietary data, that's unacceptable.
Local models run at 30-80 tok/s on modern GPUs. They're private, cost nothing per token, and work offline. The Workstation Duo proves you don't need a $20/month subscription to get world-class AI — you need the right hardware and the knowledge to use it.
When cloud is needed, it's chosen consciously — not by default. That's the difference.