The time is now

After seeing the huge news about DeepSeek-R1’s release last month and its affect on Nvidia’s stockand after dabbling in running LLM’s locally last year for a research project, aaaaand after I just deployed my first server for my home lab and looking for a project to run on it. I figured now was a better time then ever to get started!

Technology Stack

After some research for this project I decided on this Technology stack:

TechnologyProduct/PlatformReason
ServerWindows 11Modern OS and flexible for other projects down the road.
Virtual Machine PlatformWindows Subsystem for Linux (WSL)Native to Windows and supports GPU acceleration with the use of a shared Docker daemon and images between Linux and Windows.
Containerization PlatformDocker DesktopOpen Source, very popular and I wanted to expand my skills with the platform.
Large Language Model (LLM)llama2Made by Meta and is lightweight (7b) for quick testing and deployments.
LLM EngineOllamaFree, open source and is a new tool I wanted to explore!
Web Server and Graphical User Interface (GUI)Open Web UIOpen Source and has conveniently made packages for Ollama.
Virtual Private Network (VPN)TailscaleFree, mostly open source and helps streamline networking.
Networking ServingTailscale ServeAllows you to share privately a locally hosted service (Not internet facing).

Configuration

  1. I first configured my server with an installation of Windows 11.
  2. Installed and configured Tailscale.
  3. Enabled the WSL windows feature and installed a fresh Debian distribution.
  4. Installed Docker desktop and configured it to use the WSL 2 based engine.
  5. I pulled the Ollama+Open Web UI docker image down to my Debian VM and installed the necessary GPU drivers.
  6. Once deployed locally I went into the Tailscale admin console to enable HTTPS traffic and Tailscale Serve.
  7. Once Tailscale Serve was configured I was able to open the Open Web UI on my iPhone and start talking to the LLM! The experience was very similar to using the official ChatGPT app. (Outside of it being a 7 billion parameter model vs 4o 200 billion, My RTX 2070 Super stays strong!).

Resources