Project Avocado: Building AI in the Garage - Month 1 Recap

From Bare Metal to AI Agent Platform (Jan 24 - Apr 6, 2025)

Imagine building something truly cutting-edge, not in a sterile corporate lab, but in your own space, with your own hardware – like the tech legends of old. That's the spirit behind Project Avocado, an ambitious venture kicked off just over a month ago by Terry Dynamics. The goal? To cultivate a powerful, self-hosted AI platform capable of complex reasoning and real-world interaction, potentially laying the groundwork for the AGI revolution many anticipate.

This isn't just about running models; it's about understanding the entire stack, from installing the operating system (Arch Linux on bare metal!) to fine-tuning network configurations, wrestling with GPU drivers, building custom APIs, and exploring the nascent field of AI agents.

The Journey So Far: A Whirlwind Month

The first month was a sprint, transforming a blank server into a functioning AI development hub:

Laying the Foundation (Week 1): Starting with a fresh Arch Linux install, the initial focus was on bedrock infrastructure: stable networking (Ethernet!), secure remote access (SSH), NVIDIA drivers, CUDA, and Docker with GPU support. This groundwork enabled the first major breakthrough: running the DeepSeek-Coder 6.7B model locally using `llama-cpp-python`, achieving respectable code generation performance (~10.5 tokens/sec) even on the initial GTX 1070 Ti GPU.
Integrating Models (Weeks 2-3): The platform quickly evolved beyond a single model. Mistral-7B was integrated using the `transformers` library, requiring a Python version upgrade (building 3.12 from source). Ollama was introduced as a versatile serving backend, simplifying the deployment of models like Mistral and, later, Google's Gemma 3 12B.
Building Interfaces (Weeks 3-6): Accessing the AI needed to be user-friendly. A feature-rich Discord bot ("Avocado") was brought online, complete with custom commands and integration with the local LLMs. Simultaneously, work began on a web presence: the Terry Dynamics AI Lab dashboard. This involved setting up Nginx as a reverse proxy, battling infamous CORS errors, integrating the Ollama Web UI, and even adding access to ComfyUI for image generation – all accessible via the `api.terrydynamics.com` domain.
Hardware Evolution & Agent Exploration (Weeks 7-8): Recognizing the VRAM limitations, a crucial hardware upgrade arrived: the powerful NVIDIA P40 GPU with 24GB VRAM. This unlocked the potential for much larger models. Alongside this, initial steps were taken into the world of AI agents using the OpenManus framework, developing a custom adapter to connect it to the local Ollama instance and achieving basic tool execution.
Refinement & Roadblocks (Weeks 9-10): With Gemma 3 integrated via Ollama, the focus shifted to refining the API service – adding concurrency controls (locking), session management, and workarounds for Ollama's VRAM handling. A GitHub workflow was established. An attempt to build local RAG tooling using project logs and integrate it with VS Code via Continue.dev hit persistent snags and was ultimately paused to maintain focus on core goals. Most recently, an unexpected public IP change required urgent DNS updates, and exploration into using a second GPU (RTX 3060 Ti) began, revealing driver initialization challenges.

Visualizing the Flow: Inside the LLM

The animation running at the top of this page is more than just eye candy. It's a visual metaphor, inspired by the complex processes happening within the language models hosted on Avocado. Think of the dots as neurons or concepts, the faint lines as the static connections learned during training, the pulsing nodes as activated concepts, the traveling yellow particles as the flow of information (like your prompt being processed), and the bright flashing lines as 'attention' mechanisms – the model focusing on relevant connections to generate a response. It's a glimpse into the dynamic dance of data that allows these AI systems to reason and create.

Overcoming Garage-Built Hurdles

Building from the ground up isn't without its challenges:

Environment Wrangling: Making Python virtual environments play nice with Arch Linux's system packages required careful management.
Hardware Limits: Early workarounds for the 8GB 1070 Ti led to the P40 upgrade. Even the P40, a data center card, presents thermal considerations in standard PC cooling.
Configuration Complexity: Debugging CORS, Nginx proxies, systemd services, and Docker networking demanded meticulous attention to detail.
Tooling Integration: Adapting tools like OpenManus or Continue.dev often requires custom code and troubleshooting.
Infrastructure Stability: Unexpected IP changes highlight the reliance on external factors like DNS.
Multi-GPU Setup: Getting multiple, different-architecture GPUs recognized and utilized by the driver is proving to be the current frontier.

Where Avocado Stands Now

After one intensive month (and a bit!), Project Avocado boasts:

A stable Arch Linux server powered by an NVIDIA P40 (24GB VRAM).
Multiple LLMs (Mistral 7B, Gemma 3 12B) served via Ollama.
A custom API for Gemma 3 handling concurrency and sessions.
A functional web dashboard (pending DNS fix) and integrated UIs for Ollama and ComfyUI.
A versatile Discord bot providing LLM access and other tools.
An initial agent framework (OpenManus) capable of basic tool use.
An ongoing effort to enable a second GPU (RTX 3060 Ti) for specialized tasks.

The most pressing immediate issue is updating the DNS record for `api.terrydynamics.com` to the new server IP (`67.161.79.201`) to restore web access.

The Road Ahead: Cultivating an Agent

With the foundational infrastructure largely in place and the powerful P40 ready, the next phase for Project Avocado shifts decisively towards its core vision: **agent development**. This means:

Leveraging Gemma 3 (and potentially other large models suited for the P40) within the agent framework.
Defining more complex tasks and tools (the "MCPs" - Master Control Programs?) for the agent to interact with the system and potentially the outside world.
Refining the system architecture for robust model management and resource allocation (moving beyond workarounds like Ollama restarts).
Resolving the multi-GPU setup to potentially offload tasks like image generation to the 3060 Ti.

Project Avocado is more than just code and hardware; it's an exploration into the future of AI, built with passion and perseverance. Stay tuned as we continue this journey, pushing the boundaries of what's possible with self-hosted AI.