Cluster Architecture (Homelab)

My homelab cluster is the foundation behind nearly everything I build — from AI services and theological search systems to livestreaming infrastructure and my Bible platform. The environment is designed around low-cost, scalable hardware using Kubernetes, allowing me to experiment with distributed systems, CI/CD pipelines, self-hosted tooling, vector search, AI agents, and cloud-native development practices from home.

The rack combines ARM nodes, self-hosted services, storage, networking, and GPU-assisted AI workloads into a single ecosystem that I can continuously expand and refine over time.

How my homelab Kubernetes cluster is architected.

Overview

The cluster is built primarily around lightweight Kubernetes (K3s) running on multiple nodes. The goal is to maintain a modular infrastructure that supports:

Self-hosted applications
AI microservices
Development environments
CI/CD pipelines
Streaming infrastructure
Databases and caching
Vector search and retrieval systems
Experimental AI agents

The setup is designed to mimic production-style infrastructure while remaining affordable and power efficient.

Core Infrastructure

Kubernetes Distribution

I use K3s because it is lightweight, simple to manage, and works well on ARM-based hardware while still supporting production-like workflows.

Key benefits:

Low resource overhead
Fast deployments
Simple cluster management
ARM compatibility
Easy scaling

Hardware Layout

The rack currently consists of multiple low-power nodes running cluster workloads along with a separate GPU development machine.

Cluster Nodes

The Kubernetes nodes handle:

API services
Databases
Internal tooling
Web applications
Background workers
CI/CD runners
AI orchestration services

GPU Development Machine

Outside the cluster, I run local AI models inside Docker containers on a dedicated development machine with an RTX 3090.

This system is used for:

Running local LLMs
Embedding generation
AI experimentation
Agent workflows
Model testing
Retrieval pipelines

Long term, the plan is to integrate the GPU machine more tightly into the cluster for shared orchestration and scheduling.

Networking

The cluster uses internal networking for service-to-service communication while exposing selected applications through ingress controllers and reverse proxies.

Typical traffic flow:

Internet
   ↓
Reverse Proxy / Ingress
   ↓
Kubernetes Services
   ↓
Pods / Microservices

Storage

Persistent workloads use mounted storage volumes for:

Databases
User uploads
Media
Vector indexes
Application data
Logs

Assets such as videos, game files, and uploaded media are stored externally on Amazon S3.

Databases

The stack currently uses multiple storage layers depending on workload type.

PostgreSQL

Primary relational database used for:

User accounts
App data
Content systems
Metadata
Authentication

Redis

Used for:

Caching
Queues
Session storage
Temporary state
High-speed lookups

Vector Storage

Vector search infrastructure is used for retrieval systems and AI workflows involving:

Theology datasets
Semantic search
AI memory systems
Embeddings
Knowledge retrieval

AI Infrastructure

AI services are a major part of the architecture.

Current AI Workloads

The infrastructure supports:

Local LLM inference
AI agents
Embedding pipelines
Retrieval systems
Tool-using agents
Conversational systems
Bible-focused AI tools

Bible Logic

One of the core systems connected to the infrastructure is Bible Logic, an AI assistant integrated into my Bible platform.

The broader goal is building infrastructure capable of supporting persistent AI systems with memory, retrieval, tools, and orchestration.

CI/CD

Source control and deployment pipelines are self-hosted.

Gitea

Gitea is used for:

Git repositories
Source management
Internal development
Deployment workflows

CI Runners

CI runners automate:

Builds
Deployments
Container publishing
Infrastructure updates
Kubernetes rollouts

This allows code pushed to repositories to automatically deploy into the cluster.

Application Hosting

The cluster hosts multiple types of applications including:

APIs
Web frontends
AI services
Streaming services
Internal dashboards
Background workers
Experimental tooling

Applications are containerized and deployed through Kubernetes manifests and automated pipelines.

Goals of the Architecture

The main goals of the homelab are:

Learn production infrastructure
Build scalable systems
Self-host applications
Develop AI tooling
Practice DevOps workflows
Experiment with distributed systems
Create a flexible R&D environment

Future Expansion

Planned improvements include:

More cluster nodes
GPU integration into Kubernetes
Improved observability
Dedicated storage systems
High availability services
Expanded AI orchestration
More automation
Edge AI workloads
Multi-cluster experimentation

Conclusion

This homelab has become more than just a place to host applications — it functions as an active research and engineering environment where I can develop real-world skills across infrastructure, AI engineering, Kubernetes, distributed systems, and backend development.

By combining self-hosted services, GPU-powered AI workloads, and automated deployment pipelines into a unified rack setup, the cluster gives me the flexibility to rapidly prototype ideas while learning how modern production systems are designed and operated at scale.

Menu

Cluster Architecture (Homelab)

Cluster Architecture (Homelab)