Storage Setup
Introduction
In a homelab environment like my Raspberry Pi–based K3s Kubernetes cluster, storage is one of the most important foundations to get right early. Unlike cloud environments where persistent volumes are abstracted and managed automatically, bare-metal clusters require intentional design decisions around persistence, reliability, and performance.
In my rack setup—which includes multiple Raspberry Pi nodes running K3s alongside other services in Docker on my development machine—storage has to remain simple, lightweight, and resilient. I don’t have enterprise SAN hardware, so the goal is to build a system that is easy to maintain, survives node restarts, and integrates cleanly with Kubernetes workloads like my Bible app backend, CI/CD pipelines using Gitea runners, and AI services running in containers.
This setup focuses on balancing simplicity and practicality while leaving room for scaling later.
Storage Goals
Before choosing a solution, I defined a few core goals:
- Persistent data must survive pod restarts and node reboots
- Minimal overhead for Raspberry Pi hardware
- Easy backup and restore strategy
- Works well with K3s without complex external dependencies
- Flexible enough to support databases, file uploads, and application state
Given my stack (FastAPI services, PostgreSQL, Redis, and app media storage), storage needs to support both structured and unstructured data.
Storage Architecture Overview
My current homelab storage architecture is split into three layers:
1. Local Node Storage (Primary Layer)
Each Raspberry Pi node uses its own local storage (SD card or SSD depending on node role). This is used for:
- Kubernetes system data
- Container runtime storage
- Ephemeral workloads
- Temporary caching
This layer is fast and simple but not reliable for critical persistence on its own.
2. Persistent Volume Layer (K3s Local Path Provisioner)
For Kubernetes-native persistence, I use K3s’s built-in local path provisioner.
This allows me to create PersistentVolumeClaims (PVCs) that map directly to directories on the node filesystem.
Example use cases:
- PostgreSQL database storage
- Redis persistence (if enabled)
- Application uploads (user images, media, documents)
- Gitea repositories and CI artifacts
Typical path on nodes:
/var/lib/rancher/k3s/storage
This approach keeps everything lightweight and avoids needing external storage systems like NFS or Ceph.
3. External/Off-Cluster Storage (Development Machine Integration)
In my broader rack setup, my development machine (with an RTX 3090 and Docker-based model containers) acts as an auxiliary compute and storage companion.
I occasionally mount shared storage between:
- Dev machine Docker volumes
- Kubernetes workloads (via sync or backup jobs)
- Model artifacts and datasets for AI pipelines
This is not tightly coupled to Kubernetes, but it supports my workflow for AI experimentation and training data.
Persistent Data Strategy
For production-like stability inside the cluster, I separate workloads by persistence needs:
Stateless workloads
- API services (FastAPI, Node.js backends)
- Frontend apps
- Workers and schedulers
These can be freely rescheduled across nodes.
Stateful workloads
- PostgreSQL (primary database for apps like my Bible platform)
- Gitea (repositories and CI/CD metadata)
- Redis (if persistence is enabled)
- File storage services
These are pinned to persistent volumes using node-affinity where needed.
Backup Strategy
Because local-path storage is not inherently redundant, backups are essential.
My backup approach includes:
- Nightly cron-based volume snapshots (simple rsync-based approach)
- Git-based backups for code (via Gitea)
- Database dumps for PostgreSQL
- Manual sync to external storage on my dev machine
- Long-term optional cloud backup (future expansion)
Example PostgreSQL backup flow:
kubectl exec postgres-pod -- pg_dumpall > backup.sql
Storage for the Bible App Ecosystem
A large part of this setup is designed around my Bible app infrastructure.
Storage is used for:
- User avatars and cover images (stored in S3 or local dev fallback)
- Uploaded media content
- Devotional content and blog assets
- Game marketplace assets (JavaScript games stored in S3)
- Video ad assets and streaming metadata
The system is designed so that Kubernetes handles orchestration, while storage is abstracted depending on environment (local vs production-ready S3).
Scaling Considerations
As the cluster grows beyond Raspberry Pi nodes, I may introduce:
- NFS server on a dedicated node or mini-NAS
- Longhorn for distributed block storage
- MinIO for S3-compatible object storage
- Ceph (only if the cluster becomes significantly larger)
For now, simplicity wins. K3s + local-path provisioner keeps everything understandable and debuggable.
Operational Notes
A few key lessons from running this setup:
- SD cards are not reliable long-term for heavy write workloads (SSDs are preferred)
- PVC cleanup must be monitored to avoid orphaned storage usage
- Database workloads should always be pinned and not freely rescheduled
- Backups matter more than redundancy at small scale
- Simplicity beats overengineering in early homelab stages
Conclusion
This storage setup is intentionally lightweight, reflecting the constraints and goals of a Raspberry Pi–based K3s homelab. Instead of introducing heavy distributed storage systems too early, the focus is on clarity, maintainability, and integration with real workloads like my Bible app, CI/CD pipelines, and AI services running across my rack.
As the system evolves, this foundation allows me to layer in more advanced storage solutions without needing to redesign everything from scratch. For now, local-path storage combined with disciplined backups provides a stable and practical base for development and experimentation.
More in infrastructure-engineering
Continue exploring articles in this category.
Sep 7, 2025
K3s on Raspberry Pis
Step-by-step guide to setting up a K3s Kubernetes cluster on Raspberry Pi nodes — networking, configuration, a…
Sep 13, 2025
Hardware List and Costs
Full hardware list and cost breakdown for my ARM64 homelab Kubernetes cluster — Raspberry Pis, switches, stora…
Sep 20, 2025
Flashing Raspberry Pi OS
How to flash Raspberry Pi OS Lite and configure base settings for a production-ready Kubernetes homelab node f…
Case Study
Bible Verse — Case Study
Production SaaS Platform · Full-Stack · Founder & Sole Engineer
A domain-driven SaaS platform with five independently scalable system boundaries: scripture content delivery, RAG-backed AI study, real-time community interaction, async media processing, and infrastructure services — built and operated end-to-end.
Our Results
How We Built It
- RAG pipeline grounding AI responses in actual scripture rather than model memory
- Hybrid Llama / OpenAI routing — local inference for cost, API fallback for quality at the edge
- Non-blocking media processing — FFmpeg jobs enqueued via BullMQ, API never waits on transcoding
- Cross-instance real-time consistency via Redis pub/sub behind WebSocket and WebRTC layers
Lessons Learned
- Domain boundaries enforced at the service layer prevent coupling long before scale demands microservices.
- RAG retrieval quality matters more than model size — better embeddings outperform a larger model on poor context.
- Async queue design should be first-class, not bolted on; BullMQ worker isolation saved the request path repeatedly.
Stack
Written by
5+ years building production systems · AI, Backend & Infrastructure · Founder of Bible Logic
Full-stack engineer with 5+ years of hands-on experience designing and shipping production systems — from Nuxt 3 frontends and Nitro APIs to self-hosted Kubernetes clusters, RAG pipelines, and real-time AI applications. Everything I write comes from systems I've designed, deployed, and operated in production.

