SaaS

Bible Verse RAG · Hybrid Inference · Real-Time

Full-stack SaaS — Nuxt 3, PostgreSQL, Weaviate vector search, BullMQ queues, RAG-backed AI with hybrid Llama/OpenAI routing, WebSocket/WebRTC real-time, and FFmpeg media processing.

A full-stack spiritual platform engineered across five independently scalable domains: scripture content delivery, RAG-backed AI study, real-time community interaction, async media processing, and a hybrid AI orchestration layer.

1

HTTP request → Nitro server → PostgreSQL scripture query or Weaviate vector retrieval

2

AI queries routed through custom prompt orchestration → Llama 3.2 8B primary, OpenAI API fallback

3

Media uploads dispatched to BullMQ → FFmpeg transcoding worker → MinIO object storage

4

Live events broadcast via WebSocket sync; livestreams run over WebRTC peer connections

5

Session caching, rate limiting, and cross-instance pub/sub handled by Redis

Domain-driven architecture — each system boundary owns its data access, service logic, and scaling surface without coupling to the others.

by Donavan Jones, 22 January 2024
Bible
Core Engine
AI
Bible Logic
Live
Streaming
Social
Community
System Snapshot
Frontend
  • Nuxt 3
  • Vue 3
  • Tailwind CSS
  • TypeScript
Backend
  • Node.js
  • Nitro Server
  • Prisma ORM
  • PostgreSQL
  • Redis
  • BullMQ
  • Weaviate
Infrastructure
  • MinIO (Object Storage)
  • FFmpeg (Media Processing)
  • WebRTC (Peer Communication)
  • WebSockets (Realtime Sync)
AI
  • Llama 3.2 8B (primary inference)
  • OpenAI API (hybrid fallback)
  • Stable Diffusion 2.1
  • Whisper (Speech-to-Text)
  • ElevenLabs (Text-to-Speech)
  • Custom Prompt Orchestration Layer
Use Cases
AI-assisted scripture study via RAG pipelineReal-time community discussion and livestreamingSemantic search across Bible and Quran corporaAsync audio/video processing for devotional media
Project Scope
100+ PagesRAG PipelineJob QueuesAI IntegrationsReal-time StreamingCommunity PlatformMedia Processing PipelineScalable Backend Architecture

Engineering Overview

Bible Verse is a full-stack SaaS platform built around a domain-driven architecture with five independently scalable system boundaries: scripture content delivery, AI-assisted study, real-time community interaction, async media processing, and infrastructure services.

The core engineering challenge is not storing scripture — it is building a system that delivers contextually accurate AI responses, handles concurrent real-time sessions, and processes media asynchronously without blocking the main request path. Each of these problems requires a different architectural pattern, and the platform composes them into a single deployment unit.

The Nitro server coordinates across domains without owning business logic: resolving scripture from PostgreSQL, triggering Weaviate retrievals for AI context, dispatching BullMQ jobs for media, and publishing Redis events for real-time broadcast.

37K+
Verses Indexed
Bible + Quran in Weaviate
5
AI Models
Llama · OpenAI · Whisper · ElevenLabs · SD
3
Job Queues
Transcoding · Notifications · AI Batch
5
Bounded Domains
Independently scalable contexts

Technical Highlights

RAG Pipeline — Weaviate Vector Search

Scripture content is embedded into Weaviate's vector database. At query time, semantically relevant passages are retrieved and injected into the prompt before inference — grounding AI responses in actual scripture text rather than relying on model memory.

Hybrid AI Architecture

Llama 3.2 8B handles primary inference for cost efficiency. A custom orchestration layer classifies query complexity and routes complex or low-confidence cases to the OpenAI API as a fallback — keeping routine queries cheap while preserving quality at the edge.

BullMQ Async Job Queues

Three dedicated queues handle work outside the request path: audio transcoding, notification dispatch, and AI inference batching. No heavy operation blocks API response time — all long-running work is enqueued and processed by dedicated workers.

WebSocket + WebRTC Real-Time Layer

Community features use WebSockets for message sync and reaction broadcasting. Livestream sessions use WebRTC for peer-to-peer video. Redis pub/sub routes events across server instances so state stays consistent regardless of which node a client connects to.

FFmpeg Media Pipeline

Audio and video uploads are accepted immediately and enqueued as BullMQ jobs. A dedicated worker runs FFmpeg transformations, writes processed output to MinIO, and notifies the client via WebSocket on completion. Raw uploads never touch the public delivery path.

Cross-Corpus Semantic Search

Bible and Quran content share a single Weaviate embedding collection. A vector similarity query with metadata filters can surface semantically related passages from either corpus in one call — no separate indices, no post-hoc merging.


System Architecture

Bible Verse follows a modular, domain-driven structure. Each bounded context owns its own data access, service layer, and scaling surface. The Nitro server acts as the coordination layer without being a bottleneck — domain services communicate through typed interfaces, not shared tables.

Loading diagram…

Install @vue-flow/core to render this diagram.
Client request
Data ops
Queue dispatch
Real-time
AI pipeline
Storage

AI study request: client → Nitro API → Weaviate retrieval → prompt construction → Llama inference (or OpenAI fallback) → SSE stream → client.

Livestream session: WebRTC signaling through Nitro → peer connection established → Redis pub/sub synchronizes chat and reactions across all participants.


System Components

System Architecture

Engineering Overview

Five bounded domains compose the platform at runtime. Each domain owns its schema, service logic, and data access — no cross-domain direct database queries.

Scripture & Content Engine

Bible + Quran — Content Delivery

  • Multi-translation rendering from a structured PostgreSQL schema with verse-level granularity
  • Per-user highlights, annotations, bookmarks, and reading progress stored relationally
  • Weaviate vector index shared across Bible and Quran for unified semantic search
  • Devotional content pipeline with author tooling and scheduled publish support
  • Full-text and semantic search paths run in parallel; results merged by relevance score

AI Reasoning Layer

Bible Logic — RAG-Backed Inference

  • Weaviate retrieval fetches top-K passages by vector similarity before any inference call
  • Prompt orchestration layer assembles context window from retrieved passages + user query
  • Llama 3.2 8B handles primary inference; OpenAI API activated above a complexity threshold
  • Server-Sent Events stream tokens to the client as they are generated
  • Confidence scoring triggers fallback routing without exposing the switch to the user

Real-Time Engagement

Live + Community — WebSocket & WebRTC

  • WebRTC peer-to-peer video with Nitro-backed signaling for low-latency livestreams
  • WebSocket event bus broadcasts chat messages, reactions, and thread updates in real time
  • Redis pub/sub synchronises events across multiple server instances without sticky sessions
  • BullMQ handles async notification dispatch so community alerts never block the main thread
  • Presence tracking and user connection state maintained in Redis with TTL expiry

Media Processing Pipeline

Upload → Queue → Transcode → Deliver

  • Uploads accepted immediately via MinIO pre-signed URL — API stays non-blocking
  • BullMQ job enqueued on upload; dedicated FFmpeg worker processes audio and video
  • Output formats include MP3, MP4, and HLS segment sets for adaptive streaming
  • Processed assets written to MinIO; DB record updated; client notified via WebSocket
  • Failed jobs retry with exponential back-off; dead-letter queue captures persistent failures

Auth & Infrastructure Services

Sessions · Rate Limits · Data Access

  • JWT access tokens (15 min TTL) with Redis-backed refresh token rotation
  • Server-side session revocation via Redis without waiting on token expiry
  • Per-IP and per-user rate limiting enforced at Nitro middleware before any handler runs
  • Role-based access control evaluated at the service layer — route guards are UI-only
  • PostgreSQL connection pooling via pg; Prisma client typed from schema migrations
Core Technologies
Nuxt 3Vue 3TypeScriptNitro ServerPrisma ORMPostgreSQLRedisBullMQWeaviateMinIOFFmpegWebRTCWebSocketsLlama 3.2 8BOpenAI APIWhisper STTElevenLabs TTSStable Diffusion 2.1ZodPinia

Challenges Solved

Real-Time State Consistency Across Instances

Problem

Chat messages, reactions, and community updates needed to stay consistent for users connected to different server instances simultaneously.

Solution

Redis pub/sub acts as the cross-instance event bus. WebSocket handlers publish events to a Redis channel; all server instances subscribe and broadcast to their locally connected clients. PostgreSQL is the authoritative state store — Redis is the delivery mechanism only.

Context-Accurate AI for Theological Queries

Problem

General-purpose models hallucinate or over-generalize when answering specific scripture-referenced or theological questions.

Solution

A RAG pipeline retrieves relevant passages from Weaviate before inference. The prompt is assembled with retrieved context injected before the user query. The orchestration layer escalates low-confidence queries from Llama to the OpenAI API rather than letting the local model guess.

Non-Blocking Media Processing

Problem

Audio and video uploads require CPU-intensive FFmpeg transcoding that cannot run synchronously in the API request path without causing timeouts.

Solution

Uploads are accepted immediately and enqueued as BullMQ jobs. A dedicated worker handles FFmpeg transforms, writes output to MinIO, updates the database record, and notifies the client via WebSocket. The API layer remains non-blocking throughout.

Semantic Search Across Multiple Corpora

Problem

Users needed meaning-based passage discovery across Bible and Quran without knowing exact verse references. Keyword search alone was not sufficient.

Solution

Both corpora share a single Weaviate collection. Vector similarity queries with metadata filters scope results to one or both texts. A single retrieval call surfaces related passages from different scriptures without separate indices or post-hoc merging.

Hybrid AI Cost vs. Quality Trade-Off

Problem

Routing all queries through the OpenAI API is cost-prohibitive at scale. Running all queries through the local model alone risks quality degradation on complex theological edge cases.

Solution

The orchestration layer classifies query complexity using lightweight heuristics before inference. Standard Q&A routes to Llama 3.2 8B locally. Queries exceeding a complexity threshold escalate to the OpenAI API. The routing is transparent to the user — only the backend path changes.

Domain Isolation Without Microservices Overhead

Problem

Five functionally distinct domains needed clear boundaries, but splitting into separate services would introduce network latency, deployment complexity, and distributed tracing overhead at current scale.

Solution

Domains are isolated as typed service modules within the monorepo. No domain accesses another's database tables directly. Each exposes a contract consumed by the Nitro API layer — the same isolation benefits as microservices, in a single deployable unit until scale justifies extraction.

Testing Async BullMQ Workers

Problem

BullMQ workers run outside the HTTP request cycle, making them hard to test deterministically — standard API testing tools don't reach the processor functions directly.

Solution

Each worker exports its processor function independently of the queue registration. Unit tests call the processor directly with fixture data. Integration tests spin up a dedicated Redis test instance and verify the full enqueue → process → database update → WebSocket notification cycle end-to-end.

WebRTC Signaling Reliability

Problem

WebRTC peer connections require a signaling channel to exchange SDP offers and ICE candidates. If the signaling server drops mid-negotiation, the connection never completes and the session is silently lost.

Solution

The Nitro WebSocket handler maintains a per-session signaling state machine. If a client reconnects during negotiation, the server replays the last unacknowledged signaling message. ICE candidate buffering prevents race conditions between offer/answer exchange and candidate delivery.


Engineering Perspective

Frontend — Nuxt 3

Vue 3 Composition API throughout — no Options API. SSR on public scripture and devotional pages for SEO; SPA mode for authenticated dashboard routes. Pinia stores scoped per domain. Nuxt auto-imports remove component boilerplate. Strict TypeScript shared between client and server layer.

API Layer — Nitro Server

H3 event handlers with Zod schema validation at every public endpoint. Service layer pattern — controllers are thin, logic lives in typed service modules. Middleware chain: auth → rate limit → logging → handler. Async work dispatched to BullMQ queues rather than awaited in the response path.

Domain-Driven Architecture

Five bounded contexts: Scripture, AI Reasoning, Community, Media, and Infrastructure. Each owns its schema, service logic, and repository layer. Cross-domain calls go through typed service interfaces — no direct DB table access across boundaries. Enables future service extraction without a rewrite.

Data Layer — Four Stores

PostgreSQL + Prisma for relational content and user data with typed migrations. Weaviate vectors indexed by book, chapter, and verse with metadata filters for scoped retrieval. Redis for session store, rate-limit counters, and pub/sub. MinIO for S3-compatible object storage with signed URL delivery.

Auth & Session Management

JWT-based auth with short-lived access tokens and Redis-backed refresh token rotation. Redis tracks active sessions enabling server-side revocation without waiting on token expiry. Rate limiting applied per-IP and per-user at the Nitro middleware layer before any handler executes. Role-based access enforced at the service layer, not just route guards.

Type Safety — End to End

Strict TypeScript across frontend, Nitro server, and shared utilities. Zod schemas at API boundaries generate both runtime validation and inferred TypeScript types — one source of truth. Prisma client types flow from the schema; no hand-written model interfaces. ESLint and Prettier enforced via pre-commit hooks.


Observability & DevOps

Structured Logging

Every request generates a structured log entry with a correlation ID threaded through all downstream calls — PostgreSQL queries, Weaviate retrievals, BullMQ dispatches, and AI inference. Errors include stack traces, request context, and the authenticated user ID. Log level controlled per environment via runtime config.

Health Checks & Circuit Breakers

Each external dependency (PostgreSQL, Redis, Weaviate, MinIO) has a dedicated health check endpoint polled by the load balancer. The AI inference path wraps Llama and OpenAI calls in a circuit breaker — if error rate exceeds threshold, new requests route to the fallback immediately rather than queuing behind timeouts.

Metrics & Alerting

Prometheus metrics track request latency, BullMQ queue depth, worker throughput, and AI inference time per model. Grafana dashboards surface queue backlog and Redis memory pressure. Alertmanager notifies on sustained queue stalls, high error rates, and inference latency spikes.

Deployment & Environments

Docker Compose for local development with full service parity — same PostgreSQL, Redis, Weaviate, and MinIO versions as production. Environment config managed via Nuxt runtime config. CI pipeline runs lint, type check, and integration tests before any deploy step executes.


Database Design

PostgreSQL Schema

Scripture modelled at verse granularity — Books → Chapters → Verses with translation variants in a separate table to support multi-translation queries without denormalization. B-tree indexes on verse reference columns; GIN indexes on full-text search fields. User data isolated in separate schemas with FK constraints enforced at the DB layer.

Weaviate Collection Design

A single ScriptureVerse collection stores embeddings for both Bible and Quran. Properties include text, translation, corpus (bible|quran), book, chapter, and verse. The corpus and translation fields act as metadata filters, enabling single-query cross-corpus retrieval without managing separate collections. Vectorizer: text2vec-transformers with a multilingual model.

Redis Key Architecture

Four namespaces: session:{userId} for refresh tokens (7-day TTL), ratelimit:{ip}:{route} for sliding window counters, presence:{roomId} for WebSocket connection state (30s TTL with heartbeat renewal), and events:{instanceId} pub/sub channels for cross-instance broadcast. No business data lives in Redis.

Prisma Migration Strategy

All schema changes go through Prisma migrate — no hand-written SQL in production. Migrations run in CI before the deploy step; the pipeline blocks if pending migrations are detected. A shadow database catches destructive migration issues in development before they reach staging. Seeder scripts populate translations and scripture content for local dev.


Future Roadmap

Near Term

Infrastructure

  • Docker Compose setup for reproducible local development
  • Kubernetes manifests with horizontal pod autoscaling
  • Structured logging with correlation IDs across all API paths
  • Health checks and circuit breakers around AI inference services
Near Term

Features

  • Mobile app via Capacitor wrapping the existing Nuxt frontend
  • Push notifications for community and prayer activity
  • Offline reading mode with local-first scripture sync
  • User-configurable daily reading plan engine
Mid Term

AI Improvements

  • Fine-tuned scripture embedding model for higher retrieval precision
  • Multi-language support with language-scoped Weaviate collections
  • AI-generated study guides from reading session history
  • On-device inference exploration via WebLLM or Ollama
Long Term

Vision

  • Federated community nodes for church and study group self-hosting
  • Plugin architecture for additional corpora (Torah, Hadith, etc.)
  • Public API for third-party devotional app integrations
  • Analytics dashboard for reading consistency and engagement patterns