Engineering Overview

Bible Verse is a full-stack SaaS platform built around a domain-driven architecture with five independently scalable system boundaries: scripture content delivery, AI-assisted study, real-time community interaction, async media processing, and infrastructure services.

The core engineering challenge is not storing scripture — it is building a system that delivers contextually accurate AI responses, handles concurrent real-time sessions, and processes media asynchronously without blocking the main request path. Each of these problems requires a different architectural pattern, and the platform composes them into a single deployment unit.

The Nitro server coordinates across domains without owning business logic: resolving scripture from PostgreSQL, triggering Weaviate retrievals for AI context, dispatching BullMQ jobs for media, and publishing Redis events for real-time broadcast.

37K+

Verses Indexed

Bible + Quran in Weaviate

AI Models

Llama · OpenAI · Whisper · ElevenLabs · SD

Job Queues

Transcoding · Notifications · AI Batch

Bounded Domains

Independently scalable contexts

Technical Highlights

RAG Pipeline — Weaviate Vector Search

Scripture content is embedded into Weaviate's vector database. At query time, semantically relevant passages are retrieved and injected into the prompt before inference — grounding AI responses in actual scripture text rather than relying on model memory.

Hybrid AI Architecture

Llama 3.2 8B handles primary inference for cost efficiency. A custom orchestration layer classifies query complexity and routes complex or low-confidence cases to the OpenAI API as a fallback — keeping routine queries cheap while preserving quality at the edge.

BullMQ Async Job Queues

Three dedicated queues handle work outside the request path: audio transcoding, notification dispatch, and AI inference batching. No heavy operation blocks API response time — all long-running work is enqueued and processed by dedicated workers.

WebSocket + WebRTC Real-Time Layer

Community features use WebSockets for message sync and reaction broadcasting. Livestream sessions use WebRTC for peer-to-peer video. Redis pub/sub routes events across server instances so state stays consistent regardless of which node a client connects to.

FFmpeg Media Pipeline

Audio and video uploads are accepted immediately and enqueued as BullMQ jobs. A dedicated worker runs FFmpeg transformations, writes processed output to MinIO, and notifies the client via WebSocket on completion. Raw uploads never touch the public delivery path.

Cross-Corpus Semantic Search

Bible and Quran content share a single Weaviate embedding collection. A vector similarity query with metadata filters can surface semantically related passages from either corpus in one call — no separate indices, no post-hoc merging.

System Architecture

Bible Verse follows a modular, domain-driven structure. Each bounded context owns its own data access, service layer, and scaling surface. The Nitro server acts as the coordination layer without being a bottleneck — domain services communicate through typed interfaces, not shared tables.

Loading diagram…

AI study request: client → Nitro API → Weaviate retrieval → prompt construction → Llama inference (or OpenAI fallback) → SSE stream → client.

Livestream session: WebRTC signaling through Nitro → peer connection established → Redis pub/sub synchronizes chat and reactions across all participants.

System Components

System Architecture

Engineering Overview

Five bounded domains compose the platform at runtime. Each domain owns its schema, service logic, and data access — no cross-domain direct database queries.

Scripture & Content Engine

Bible + Quran — Content Delivery

Multi-translation rendering from a structured PostgreSQL schema with verse-level granularity
Per-user highlights, annotations, bookmarks, and reading progress stored relationally
Weaviate vector index shared across Bible and Quran for unified semantic search
Devotional content pipeline with author tooling and scheduled publish support
Full-text and semantic search paths run in parallel; results merged by relevance score

AI Reasoning Layer

Bible Logic — RAG-Backed Inference

Weaviate retrieval fetches top-K passages by vector similarity before any inference call
Prompt orchestration layer assembles context window from retrieved passages + user query
Llama 3.2 8B handles primary inference; OpenAI API activated above a complexity threshold
Server-Sent Events stream tokens to the client as they are generated
Confidence scoring triggers fallback routing without exposing the switch to the user

Real-Time Engagement

Live + Community — WebSocket & WebRTC

WebRTC peer-to-peer video with Nitro-backed signaling for low-latency livestreams
WebSocket event bus broadcasts chat messages, reactions, and thread updates in real time
Redis pub/sub synchronises events across multiple server instances without sticky sessions
BullMQ handles async notification dispatch so community alerts never block the main thread
Presence tracking and user connection state maintained in Redis with TTL expiry

Media Processing Pipeline

Upload → Queue → Transcode → Deliver

Uploads accepted immediately via MinIO pre-signed URL — API stays non-blocking
BullMQ job enqueued on upload; dedicated FFmpeg worker processes audio and video
Output formats include MP3, MP4, and HLS segment sets for adaptive streaming
Processed assets written to MinIO; DB record updated; client notified via WebSocket
Failed jobs retry with exponential back-off; dead-letter queue captures persistent failures

Auth & Infrastructure Services

Sessions · Rate Limits · Data Access

JWT access tokens (15 min TTL) with Redis-backed refresh token rotation
Server-side session revocation via Redis without waiting on token expiry
Per-IP and per-user rate limiting enforced at Nitro middleware before any handler runs
Role-based access control evaluated at the service layer — route guards are UI-only
PostgreSQL connection pooling via pg; Prisma client typed from schema migrations

Core Technologies

Nuxt 3Vue 3TypeScriptNitro ServerPrisma ORMPostgreSQLRedisBullMQWeaviateMinIOFFmpegWebRTCWebSocketsLlama 3.2 8BOpenAI APIWhisper STTElevenLabs TTSStable Diffusion 2.1ZodPinia

Challenges Solved

Real-Time State Consistency Across Instances

Problem

Chat messages, reactions, and community updates needed to stay consistent for users connected to different server instances simultaneously.

Solution

Redis pub/sub acts as the cross-instance event bus. WebSocket handlers publish events to a Redis channel; all server instances subscribe and broadcast to their locally connected clients. PostgreSQL is the authoritative state store — Redis is the delivery mechanism only.

Context-Accurate AI for Theological Queries

Problem

General-purpose models hallucinate or over-generalize when answering specific scripture-referenced or theological questions.

Solution

A RAG pipeline retrieves relevant passages from Weaviate before inference. The prompt is assembled with retrieved context injected before the user query. The orchestration layer escalates low-confidence queries from Llama to the OpenAI API rather than letting the local model guess.

Non-Blocking Media Processing

Problem

Audio and video uploads require CPU-intensive FFmpeg transcoding that cannot run synchronously in the API request path without causing timeouts.

Solution

Uploads are accepted immediately and enqueued as BullMQ jobs. A dedicated worker handles FFmpeg transforms, writes output to MinIO, updates the database record, and notifies the client via WebSocket. The API layer remains non-blocking throughout.

Semantic Search Across Multiple Corpora

Problem

Users needed meaning-based passage discovery across Bible and Quran without knowing exact verse references. Keyword search alone was not sufficient.

Solution

Both corpora share a single Weaviate collection. Vector similarity queries with metadata filters scope results to one or both texts. A single retrieval call surfaces related passages from different scriptures without separate indices or post-hoc merging.

Hybrid AI Cost vs. Quality Trade-Off

Problem

Routing all queries through the OpenAI API is cost-prohibitive at scale. Running all queries through the local model alone risks quality degradation on complex theological edge cases.

Solution

The orchestration layer classifies query complexity using lightweight heuristics before inference. Standard Q&A routes to Llama 3.2 8B locally. Queries exceeding a complexity threshold escalate to the OpenAI API. The routing is transparent to the user — only the backend path changes.

Domain Isolation Without Microservices Overhead

Problem

Five functionally distinct domains needed clear boundaries, but splitting into separate services would introduce network latency, deployment complexity, and distributed tracing overhead at current scale.

Solution

Domains are isolated as typed service modules within the monorepo. No domain accesses another's database tables directly. Each exposes a contract consumed by the Nitro API layer — the same isolation benefits as microservices, in a single deployable unit until scale justifies extraction.

Testing Async BullMQ Workers

Problem

BullMQ workers run outside the HTTP request cycle, making them hard to test deterministically — standard API testing tools don't reach the processor functions directly.

Solution

Each worker exports its processor function independently of the queue registration. Unit tests call the processor directly with fixture data. Integration tests spin up a dedicated Redis test instance and verify the full enqueue → process → database update → WebSocket notification cycle end-to-end.

WebRTC Signaling Reliability

Problem

WebRTC peer connections require a signaling channel to exchange SDP offers and ICE candidates. If the signaling server drops mid-negotiation, the connection never completes and the session is silently lost.

Solution

The Nitro WebSocket handler maintains a per-session signaling state machine. If a client reconnects during negotiation, the server replays the last unacknowledged signaling message. ICE candidate buffering prevents race conditions between offer/answer exchange and candidate delivery.

Engineering Perspective

Frontend — Nuxt 3

Vue 3 Composition API throughout — no Options API. SSR on public scripture and devotional pages for SEO; SPA mode for authenticated dashboard routes. Pinia stores scoped per domain. Nuxt auto-imports remove component boilerplate. Strict TypeScript shared between client and server layer.

API Layer — Nitro Server

H3 event handlers with Zod schema validation at every public endpoint. Service layer pattern — controllers are thin, logic lives in typed service modules. Middleware chain: auth → rate limit → logging → handler. Async work dispatched to BullMQ queues rather than awaited in the response path.

Domain-Driven Architecture

Five bounded contexts: Scripture, AI Reasoning, Community, Media, and Infrastructure. Each owns its schema, service logic, and repository layer. Cross-domain calls go through typed service interfaces — no direct DB table access across boundaries. Enables future service extraction without a rewrite.

Data Layer — Four Stores

PostgreSQL + Prisma for relational content and user data with typed migrations. Weaviate vectors indexed by book, chapter, and verse with metadata filters for scoped retrieval. Redis for session store, rate-limit counters, and pub/sub. MinIO for S3-compatible object storage with signed URL delivery.

Auth & Session Management

JWT-based auth with short-lived access tokens and Redis-backed refresh token rotation. Redis tracks active sessions enabling server-side revocation without waiting on token expiry. Rate limiting applied per-IP and per-user at the Nitro middleware layer before any handler executes. Role-based access enforced at the service layer, not just route guards.

Type Safety — End to End

Strict TypeScript across frontend, Nitro server, and shared utilities. Zod schemas at API boundaries generate both runtime validation and inferred TypeScript types — one source of truth. Prisma client types flow from the schema; no hand-written model interfaces. ESLint and Prettier enforced via pre-commit hooks.

Observability & DevOps

Structured Logging

Every request generates a structured log entry with a correlation ID threaded through all downstream calls — PostgreSQL queries, Weaviate retrievals, BullMQ dispatches, and AI inference. Errors include stack traces, request context, and the authenticated user ID. Log level controlled per environment via runtime config.

Health Checks & Circuit Breakers

Each external dependency (PostgreSQL, Redis, Weaviate, MinIO) has a dedicated health check endpoint polled by the load balancer. The AI inference path wraps Llama and OpenAI calls in a circuit breaker — if error rate exceeds threshold, new requests route to the fallback immediately rather than queuing behind timeouts.

Metrics & Alerting

Prometheus metrics track request latency, BullMQ queue depth, worker throughput, and AI inference time per model. Grafana dashboards surface queue backlog and Redis memory pressure. Alertmanager notifies on sustained queue stalls, high error rates, and inference latency spikes.

Deployment & Environments

Docker Compose for local development with full service parity — same PostgreSQL, Redis, Weaviate, and MinIO versions as production. Environment config managed via Nuxt runtime config. CI pipeline runs lint, type check, and integration tests before any deploy step executes.

Database Design

PostgreSQL Schema

Scripture modelled at verse granularity — Books → Chapters → Verses with translation variants in a separate table to support multi-translation queries without denormalization. B-tree indexes on verse reference columns; GIN indexes on full-text search fields. User data isolated in separate schemas with FK constraints enforced at the DB layer.

Weaviate Collection Design

A single ScriptureVerse collection stores embeddings for both Bible and Quran. Properties include text, translation, corpus (bible|quran), book, chapter, and verse. The corpus and translation fields act as metadata filters, enabling single-query cross-corpus retrieval without managing separate collections. Vectorizer: text2vec-transformers with a multilingual model.

Redis Key Architecture

Four namespaces: session:{userId} for refresh tokens (7-day TTL), ratelimit:{ip}:{route} for sliding window counters, presence:{roomId} for WebSocket connection state (30s TTL with heartbeat renewal), and events:{instanceId} pub/sub channels for cross-instance broadcast. No business data lives in Redis.

Prisma Migration Strategy

All schema changes go through Prisma migrate — no hand-written SQL in production. Migrations run in CI before the deploy step; the pipeline blocks if pending migrations are detected. A shadow database catches destructive migration issues in development before they reach staging. Seeder scripts populate translations and scripture content for local dev.

Future Roadmap

Near Term

Infrastructure

Docker Compose setup for reproducible local development
Kubernetes manifests with horizontal pod autoscaling
Structured logging with correlation IDs across all API paths
Health checks and circuit breakers around AI inference services

Near Term

Features

Mobile app via Capacitor wrapping the existing Nuxt frontend
Push notifications for community and prayer activity
Offline reading mode with local-first scripture sync
User-configurable daily reading plan engine

Mid Term

AI Improvements

Fine-tuned scripture embedding model for higher retrieval precision
Multi-language support with language-scoped Weaviate collections
AI-generated study guides from reading session history
On-device inference exploration via WebLLM or Ollama

Long Term

Vision

Federated community nodes for church and study group self-hosting
Plugin architecture for additional corpora (Torah, Hadith, etc.)
Public API for third-party devotional app integrations
Analytics dashboard for reading consistency and engagement patterns

Menu

Bible Verse RAG · Hybrid Inference · Real-Time

RAG Pipeline — Weaviate Vector Search

Hybrid AI Architecture

BullMQ Async Job Queues

WebSocket + WebRTC Real-Time Layer

FFmpeg Media Pipeline

Cross-Corpus Semantic Search

Engineering Overview

Scripture & Content Engine

AI Reasoning Layer

Real-Time Engagement

Media Processing Pipeline

Auth & Infrastructure Services

Real-Time State Consistency Across Instances

Context-Accurate AI for Theological Queries

Non-Blocking Media Processing

Semantic Search Across Multiple Corpora

Hybrid AI Cost vs. Quality Trade-Off

Domain Isolation Without Microservices Overhead

Testing Async BullMQ Workers

WebRTC Signaling Reliability

Frontend — Nuxt 3

API Layer — Nitro Server

Domain-Driven Architecture

Data Layer — Four Stores

Auth & Session Management

Type Safety — End to End

Structured Logging

Health Checks & Circuit Breakers

Metrics & Alerting

Deployment & Environments

PostgreSQL Schema

Weaviate Collection Design

Redis Key Architecture

Prisma Migration Strategy

Infrastructure

Features

AI Improvements

Vision