back to projects
STATUS · NOT YET LIVE

Lumineltek is a personal project in active development. Not yet publicly available. This case study documents architecture and decisions from the build in progress.

CASE STUDY · LUMINELTEK · PERSONAL PROJECT

Lumineltek
RAG-as-a-Service, from scratch.

A multi-tenant SaaS platform for AI-powered document processing and semantic search. Built end-to-end: Node.js + Postgres + pgvector on the back, Next.js 15 on the front, a public API for external consumers, and billing, permissions, and grounding verification baked in as infrastructure, not features.

01 / the thesis

Most "chat with your docs" tools are demos wearing a production badge. The retrieval is a black box. The answer isn't attributable to a specific passage. Permissions are bolted on after the first enterprise customer asks for them. Billing is duct tape over Stripe.

Lumineltek is an attempt at the serious version: a RAG platform where answers are traceable, the search backend is swappable, tenants are isolated from the root of the schema, and the public API is a first-class surface. If a customer asks "which passage did this sentence come from?" the answer is always available. If they ask "who saw this document yesterday?" there's an audit trail.

02 / ingestion pipeline

Documents flow through four distinct phases. Each phase is a pg-boss worker. Each phase writes its output to disk before the next starts.

  upload ──► Phase 0         Phase 1          Phase 3         Phase 4
            conversion ──► extraction ───► chunking ────► indexing
                               │
                               ▼
                         Chandra (Datalab)
                         primary. OpenAI Vision
                         as deprecated fallback.

One endpoint handles PDF, DOCX, PPTX, XLSX, HTML, EPUB, and images. Chandra is called with include_markdown_in_chunks=true, so the response is pre-flattened blocks with both HTML and markdown. No tree walking, no HTML stripping on our side.

DECISION · CHECKPOINTING

Each phase checkpoints to disk before handing off. A worker crash during extraction resumes from the last checkpoint instead of re-running expensive LLM calls. Felt like overkill in week one. By month three it had paid for itself a dozen times over.

DECISION · STORE EVERYTHING CHANDRA GIVES

Chandra returns bbox, polygon, images, section hierarchy. We store all of it, even when the UI doesn't render it yet. Re-processing later (to build, say, an interactive page overlay that highlights source regions) is far more expensive than the storage.

03 / retrieval and search

The semantic search pipeline has six stages:

swappable search provider

A provider abstraction selects the backend at runtime:

SEARCH_PROVIDER=pgvector        # or: elasticsearch

Both are fully implemented behind a shared interface. pgvector keeps operations simple (one database). Elasticsearch scales past what pgvector comfortably handles. Having both working means we can start simple and scale horizontally without a platform rewrite.

04 / the "as-a-service" surface

multi-tenancy from the root

tenant
  └── workspace
        └── domain
              ├── roles         (40+ permission domains, fine-grained RBAC)
              ├── documents
              └── API keys

40+ permission domains is a lot. Getting the taxonomy right up front saved enormous retrofit cost later.

public API v1

API keys are first-class. Each key scoped to a tenant, rate-limited, usage-metered. The platform bills on real consumption (tokens, extractions, storage), not "10 docs/month" heuristics. Payments flow through Lemon Squeezy (plans, top-ups, subscriptions, webhooks).

integrations

Confluence and Jira pipelines pull content into the ingestion pipeline automatically. A generic DMS integration handles arbitrary document sources. Same 4-phase pipeline applies regardless of origin.

05 / quality infrastructure

Most RAG demos ship without any of these. They're why a platform version needs them.

GROUNDING AS INFRASTRUCTURE

In compliance-heavy contexts, "the answer looks right" isn't enough. Reviewers need to see which passage supports each claim, confidence on each, and an audit trail of who accessed what. Lumineltek treats all of that as infrastructure, not a feature.

06 / what I learned

TAKEAWAY

The serious version of RAG is almost entirely about the stuff around the model: ingestion quality, attribution, permissions, billing, verification. Build that layer right and the product earns trust. Skip it and no amount of prompt tuning will save you.

STATUSOPEN TO WORK
PAGE/projects/lumineltek-rag-saas
--:--:--