AIVA

Document Intelligence Platform

AIVA is a document-intelligence platform for internal company documents: HR, contracts, financials, and asset records. The hard part was not orchestration; it was making messy PDFs, scans, blueprints, and rotated tables usable. I built a custom Textract parser that turns those files into tagged, searchable content, then a LangGraph RAG pipeline that answers over it without flooding the model context. It now runs as a multi-tenant platform with streaming chat and a live SharePoint crawl.

Role: Solo Founder + Lead Engineer
Period: 2024 to present
Status: Production

AWS TextractMicrosoft GraphLangGraphWeaviateFastMCPSvelteKit

— Chapter 01

System shape

How the system fits together.

Click a block to zoom in

AIVA turns messy enterprise documents into searchable knowledge, then answers questions over them. The ingestion side now crawls a full company SharePoint.

Fig. 01 — AIVA architecture

— Chapter 02

Decisions and outcomes

The calls that shaped it.

01

The parser solves the data problem head-on: Textract classifies every region of every page, tables become their own CSV files, figures are pulled by exact bounding box and read by a vision model (diagrams → Mermaid), and each document lands as one clean, normalized folder. Everything else stands on it.
02

Tables never get inlined, since that burns context and drowns the search. Each stays a file behind an inline tag the chunker can't split, and an MCP server turns it into a tool the model queries at runtime to filter, aggregate, and join real rows.
03

Every question is routed by cost: a lookup answers from a template in ~200 ms; a real one runs the full path. Three-layer discovery (keyword catalog → summaries → filtered hybrid search), RRF fusion, optional Cohere rerank, compose, then verify the answer against its own evidence. Feedback tunes the ranking.
04

The recent push was scale: from a 5-document prototype to a real company SharePoint crawl. One orchestrator now handles local files and SharePoint, skips unchanged content, keeps moved documents attached to the same identity, and streams crawl progress into the console.
05

I run it like production, not a demo: every sub-project goes spec → adversarial review by a second model → plan → TDD, one commit each, and that reviewer catches a real issue at every gate. The docs say plainly what's proven by tests versus what hasn't had a real end-to-end run.
06

The console around it: a Svelte streaming chat (live pipeline · evidence · citations), the live SharePoint corpus tree, an upload → parse → index pipeline, and a per-query inspector. Plus RAGAS + side-by-side eval, LangSmith tracing, and every reported bug locked behind a regression test.

— Aside

The interesting work isn't the stack. It's the boundaries.

— Chapter 03

How it runs

What it runs on.

01
Custom AWS Textract parser (async API; LAYOUT / TABLES / FORMS); figures captured by a vision model (diagrams → Mermaid); three output modes (text · text+folders · tag-pointer)
02
Tag-aware chunker that never splits a tag; tables kept as files, referenced by [CSV_MCP:…] tags
03
LangGraph three-path query graph (template · suggestion · full RAG) with speculative draft-and-verify
04
Three-layer document discovery: SQLite FTS catalog → AI summary vectors → filtered chunk hybrid
05
Weaviate hybrid (BM25F + vector) with native multi-tenant isolation per department; Chunk vectors text-embedding-3-large (3072-d)
06
Reciprocal-rank fusion across search legs + optional Cohere cross-encoder rerank; per-chunk user feedback adjusts ranking
07
FastMCP server: an LLM agent that chains schema / filter / aggregate / join / validate tools over pandas DataFrames
08
Source-agnostic ingestion orchestrator (DocumentSource: local + SharePoint) with durable, single-transaction, restartable jobs
09
SharePoint connector: app-only Microsoft Graph auth, throttle-aware parallel crawl, live folder tree (since-date filter · scan-lock · delta-link checkpoint)
10
Content-addressed identity & cache: skip-unchanged, dedup, stable doc-id aliases, idempotent 16-department tenant resolver
11
OpenAI GPT-4.1-mini / nano across routing, extraction, and compose (structured + streamed); GPT-4o-mini for summaries; LangSmith tracing
12
Svelte 5 + SvelteKit + Tailwind console: streaming chat · live SharePoint corpus tree · parse → index pipeline · per-query inspector; RAGAS + side-by-side eval
13
Docker Compose for the app, Weaviate, and Redis / Valkey

— Keep exploring

More from the workshop.

← Previous

AgentOS

Enterprise Multi-Agent Platform

Forecasting and Failure Prediction

In-house ML platform