Play 06
Document Intelligence
Medium🔧 Skeleton
Extract, classify, and structure document data with OCR + LLM.
Feed PDFs, invoices, receipts, and forms into Azure Document Intelligence for OCR, then GPT-4o extracts structured fields into typed JSON. Cosmos DB stores results. The pipeline handles multi-page documents, handwriting, tables, and stamps. PII masking built in.
Architecture Pattern
OCR+LLM extraction, structured output, form recognition
Azure Services
Document IntelligenceAzure OpenAI (gpt-4o)Blob StorageCosmos DB
DevKit (.github Agentic OS)
- agent.md — doc extraction persona
- instructions.md — extraction rules
- mcp/index.js — doc validation tools
- plugins/ — extraction pipeline, form parser, data mapper
TuneKit (AI Config)
- config/openai.json — extraction prompts, gpt-4o multimodal
- config/extraction-schema.json — field definitions per doc type
- config/guardrails.json — PII masking
- evaluation/test-set.jsonl — doc samples per type
Tuning Parameters
Extraction promptsConfidence thresholdsField schemasDocument classification rules
Estimated Cost
Dev/Test
$80–200/mo
Production
$1.2K–3K/mo