Material Kai Vision Platform - Documentation
AI-Powered Material Intelligence System
Production-grade platform serving 5,000+ users with 99.5%+ uptime. Transforms material catalog PDFs into searchable knowledge using 12 AI models across a 14-stage processing pipeline.
📚 Documentation
🎯 Start Here
INDEX.md - Complete documentation index with learning paths
BLOG-POST-OVERVIEW.md - Comprehensive blog post overview (6,000+ words)
📖 Core Documentation
overview.md - Complete platform overview
- Executive summary with key metrics
- Architecture overview
- AI models integration (12 models)
- 14-stage PDF processing pipeline
- Multi-modal search capabilities
- Database architecture
- Production metrics
system-architecture.md - System architecture & design
- Three-tier architecture
- Hybrid architecture pattern
- Technology stack
- Authentication & security
- API endpoints (113)
- Scalability & monitoring
ai-models-guide.md - AI models reference
- 12 AI models overview
- Claude Sonnet 4.5 & Haiku 4.5
- GPT-4o & embeddings
- Llama 4 Scout 17B Vision
- OpenAI CLIP (5 types)
- Model usage by stage
- Cost optimization
pdf-processing-pipeline.md - PDF processing pipeline
- 14-stage pipeline breakdown
- Products + Metadata extraction (inseparable)
- Document entities (certificates, logos, specs)
- Stage-by-stage details
- Checkpoint recovery (9 checkpoints)
- Performance metrics
- API endpoint
api-endpoints.md - API reference
- 113 endpoints across 14 categories
- RAG Routes (25)
- Admin Routes (18)
- Search Routes (18)
- Documents Routes (11)
- AI Services Routes (10)
- Images Routes (5)
- Document Entities Routes (5) ✨ NEW
- PDF Routes (4)
- Products Routes (3)
- Embeddings Routes (3)
- Together AI Routes (3)
- Anthropic Routes (3)
- Monitoring Routes (3)
- AI Metrics Routes (2)
database-schema-complete.md - Database schema
- Core tables (products, chunks, images, document_entities)
- Products + Metadata architecture (JSONB)
- Document entities (certificates, logos, specifications)
- Product-document relationships
- Relationship tables with relevance scores
- Row-Level Security (RLS)
- Indexes & performance
- Storage capacity
- Backup & recovery
job-queue-system.md - Job queue & async processing
- Supabase-native job queue
- Checkpoint-based recovery
- Auto-recovery for stuck jobs
- Real-time progress tracking
- Priority queuing
- Health monitoring
features-guide.md - Platform features
- Intelligent PDF processing
- Multi-modal search
- Materials catalog
- Product management
- Admin dashboard
- RAG system
- Real-time monitoring
- Metadata management
- Image management
- Workspace isolation
- Batch processing
- Security features
deployment-guide.md - Production deployment
- Deployment architecture
- Frontend (Vercel)
- Backend (Self-hosted)
- Database (Supabase)
- CI/CD pipeline
- Secrets management
- Monitoring & alerts
- Rollback procedures
troubleshooting-guide.md - Common issues & solutions
- Critical issues (API down, database, OOM)
- Common issues (PDF processing, search, latency, auth)
- Performance optimization
- Support resources
product-discovery-architecture.md - Product discovery system
- Products + Metadata architecture (inseparable)
- Document entities (certificates, logos, specifications)
- Factory/group identification for agentic queries
- Product-document relationships
- API endpoints for entity management
- Future extensibility (marketing, bank statements)
🎓 Learning Paths
For New Developers
- overview.md - Understand the platform
- system-architecture.md - Learn the architecture
- pdf-processing-pipeline.md - Understand the pipeline
- job-queue-system.md - Learn async job processing
- api-endpoints.md - Learn the API
- deployment-guide.md - Understand deployment
For API Integration
- api-endpoints.md - All endpoints
- ai-models-guide.md - AI models used
- database-schema-complete.md - Data structure
- system-architecture.md - Authentication
For Operations
- deployment-guide.md - Deployment process
- job-queue-system.md - Job monitoring & recovery
- troubleshooting-guide.md - Common issues
- system-architecture.md - Monitoring
- database-schema-complete.md - Backup strategy
For Product Managers
- overview.md - Platform overview
- features-guide.md - All features
- pdf-processing-pipeline.md - Processing pipeline
- ai-models-guide.md - AI capabilities
📊 Quick Reference
Key Numbers
- 5,000+ users in production
- 99.5%+ uptime SLA
- 12 AI models integrated
- 14 processing pipeline stages
- 74+ API endpoints
- 9 API categories
- 6 embedding types
- 200+ metafield types
- 95%+ product detection accuracy
- 85%+ search relevance
- 90%+ material recognition accuracy
Technology Stack
- Frontend: React 18, TypeScript, Vite, Shadcn/ui, Vercel
- Backend: FastAPI, Python 3.11, Uvicorn, self-hosted
- Database: PostgreSQL 15, pgvector, Supabase
- AI: Claude, GPT-4o, Llama, CLIP, LlamaIndex
API Categories
- PDF Processing (12 endpoints)
- Document Management (13 endpoints)
- Search APIs (8 endpoints)
- Image Analysis (5 endpoints)
- RAG System (7 endpoints)
- Embeddings (3 endpoints)
- Products (6 endpoints)
- Admin & Monitoring (8 endpoints)
- AI Services (11+ endpoints)
🔗 External Resources
API Documentation:
- Swagger UI:
/docs
- ReDoc:
/redoc
- OpenAPI Schema:
/openapi.json
Dashboards:
Repositories:
📝 Documentation Standards
All documentation follows these standards:
- ✅ Clear, concise language
- ✅ Code examples where applicable
- ✅ Structured with headers
- ✅ Links to related docs
- ✅ No task lists or planning documents
- ✅ Production-focused content
- ✅ Updated regularly
� Support
For questions or issues:
Last Updated: October 31, 2025
Version: 1.0.1
Status: Production
Total Documentation: 11 comprehensive guides
Total Lines: 6,000+
Coverage: 100% of platform features