System Architecture

Complete technical architecture of Material Kai Vision Platform.

🏗️ Three-Tier Architecture

┌─────────────────────────────────────────────────────────────┐ │ FRONTEND TIER (Vercel Edge Network) │ │ React 18 + TypeScript + Vite + Shadcn/UI │ │ - Materials Catalog │ │ - Search Hub (Semantic, Vector, Hybrid, Visual) │ │ - Admin Dashboard │ │ - Real-time Monitoring │ │ - 3D Material Visualization │ └─────────────────────────────────────────────────────────────┘ ↓ (HTTPS REST API) ↓ ┌─────────────────────────────────────────────────────────────┐ │ API TIER (MIVAA - FastAPI) │ │ Python 3.11 + FastAPI + Uvicorn │ │ Deployed: v1api.materialshub.gr │ │ - 115 REST API endpoints (15 categories) │ │ - 9-stage PDF processing pipeline (optimized) │ │ - Memory-safe image processing (10-15MB constant) │ │ - Real-time CLIP embedding generation │ │ - RAG system (Claude 4.5 + Direct Vector DB) │ │ - Search APIs (Multi-Vector, Semantic, Hybrid) │ │ - AI Services (Claude 4.5, GPT, Qwen3-VL, SigLIP) │ │ - Product Management + Metadata Management │ │ - Duplicate Detection & Merging (factory-based) │ │ - Admin & Monitoring │ │ - Background job processing │ └─────────────────────────────────────────────────────────────┘ ↓ (PostgreSQL + pgvector) ↓ ┌─────────────────────────────────────────────────────────────┐ │ DATA TIER (Supabase PostgreSQL 15 + pgvector) │ │ - Documents & Chunks │ │ - Products & Metafields │ │ - Images & Embeddings │ │ - Vector Indexes (pgvector) │ │ - Row-Level Security (RLS) │ │ - Real-time subscriptions │ │ - Storage (Supabase Storage) │ └─────────────────────────────────────────────────────────────┘

🔌 Hybrid Architecture Pattern

Key Design: Frontend calls MIVAA directly (no proxy Edge Functions)

Frontend (Vercel) ↓ └─→ MIVAA API (v1api.materialshub.gr) ↓ ├─→ Supabase (Data) ├─→ OpenAI (Embeddings) ├─→ Anthropic (Claude) ├─→ HuggingFace Endpoint (Qwen3-VL) └─→ Supabase Storage (Images)

Benefits:

✅ Reduced latency (no proxy layer)
✅ Lower costs (fewer Edge Functions)
✅ Simpler architecture
✅ Better error handling
✅ Direct authentication

📊 Database Schema

Core Tables

workspaces

Multi-tenant isolation
User workspace association
Metadata storage

documents

PDF metadata
Processing status
File references
Workspace association

chunks

Text segments
Quality scores
Document references
Embedding references

products

Extracted products
Metadata
Chunk associations
Image associations

images

Extracted images
Analysis results
Quality scores
Storage references

metafields

Structured metadata
Product associations
Chunk associations
Type definitions

embeddings

Vector storage (pgvector 0.8.0, halfvec float16 — 50% storage savings)
7 types: text, visual, understanding, color, texture, style, material
HNSW + IVFFlat similarity indexes (halfvec_cosine_ops)
Chunk/image references

background_jobs

Async job tracking
Status monitoring
Progress tracking
Error handling

job_progress

Real-time progress updates
Stage tracking
Checkpoint data
Performance metrics

🔐 Authentication & Security

Triple Authentication Support

Supabase JWT (Frontend)
- HS256 algorithm
- "authenticated" audience
- 24-hour expiry
- User identification
MIVAA JWT (Internal)
- Service-to-service
- Long-lived tokens
- API operations
API Keys (External)
- Simple authentication
- Rate limiting
- External integrations

Row-Level Security (RLS)

All tables use RLS policies that restrict access based on workspace membership. Users can only read, insert, update, and delete data that belongs to their own workspace, enforced via auth.uid() checks.

🚀 API Endpoints (108 - Consolidated from 113)

14 Categories

RAG & Document Processing (27 endpoints)
- Upload, extract, process PDFs (consolidated from /api/pdf/extract/*)
- Job status tracking
- Progress streaming
- Metadata management (scope detection, application, listing, statistics)
- Document upload, query, chat
- Search with multiple strategies
Search APIs (6 endpoints)
- Semantic search
- Vector search
- Hybrid search
- Visual search
- Material search
- Multi-vector search
Admin Routes (18 endpoints)
- Job management and monitoring
- System health and metrics
- Data backup and cleanup
- Metadata management
Document Entities (5 endpoints)
- Certificates management
- Logos management
- Specifications management
- Entity relationships
Products (3 endpoints)
- Product management
- Product relationships
Images (5 endpoints)
- Image analysis
- Batch processing
- Similarity search
- OCR processing
AI Services (10 endpoints)
- Classification
- Boundary detection
- Validation
- Enrichment
- Product discovery
Background Jobs (7 endpoints)
- Job creation
- Status tracking
- Progress updates
- Statistics
Anthropic APIs (3 endpoints)
- Claude integration
- Vision analysis
HuggingFace Endpoint APIs (3 endpoints)
- Qwen3-VL integration
- Vision analysis
Monitoring Routes (3 endpoints)
- System health
- Service status
- Performance metrics
AI Metrics Routes (2 endpoints)
- Model performance
- Usage statistics

Consolidation Notes:

✅ PDF Extraction endpoints (/api/pdf/extract/*) consolidated into /api/rag/documents/upload
✅ All extraction functionality available via RAG pipeline with deep processing mode
✅ Internal utilities preserved in app/core/extractor.py

🤖 AI Integration

12 AI Models

Anthropic:

Claude Sonnet 4.5 (Product discovery, enrichment)
Claude Haiku 4.5 (Fast validation)
Semantic Chunking (Text segmentation)

OpenAI:

GPT-4o (Alternative discovery)
(text-embedding-3-small retired 2026-04 — Voyage AI is the sole text embedder)

Voyage AI:

voyage-3.5 (Text + understanding embeddings, 1024D)

SigLIP2 (SLIG):

Visual embeddings (768D) via HuggingFace Cloud Endpoint
5 collections: visual, color, texture, style, material

HuggingFace Endpoint:

Qwen3-VL 32B Vision (Image analysis, OCR → feeds understanding embeddings via Voyage AI)

Direct Vector DB RAG:

Claude 4.5 + Multi-Vector Search (Document retrieval, synthesis)

📈 Scalability

Horizontal Scaling

Frontend:

Vercel Edge Network (global CDN)
Auto-scaling
99.99% uptime SLA

API:

FastAPI with Uvicorn
Load balancing
Horizontal pod autoscaling
Connection pooling

Database:

Supabase managed PostgreSQL
Automatic backups
Read replicas
pgvector indexes

Performance Optimization

Caching:

Redis for frequently accessed data
Query result caching
Embedding caching

Indexing:

pgvector halfvec indexes for similarity search (HNSW + IVFFlat)
Full-text search indexes
Composite indexes

Batch Processing:

Batch embeddings
Batch image analysis
Batch product creation

🔄 Data Flow

PDF Upload Flow

User uploads PDF (Frontend) ↓
Frontend calls MIVAA API ↓
MIVAA creates job record ↓
Background task starts ↓
14-stage pipeline executes ↓
Progress updates to database ↓
Frontend polls for updates ↓
Results stored in database ↓
Frontend displays results

Search Flow

User enters search query (Frontend) ↓
Frontend calls MIVAA search API ↓
Query Understanding (GPT-4o-mini) → Extracts: colors, finish, dimensions, pattern, style, material_type → Selects weight profile (e.g., "color_finish", "specification", "balanced") ↓
Dynamic weight profile applied to 7-vector fusion → 7 profiles: product_name, color_finish, specification, texture_pattern, style_aesthetic, material_search, balanced ↓
Parallel embedding search (asyncio.gather) → Text + Visual + Understanding + Color + Texture + Style + Material ↓
Weighted score fusion using selected profile ↓
Metadata filtering + soft boosts ↓
Results returned to frontend (with weight_profile in metadata)

🛠️ Technology Stack

Frontend:

React 18
TypeScript
Vite
Shadcn/ui
TailwindCSS
Vercel deployment

Backend:

FastAPI
Python 3.11
Uvicorn
Pydantic
SQLAlchemy

Database:

PostgreSQL 15
pgvector
Supabase
Redis (optional)

AI Services:

OpenAI API
Anthropic API (Claude 4.5)
HuggingFace Endpoint API (Qwen3-VL 32B)
Voyage AI (Embeddings)

Infrastructure:

Vercel (Frontend)
Self-hosted server (Backend)
Supabase (Database)
Supabase Storage (Images)

📊 Monitoring & Observability

Metrics

Request latency
Error rates
Processing time
API usage
Database performance
AI model costs

Logging

Structured logging
Error tracking
Performance profiling
Audit logs

Alerting

Health checks
Error thresholds
Performance degradation
Resource limits

🔒 Security Measures

✅ HTTPS/TLS encryption ✅ JWT authentication ✅ Row-Level Security (RLS) ✅ API rate limiting ✅ Input validation ✅ SQL injection prevention ✅ CORS configuration ✅ Audit logging

📈 Production Metrics

Uptime: 99.5%+
API Endpoints: 110 (15 categories)
Processing Speed: 1-15 minutes per PDF
Accuracy: 95%+ product detection
Scalability: 5,000+ concurrent users
Data Volume: 100,000+ products indexed

Last Updated: November 3, 2025 Version: 1.0.0 Status: Production