AI Models Integration Guide

Last Updated: 2025-12-26

Complete reference of all AI models used across the Material KAI Vision Platform.


AI Models Overview

Model Provider Purpose Capability Cost (per 1M tokens)
Text Generation
Claude Sonnet 4.5 Anthropic Product discovery, enrichment 95%+ accuracy $3 input / $15 output
Claude Haiku 4.5 Anthropic Fast validation Real-time $0.80 input / $4 output
Claude Opus 4.5 Anthropic Complex reasoning Highest accuracy $15 input / $75 output
GPT-4o OpenAI Alternative discovery 94%+ accuracy $2.50 input / $10 output
GPT-4o Mini OpenAI Lightweight tasks Fast & cheap $0.15 input / $0.60 output
Text Embeddings
voyage-3.5 Voyage AI PRIMARY Text embeddings 1024D vectors $0.06 input
voyage-3 Voyage AI Alternative text embeddings 1024D vectors $0.06 input
voyage-3-lite Voyage AI Lightweight embeddings 512D vectors $0.02 input
text-embedding-3-small OpenAI LEGACY (CI changelog only, retired from production 2026-04) 1536D vectors $0.02 input
Vision Models
Qwen3-VL-32B-Instruct HuggingFace Endpoint PRIMARY Vision analysis State-of-the-art OCR Cloud endpoint
Visual Embeddings
SLIG (SigLIP2) Visual HuggingFace Endpoint General visual embeddings 768D vectors Cloud endpoint
SLIG (SigLIP2) Color HuggingFace Endpoint Color-guided embeddings 768D vectors Cloud endpoint
SLIG (SigLIP2) Texture HuggingFace Endpoint Texture-guided embeddings 768D vectors Cloud endpoint
SLIG (SigLIP2) Style HuggingFace Endpoint Style-guided embeddings 768D vectors Cloud endpoint
SLIG (SigLIP2) Material HuggingFace Endpoint Material-guided embeddings 768D vectors Cloud endpoint

Model Details

1. Claude Sonnet 4.5 (Anthropic)

Purpose: Product discovery, enrichment, validation

Capabilities:

Performance:

When to Use:


2. Claude Haiku 4.5 (Anthropic)

Purpose: Fast validation, real-time processing

Capabilities:

Performance:

When to Use:


3. GPT-4o (OpenAI)

Purpose: Alternative product discovery

Capabilities:

Performance:

When to Use:


4. Voyage AI voyage-3.5 (updated 2026-04)

Purpose: Generate text embeddings for semantic search (sole production text embedder)

Capabilities:

Performance:

Note: OpenAI text-embedding-3-small (1536D) was retired from the production path in 2026-04. It is only retained for the legacy CI changelog workflow.

When to Use:


5. Qwen3-VL-32B-Instruct (HuggingFace Endpoint) - PRIMARY VISION MODEL

Purpose: State-of-the-art vision-language model for image analysis, OCR, material recognition

Endpoint Configuration:

Capabilities:

Performance:

When to Use:


6-10. SLIG (SigLIP2) Specialized Embeddings (updated 2026-04)

Purpose: Multi-modal visual embeddings via HuggingFace Cloud Endpoint. Replaced legacy OpenAI CLIP (256/512/1536D) and SigLIP-SO400M (1152D) in 2026-04 — those columns were dropped from the database.

5 Embedding Types (all 768D halfvec, written directly to VECS):

Visual Embeddings (768D) → image_slig_embeddings

Color Embeddings (768D) → image_color_embeddings

Texture Embeddings (768D) → image_texture_embeddings

Style Embeddings (768D) → image_style_embeddings

Material Embeddings (768D) → image_material_embeddings

Additionally, an Understanding Embedding (1024D, Voyage AI from Qwen3-VL vision_analysis JSON) → image_understanding_embeddings is generated inline for spec-based semantic search.

Performance:

When to Use:


11. Anthropic Semantic Chunking

Purpose: Intelligent text segmentation

Capabilities:

Strategies:

Performance:


12. Direct Vector DB RAG System (Claude 4.5)

Purpose: Retrieval-Augmented Generation with Multi-Vector Search

Capabilities:

Performance:


📊 Model Usage by Pipeline Stage

Stage Primary Model Secondary Model Purpose
0 Claude Sonnet 4.5 GPT-4o Product discovery
2 Anthropic Chunking - Text segmentation
4 voyage-3.5 (Voyage AI) - Text embeddings (1024D, updated 2026-04)
6 Qwen3-VL 17B - Image analysis
7-10 SLIG (SigLIP2, 5 types) - Visual embeddings (768D)
11 Claude Haiku 4.5 Claude Sonnet 4.5 Product validation
13 Claude Sonnet 4.5 - Quality enhancement

💰 Cost Optimization

Strategies:

  1. Use Haiku for fast validation (10x cheaper)
  2. Batch embeddings to reduce API calls
  3. Cache results for repeated queries
  4. Use focused extraction to reduce image analysis

Example Cost per PDF:


🔐 API Keys & Configuration

Required Environment Variables:

The model configuration maps each task to its designated model: discovery uses claude-sonnet-4-5, validation uses claude-haiku-4-5, text_embeddings uses voyage-3.5, vision uses Qwen/Qwen3-VL-32B-Instruct, and visual_embeddings uses SLIG.


📈 Performance Benchmarks

Accuracy:

Speed:

Cost:


New Models (2025-12-26)

Voyage-3 (Voyage AI) - PRIMARY TEXT EMBEDDINGS

Purpose: High-quality text embeddings for semantic search and retrieval

Capabilities:

Performance:

When to Use:

Migration: Replaced text-embedding-3-small in production 2026-04 (1536D → 1024D, dict key text_1536text_1024)


SLIG (SigLIP2) - VISUAL EMBEDDINGS (HuggingFace Endpoint)

Purpose: Cloud-based visual embeddings for image similarity search

Endpoint Configuration:

Capabilities:

Performance:

When to Use:


Voyage-3-Lite (Voyage AI) - LIGHTWEIGHT EMBEDDINGS

Purpose: Fast, cost-effective embeddings for simple tasks

Capabilities:

Performance:

When to Use:


Model Selection Guide

Text Embeddings

  1. Voyage-3 (PRIMARY) - All production text embeddings
  2. Voyage-3-Lite - Simple/fast tasks only
  3. text-embedding-3-small - Retired 2026-04 (CI changelog workflow only)

Vision Analysis

  1. Qwen3-VL-32B-Instruct (PRIMARY) - All production vision tasks (HuggingFace endpoint)
  2. Claude Sonnet 4.5 - Validation for low-confidence results

Visual Embeddings

  1. SLIG (SigLIP2) (PRIMARY) - All visual embeddings (768D, HuggingFace endpoint)
    • General visual (image_embedding mode)
    • Text-guided (color, texture, material, style) (text_embedding mode)

Text Generation

  1. Claude Sonnet 4.5 (PRIMARY) - Complex reasoning
  2. Claude Haiku 4.5 - Fast validation
  3. GPT-4o - Alternative/fallback

Last Updated: December 26, 2025 Version: 2.0.0 Status: Production