API Configuration Examples - Dynamic AI Models

This document provides practical examples of how to use the dynamic AI model configuration system in the MIVAA PDF processing pipeline.


Overview

All internal pipeline endpoints (/api/internal/*) accept an optional ai_config parameter that allows you to customize which AI models are used at each stage. If not provided, the system uses DEFAULT_AI_CONFIG.


Table of Contents

  1. Basic Usage
  2. Pre-configured Profiles
  3. Endpoint-Specific Examples
  4. Advanced Configurations
  5. Cost Optimization
  6. Performance Tuning

Basic Usage

Default Configuration (No ai_config)

If you don't provide ai_config, the system uses these defaults:

Defaults Used:

Custom Configuration

You can override specific models while keeping others as default by providing an ai_config object with only the fields you want to change.


Pre-configured Profiles

1. DEFAULT_AI_CONFIG (Balanced)

Best overall accuracy and reliability.

Use When: You need the best balance of accuracy, reliability, and performance.

2. FAST_CONFIG (Speed Optimized)

Faster processing with good accuracy.

Use When: You need faster processing times and can accept slightly lower accuracy.

Speed Improvements:

3. HIGH_ACCURACY_CONFIG (Quality Optimized)

Maximum accuracy for critical processing.

Use When: Accuracy is critical and processing time is not a concern.

Accuracy Improvements:

4. COST_OPTIMIZED_CONFIG (Budget Friendly)

Minimize costs while maintaining acceptable quality.

Use When: You need to minimize API costs.

Cost Savings:


Endpoint-Specific Examples

Endpoint 10: classify-images

Customize image classification models and thresholds via the ai_config parameter.

Endpoint 30: save-images-db

Customize visual embedding models (SigLIP/CLIP) via the ai_config parameter.

Note: 5 embeddings per image (visual, color, texture, style, material) = 65 × 5 = 325 total embeddings.

Endpoint 40: extract-metadata

Customize metadata extraction model and parameters.

Metadata Fields Extracted:

Endpoint 50: create-chunks

Customize chunking and text embedding models.


Advanced Configurations

High-Volume Processing

For processing large batches of PDFs, optimize for speed and cost:

Benefits:


Premium Quality Processing

For high-value catalogs requiring maximum accuracy:

Benefits:


Cost Optimization

Estimated Costs Per PDF (100 pages, 50 images)

DEFAULT_AI_CONFIG:

COST_OPTIMIZED_CONFIG:

HIGH_ACCURACY_CONFIG:


Performance Tuning

Reduce Processing Time

Speed Improvements:

Balance Speed and Quality

Benefits:


Testing Different Configurations

A/B Testing Example

Test two configurations side-by-side by submitting separate jobs with different ai_config values. Compare results to find the best configuration for your use case.


Best Practices

  1. Start with DEFAULT_AI_CONFIG: It provides the best balance for most use cases.

  2. Test Before Production: Use NOVA test script to validate configurations.

  3. Monitor Costs: Track API usage and costs for different configurations.

  4. Optimize Iteratively: Start with quality, then optimize for speed/cost.

  5. Use Pre-configured Profiles: They're tested and optimized for specific scenarios.

  6. Document Your Choices: Keep track of which configurations work best for different PDF types.

  7. Consider PDF Complexity:

    • Simple catalogs → FAST_CONFIG or COST_OPTIMIZED_CONFIG
    • Complex technical docs → HIGH_ACCURACY_CONFIG
    • Mixed content → DEFAULT_AI_CONFIG

Troubleshooting

Low Classification Accuracy

Problem: Too many false positives/negatives in image classification.

Solution: Increase confidence threshold and use Claude Sonnet for validation by setting classification_confidence_threshold to 0.8 and classification_validation_model to claude-sonnet-4-6-20260217 in your ai_config.

Slow Processing

Problem: Pipeline takes too long to complete.

Solution: Use FAST_CONFIG or reduce max tokens by switching to gpt-4o for discovery, claude-haiku-4-20250514 for validation, and reducing discovery_max_tokens and metadata_max_tokens to 2048.

High API Costs

Problem: API costs are too high.

Solution: Use COST_OPTIMIZED_CONFIG or lower threshold by reducing classification_confidence_threshold to 0.6, switching to gpt-4o for discovery, gpt for metadata extraction, and claude-haiku-4-20250514 for validation.

Poor Metadata Quality

Problem: Extracted metadata is incomplete or inaccurate.

Solution: Use Claude with higher max tokens by setting metadata_extraction_model to claude, metadata_temperature to 0.05, and metadata_max_tokens to 8192.


Summary

The dynamic AI model configuration system gives you complete control over the PDF processing pipeline. Choose the right configuration based on your priorities:

All configurations are production-ready and tested with the NOVA end-to-end test script.