This document provides practical examples of how to use the dynamic AI model configuration system in the MIVAA PDF processing pipeline.
All internal pipeline endpoints (/api/internal/*) accept an optional ai_config parameter that allows you to customize which AI models are used at each stage. If not provided, the system uses DEFAULT_AI_CONFIG.
If you don't provide ai_config, the system uses these defaults:
Defaults Used:
You can override specific models while keeping others as default by providing an ai_config object with only the fields you want to change.
Best overall accuracy and reliability.
Use When: You need the best balance of accuracy, reliability, and performance.
Faster processing with good accuracy.
Use When: You need faster processing times and can accept slightly lower accuracy.
Speed Improvements:
Maximum accuracy for critical processing.
Use When: Accuracy is critical and processing time is not a concern.
Accuracy Improvements:
Minimize costs while maintaining acceptable quality.
Use When: You need to minimize API costs.
Cost Savings:
Customize image classification models and thresholds via the ai_config parameter.
Customize visual embedding models (SigLIP/CLIP) via the ai_config parameter.
Note: 5 embeddings per image (visual, color, texture, style, material) = 65 × 5 = 325 total embeddings.
Customize metadata extraction model and parameters.
Metadata Fields Extracted:
Customize chunking and text embedding models.
For processing large batches of PDFs, optimize for speed and cost:
Benefits:
For high-value catalogs requiring maximum accuracy:
Benefits:
DEFAULT_AI_CONFIG:
COST_OPTIMIZED_CONFIG:
HIGH_ACCURACY_CONFIG:
Speed Improvements:
Benefits:
Test two configurations side-by-side by submitting separate jobs with different ai_config values. Compare results to find the best configuration for your use case.
Start with DEFAULT_AI_CONFIG: It provides the best balance for most use cases.
Test Before Production: Use NOVA test script to validate configurations.
Monitor Costs: Track API usage and costs for different configurations.
Optimize Iteratively: Start with quality, then optimize for speed/cost.
Use Pre-configured Profiles: They're tested and optimized for specific scenarios.
Document Your Choices: Keep track of which configurations work best for different PDF types.
Consider PDF Complexity:
Problem: Too many false positives/negatives in image classification.
Solution: Increase confidence threshold and use Claude Sonnet for validation by setting classification_confidence_threshold to 0.8 and classification_validation_model to claude-sonnet-4-6-20260217 in your ai_config.
Problem: Pipeline takes too long to complete.
Solution: Use FAST_CONFIG or reduce max tokens by switching to gpt-4o for discovery, claude-haiku-4-20250514 for validation, and reducing discovery_max_tokens and metadata_max_tokens to 2048.
Problem: API costs are too high.
Solution: Use COST_OPTIMIZED_CONFIG or lower threshold by reducing classification_confidence_threshold to 0.6, switching to gpt-4o for discovery, gpt for metadata extraction, and claude-haiku-4-20250514 for validation.
Problem: Extracted metadata is incomplete or inaccurate.
Solution: Use Claude with higher max tokens by setting metadata_extraction_model to claude, metadata_temperature to 0.05, and metadata_max_tokens to 8192.
The dynamic AI model configuration system gives you complete control over the PDF processing pipeline. Choose the right configuration based on your priorities:
All configurations are production-ready and tested with the NOVA end-to-end test script.