Complete documentation for async processing architecture and concurrency limits across all product generation methods.
All three product generation methods (PDF Processing, Web Scraping, XML Import) use fully async processing with unified concurrency limits to ensure:
✅ Fully Async: All I/O operations use async/await
✅ Semaphore-based: Concurrency controlled via asyncio.Semaphore
✅ Batch Processing: Large datasets processed in batches
✅ Retry Logic: Automatic retry with exponential backoff
✅ Circuit Breakers: Prevent cascading failures
✅ Timeout Guards: Prevent infinite hangs
All three methods use AsyncQueueService for background job processing. PDF Processing uses process_pdf_document(), Web Scraping uses process_scraping_session(), and XML Import uses process_import_job(). All are fully async and use the same downstream services for chunking, embedding, and image processing.
The AsyncQueueService is shared across all three methods. It queues jobs for chunking, embedding generation, and product enrichment via its queue_ai_analysis_jobs() method.
All methods update the background_jobs table in real-time with the current job status, progress percentage, and metadata about the current stage.
Applies to: PDF, Web Scraping, XML Import
Service: ImageProcessingService
| Limit | Value | Purpose |
|---|---|---|
| Qwen Vision Concurrent | 5 | Fast material classification |
| Claude Validation Concurrent | 2 | Validation for uncertain cases |
| Batch Size | 15 images | Memory optimization |
Why these limits?
Applies to: PDF, Web Scraping, XML Import
Service: ImageProcessingService
| Limit | Value | Purpose |
|---|---|---|
| Concurrent Uploads | 10 | Supabase Storage upload limit |
Why 10?
Applies to: PDF, Web Scraping, XML Import
Service: ImageProcessingService
| Limit | Value | Purpose |
|---|---|---|
| Batch Size | 20 images | Memory optimization |
| Max Retries | 3 | Retry failed embeddings |
Why 20?
Applies to: XML Import
Service: ImageDownloadService
| Limit | Value | Purpose |
|---|---|---|
| Concurrent Downloads | 5 | Network bandwidth optimization |
| Max Retries | 3 | Retry failed downloads |
| Timeout | 30 seconds | Prevent hanging downloads |
| Max File Size | 10 MB | Prevent large file downloads |
Why these limits?
Applies to: XML Import
Service: DataImportService
| Limit | Value | Purpose |
|---|---|---|
| Batch Size | 10 products | Memory optimization |
| Image Downloads per Batch | 5 concurrent | Network optimization |
Why 10 products?
Applies to: PDF Processing
Service: PDFProcessor
| Limit | Value | Purpose |
|---|---|---|
| Max Workers | 2 | Memory optimization (reduced from 4) |
| Pages per Worker | 5 | Batch size for page processing |
| Max Pages in Memory | 10 | 2 workers × 5 pages |
Why 2 workers?
Applies to: PDF, Web Scraping
Service: ProductDiscoveryService
| Operation | Timeout | Purpose |
|---|---|---|
| Product Discovery | 300s (5 min) | AI analysis of full document |
| Per-product Extraction | 60s | Individual product metadata |
Applies to: PDF Processing
Service: PDFProcessor
| Operation | Timeout | Purpose |
|---|---|---|
| Full PDF Extraction | 7200s (2 hours) | Large PDFs with OCR |
| Per-page Extraction | Dynamic | Based on file size |
The per-page timeout is calculated dynamically: max(300, file_size_mb * 10 + num_pages * 5). This means a small PDF (10 pages, 5MB) gets ~300s, while a large PDF (500 pages, 50MB) gets ~3000s (50 min).
Applies to: XML Import
Service: ImageDownloadService
| Operation | Timeout | Purpose |
|---|---|---|
| Per-image Download | 30s | Single image download |
Applies to: PDF, Web Scraping, XML Import
Service: QwenEndpointService, AIClientService
| Operation | Timeout | Purpose |
|---|---|---|
| Qwen Vision Request | 120s | Image classification |
| Claude Request | 120s | Validation |
Applies to: PDF, Web Scraping, XML Import
Service: QwenEndpointService
| Limit | Value | Purpose |
|---|---|---|
| Requests per Minute | 10 | API rate limit |
| Burst Limit | 5 | Short-term burst |
Applies to: PDF, Web Scraping, XML Import
Service: CircuitBreaker
| Limit | Value | Purpose |
|---|---|---|
| Failure Threshold | 5 | Open circuit after 5 failures |
| Recovery Timeout | 60s | Try again after 60s |
Applies to: All methods
Service: images.py API
| Limit | Value | Purpose |
|---|---|---|
| Exports per Hour | 5 | Prevent abuse |
All three methods use the SAME services with SAME limits:
Used by: PDF, Web Scraping, XML Import
Shared limits across all methods: 5 concurrent HuggingFace Endpoint (Qwen3-VL) requests, 2 concurrent Claude requests, 10 concurrent uploads, classification batch size of 15 images, and CLIP batch size of 20 images.
Used by: PDF, Web Scraping, XML Import
Uses SigLIP2 via the SLIG cloud endpoint (HuggingFace Inference Endpoint, 768D) and generates five specialized 768D embedding types written directly to VECS: image_slig_embeddings (visual), image_color_embeddings, image_texture_embeddings, image_style_embeddings, and image_material_embeddings. Plus an understanding embedding (1024D Voyage AI from Qwen3-VL vision_analysis) → image_understanding_embeddings. Updated 2026-04: legacy google/siglip-so400m-patch14-384 (1152D) and CLIP 512D collections were dropped.
Used by: PDF, Web Scraping, XML Import
Shared background job processing for chunking, embedding generation, and product enrichment.
Used by: PDF, Web Scraping, XML Import
Shared chunking logic with chunk size of 1000 characters and overlap of 200 characters.
Batch Processing: All methods process data in batches to prevent OOM
| Method | Batch Size | Memory Impact |
|---|---|---|
| PDF Image Classification | 15 images | ~500MB per batch |
| CLIP Embeddings | 20 images | ~300MB per batch |
| XML Product Import | 10 products | ~200MB per batch |
Concurrent Downloads: Controlled via semaphores
| Operation | Concurrency | Throughput |
|---|---|---|
| Image Downloads (XML) | 5 concurrent | ~5 images/sec |
| Image Uploads | 10 concurrent | ~10 images/sec |
Rate Limiting: Prevent API throttling
| API | Limit | Strategy |
|---|---|---|
| HuggingFace (Qwen3-VL) | 10 req/min | Semaphore (5 concurrent) |
| Claude | Circuit breaker | Semaphore (2 concurrent) |
| OpenAI | No limit | Batch processing |
| Feature | Web | XML | |
|---|---|---|---|
| Main Processing | ✅ Fully async | ✅ Fully async | ✅ Fully async |
| Background Jobs | ✅ AsyncQueueService | ✅ AsyncQueueService | ✅ AsyncQueueService |
| Product Discovery | ✅ Async + timeout | ✅ Async + timeout | ✅ Async (queued) |
| Image Processing | ✅ Async + semaphores | ✅ Async + semaphores | ✅ Async + semaphores |
| Chunking | ✅ Async | ✅ Async | ✅ Async (queued) |
| Embeddings | ✅ Async | ✅ Async | ✅ Async (queued) |
| Limit | Web | XML | |
|---|---|---|---|
| Qwen Vision | 5 concurrent | 5 concurrent | 5 concurrent |
| Claude Validation | 2 concurrent | 2 concurrent | 2 concurrent |
| Image Classification Batch | 15 images | 15 images | 15 images |
| Image Uploads | 10 concurrent | 10 concurrent | 10 concurrent |
| CLIP Batch | 20 images | 20 images | 20 images |
| Image Downloads | N/A | N/A | 5 concurrent |
| Product Batch | N/A | N/A | 10 products |
| PDF Workers | 2 workers | N/A | N/A |
| Timeout | Web | XML | |
|---|---|---|---|
| Product Discovery | 300s (5 min) | 300s (5 min) | N/A |
| PDF Extraction | 7200s (2 hours) | N/A | N/A |
| Image Download | N/A | N/A | 30s |
| AI Classification | 120s | 120s | 120s |
Always log batch progress including the current batch number, total batches, and progress percentage to enable real-time tracking.
Use try/except blocks with detailed logging around every batch processing call. On failure, log the error with context and continue processing the next batch rather than aborting the entire job.
After each batch, explicitly delete the batch data reference and call the garbage collector to free memory, particularly important for large image batches.
Always update the background_jobs table after each stage with the current progress percentage and stage name so the frontend can display accurate real-time progress.
✅ All methods fully async: PDF, Web Scraping, XML Import ✅ Same concurrency limits: 5 HuggingFace/Qwen, 2 Claude, 10 uploads, 20 SLIG ✅ Same timeout guards: 300s discovery, 120s AI, 30s downloads ✅ Same rate limiting: 10 req/min HuggingFace/Qwen, circuit breaker Claude ✅ Same shared services: ImageProcessingService, RealEmbeddingsService, AsyncQueueService ✅ Memory optimized: Batch processing prevents OOM ✅ Network optimized: Semaphores prevent congestion ✅ API optimized: Rate limiting prevents throttling
The architecture is unified, consistent, and production-ready! 🚀