Last Updated: November 3, 2025 Version: 1.0.0 Status: ✅ Production
The MIVAA Relevancy System establishes intelligent relationships between chunks, products, and images using AI-powered scoring algorithms. This system ensures accurate search results, proper entity linking, and high-quality knowledge base organization.
Relevancy is a scored relationship (0.0-1.0) between two entities that indicates how closely they are related. Higher scores mean stronger relationships.
MIVAA uses 3 primary relationship tables to link entities:
Table: chunk_product_relationships
Purpose: Links text chunks to products they describe
Relationship Types:
source - Chunk is primary source describing the productrelated - Chunk mentions or relates to the productcomponent - Chunk describes a component of the productalternative - Chunk describes an alternative to the productTable: product_image_relationships
Purpose: Links products to images that depict them
Relationship Types:
depicts - Image directly shows the productillustrates - Image illustrates product featuresvariant - Image shows a product variantrelated - Image is related to the productTable: chunk_image_relationships
Purpose: Links text chunks to images they reference
Relationship Types:
illustrates - Image illustrates the chunk contentdepicts - Image depicts what the chunk describesrelated - Image is related to the chunkexample - Image provides an example of chunk contentFormula: relevance_score = page_proximity(40%) + embedding_similarity(30%) + mention_score(30%)
Components:
Page Proximity (40%) - How close is the chunk to the product?
0.40.20.0Embedding Similarity (30%) - How similar is the chunk content to the product?
0.15Mention Score (30%) - Does the chunk mention the product name?
0.30.0Formula: relevance_score = page_overlap(40%) + visual_similarity(40%) + detection_score(20%)
Components:
Page Overlap (40%) - Are the product and image on the same page?
0.40.20.0Visual Similarity (40%) - How visually similar is the image to the product?
0.3Detection Score (20%) - How confident is the AI that this image shows the product?
0.2Formula: relevance_score = same_page(50%) + visual_text_similarity(30%) + spatial_proximity(20%)
Components:
Same Page (50%) - Are the chunk and image on the same page?
0.50.0Visual-Text Similarity (30%) - Does the image content match the chunk text?
0.2Spatial Proximity (20%) - How close are they on the page?
0.20.10.0entity_linking_service.pyLocation: mivaa-pdf-extractor/app/services/entity_linking_service.py
Key Methods:
link_images_to_products(document_id, image_to_product_mapping, product_name_to_id) - Links images to products with relevance scoreslink_chunks_to_images(document_id) - Links chunks to images on the same pagelink_chunks_to_products(document_id) - Links chunks to products with relevance scoresentityRelationshipService.tsLocation: src/services/entityRelationshipService.ts
Key Methods:
linkChunkToProduct(chunkId, productId, relationshipType, relevanceScore) - Returns ChunkProductRelationshiplinkProductToImage(productId, imageId, relationshipType, relevanceScore) - Returns ProductImageRelationshiplinkChunkToImage(chunkId, imageId, relationshipType, relevanceScore) - Returns ChunkImageRelationshipClaude/GPT analyzes PDF, identifies products and their pages, and creates an image-to-product mapping.
Semantic chunking creates text chunks which are stored in the document_chunks table, each with a page_number.
Products are linked to Images (using image-to-product mapping), then Chunks are linked to Products (using page proximity + embeddings), then Chunks are linked to Images (using same-page detection). All relationships are stored with relevance scores.
Filter relationships by minimum relevance score. Recommended minimums: 0.7 for high-quality chunk-product relationships, 0.5 for product-image relationships.
When multiple relationships exist, prioritize by type:
Chunk → Product:
source (primary description)component (part of product)related (mentions product)alternative (alternative to product)Product → Image:
depicts (shows product directly)illustrates (shows features)variant (shows variant)related (related image)Relevance scores can be updated based on user feedback. Increase score by 0.1 when user confirms a relationship (capped at 1.0), decrease by 0.2 when user rejects (floored at 0.0).
Related Documentation: