Material-Specific OCR Processing
This document describes the material-specific OCR processing system implemented in the platform. The system enhances OCR extraction by using material type detection and material-specific metadata fields.
Overview
The material-specific OCR processing system consists of the following components:
- Material Type Detector: Detects the material type from OCR text and/or images
- Metadata Field Utilities: Retrieves metadata fields specific to a material type
- Material-Specific OCR Extractor: Extracts metadata using material-specific patterns and context-aware processing
Material Type Detection
The system first detects the material type from the OCR text and/or image. This is a crucial first step that determines which metadata fields should be used for extraction.
Detection Methods
- Text-based detection: Uses keyword matching and pattern recognition to identify material types from text
- Image-based detection: Uses ML models to classify images into material types
- Hybrid detection: Combines text and image detection for higher accuracy
Supported Material Types
tile
: Ceramic, porcelain, mosaic, and other tile materialswood
: Hardwood, engineered wood, laminate, and other wood materialslighting
: Lamps, fixtures, and other lighting productsfurniture
: Chairs, tables, sofas, and other furnituredecoration
: Decorative items like vases, artwork, and rugsall
: Common fields applicable to all material types
Metadata Field Filtering
Once the material type is detected, the system retrieves metadata fields specific to that material type. This ensures that only relevant fields are used for extraction.
Field Selection Process
- Get all metadata fields for the detected material type
- Include common fields applicable to all material types
- Filter out inactive or irrelevant fields
Material-Specific Extraction
The system uses material-specific extraction patterns and context-aware processing to extract metadata from OCR text.
Extraction Methods
- Pattern-based extraction: Uses regular expressions specific to each field and material type
- Hint-based extraction: Uses field hints to locate and extract values
- Context-aware extraction: Uses context and relationships between fields to enhance extraction
Context-Aware Processing
The system implements material-specific context-aware processing to enhance extraction:
-
For tiles:
- Extract size from dimensions and vice versa
- Validate thickness values against typical ranges for tiles
-
For wood:
- Extract width and length from dimensions
- Validate thickness values against typical ranges for wood
-
For lighting:
- Extract wattage and lumens from technical specifications
- Convert between different units (e.g., watts to lumens)
API Endpoints
The system exposes the following API endpoints:
Detect Material Type
POST /api/ocr/detect-material-type
Request body:
JavaScript code removed for compatibility
Response:
JavaScript code removed for compatibility
Extract Metadata
POST /api/ocr/extract-metadata
Request body:
JavaScript code removed for compatibility
Response:
JavaScript code removed for compatibility
Test Extraction Pattern
POST /api/ocr/test-extraction-pattern
Request body:
JavaScript code removed for compatibility
Response:
JavaScript code removed for compatibility
Integration with ML Training
The material-specific OCR system is integrated with the ML training pipeline:
- ML training uses material-specific metadata fields
- Training data is filtered by material type
- Models are trained to recognize material-specific properties
This integration ensures that the ML models are optimized for each material type, improving recognition accuracy.
Future Enhancements
Planned enhancements for the material-specific OCR system:
- Improved material type detection: Enhance detection accuracy with more advanced ML models
- More material types: Add support for additional material types like fabric, metal, and glass
- Enhanced context-aware processing: Implement more sophisticated context-aware extraction rules
- Feedback loop: Incorporate user feedback to improve extraction patterns and rules
- Multi-language support: Add support for extracting metadata from OCR text in multiple languages