Prompt-Based Extraction Architecture

Overview

All extraction services in the Material Kai Vision Platform follow a prompt-based architecture where extraction logic is controlled by AI prompts stored in the database, not hardcoded in services.

This document describes the standardized approach for building extraction services.


Architecture Principles

1. Database-Driven Prompts

All extraction prompts are stored in the prompts table with:

2. AI-Powered Extraction

Services use AI (Claude/GPT) to interpret data and extract information:

3. No Hardcoded Logic

The correct approach is to load prompts from the database and use AI to interpret data, rather than embedding hardcoded regex patterns or logic directly in the service code.


Current Extraction Prompts

1. Product Discovery

2. Product Entity Creation

3. Material Properties

4. Factory Documents

5. Factory Metadata

6. Icon Metadata

7. Product Image Analysis


How to Build a New Extraction Service

Step 1: Create Database Prompt

Insert a new record into the prompts table specifying the workspace ID, prompt type (extraction), stage, category, name, detailed AI instructions, status (active), active flag, and version number.

Step 2: Create Service Class

Create a service class that loads the prompt from the database on initialization, then calls the AI model with the combined prompt and data. Parse the structured JSON response returned by the AI.

Step 3: Integrate into Pipeline

Add service call to appropriate PDF processing stage.


Benefits

  1. Flexibility: Change extraction logic by updating prompts, no code changes
  2. Versioning: Track prompt changes with version numbers
  3. Customization: Workspace-specific prompts for different use cases
  4. Consistency: All services follow same pattern
  5. Maintainability: No scattered hardcoded logic
  6. Testability: Easy to test with different prompts

Admin Management

Prompts can be managed through:

All changes are tracked in prompt_history table.