Knowledge Base & Documentation System

📋 Overview

The Knowledge Base & Documentation System provides a comprehensive solution for managing product documentation, technical guides, and knowledge articles with AI-powered semantic search and intelligent organization.


Database Schema

Tables Created (6 total)

  1. kb_docs - Main documents table

    • Embeddings support (1024D vector with ivfflat index)
    • Embedding metadata (model, timestamp, status, error tracking)
    • Content fields (title, content, markdown, summary)
    • Status & visibility control (draft/published/archived, public/private/workspace)
    • View tracking and engagement metrics
    • price_doc_type (2026-04) — optional enum (price_list | discount_rule | contract_terms | promotion) for docs filed under the Pricing category; drives how the price_lookup agent tool combines documents
    • RLS policies for workspace isolation
  2. kb_categories - Category hierarchy

    • Parent/child relationships for nested categories
    • Color coding and icons for visual organization
    • Workspace isolation with RLS
    • Sort order for custom arrangement
  3. kb_doc_attachments - Product/material links

    • Multi-product linking (1 doc → many products)
    • Relationship types (primary, supplementary, related, certification, specification)
    • Relevance scoring (1-5 scale)
    • Workspace isolation
  4. kb_doc_versions - Version history

    • Track all changes with timestamps
    • Change type and description
    • Changed fields tracking
    • Immutable (no updates, only inserts)
    • Creator tracking
  5. kb_doc_comments - Comments & suggestions

    • Section-level feedback
    • Threading support (parent/child comments)
    • @mentions support (mentioned_users array)
    • Status tracking (open, resolved, archived)
    • Workspace isolation
  6. kb_search_analytics - Search tracking

    • Query tracking with search type
    • Click tracking (which document was clicked)
    • Performance metrics (search_time_ms)
    • Immutable (no updates, only inserts)
    • User tracking

Indexes Created

RLS Policies


Backend API Endpoints

API Routes Created (16+ endpoints)

Base Path: /api/kb

Document Management (5 endpoints)

  1. POST /api/kb/documents - Create document

    • Automatic embedding generation (1024D)
    • Smart embedding status tracking
    • Error handling with retry support
    • Returns: Document with embedding status
  2. GET /api/kb/documents/{doc_id} - Get document

    • Retrieve single document by ID
    • Returns: Full document with metadata
  3. PATCH /api/kb/documents/{doc_id} - Update document

    • Smart content change detection
    • Regenerates embedding ONLY if content changed
    • Skips embedding if only metadata changed
    • Returns: Updated document with embedding status
  4. DELETE /api/kb/documents/{doc_id} - Delete document

    • Cascading delete (removes attachments, versions, comments)
    • Returns: 204 No Content
  5. POST /api/kb/documents/from-pdf - Create from PDF

    • Extract text using PyMuPDF (text only, no chunking)
    • Automatic embedding generation
    • Returns: Document with extracted text

Search (1 endpoint)

  1. POST /api/kb/search - Search documents

    • Semantic Search: Vector similarity using pgvector cosine distance
      • Generates embedding for search query using Voyage AI voyage-3.5 (updated 2026-04)
      • Compares against stored document embeddings using <=> operator
      • Returns results with similarity scores (0.0 - 1.0)
      • Minimum threshold: 0.5 (configurable)
    • Full-Text Search: ILIKE-based keyword matching
      • Searches title and content fields
      • Case-insensitive matching
    • Hybrid Search: Combination of semantic + full-text
      • Weighted scoring for best results
    • Category filtering (optional)
    • Pagination support (default: 20 results)
    • Returns: Results with search time metrics (ms)

    The request body takes workspace_id, query, search_type (semantic, full_text, or hybrid), and optional limit. Additional filters added 2026-04: category_id, category_slug (e.g. "pricing"), price_doc_type (price_list | discount_rule | contract_terms | promotion), allowed_access_levels, require_published (default false for admin management). The response includes results with category_slug, category_name, price_doc_type, and similarity, plus search_time_ms and total_results.

    Architecture:

    • Frontend → MIVAA API /api/kb/search
    • MIVAA generates query embedding (Voyage AI)
    • MIVAA calls Supabase kb_match_docs() RPC function (unified 2026-04; accepts match_category_id, match_category_slug, match_price_doc_type, require_published)
    • Supabase performs vector similarity search using pgvector
    • Returns ranked results with similarity scores

See also: Pricing API for the admin-only flow that ingests docs under a "Pricing" category with price_doc_type sub-types and retrieves them via either the price_lookup agent tool (AI reasoning mode) or search_knowledge_base gateway action (quick-pick direct mode).

Categories (2 endpoints)

  1. POST /api/kb/categories - Create category

    • Hierarchical support (parent/child)
    • Color and icon customization
    • Returns: Created category
  2. GET /api/kb/categories - List categories

    • Workspace filtering
    • Ordered by sort_order
    • Returns: All categories for workspace

Product Attachments (3 endpoints)

  1. POST /api/kb/attachments - Attach document to product

    • Link document to 1+ products
    • Relationship type specification
    • Relevance scoring (1-5)
    • Returns: Attachment record
  2. GET /api/kb/documents/{doc_id}/attachments - Get document attachments

    • List all products linked to document
    • Returns: Array of attachments
  3. GET /api/kb/products/{product_id}/documents - Get product documents

    • List all documents linked to product
    • Returns: Array of documents

Health Check (1 endpoint)

  1. GET /api/kb/health - Health check
    • Service status
    • Feature availability
    • Endpoint listing
    • Returns: Health status

🔄 Embedding Generation Lifecycle

When Embeddings Are Generated

  1. CREATE Document

    • User creates new doc → Backend generates embedding (1024D)
    • Sync operation (happens immediately)
    • Status: pendingsuccess or failed
  2. PDF Upload

    • User uploads PDF → Extract text → Generate embedding
    • Sync operation
    • Status tracked in database
  3. EDIT/MODIFY Document (Smart Detection)

    • User edits content → Check if content changed
    • IF content changed: Generate NEW embedding
    • IF only metadata changed: Skip embedding
    • Content fields that trigger re-embedding:
      • title, content, summary, seo_keywords, category_id
    • Metadata fields that DON'T trigger re-embedding:
      • status, visibility, view_count, timestamps
  4. SEARCH

    • User searches → Generate query embedding
    • Perform vector similarity search
    • Returns top N results

Embedding Metadata Tracking

Stored in kb_docs table:

Error Handling


📊 API Response Formats

Success responses include document fields such as id, workspace_id, title, content, embedding_status, embedding_generated_at, created_at, and view_count. Error responses include a detail message and status_code. Search responses include success, results, total_count, search_time_ms, and search_type.


Implementation Files

Backend Files

Database

Documentation


Key Features

  1. Automatic Embedding Generation - Text embeddings (1024D) for semantic search
  2. Smart Content Detection - Only regenerate embeddings when content changes
  3. PDF Text Extraction - PyMuPDF integration for text-only extraction
  4. Semantic Search - Vector similarity search using embeddings
  5. Product Attachment - Link documents to multiple products
  6. Category Hierarchy - Parent/child category relationships
  7. Version History - Track all document changes
  8. Comments System - Section-level feedback with threading
  9. Search Analytics - Track queries and clicks
  10. Workspace Isolation - RLS policies for multi-tenant security

📈 Metrics


🔧 Technical Stack


Frontend Components

Components (6 total)

  1. KnowledgeBaseManagement.tsx - Main admin page

    • Tabbed interface (Documents, Search, Categories, Product Links, Analytics)
    • Stats dashboard with real-time metrics
    • Integrated with GlobalAdminHeader for consistent UI
    • Route: /admin/knowledge-base
  2. DocumentList.tsx - Document management

    • Table view with status, embedding status, views, created date
    • Status filter (all, draft, published, archived)
    • Search filtering by title/content
    • Edit and delete actions
    • Direct Supabase queries for performance
  3. DocumentEditor.tsx - Document creation/editing

    • Modal dialog with full-screen editing
    • Title, content, summary, category selection
    • PDF upload with automatic text extraction
    • Edit/Preview tabs for content
    • Status and visibility controls
    • Smart embedding generation on save
  4. CategoryManager.tsx - Category management

    • Table view with icon, name, description, document count
    • Create category dialog
    • Color picker and icon selector
    • Edit and delete actions
  5. SearchInterface.tsx - Semantic search

    • Search type selector (semantic, full-text, hybrid)
    • Real-time search with performance metrics
    • Results display with similarity scores
    • AI indexed badge for documents with embeddings
  6. ProductAttachments.tsx - Product linking

    • Link documents to products
    • Relationship type selection (primary, supplementary, related, certification, specification)
    • Relevance scoring (1-5 stars)
    • Table view with product name, relationship, relevance

Service Layer

knowledgeBaseService.ts - API integration service

Integration Points

  1. App.tsx - Route registration

    • Updated /admin/knowledge-base route to use new component
    • Removed old MaterialKnowledgeBase import
    • Added AuthGuard and AdminGuard protection
  2. AdminDashboard.tsx - Navigation link

    • Updated "PDF Knowledge Base" to "Knowledge Base & Documentation"
    • Updated description to reflect new features
    • Badge shows "NEW v2.3.0"
  3. MIVAA Gateway - API routing

    • 13 Knowledge Base endpoints registered
    • Proper path and method mapping
    • Version updated to v2.3.0

UI/UX Features


System Metrics