Interon Site Knowledge Graph

Transform any website into a structured knowledge graph with AI-powered entity extraction, relationship mapping, and intelligent question generation. Built by Interon, South Africa.

What Does This Tool Do?

Site Knowledge Graph is an intelligent web crawler and knowledge extraction system that analyzes websites to create structured, machine-readable knowledge bases. Perfect for AI training, content analysis, and data-driven insights.

100%

Automated

Fully automated extraction and analysis

5+

Export Formats

Multiple output formats for any use case

∞

Use Cases

Unlimited applications and scenarios

How It Works

πŸ“

1. Submit URL

Enter any website URL with custom depth and page limits

πŸ•·οΈ

2. Intelligent Crawl

Respects robots.txt, handles JavaScript, and extracts clean content

🧠

3. Entity Extraction

AI identifies entities: organizations, products, services, concepts

πŸ”—

4. Build Graph

Maps relationships between entities to create knowledge graph

❓

5. Generate Questions

Creates AI-ready question sets derived from content and relationships

πŸ“Š

6. Export Report

Download in multiple formats: PDF, JSON, CSV

The Comprehensive Report

Each analysis generates a detailed, multi-section report that provides deep insights into the website's content and structure:

πŸ“ˆ

Site Overview

Domain info, crawl status, timestamps, and metadata about the analyzed website

πŸ—ΊοΈ

Crawl Summary

Total pages, depth analysis, fetch methods (HTTP vs JavaScript), and crawl duration

πŸ“„

Pages & Structure

Complete page inventory with URLs, titles, depth levels, and entity counts

🏒

Entity Summary

All extracted entities with types, confidence scores, mentions, and relationships

πŸ”—

Relationships Map

Connections between entities with relationship types, weights, and sources

❓

Question Bank

AI-generated questions grouped by type and complexity level with confidence scores

Powerful Export Options

Every report comes with multiple export formats designed for different use cases. All exports are generated from the same analysis data, ensuring consistency:

πŸ“„ PDF Report πŸ€– JSON (AI-Ready) πŸ“Š Pages CSV 🏒 Entities CSV πŸ”— Relationships CSV ❓ Questions JSON πŸ“‹ Questions CSV

πŸ“„ PDF Report

Print-ready, professionally formatted document with all sections, tables, and visualizations. Perfect for presentations and documentation.

πŸ€– AI-Ready JSON

Structured JSON format optimized for AI/ML pipelines, RAG systems, and programmatic processing. Includes full entity graph with relationships.

πŸ“Š CSV Exports

Three separate CSV files:

  • Pages CSV: URL, title, depth, status, word count, entity count
  • Entities CSV: Name, type, source, confidence, mention count, relations
  • Relationships CSV: From entity, relation type, to entity, weight, source

Perfect for Excel analysis, database imports, and custom processing.

❓ Question Bank Exports

AI-generated questions in JSON and CSV formats with question text, type, level, confidence, and source entities. Ready for training chatbots, creating FAQs, or testing comprehension models.

Key Features

πŸ›‘οΈ Ethical & Compliant Crawling

  • Respects robots.txt and crawl-delay directives
  • Permission-first approach - only crawl authorized sites
  • Rate limiting to avoid server overload
  • Duplicate detection via content hashing

🧠 Intelligent Extraction

  • AI-powered entity recognition (organizations, products, services, concepts)
  • Automatic relationship detection and mapping
  • Confidence scoring for every extraction
  • Source tracking for full traceability

⚑ Real-Time Progress Tracking

  • Live progress bar showing pages processed
  • Phase indicators: Crawling β†’ Building Graph β†’ Generating Report
  • Auto-refresh interface for seamless monitoring
  • Error handling with detailed diagnostics

πŸ“Š Rich Question Generation

  • 7 Question Types: Definition, How-To, Capability, Relationship, Coverage, Comparison, Gap Analysis
  • 4 Complexity Levels: Content Chunk, Page, Entity, Knowledge Graph
  • Confidence scores for answer reliability
  • AI guidance for interrogation best practices

Use Cases

πŸ€–

AI Training Data

Generate structured datasets for training chatbots, RAG systems, and knowledge-based AI models

πŸ”

Competitive Analysis

Analyze competitor websites to understand their offerings, messaging, and structure

πŸ“š

Content Audits

Review your own website's content structure, entity coverage, and information architecture

πŸ”—

Knowledge Graphs

Build enterprise knowledge graphs from public documentation and resources

πŸ“–

Documentation Mining

Extract structured information from technical docs, wikis, and help centers

πŸŽ“

Educational Content

Create Q&A datasets and study materials from educational websites

Technical Architecture

Stack: Node.js + TypeScript + Fastify
Database: PostgreSQL (structured data) + Redis (queue/cache)
Crawling: HTTP + Playwright (JavaScript rendering)
Extraction: AI-powered entity recognition
Deployment: Railway (managed services)

Features:
βœ“ Type-safe with Prisma ORM
βœ“ Redis-backed job queue
βœ“ Parallel processing
βœ“ Content deduplication
βœ“ Relationship graph building
βœ“ Multi-format exports
      

Ready to Extract Knowledge?

Transform any website into structured knowledge in minutes. Start with a free AI Readiness audit or contact Interon for a full Knowledge Graph report.

Run Free AI Readiness Audit Contact Interon