Files
awesome-copilot/skills/mini-context-graph/references/ingestion.md
T
Nixon Kurian 746ba555b6 add mini-context-graph skill (#1580)
* add mini-context-graph skill

* remove pycache files

* filename case update to SKILL.md

* update readme
2026-05-05 14:04:37 +10:00

5.0 KiB
Raw Blame History

Ingestion Instructions

This file defines how the agent extracts entities and relations from a raw document.


Step 1: Read the Document

Read the provided text carefully. Identify:

  • Entities: noun phrases that refer to real-world objects, systems, components, actors, concepts, or events.
  • Relations: verb phrases that describe how one entity affects, contains, causes, uses, or is related to another.

Step 2: Extract Entities

For each entity:

  • Record its name (normalized: lowercase, strip leading/trailing whitespace)
  • Assign a type: a short label (13 words) that categorizes the entity

Entity Type Examples

Entity Name Suggested Type
Python interpreter software
memory leak issue
operating system system
database infrastructure
user actor
API endpoint interface
server infrastructure

Rules:

  • Types must be general enough to reuse across documents
  • Do NOT create unique types per entity (e.g., avoid python-interpreter-type)
  • Use ontology.md normalization rules to canonicalize types

Step 3: Extract Relations

For each pair of entities with an explicit connection in the text:

  • Record the source entity name
  • Record the target entity name
  • Record the relation type: a verb or verb phrase (normalized: lowercase)
  • Assign a confidence score between 0 and 1:
    • 1.0 = stated explicitly ("A causes B")
    • 0.8 = strongly implied ("A is linked to B")
    • 0.6 = weakly implied ("A may affect B")
    • < 0.6 = do NOT include

Step 4: Output Format

Produce a JSON object in this exact format:

{
  "entities": [
    { "name": "entity name", "type": "entity type", "supporting_text": "exact quote mentioning this entity" }
  ],
  "relations": [
    {
      "source": "source entity name",
      "target": "target entity name",
      "type": "relation type",
      "confidence": 0.9,
      "supporting_text": "exact quote that justifies this relation"
    }
  ]
}

The supporting_text field is required for provenance. It must be a verbatim or near-verbatim quote from the document that mentions or supports the entity/relation. This is what links graph nodes and edges back to their source.


Rules

  • All names and types must be lowercase
  • Only include relations where both entities are present in the entities list
  • Do NOT invent entities or relations not supported by the text
  • Prefer reusing existing entity and relation types from the ontology over creating new ones
  • One entity can appear in multiple relations (as source or target)
  • Always include supporting_text — this enables evidence retrieval and audit trails

Step 5: Write Wiki Pages (Required)

After calling skill.ingest_with_content(...), you MUST write wiki pages:

5a. Write a summary page for the document

from scripts.tools import wiki_store

wiki_store.write_page(
    category="summary",
    title=f"{title} Summary",
    content=f"""---
title: {title}
source_document: {doc_id}
tags: [summary]
---

# {title}

**Source:** {source}

## Key Claims

{chr(10).join(f'- [[{r["source"].replace(" ", "-")}]] {r["type"]} [[{r["target"].replace(" ", "-")}]] (confidence: {r["confidence"]})' for r in relations)}

## Entities

{chr(10).join(f'- [[{e["name"].replace(" ", "-")}]] ({e["type"]})' for e in entities)}

## Open Questions

- (Add questions from reading the document here)
""",
    summary=f"Summary of {title}",
)

5b. Write or update entity pages

For each new entity not already in the wiki, write an entity page:

wiki_store.write_page(
    category="entity",
    title=entity_name,
    content=f"""---
title: {entity_name}
type: {entity_type}
source_document: {doc_id}
tags: [{entity_type}]
---

# {entity_name}

(Description from the document or prior knowledge.)

## Relations

(List any wikilinks to related entities extracted from relations.)

## Mentioned in

- [[{doc_id}-summary]]
""",
    summary=f"{entity_name}: {entity_type}",
)

For existing entity pages, read the current page and append new information, updated relations, or flag contradictions.


Example

Input document:

System crashes due to memory leaks.
Memory leaks occur when objects are not released.

Expected extraction output:

{
  "entities": [
    { "name": "system crash", "type": "issue",     "supporting_text": "system crashes due to memory leaks" },
    { "name": "memory leak",  "type": "issue",     "supporting_text": "memory leaks occur when objects are not released" },
    { "name": "object",       "type": "component", "supporting_text": "objects are not released" }
  ],
  "relations": [
    {
      "source": "memory leak",
      "target": "system crash",
      "type": "causes",
      "confidence": 1.0,
      "supporting_text": "System crashes due to memory leaks."
    },
    {
      "source": "object",
      "target": "memory leak",
      "type": "contributes to",
      "confidence": 0.9,
      "supporting_text": "Memory leaks occur when objects are not released."
    }
  ]
}