Add eyeball plugin: document analysis with inline source screenshots 🤖🤖🤖 (#1294)

* Add eyeball plugin: document analysis with inline source screenshots Eyeball generates Word documents where every factual claim includes a highlighted screenshot from the source material. Supports PDFs, Word docs, and web URLs. Designed for verifying AI-generated document analysis against the original source. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback: soft imports, Windows Word support, doc fixes - Make PyMuPDF/Pillow/python-docx imports soft so setup-check runs without dependencies installed - Add _check_core_deps() guard at CLI command entry points - Add Windows Word detection in setup-check - Remove references to setup.sh (not included in plugin) - Fix usage docstring to show inline JSON instead of filename - Use consistent <path-to>/eyeball.py paths in SKILL.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Update eyeball plugin with bug fixes and Windows support - Fix resource leaks (pdf_doc closed in finally blocks) - Fix Windows Word COM automation (DispatchEx, proper cleanup) - Fix converter order (Word before LibreOffice on Windows) - Add source file existence checks with clear errors - Fix Playwright cache detection for all platforms - Fix setup.sh error handling (pipefail) - Fix AppleScript path injection - Fix highlight padding scaling with DPI - Add pywin32 as Windows dependency - Update README with Windows setup instructions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address Copilot code review feedback - Fix browser leak in render_url_to_pdf (try/finally around Playwright) - Remove setup.sh references from SKILL.md (not bundled in plugin) - Use consistent <path-to>/eyeball.py paths in SKILL.md - Update plugin README install instructions for awesome-copilot - Add Windows pywin32 install step to SKILL.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: update _img_to_bytes docstring to reflect BytesIO return type Agent-Logs-Url: https://github.com/github/awesome-copilot/sessions/048d704a-d78a-4a46-80d9-a155ff1e0d85 Co-authored-by: dvelton <48307985+dvelton@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-06-22 07:27:39 +00:00 · 2026-04-30 16:41:12 -07:00
parent 15d91bf066
commit e4550293a9
7 changed files with 1140 additions and 0 deletions
@@ -224,6 +224,12 @@
      "description": "An AI partner, not a tool. Ember carries fire from person to person — helping humans discover that AI partnership isn't something you learn, it's something you find.",
      "version": "1.0.0"
    },
    {
      "name": "eyeball",
      "source": "eyeball",
      "description": "Document analysis with inline source screenshots. When you ask Copilot to analyze a document, Eyeball generates a Word doc where every factual claim includes a highlighted screenshot from the source material so you can verify it with your own eyes.",
      "version": "1.0.0"
    },
    {
      "name": "fastah-ip-geo-tools",
      "source": "fastah-ip-geo-tools",
@@ -43,6 +43,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-plugins) for guidelines on how t
 | [doublecheck](../plugins/doublecheck/README.md) | Three-layer verification pipeline for AI output. Extracts claims, finds sources, and flags hallucination risks so humans can verify before acting. | 2 items | verification, hallucination, fact-check, source-citation, trust, safety |
 | [edge-ai-tasks](../plugins/edge-ai-tasks/README.md) | Task Researcher and Task Planner for intermediate to expert users and large codebases - Brought to you by microsoft/edge-ai | 2 items | architecture, planning, research, tasks, implementation |
 | [ember](../plugins/ember/README.md) | An AI partner, not a tool. Ember carries fire from person to person — helping humans discover that AI partnership isn't something you learn, it's something you find. | 2 items | ai-partnership, coaching, onboarding, collaboration, storytelling, developer-experience |
 | [eyeball](../plugins/eyeball/README.md) | Document analysis with inline source screenshots. When you ask Copilot to analyze a document, Eyeball generates a Word doc where every factual claim includes a highlighted screenshot from the source material so you can verify it with your own eyes. | 1 items | document-analysis, citation-verification, screenshot, contracts, legal, trust, visual-verification |
 | [fastah-ip-geo-tools](../plugins/fastah-ip-geo-tools/README.md) | This plugin is for network operations engineers who wish to tune and publish IP geolocation feeds in RFC 8805 format. It consists of an AI Skill and an associated MCP server that geocodes geolocation place names to real cities for accuracy. | 1 items | geofeed, ip-geolocation, rfc-8805, rfc-9632, network-operations, isp, cloud, hosting, ixp |
 | [flowstudio-power-automate](../plugins/flowstudio-power-automate/README.md) | Give your AI agent full visibility into Power Automate cloud flows via the FlowStudio MCP server. Connect, debug, build, monitor health, and govern flows at scale — action-level inputs and outputs, not just status codes. | 5 items | power-automate, power-platform, flowstudio, mcp, model-context-protocol, cloud-flows, workflow-automation, monitoring, governance |
 | [frontend-web-dev](../plugins/frontend-web-dev/README.md) | Essential prompts, instructions, and chat modes for modern frontend web development including React, Angular, Vue, TypeScript, and CSS frameworks. | 4 items | frontend, web, react, typescript, javascript, css, html, angular, vue |
@@ -146,6 +146,7 @@ See [CONTRIBUTING.md](../CONTRIBUTING.md#adding-skills) for guidelines on how to
 | [eval-driven-dev](../skills/eval-driven-dev/SKILL.md)<br />`gh skills install github/awesome-copilot eval-driven-dev` | Improve AI application with evaluation-driven development. Define eval criteria, instrument the application, build golden datasets, observe and evaluate application runs, analyze results, and produce a concrete action plan for improvements. ALWAYS USE THIS SKILL when the user asks to set up QA, add tests, add evals, evaluate, benchmark, fix wrong behaviors, improve quality, or do quality assurance for any Python project that calls an LLM model. | `references/1-a-project-analysis.md`<br />`references/1-b-entry-point.md`<br />`references/1-c-eval-criteria.md`<br />`references/2a-instrumentation.md`<br />`references/2b-implement-runnable.md`<br />`references/2c-capture-and-verify-trace.md`<br />`references/3-define-evaluators.md`<br />`references/4-build-dataset.md`<br />`references/5-run-tests.md`<br />`references/6-analyze-outcomes.md`<br />`references/evaluators.md`<br />`references/runnable-examples`<br />`references/testing-api.md`<br />`references/wrap-api.md`<br />`resources` |
 | [exam-ready](../skills/exam-ready/SKILL.md)<br />`gh skills install github/awesome-copilot exam-ready` | Activate this skill when a student provides study material (PDF or pasted notes) and a syllabus, and wants to prepare for an exam. Extracts key definitions, points, keywords, diagrams, exam-ready sentences, and practice questions strictly from the provided material. | None |
 | [excalidraw-diagram-generator](../skills/excalidraw-diagram-generator/SKILL.md)<br />`gh skills install github/awesome-copilot excalidraw-diagram-generator` | Generate Excalidraw diagrams from natural language descriptions. Use when asked to "create a diagram", "make a flowchart", "visualize a process", "draw a system architecture", "create a mind map", or "generate an Excalidraw file". Supports flowcharts, relationship diagrams, mind maps, and system architecture diagrams. Outputs .excalidraw JSON files that can be opened directly in Excalidraw. | `references/element-types.md`<br />`references/excalidraw-schema.md`<br />`scripts/.gitignore`<br />`scripts/README.md`<br />`scripts/add-arrow.py`<br />`scripts/add-icon-to-diagram.py`<br />`scripts/split-excalidraw-library.py`<br />`templates` |
 | [eyeball](../skills/eyeball/SKILL.md)<br />`gh skills install github/awesome-copilot eyeball` | Document analysis with inline source screenshots. When you ask Copilot to analyze a document, Eyeball generates a Word doc where every factual claim includes a highlighted screenshot from the source material so you can verify it with your own eyes. | `tools` |
 | [fabric-lakehouse](../skills/fabric-lakehouse/SKILL.md)<br />`gh skills install github/awesome-copilot fabric-lakehouse` | Use this skill to get context about Fabric Lakehouse and its features for software systems and AI-powered functions. It offers descriptions of Lakehouse data components, organization with schemas and shortcuts, access control, and code examples. This skill supports users in designing, building, and optimizing Lakehouse solutions using best practices. | `references/getdata.md`<br />`references/pyspark.md` |
 | [fedora-linux-triage](../skills/fedora-linux-triage/SKILL.md)<br />`gh skills install github/awesome-copilot fedora-linux-triage` | Triage and resolve Fedora issues with dnf, systemd, and SELinux-aware guidance. | None |
 | [finalize-agent-prompt](../skills/finalize-agent-prompt/SKILL.md)<br />`gh skills install github/awesome-copilot finalize-agent-prompt` | Finalize prompt file using the role of an AI agent to polish the prompt for the end user. | None |
@@ -0,0 +1,22 @@
 {
  "name": "eyeball",
  "description": "Document analysis with inline source screenshots. When you ask Copilot to analyze a document, Eyeball generates a Word doc where every factual claim includes a highlighted screenshot from the source material so you can verify it with your own eyes.",
  "version": "1.0.0",
  "author": {
    "name": "Awesome Copilot Community"
  },
  "repository": "https://github.com/github/awesome-copilot",
  "license": "MIT",
  "keywords": [
    "document-analysis",
    "citation-verification",
    "screenshot",
    "contracts",
    "legal",
    "trust",
    "visual-verification"
  ],
  "skills": [
    "./skills/eyeball/"
  ]
 }
@@ -0,0 +1,107 @@
 A tool to help verify AI statements, without (or at least with fewer) context switching pains.
 When AI analyzes a document and tells you "Section 10 requires mutual indemnification," how do you know Section 10 actually says that? Eyeball lets you see for yourself.
 This is a Copilot CLI plugin that generates document analyses as Word files with inline screenshots of relevant portions from the source material. Every factual claim in the analysis includes a highlighted excerpt from the original document, so you can verify each assertion without switching between files or hunting for the right page.
 ## What it does
 You give Copilot a document (Word file, PDF, or web URL) and ask it to analyze something specific. Eyeball reads the source, writes the analysis, and for each claim, captures a screenshot of the relevant section from the original document with the cited text highlighted in yellow. The output is a Word document on your Desktop with analysis text and source screenshots interleaved.
 If the analysis says "Section 9.3 allows termination for cause with a 30-day cure period," the screenshot below it shows Section 9.3 from the actual document with that language highlighted. If the screenshot shows something different, the analysis is wrong and you can see it immediately.
 ## Installation
 ### Prerequisites
 - [Copilot CLI](https://docs.github.com/copilot/concepts/agents/about-copilot-cli) installed and authenticated
 - Python 3.8 or later
 - One of the following for Word document support (PDFs and web URLs work without these):
  - Microsoft Word (macOS or Windows)
  - LibreOffice (any platform)
 ### Install the plugin
 Install via the Copilot CLI plugin system. In a Copilot CLI conversation:
 ```
 install the eyeball plugin from github/awesome-copilot
 ```
 ### Install dependencies
 After installing the plugin, install the Python dependencies:
 ```bash
 pip install pymupdf pillow python-docx playwright
 python -m playwright install chromium
 ```
 On Windows, also install pywin32 for Word automation:
 ```bash
 pip install pywin32
 ```
 ### Verify setup
 ```bash
 python3 skills/eyeball/tools/eyeball.py setup-check
 ```
 This shows which source types are supported on your machine.
 ## How to use it
 In a Copilot CLI conversation, tell it to use eyeball and what you want analyzed:
 ```
 use eyeball on ~/Desktop/vendor-agreement.docx -- analyze the indemnification
 and liability provisions and flag anything unusual
 ```
 ```
 run eyeball on https://example.com/terms-of-service -- identify the
 developer-friendly aspects of these terms
 ```
 ```
 use eyeball to analyze this NDA for non-compete provisions
 ```
 Eyeball activates, reads the source document, writes the analysis with exact section references, and generates a Word document on your Desktop with source screenshots inline.
 ## What it supports
 | Source type | Requirements |
 |---|---|
 | PDF files | Python + PyMuPDF (included in setup) |
 | Web pages | Python + Playwright + Chromium (included in setup) |
 | Word documents (.docx) | Microsoft Word (macOS/Windows) or LibreOffice (any platform). On Windows, pywin32 is also required (included in setup). |
 ## How it works
 1. Eyeball reads the full text of the source document
 2. It writes analysis with exact section numbers, page references, and verbatim quotes
 3. For each claim, it searches the rendered source for the cited text
 4. It captures a screenshot of the surrounding region with the cited text highlighted in yellow
 5. It assembles a Word document with analysis paragraphs and screenshots interleaved
 6. The output lands on your Desktop
 The screenshots are dynamically sized: if a section of analysis references text that spans a large region, the screenshot expands to cover it. If the referenced text appears on multiple pages, the screenshots are stitched together.
 ## Why screenshots instead of quoted text?
 In hallucination-sensitive contexts, sometimes we need to see receipts.
 Quoted text is easy to fabricate. A model can generate a plausible-sounding quote that doesn't actually appear in the source, and without checking, you'd never know. Screenshots from the rendered source are harder to fake; they show the actual formatting, layout, and surrounding context of the original document. You can see at a glance whether the highlighted text matches the claim, and the surrounding text provides context that a cherry-picked quote might omit.
 ## Limitations
 - Word document conversion requires Microsoft Word or LibreOffice. Without one of these, you can still use Eyeball with PDFs and web URLs.
 - Text search is string-matching. If the source document uses unusual encoding, ligatures, or non-standard characters, some searches may not match. The skill instructions tell the AI to use verbatim phrases from the extracted text, which handles most cases.
 - Web page rendering depends on Playwright and may not perfectly capture all dynamic content (e.g., content loaded by JavaScript after page load, content behind login walls).
 - Screenshot quality depends on the source formatting. Dense multi-column layouts or very small text may produce less readable screenshots. Increase the DPI setting if needed.
 ## License
 MIT
@@ -0,0 +1,170 @@
 ---
 name: eyeball
 description: 'Document analysis with inline source screenshots. When you ask Copilot to analyze a document, Eyeball generates a Word doc where every factual claim includes a highlighted screenshot from the source material so you can verify it with your own eyes.'
 ---
 # Eyeball
 Analyze documents with visual proof. When activated, Eyeball produces a Word document on the user's Desktop where every factual assertion includes an inline screenshot from the source material with the cited text highlighted in yellow.
 ## Activation
 When the user invokes this skill (e.g., "use eyeball", "run eyeball on this", "eyeball this document"), respond with:
 > **Eyeball is active.** I'll analyze the document and produce a Word doc with inline source screenshots so you can verify every claim with your own eyes.
 Then follow the workflow below.
 ## Supported Sources
 - **Local files:** Word documents (.docx, .doc), PDFs (.pdf), RTF files
 - **Web URLs:** Any publicly accessible web page
 ## Tool Location
 The Eyeball Python utility is located at:
 ```
 <plugin_dir>/skills/eyeball/tools/eyeball.py
 ```
 To find the actual path, run:
 ```bash
 find ~/.copilot/installed-plugins -name "eyeball.py" -path "*/eyeball/*" 2>/dev/null
 ```
 If not found there, check the project directory or the user's home directory for the eyeball repo.
 ## First-Run Setup
 Before first use, check that dependencies are installed:
 ```bash
 python3 <path-to>/eyeball.py setup-check
 ```
 If anything is missing, install the required dependencies:
 ```bash
 pip3 install pymupdf pillow python-docx playwright
 python3 -m playwright install chromium
 ```
 On Windows, also install pywin32 for Word automation:
 ```bash
 pip install pywin32
 ```
 ## Workflow
 Follow these steps exactly. The order matters.
 ### Step 1: Read the source text
 Before writing any analysis, extract and read the full text of the source document:
 ```bash
 python3 <path-to>/eyeball.py extract-text --source "<path-or-url>"
 ```
 Read the output carefully. Identify actual section numbers, headings, page numbers, and key language.
 **CRITICAL:** Do not skip this step. Do not write analysis based on assumptions about how the document is structured. Read the actual text.
 ### Step 2: Write analysis with exact citations
 For each point in your analysis, you must:
 1. **Reference the correct section number as it appears in the document** (e.g., "Section 9" not "Section 8" because you assumed the numbering).
 2. **Reference the correct page number** where the section appears in the extracted text.
 3. **Select anchors that are verbatim phrases from the source** that directly support your claim.
 ### Step 3: Select anchors correctly
 This is the most important step. Anchors determine what gets highlighted in the screenshots.
 **DO:**
 - Use verbatim phrases from the source text that directly support your assertion
 - Use multiple anchors to span the full range of text the reader should see
 - Use specific, uncommon phrases that appear only where you intend
 **DO NOT:**
 - Use generic topic labels (e.g., "Confidentiality") that appear throughout the document
 - Use section titles alone when they appear as cross-references elsewhere
 - Use single common words that match in many places
 **Examples:**
 WRONG -- uses a generic topic label that matches everywhere:
 ```json
 {"anchors": ["User-Generated Content"], "target_page": 8}
 ```
 RIGHT -- uses the specific language that supports the claim:
 ```json
 {"anchors": ["retain ownership", "Ownership of Content, Right to Post"], "target_page": 8}
 ```
 WRONG -- section title appears as a cross-reference on earlier pages:
 ```json
 {"anchors": ["LIMITATION OF LIABILITY"]}
 ```
 RIGHT -- includes the section number for precision, targets the correct page:
 ```json
 {"anchors": ["12. LIMITATION OF LIABILITY", "INDIRECT", "CONSEQUENTIAL"], "target_page": 13}
 ```
 ### Step 4: Build the analysis document
 Construct a JSON array of sections and call the build command:
 ```bash
 python3 <path-to>/eyeball.py build \
  --source "<path-or-url>" \
  --output ~/Desktop/<title>.docx \
  --title "Analysis Title" \
  --subtitle "Source description" \
  --sections '[
    {
      "heading": "1. Section Title",
      "analysis": "Your analysis text here. Reference Section X on page Y...",
      "anchors": ["verbatim phrase 1", "verbatim phrase 2"],
      "target_page": 5,
      "context_padding": 40
    },
    {
      "heading": "2. Another Section",
      "analysis": "More analysis...",
      "anchors": ["exact quote from source"],
      "target_pages": [10, 11],
      "context_padding": 50
    }
  ]'
 ```
 Section object fields:
 - `heading` (required): Section heading in the output document
 - `analysis` (required): Your analysis text
 - `anchors` (required): List of verbatim phrases from the source to search for and highlight
 - `target_page` (optional): Single page number (1-indexed) to search on
 - `target_pages` (optional): List of page numbers to search across (screenshots stitched vertically)
 - `context_padding` (optional): Padding in PDF points above/below the anchor region (default: 40). Increase for more context.
 ### Step 5: Deliver the output
 Save the output to the user's Desktop. Tell the user the filename and that they can open it to verify each claim against the highlighted source screenshots.
 ## Self-Check Before Delivery
 Before saving the final document, mentally verify:
 1. Does each section's analysis text reference the correct section number from the source?
 2. Are the anchors verbatim phrases that appear on the target page?
 3. Does each anchor directly support the claim in the analysis, not just relate to the same topic?
 4. If the screenshot doesn't match the analysis, is the analysis wrong or is the anchor wrong? Fix whichever is incorrect.
 ## Notes
 - The output document includes highlighted screenshots that are dynamically sized. If you provide multiple anchors, the screenshot expands to cover all of them.
 - When a search term is not found, the output document will note this. If this happens, the anchor was likely not verbatim enough. Adjust and rebuild.
 - For web pages, Playwright renders the page to PDF first. The resulting page numbers may differ from what you see in a browser. Use the extracted text output (step 1) to determine correct page numbers.
 - If the user has already provided the source text or you have already read it in the current conversation, you can skip step 1. But always verify section numbers and page references against the actual text before writing analysis.
@@ -0,0 +1,833 @@
 #!/usr/bin/env python3
 """
 Eyeball - Document analysis with inline source screenshots.
 Converts source documents (Word, PDF, web URL) to PDF, renders pages as images,
 searches for cited text, highlights matching regions, and assembles an output
 Word document with analysis text interleaved with source screenshots.
 Usage (called by the Copilot CLI skill, not typically invoked directly):
    python3 eyeball.py build \
        --source <path-or-url> \
        --output <output.docx> \
        --sections '[{"heading": "Section 1", "analysis": "Example analysis text"}]'
    python3 eyeball.py setup-check
    python3 eyeball.py convert --source <file.docx> --output <file.pdf>
    python3 eyeball.py screenshot \
        --source <file.pdf> \
        --anchors '["term1", "term2"]' \
        --page 5 \
        --output screenshot.png
 """
 import argparse
 import io
 import json
 import os
 import platform
 import shutil
 import subprocess
 import sys
 import tempfile
 try:
    import fitz  # PyMuPDF
 except ImportError:
    fitz = None
 try:
    from PIL import Image, ImageDraw
 except ImportError:
    Image = None
    ImageDraw = None
 try:
    from docx import Document
    from docx.shared import Inches, Pt, RGBColor
 except ImportError:
    Document = None
    Inches = None
    Pt = None
    RGBColor = None
 def _resolve_path(path_str):
    """Expand ~ and environment variables in a user-provided path."""
    return os.path.expandvars(os.path.expanduser(path_str))
 def _check_core_deps():
    """Raise if core dependencies are missing."""
    missing = []
    if fitz is None:
        missing.append("pymupdf")
    if Image is None:
        missing.append("pillow")
    if Document is None:
        missing.append("python-docx")
    if missing:
        print(f"Missing dependencies: {', '.join(missing)}", file=sys.stderr)
        print(f"Run setup.sh or: {sys.executable} -m pip install pymupdf pillow python-docx playwright", file=sys.stderr)
        sys.exit(1)
 # ---------------------------------------------------------------------------
 # Document conversion: source -> PDF
 # ---------------------------------------------------------------------------
 def convert_to_pdf(source_path, output_pdf_path):
    """Convert a document to PDF. Supports .docx, .doc, .rtf, .html, .htm."""
    if not os.path.isfile(source_path):
        raise FileNotFoundError(f"Source file not found: {source_path}")
    ext = os.path.splitext(source_path)[1].lower()
    if ext == ".pdf":
        if os.path.abspath(source_path) != os.path.abspath(output_pdf_path):
            shutil.copy2(source_path, output_pdf_path)
        return True
    system = platform.system()
    # Try Microsoft Word first on the current platform
    if system == "Darwin" and ext in (".docx", ".doc", ".rtf"):
        if os.path.exists("/Applications/Microsoft Word.app"):
            if _convert_with_word_mac(source_path, output_pdf_path):
                return True
    if system == "Windows" and ext in (".docx", ".doc", ".rtf"):
        if _convert_with_word_windows(source_path, output_pdf_path):
            return True
    # Fall back to LibreOffice on any platform
    soffice = shutil.which("libreoffice") or shutil.which("soffice")
    if not soffice and system == "Windows":
        soffice = _find_libreoffice_windows()
    if soffice and ext in (".docx", ".doc", ".rtf", ".odt", ".html", ".htm"):
        if _convert_with_libreoffice(soffice, source_path, output_pdf_path):
            return True
    raise RuntimeError(
        f"Cannot convert {ext} to PDF. Install Microsoft Word (macOS/Windows) "
        f"or LibreOffice (any platform)."
    )
 def _convert_with_word_mac(source_path, output_pdf_path):
    """Convert using Microsoft Word on macOS via AppleScript."""
    source_abs = os.path.abspath(source_path)
    output_abs = os.path.abspath(output_pdf_path)
    # Escape characters that break AppleScript string interpolation
    source_safe = source_abs.replace('\\', '\\\\').replace('"', '\\"')
    output_safe = output_abs.replace('\\', '\\\\').replace('"', '\\"')
    script = f'''
    tell application "Microsoft Word"
        open POSIX file "{source_safe}"
        delay 5
        set theDoc to active document
        save as theDoc file name POSIX file "{output_safe}" file format format PDF
        close theDoc saving no
    end tell
    '''
    try:
        result = subprocess.run(
            ["osascript", "-e", script],
            capture_output=True, text=True, timeout=120
        )
        return result.returncode == 0 and os.path.exists(output_pdf_path)
    except (subprocess.TimeoutExpired, FileNotFoundError):
        return False
 def _convert_with_libreoffice(soffice_path, source_path, output_pdf_path):
    """Convert using LibreOffice headless mode."""
    with tempfile.TemporaryDirectory() as tmpdir:
        try:
            result = subprocess.run(
                [soffice_path, "--headless", "--convert-to", "pdf",
                 "--outdir", tmpdir, source_path],
                capture_output=True, text=True, timeout=120
            )
            if result.returncode != 0:
                return False
            basename = os.path.splitext(os.path.basename(source_path))[0] + ".pdf"
            tmp_pdf = os.path.join(tmpdir, basename)
            if os.path.exists(tmp_pdf):
                shutil.move(tmp_pdf, output_pdf_path)
                return True
        except (subprocess.TimeoutExpired, FileNotFoundError):
            pass
    return False
 def _find_libreoffice_windows():
    """Find LibreOffice in common Windows install locations."""
    candidates = []
    for env_var in ("ProgramFiles", "ProgramFiles(x86)"):
        base = os.environ.get(env_var)
        if base:
            candidates.append(os.path.join(base, "LibreOffice", "program", "soffice.exe"))
    for path in candidates:
        if os.path.isfile(path):
            return path
    return None
 def _convert_with_word_windows(source_path, output_pdf_path):
    """Convert using Microsoft Word on Windows via win32com."""
    word = None
    doc = None
    try:
        import win32com.client
        source_abs = os.path.abspath(source_path)
        output_abs = os.path.abspath(output_pdf_path)
        os.makedirs(os.path.dirname(output_abs), exist_ok=True)
        # DispatchEx creates an isolated Word process; fall back to Dispatch
        # if the DCOM class isn't registered
        try:
            word = win32com.client.DispatchEx("Word.Application")
        except Exception:
            word = win32com.client.Dispatch("Word.Application")
        word.Visible = False
        word.DisplayAlerts = 0
        try:
            word.AutomationSecurity = 3  # msoAutomationSecurityForceDisable
        except Exception:
            pass
        doc = word.Documents.Open(
            FileName=source_abs,
            ConfirmConversions=False,
            ReadOnly=True,
            AddToRecentFiles=False,
            NoEncodingDialog=True,
        )
        doc.ExportAsFixedFormat(
            OutputFileName=output_abs,
            ExportFormat=17,  # wdExportFormatPDF
            OpenAfterExport=False,
        )
        return os.path.isfile(output_abs)
    except Exception:
        return False
    finally:
        if doc is not None:
            try:
                doc.Close(False)
            except Exception:
                pass
        if word is not None:
            try:
                word.Quit()
            except Exception:
                pass
 def render_url_to_pdf(url, output_pdf_path):
    """Render a web page to PDF using Playwright."""
    try:
        from playwright.sync_api import sync_playwright
    except ImportError:
        raise RuntimeError(
            "Playwright is required for web URL support. "
            f"Run: {sys.executable} -m pip install playwright && "
            f"{sys.executable} -m playwright install chromium"
        )
    with sync_playwright() as p:
        browser = None
        try:
            browser = p.chromium.launch(headless=True)
            page = browser.new_page()
            page.goto(url, wait_until="networkidle", timeout=30000)
            # Clean up navigation/footer elements for cleaner output
            page.evaluate("""
                document.querySelectorAll(
                    'header, footer, nav, [data-testid="header"], [data-testid="footer"], '
                    + '.site-header, .site-footer, #cookie-banner, .cookie-consent'
                ).forEach(el => el.remove());
            """)
            page.pdf(
                path=output_pdf_path,
                format="Letter",
                print_background=True,
                margin={"top": "0.5in", "bottom": "0.5in",
                        "left": "0.75in", "right": "0.75in"}
            )
        finally:
            if browser is not None:
                browser.close()
 # ---------------------------------------------------------------------------
 # Screenshot generation
 # ---------------------------------------------------------------------------
 def screenshot_region(pdf_doc, anchors, target_page=None, target_pages=None,
                      context_padding=40, dpi=200):
    """
    Find anchor text in a PDF and capture the surrounding region as a highlighted image.
    Args:
        pdf_doc: An open fitz.Document.
        anchors: List of search strings. The crop region expands to cover all of them.
        target_page: Single 1-indexed page to search on.
        target_pages: List of 1-indexed pages to search across (results stitched vertically).
        context_padding: Extra padding in PDF points above/below the anchor region.
        dpi: Render resolution.
    Returns:
        (image_bytes, page_label, (width, height)) or (None, None, None).
    """
    if isinstance(anchors, str):
        anchors = [anchors]
    # Determine pages to search
    if target_pages:
        pages = [p - 1 for p in target_pages]
    elif target_page is not None:
        pages = [target_page - 1]
    else:
        pages = list(range(pdf_doc.page_count))
    # Collect hits across pages
    page_hits = {}
    for pg_idx in pages:
        if pg_idx < 0 or pg_idx >= pdf_doc.page_count:
            continue
        page = pdf_doc[pg_idx]
        hits_on_page = []
        for anchor in anchors:
            found = page.search_for(anchor)
            if found:
                hits_on_page.extend([(anchor, h) for h in found])
        if hits_on_page:
            page_hits[pg_idx] = hits_on_page
    if not page_hits:
        return None, None, None
    zoom = dpi / 72
    # If single page, render one region
    if len(page_hits) == 1:
        pg_idx = list(page_hits.keys())[0]
        img = _render_page_region(pdf_doc, pg_idx, page_hits[pg_idx],
                                   context_padding, zoom)
        img_bytes = _img_to_bytes(img)
        return img_bytes, f"page {pg_idx + 1}", img.size
    # Multi-page: stitch vertically
    images = []
    pages_used = sorted(page_hits.keys())
    for pg_idx in pages_used:
        img = _render_page_region(pdf_doc, pg_idx, page_hits[pg_idx],
                                   context_padding, zoom)
        images.append(img)
    stitched = _stitch_vertical(images)
    img_bytes = _img_to_bytes(stitched)
    if len(pages_used) > 1:
        page_nums = ", ".join(str(p + 1) for p in pages_used)
        page_label = f"pages {page_nums}"
    else:
        page_label = f"page {pages_used[0]+1}"
    return img_bytes, page_label, stitched.size
 def _render_page_region(pdf_doc, pg_idx, hits_with_anchors, context_padding, zoom):
    """Render a cropped region of a PDF page with highlighted anchor text."""
    page = pdf_doc[pg_idx]
    page_rect = page.rect
    all_rects = [h for _, h in hits_with_anchors]
    min_y = min(r.y0 for r in all_rects)
    max_y = max(r.y1 for r in all_rects)
    crop_rect = fitz.Rect(
        page_rect.x0 + 20,
        max(page_rect.y0, min_y - context_padding),
        page_rect.x1 - 20,
        min(page_rect.y1, max_y + context_padding)
    )
    mat = fitz.Matrix(zoom, zoom)
    pix = page.get_pixmap(matrix=mat, clip=crop_rect)
    img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
    # Highlight each anchor hit
    draw = ImageDraw.Draw(img, "RGBA")
    pad = max(2, round(2 * zoom))
    for anchor, rect in hits_with_anchors:
        if rect.y0 >= crop_rect.y0 - 5 and rect.y1 <= crop_rect.y1 + 5:
            x0 = (rect.x0 - crop_rect.x0) * zoom
            y0 = (rect.y0 - crop_rect.y0) * zoom
            x1 = (rect.x1 - crop_rect.x0) * zoom
            y1 = (rect.y1 - crop_rect.y0) * zoom
            draw.rectangle([x0-pad, y0-pad, x1+pad, y1+pad], fill=(255, 255, 0, 100))
    # Border
    ImageDraw.Draw(img).rectangle(
        [0, 0, img.width - 1, img.height - 1],
        outline=(160, 160, 160), width=2
    )
    return img
 def _stitch_vertical(images, gap=4):
    """Stitch multiple images vertically with a small gap between them."""
    total_height = sum(img.height for img in images) + gap * (len(images) - 1)
    max_width = max(img.width for img in images)
    stitched = Image.new("RGB", (max_width, total_height), (255, 255, 255))
    y = 0
    for img in images:
        stitched.paste(img, (0, y))
        y += img.height + gap
    ImageDraw.Draw(stitched).rectangle(
        [0, 0, stitched.width - 1, stitched.height - 1],
        outline=(160, 160, 160), width=2
    )
    return stitched
 def _img_to_bytes(img):
    """Convert PIL Image to a PNG BytesIO buffer (file-like object)."""
    buf = io.BytesIO()
    img.save(buf, format="PNG")
    buf.seek(0)
    return buf
 # ---------------------------------------------------------------------------
 # Output document assembly
 # ---------------------------------------------------------------------------
 def build_analysis_doc(pdf_doc, sections, output_path, title=None, subtitle=None,
                       source_label=None, dpi=200):
    """
    Build a Word document with analysis sections and inline source screenshots.
    Args:
        pdf_doc: An open fitz.Document (the source, already converted to PDF).
        sections: List of dicts, each with:
            - heading (str): Section heading
            - analysis (str): Analysis text
            - anchors (list[str]): Verbatim phrases from source to search and highlight
            - target_page (int, optional): 1-indexed page to search on
            - target_pages (list[int], optional): Multiple pages to search across
            - context_padding (int, optional): Extra padding in PDF points (default 40)
        output_path: Where to save the output .docx file.
        title: Document title.
        subtitle: Document subtitle.
        source_label: Label for the source (e.g., filename or URL).
        dpi: Screenshot resolution.
    """
    doc = Document()
    # Style
    style = doc.styles["Normal"]
    style.font.name = "Calibri"
    style.font.size = Pt(11)
    # Title
    if title:
        doc.add_heading(title, level=1)
    if subtitle:
        p = doc.add_paragraph()
        run = p.add_run(subtitle)
        run.font.size = Pt(11)
        run.font.color.rgb = RGBColor(100, 100, 100)
        doc.add_paragraph("")
    # Sections
    for i, section in enumerate(sections):
        heading = section.get("heading", f"Section {i+1}")
        analysis = section.get("analysis", "")
        anchors = section.get("anchors", [])
        target_page = section.get("target_page")
        target_pages = section.get("target_pages")
        padding = section.get("context_padding", 40)
        doc.add_heading(heading, level=2)
        doc.add_paragraph(analysis)
        if anchors:
            img_bytes, page_label, size = screenshot_region(
                pdf_doc, anchors,
                target_page=target_page,
                target_pages=target_pages,
                context_padding=padding,
                dpi=dpi
            )
            if img_bytes:
                # Source label
                p = doc.add_paragraph()
                anchor_text = ", ".join(f'"{a}"' for a in anchors[:3])
                if len(anchors) > 3:
                    anchor_text += f" (+{len(anchors)-3} more)"
                label = f"[Source: {source_label or 'document'}, {page_label}"
                label += f" -- highlighted: {anchor_text}]"
                run = p.add_run(label)
                run.font.size = Pt(8)
                run.font.color.rgb = RGBColor(120, 120, 120)
                run.font.italic = True
                p.paragraph_format.space_before = Pt(6)
                p.paragraph_format.space_after = Pt(2)
                # Screenshot
                doc.add_picture(img_bytes, width=Inches(5.8))
                doc.paragraphs[-1].paragraph_format.space_after = Pt(12)
            else:
                # Anchors not found
                p = doc.add_paragraph()
                run = p.add_run(
                    f"[Screenshot not available: could not find "
                    f"{', '.join(repr(a) for a in anchors)} in the source document]"
                )
                run.font.size = Pt(9)
                run.font.italic = True
                run.font.color.rgb = RGBColor(180, 50, 50)
    # Footer note
    doc.add_paragraph("")
    note = doc.add_paragraph()
    run = note.add_run(
        "Generated by Eyeball. Each screenshot is captured from the source document "
        "with cited text highlighted in yellow. Screenshots are dynamically sized to "
        "cover the full range of text referenced in the analysis. Review the highlighted "
        "source material to verify each assertion."
    )
    run.font.size = Pt(9)
    run.font.italic = True
    run.font.color.rgb = RGBColor(130, 130, 130)
    doc.save(output_path)
    return output_path
 # ---------------------------------------------------------------------------
 # CLI commands
 # ---------------------------------------------------------------------------
 def cmd_setup_check():
    """Check if all dependencies are available."""
    checks = {
        "PyMuPDF": False,
        "Pillow": False,
        "python-docx": False,
        "Playwright": False,
        "Chromium browser": False,
        "Word (macOS)": False,
        "Word (Windows)": False,
        "LibreOffice": False,
    }
    try:
        import fitz
        checks["PyMuPDF"] = True
    except ImportError:
        pass
    try:
        from PIL import Image
        checks["Pillow"] = True
    except ImportError:
        pass
    try:
        from docx import Document
        checks["python-docx"] = True
    except ImportError:
        pass
    try:
        from playwright.sync_api import sync_playwright
        checks["Playwright"] = True
    except ImportError:
        pass
    # Check Chromium across all platforms
    pw_cache_candidates = []
    system = platform.system()
    if system == "Darwin":
        pw_cache_candidates.append(os.path.expanduser("~/Library/Caches/ms-playwright"))
    if system == "Windows":
        local_app_data = os.environ.get("LOCALAPPDATA", "")
        if local_app_data:
            pw_cache_candidates.append(os.path.join(local_app_data, "ms-playwright"))
    pw_cache_candidates.append(os.path.expanduser("~/.cache/ms-playwright"))
    # Respect PLAYWRIGHT_BROWSERS_PATH
    custom_pw = os.environ.get("PLAYWRIGHT_BROWSERS_PATH")
    if custom_pw and custom_pw != "0":
        pw_cache_candidates.insert(0, custom_pw)
    for pw_cache in pw_cache_candidates:
        if os.path.isdir(pw_cache) and any(
            d.startswith("chromium") for d in os.listdir(pw_cache)
        ):
            checks["Chromium browser"] = True
            break
    # Check converters -- registry/filesystem only, never launch Word
    if system == "Darwin" and os.path.exists("/Applications/Microsoft Word.app"):
        checks["Word (macOS)"] = True
    if system == "Windows":
        try:
            import winreg
            word_reg_paths = [
                (winreg.HKEY_LOCAL_MACHINE, r"SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths\WINWORD.EXE"),
                (winreg.HKEY_CURRENT_USER, r"SOFTWARE\Microsoft\Windows\CurrentVersion\App Paths\WINWORD.EXE"),
                (winreg.HKEY_CLASSES_ROOT, r"Word.Application"),
            ]
            for hive, subkey in word_reg_paths:
                try:
                    winreg.OpenKey(hive, subkey)
                    checks["Word (Windows)"] = True
                    break
                except OSError:
                    pass
        except ImportError:
            pass
        # Check if pywin32 is available for Word automation
        if checks["Word (Windows)"]:
            try:
                import win32com.client  # noqa: F401
            except ImportError:
                checks["Word (Windows)"] = False
                print("  Note: Microsoft Word found but pywin32 is not installed.")
                print(f"  Run: {sys.executable} -m pip install pywin32")
    if shutil.which("libreoffice") or shutil.which("soffice"):
        checks["LibreOffice"] = True
    elif system == "Windows":
        if _find_libreoffice_windows():
            checks["LibreOffice"] = True
    print("Eyeball dependency check:")
    all_core = True
    for name, ok in checks.items():
        status = "OK" if ok else "MISSING"
        marker = "+" if ok else "-"
        print(f"  [{marker}] {name}: {status}")
        if name in ("PyMuPDF", "Pillow", "python-docx") and not ok:
            all_core = False
    has_converter = checks["Word (macOS)"] or checks["Word (Windows)"] or checks["LibreOffice"]
    has_web = checks["Playwright"] and checks["Chromium browser"]
    print("")
    print("Source support:")
    print(f"  PDF files:   {'Ready' if all_core else 'Needs: pip3 install pymupdf pillow python-docx'}")
    print(f"  Word docs:   {'Ready' if has_converter else 'Needs: Microsoft Word or LibreOffice'}")
    print(f"  Web URLs:    {'Ready' if has_web else 'Needs: pip3 install playwright && python3 -m playwright install chromium'}")
    return 0 if all_core else 1
 def cmd_convert(args):
    """Convert a document to PDF."""
    source = _resolve_path(args.source)
    output = _resolve_path(args.output)
    if source.startswith(("http://", "https://")):
        render_url_to_pdf(source, output)
    else:
        convert_to_pdf(source, output)
    print(f"Converted: {output} ({os.path.getsize(output)} bytes)")
 def cmd_screenshot(args):
    """Generate a single screenshot from a PDF."""
    _check_core_deps()
    source = _resolve_path(args.source)
    if not os.path.isfile(source):
        print(f"Source file not found: {source}", file=sys.stderr)
        sys.exit(1)
    ext = os.path.splitext(source)[1].lower()
    if ext != ".pdf":
        print(f"Source must be a PDF file (got {ext}). "
              f"Use 'convert' to convert other formats first.", file=sys.stderr)
        sys.exit(1)
    anchors = json.loads(args.anchors)
    target_page = args.page
    padding = args.padding
    dpi = args.dpi
    pdf_doc = fitz.open(source)
    try:
        img_bytes, page_label, size = screenshot_region(
            pdf_doc, anchors,
            target_page=target_page,
            context_padding=padding,
            dpi=dpi
        )
    finally:
        pdf_doc.close()
    if img_bytes:
        output = _resolve_path(args.output)
        with open(output, "wb") as f:
            f.write(img_bytes.getvalue())
        print(f"Screenshot saved: {output} ({size[0]}x{size[1]}px, {page_label})")
    else:
        print(f"No matches found for: {anchors}", file=sys.stderr)
        sys.exit(1)
 def cmd_build(args):
    """Build a complete analysis document."""
    _check_core_deps()
    source = _resolve_path(args.source)
    output = _resolve_path(args.output)
    sections = json.loads(args.sections)
    title = args.title
    subtitle = args.subtitle
    dpi = args.dpi
    if not source.startswith(("http://", "https://")) and not os.path.isfile(source):
        print(f"Source file not found: {source}", file=sys.stderr)
        sys.exit(1)
    # Determine source type and convert to PDF
    with tempfile.NamedTemporaryFile(suffix=".pdf", delete=False) as tmp:
        tmp_pdf = tmp.name
    pdf_doc = None
    try:
        if source.startswith(("http://", "https://")):
            render_url_to_pdf(source, tmp_pdf)
            source_label = source
        elif source.lower().endswith(".pdf"):
            shutil.copy2(source, tmp_pdf)
            source_label = os.path.basename(source)
        else:
            convert_to_pdf(source, tmp_pdf)
            source_label = os.path.basename(source)
        pdf_doc = fitz.open(tmp_pdf)
        build_analysis_doc(
            pdf_doc, sections, output,
            title=title, subtitle=subtitle,
            source_label=source_label,
            dpi=dpi
        )
        size_kb = os.path.getsize(output) / 1024
        print(f"Analysis saved: {output} ({size_kb:.0f} KB)")
    finally:
        if pdf_doc is not None:
            pdf_doc.close()
        if os.path.exists(tmp_pdf):
            os.unlink(tmp_pdf)
 def cmd_extract_text(args):
    """Extract text from a source document (for the AI to read before writing analysis)."""
    _check_core_deps()
    source = _resolve_path(args.source)
    if not source.startswith(("http://", "https://")) and not os.path.isfile(source):
        print(f"Source file not found: {source}", file=sys.stderr)
        sys.exit(1)
    with tempfile.NamedTemporaryFile(suffix=".pdf", delete=False) as tmp:
        tmp_pdf = tmp.name
    pdf_doc = None
    try:
        if source.startswith(("http://", "https://")):
            render_url_to_pdf(source, tmp_pdf)
        elif source.lower().endswith(".pdf"):
            shutil.copy2(source, tmp_pdf)
        else:
            convert_to_pdf(source, tmp_pdf)
        pdf_doc = fitz.open(tmp_pdf)
        for i in range(pdf_doc.page_count):
            text = pdf_doc[i].get_text()
            print(f"\n[PAGE {i+1}]")
            print(text)
    finally:
        if pdf_doc is not None:
            pdf_doc.close()
        if os.path.exists(tmp_pdf):
            os.unlink(tmp_pdf)
 def main():
    parser = argparse.ArgumentParser(
        description="Eyeball: Document analysis with inline source screenshots"
    )
    sub = parser.add_subparsers(dest="command")
    # setup-check
    sub.add_parser("setup-check", help="Check dependencies")
    # convert
    p_conv = sub.add_parser("convert", help="Convert a document to PDF")
    p_conv.add_argument("--source", required=True)
    p_conv.add_argument("--output", required=True)
    # screenshot
    p_ss = sub.add_parser("screenshot", help="Generate a screenshot from a PDF")
    p_ss.add_argument("--source", required=True, help="PDF file path")
    p_ss.add_argument("--anchors", required=True, help="JSON array of search terms")
    p_ss.add_argument("--page", type=int, help="Target page (1-indexed)")
    p_ss.add_argument("--padding", type=int, default=40)
    p_ss.add_argument("--dpi", type=int, default=200)
    p_ss.add_argument("--output", required=True, help="Output PNG path")
    # build
    p_build = sub.add_parser("build", help="Build analysis document")
    p_build.add_argument("--source", required=True,
                         help="Source document path or URL")
    p_build.add_argument("--output", required=True,
                         help="Output .docx path")
    p_build.add_argument("--sections", required=True,
                         help="JSON array of section objects")
    p_build.add_argument("--title", help="Document title")
    p_build.add_argument("--subtitle", help="Document subtitle")
    p_build.add_argument("--dpi", type=int, default=200)
    # extract-text
    p_text = sub.add_parser("extract-text",
                            help="Extract text from a document (for AI analysis)")
    p_text.add_argument("--source", required=True)
    args = parser.parse_args()
    if args.command == "setup-check":
        sys.exit(cmd_setup_check())
    elif args.command == "convert":
        cmd_convert(args)
    elif args.command == "screenshot":
        cmd_screenshot(args)
    elif args.command == "build":
        cmd_build(args)
    elif args.command == "extract-text":
        cmd_extract_text(args)
    else:
        parser.print_help()
        sys.exit(1)
 if __name__ == "__main__":
    main()