feat(skills): add IoT edge skills and align agent/instruction docs (#1431)

* feat(skills): add IoT edge skills and align agent/instruction docs * fix(ci): handle fork permission errors in plugin structure check * fix(ci): allow intentional Spanish vocabulary in codespell * docs(skills): translate IoT edge skill content to English * fix(ci): pass codespell and README validation * chore: regenerate skills index after merge
2026-04-30 04:05:55 +00:00 · 2026-04-29 03:15:42 +02:00
parent bf9136726b
commit e2ae5cc559
15 changed files with 995 additions and 8 deletions
--- a/skills/arduino-azure-iot-edge-integration/SKILL.md
+++ b/skills/arduino-azure-iot-edge-integration/SKILL.md
@@ -0,0 +1,141 @@
+---
+name: arduino-azure-iot-edge-integration
+description: 'Design and implement Arduino integration with Azure IoT Hub and IoT Edge, including secure provisioning, resilient telemetry, command handling, and production guardrails.'
+---
+
+# Arduino Azure IoT Edge Integration
+
+Use this skill when the user needs to connect Arduino-class devices to Azure IoT, especially in edge-heavy scenarios (gateways, intermittent networks, offline buffering, and local actuation).
+
+## When to use it
+
+Use this skill for requests such as:
+
+- "I want to connect Arduino sensors to Azure"
+- "How do I send MQTT telemetry to IoT Hub?"
+- "I need an edge gateway for field devices"
+- "I want cloud-to-device commands and OTA configuration updates"
+
+## Mandatory documentation review
+
+Before recommending an IoT Edge topology or runtime behavior, review:
+
+- https://learn.microsoft.com/azure/iot-edge/
+
+If documentation cannot be consulted, proceed with explicit assumptions and highlight them in a dedicated section.
+
+## Official Arduino references and best practices (required)
+
+Before proposing firmware, wiring, or communication implementation details, consult official Arduino sources first:
+
+- https://www.arduino.cc/en/Guide
+- https://docs.arduino.cc/
+- https://docs.arduino.cc/language-reference/
+- references/arduino-official-best-practices.md
+
+When choosing between implementation alternatives, prioritize official Arduino guidance over community snippets unless there is a clear technical reason to deviate.
+
+## Objectives
+
+- Produce a secure end-to-end reference path from the Arduino device to cloud insights.
+- Handle unstable links (store-and-forward, retries, idempotency).
+- Define an actionable device and cloud backlog.
+
+## Integration patterns
+
+### Pattern A: Arduino direct to IoT Hub
+
+Use when connectivity is stable and cloud latency is acceptable.
+
+- Protocol: MQTT over TLS.
+- Identity: per-device credentials (SAS or X.509).
+- Telemetry payload: compact JSON with timestamp, device ID, metrics, and optional quality flags.
+
+### Pattern B: Arduino to local gateway, then IoT Edge
+
+Use when links are constrained, local control is required, or batching improves cost/reliability.
+
+- Arduino communicates with a local gateway (serial, BLE, local MQTT, RS-485, Modbus bridge).
+- The gateway publishes upstream through the IoT Edge runtime and routes data to IoT Hub.
+- Local modules can filter, aggregate, and trigger actions even during cloud outages.
+
+## Design flow
+
+### 1) Device contract
+
+Define:
+
+- Sensor catalog and units.
+- Sampling frequency and expected throughput.
+- Message schema versioning strategy.
+- Desired/reported device twin properties to control runtime behavior.
+
+### 2) Security baseline
+
+Require:
+
+- Unique identity per device.
+- No hardcoded secrets in source code or firmware artifacts.
+- Credential rotation strategy.
+- Signed firmware and a controlled update process when possible.
+
+### 3) Reliability and offline behavior
+
+Plan and document:
+
+- Backoff with jitter.
+- Local queue/buffer strategy with bounded size.
+- Duplicate suppression or downstream idempotent processing.
+- Fallback to last-known-good configuration.
+
+### 4) Cloud and edge routing
+
+Define routes for:
+
+- Raw telemetry to cold storage.
+- Curated telemetry to hot analytics.
+- Alerts to operations channels.
+- Commands and configuration back to edge/device.
+
+### 5) Observability
+
+Specify minimum operations telemetry:
+
+- Device heartbeat and firmware version.
+- Connectivity state transitions.
+- Message send success/error counters.
+- Gateway module health and restart reasons.
+
+## Reuse other skills
+
+When relevant, combine with:
+
+- `azure-smart-city-iot-solution-builder` for city-wide architecture and phased rollout.
+- `azure-resource-visualizer` for relationship diagrams.
+- `appinsights-instrumentation` for app and service telemetry patterns.
+
+Also use `references/arduino-official-best-practices.md` as a quality baseline for firmware and hardware recommendations.
+
+## Required output
+
+Always provide:
+
+1. Chosen connectivity pattern and rationale.
+2. Message contract (fields, units, sample payload).
+3. Security checklist for identity/credentials/updates.
+4. Reliability plan (retry, buffering, dedupe).
+5. Implementation backlog (firmware, gateway, cloud).
+
+## Output template
+
+1. Scenario and assumptions
+2. Recommended architecture
+3. Device and gateway contract
+4. Security and reliability controls
+5. Deployment plan and validation tests
+
+## Guidelines
+
+- Do not propose production deployments with shared credentials across devices.
+- Do not assume always-on connectivity in field deployments.
+- Do not omit command authorization and auditing in actuator scenarios.
--- a/skills/arduino-azure-iot-edge-integration/references/arduino-iot-checklist.md
+++ b/skills/arduino-azure-iot-edge-integration/references/arduino-iot-checklist.md
@@ -0,0 +1,42 @@
+# Arduino Azure IoT Checklist
+
+Use this checklist before finalizing architecture or implementation guidance.
+
+## 0) Official Arduino Baseline
+
+- Official references reviewed from <https://www.arduino.cc/en/Guide> and <https://docs.arduino.cc/>.
+- Language/API calls validated against <https://docs.arduino.cc/language-reference/>.
+- Best practices reviewed from `references/arduino-official-best-practices.md`.
+
+## 1) Device Profile
+
+- MCU model and memory constraints documented.
+- Sensor list and sampling strategy defined.
+- Power model documented (mains, battery, sleep cycles).
+
+## 2) Connectivity
+
+- Selected transport documented (MQTT over TLS preferred).
+- Network failure behavior defined.
+- Local timestamp strategy defined if device lacks RTC sync.
+
+## 3) Security
+
+- Unique identity per device.
+- No secrets in source control.
+- Credential rotation plan documented.
+- Firmware update and rollback plan documented.
+
+## 4) Edge and Cloud Flow
+
+- Routing from edge to IoT Hub documented.
+- Offline buffering limits defined.
+- Duplicate handling strategy documented.
+- Alerting thresholds and destinations defined.
+
+## 5) Validation
+
+- Connectivity soak test scenario.
+- Packet loss and reconnection test.
+- Command authorization test.
+- Firmware version and health reporting verification.
--- a/skills/arduino-azure-iot-edge-integration/references/arduino-official-best-practices.md
+++ b/skills/arduino-azure-iot-edge-integration/references/arduino-official-best-practices.md
@@ -0,0 +1,42 @@
+# Arduino Official References and Best Practices
+
+Use these official Arduino resources before finalizing firmware or hardware guidance.
+
+## Official References
+
+- Arduino main guide: <https://www.arduino.cc/en/Guide>
+- Arduino docs home: <https://docs.arduino.cc/>
+- Getting started path: <https://docs.arduino.cc/learn/starting-guide/getting-started-arduino/>
+- Arduino IDE usage: <https://docs.arduino.cc/learn/starting-guide/the-arduino-software-ide/>
+- Arduino language reference: <https://docs.arduino.cc/language-reference/>
+- Arduino programming reference overview: <https://docs.arduino.cc/learn/programming/reference/>
+- Arduino memory guide: <https://docs.arduino.cc/learn/programming/memory-guide/>
+- Arduino debugging fundamentals: <https://docs.arduino.cc/learn/microcontrollers/debugging/>
+- Arduino low-power design guide: <https://docs.arduino.cc/learn/electronics/low-power/>
+- Arduino communication protocols index: <https://docs.arduino.cc/learn/communication/>
+- Arduino style guide for libraries: <https://docs.arduino.cc/learn/contributions/arduino-library-style-guide/>
+
+## Firmware Best Practices
+
+- Keep the `loop()` non-blocking; avoid long `delay()` usage in production logic.
+- Use `millis()`-based scheduling for periodic tasks.
+- Budget SRAM explicitly and avoid dynamic allocation in hot paths.
+- Validate sensor ranges and provide safe defaults for invalid readings.
+- Add startup self-checks and periodic health heartbeat messages.
+- Version the payload schema and firmware version in every telemetry stream.
+- Implement retry with exponential backoff and jitter for network operations.
+- Store credentials outside source code and rotate them according to policy.
+
+## Hardware and Power Best Practices
+
+- Document voltage levels, pin mapping, and current limits per peripheral.
+- Design for brownout and power fluctuation scenarios.
+- Use watchdog and safe recovery behavior where available.
+- Plan low-power modes for battery deployments and validate wake cycles.
+
+## Integration Best Practices for Azure IoT
+
+- Prefer secure transports (MQTT over TLS) and per-device identity.
+- Define idempotent upstream processing for duplicate message scenarios.
+- Include device health metrics (uptime, reset reason, RSSI where applicable).
+- Validate offline buffering bounds to avoid uncontrolled memory growth.
--- a/skills/azure-architecture-autopilot/README.md
+++ b/skills/azure-architecture-autopilot/README.md
@@ -0,0 +1,188 @@
+<h1 align="center">Azure Architecture Autopilot</h1>
+
+<p align="center">
+  <strong>Design → Diagram → Bicep → Deployment - all from natural language</strong>
+</p>
+
+<p align="center">
+  <img src="https://img.shields.io/badge/GitHub_Copilot-Skill-8957e5?logo=github" alt="Copilot Skill">
+  <img src="https://img.shields.io/badge/Azure-All_Services-0078D4?logo=microsoftazure&logoColor=white" alt="Azure">
+  <img src="https://img.shields.io/badge/Bicep-IaC-ff6f00" alt="Bicep">
+  <img src="https://img.shields.io/badge/70+-Service_Types-00bcf2" alt="Service Types">
+  <img src="https://img.shields.io/badge/License-MIT-green" alt="License">
+</p>
+
+<p align="center">
+  <b>Azure Architecture Autopilot</b> designs Azure infrastructure from natural language,<br>
+  generates interactive diagrams, produces modular Bicep templates, and deploys - all through conversation.<br>
+  It also scans existing resources, visualizes them as architecture diagrams, and refines them on the fly.
+</p>
+
+<!-- Hero image: interactive architecture diagram with 605+ Azure icons -->
+<p align="center">
+  <img src="assets/06-architecture-diagram.png" width="100%" alt="Interactive Azure architecture diagram with 605+ official icons">
+</p>
+
+<p align="center">
+  <em>↑ Auto-generated interactive diagram — drag, zoom, click for details, export to PNG</em>
+</p>
+
+<p align="center">
+  <img src="assets/08-deployment-succeeded.png" width="80%" alt="Deployment succeeded">
+  &nbsp;&nbsp;
+  <img src="assets/07-azure-portal-resources.png" width="80%" alt="Azure Portal — deployed resources">
+</p>
+
+<p align="center">
+  <em>↑ Real Azure resources deployed from the generated Bicep templates</em>
+</p>
+
+<p align="center">
+  <a href="#-how-it-works">How It Works</a> •
+  <a href="#-features">Features</a> •
+  <a href="#%EF%B8%8F-prerequisites">Prerequisites</a> •
+  <a href="#-usage">Usage</a> •
+  <a href="#-architecture">Architecture</a>
+</p>
+
+---
+
+## 🔄 How It Works
+
+```
+Path A: "Build me a RAG chatbot on Azure"
+         ↓
+  🎨 Design → 🔧 Bicep → ✅ Review → 🚀 Deploy
+
+Path B: "Analyze my current Azure resources"
+         ↓
+  🔍 Scan → 🎨 Modify → 🔧 Bicep → ✅ Review → 🚀 Deploy
+```
+
+| Phase | Role | What Happens |
+|:-----:|------|--------------|
+| **0** | 🔍 Scanner | Scans existing Azure resources via `az` CLI → auto-generates architecture diagram |
+| **1** | 🎨 Advisor | Interactive design through conversation — asks targeted questions with smart defaults |
+| **2** | 🔧 Generator | Produces modular Bicep: `main.bicep` + `modules/*.bicep` + `.bicepparam` |
+| **3** | ✅ Reviewer | Compiles with `az bicep build`, checks security & best practices |
+| **4** | 🚀 Deployer | `validate` → `what-if` → preview diagram → `create` (5-step mandatory sequence) |
+
+---
+
+## ✨ Features
+
+| | Feature | Description |
+|---|---------|-------------|
+| 📦 | **Zero Dependencies** | 605+ Azure icons bundled — no `pip install`, works offline |
+| 🎨 | **Interactive Diagrams** | Drag-and-drop HTML with zoom, click details, PNG export |
+| 🔍 | **Resource Scanning** | Analyze existing Azure infra → auto-generate architecture diagrams |
+| 💬 | **Natural Language** | *"It's slow"*, *"reduce costs"*, *"add security"* → guided resolution |
+| 📊 | **Live Verification** | API versions, SKUs, model availability fetched from MS Docs in real-time |
+| 🔒 | **Secure by Default** | Private Endpoints, RBAC, managed identity — no secrets in files |
+| ⚡ | **Parallel Preload** | Next-phase info loaded while waiting for user input |
+| 🌐 | **Multi-Language** | Auto-detects user language — responds in English, Korean, or any language |
+
+---
+
+## ⚙️ Prerequisites
+
+| Tool | Required | Install |
+|------|:--------:|---------|
+| **GitHub Copilot CLI** | ✅ | [Install guide](https://docs.github.com/copilot/concepts/agents/about-copilot-cli) |
+| **Azure CLI** | ✅ | `winget install Microsoft.AzureCLI` / `brew install azure-cli` |
+| **Python 3.10+** | ✅ | `winget install Python.Python.3.12` / `brew install python` |
+
+> No additional packages required — the diagram engine is bundled in `scripts/`.
+
+### 🤖 Recommended Models
+
+| | Models | Notes |
+|---|--------|-------|
+| 🏆 **Best** | Claude Opus 4.5 / 4.6 | Most reliable for all 5 phases |
+| ✅ **Recommended** | Claude Sonnet 4.5 / 4.6 | Best cost-performance balance |
+| ⚠️ **Minimum** | Claude Sonnet 4, GPT-5.1+ | May skip steps in complex architectures |
+
+---
+
+## 🚀 Usage
+
+### Path A — Build new infrastructure
+
+```
+"Build a RAG chatbot with Foundry and AI Search"
+"Create a data platform with Databricks and ADLS Gen2"
+"Deploy Fabric + ADF pipeline with private endpoints"
+"Set up a microservices architecture with AKS and Cosmos DB"
+```
+
+### Path B — Analyze & modify existing resources
+
+```
+"Analyze my current Azure infrastructure"
+"Scan rg-production and show me the architecture"
+"What resources are in my subscription?"
+```
+
+Then modify through conversation:
+```
+"Add 3 VMs to this architecture"
+"The Foundry endpoint is slow — what can I do?"
+"Reduce costs — downgrade AI Search to Basic"
+"Add private endpoints to all services"
+```
+
+### 📂 Output Structure
+
+```
+<project-name>/
+├── 00_arch_current.html         ← Scanned architecture (Path B)
+├── 01_arch_diagram_draft.html   ← Design diagram
+├── 02_arch_diagram_preview.html ← What-if preview
+├── 03_arch_diagram_result.html  ← Deployment result
+├── main.bicep                   ← Orchestration
+├── main.bicepparam              ← Parameter values
+└── modules/
+    └── *.bicep                  ← Per-service modules
+```
+
+---
+
+## 📁 Architecture
+
+```
+SKILL.md                            ← Lightweight router (~170 lines)
+│
+├── scripts/                         ← Embedded diagram engine
+│   ├── generator.py                 ← Interactive HTML generator
+│   ├── icons.py                     ← 605+ Azure icons (Base64 SVG)
+│   └── cli.py                       ← CLI entry point
+│
+└── references/                      ← Phase instructions + patterns
+    ├── phase0-scanner.md            ← 🔍 Resource scanning
+    ├── phase1-advisor.md            ← 🎨 Architecture design
+    ├── bicep-generator.md           ← 🔧 Bicep generation
+    ├── bicep-reviewer.md            ← ✅ Code review
+    ├── phase4-deployer.md           ← 🚀 Deployment pipeline
+    ├── service-gotchas.md           ← Required properties & PE mappings
+    ├── azure-common-patterns.md     ← Security & naming patterns
+    ├── azure-dynamic-sources.md     ← MS Docs URL registry
+    ├── architecture-guidance-sources.md
+    └── ai-data.md                   ← AI/Data service domain pack
+```
+
+> **Self-contained** — `SKILL.md` is a lightweight router. All phase logic lives in `references/`. The diagram engine is embedded in `scripts/` with no external dependencies.
+
+---
+
+## 📊 Supported Services (70+ types)
+
+All Azure services supported. AI/Data services have optimized templates; others are auto-looked up from MS Docs.
+
+**Key types:** `ai_foundry` · `openai` · `ai_search` · `storage` · `adls` · `keyvault` · `fabric` · `databricks` · `aks` · `vm` · `app_service` · `function_app` · `cosmos_db` · `sql_server` · `postgresql` · `mysql` · `synapse` · `adf` · `apim` · `service_bus` · `logic_apps` · `event_grid` · `event_hub` · `container_apps` · `app_insights` · `log_analytics` · `firewall` · `front_door` · `load_balancer` · `expressroute` · `sentinel` · `redis` · `iot_hub` · `digital_twins` · `signalr` · `acr` · `bastion` · `vpn_gateway` · `data_explorer` · `document_intelligence` ...
+
+
+---
+
+## 📄 License
+
+MIT © [Jeonghoon Lee](https://github.com/whoniiii)
--- a/skills/azure-smart-city-iot-solution-builder/SKILL.md
+++ b/skills/azure-smart-city-iot-solution-builder/SKILL.md
@@ -0,0 +1,156 @@
+---
+name: azure-smart-city-iot-solution-builder
+description: 'Design and plan end-to-end Azure IoT and Smart City solutions: requirements, architecture, security, operations, cost, and a phased delivery plan with concrete implementation artifacts.'
+---
+
+# Azure Smart City IoT Solution Builder
+
+Use this skill to rebuild and standardize a complete workflow for Azure IoT and Smart City solutions.
+
+## When to use it
+
+Use this skill when the user asks for things like:
+
+- "I want to build an IoT solution on Azure"
+- "Smart City architecture for traffic, lighting, or waste"
+- "How do I connect devices, analytics, and alerts?"
+- "I need a roadmap and backlog for an urban platform"
+
+## Objectives
+
+- Convert a high-level idea into a deployable architecture.
+- Reuse existing Azure-focused skills whenever possible.
+- Produce concrete artifacts the team can implement.
+
+## Workflow
+
+### 0) Mandatory documentation review (before any architecture)
+
+Before proposing architecture or technology decisions that involve edge computing, review Azure IoT Edge documentation first:
+
+- https://learn.microsoft.com/azure/iot-edge/
+
+Minimum pages to review:
+
+- What is Azure IoT Edge
+- Runtime architecture
+- Supported systems
+- Version history/release notes
+- Relevant Linux/Windows quickstarts for the scenario
+
+If documentation cannot be consulted, state this explicitly and continue with clearly marked assumptions.
+
+### 1) Scope and constraints
+
+Collect and confirm:
+
+- City domain: mobility, parking, air quality, water, energy, public safety, waste, etc.
+- Scale: number of devices, telemetry frequency, retention, regions.
+- Latency and availability objectives.
+- Regulatory and privacy constraints.
+- Existing systems to integrate (SCADA, GIS, ERP, ticketing, APIs).
+
+### 2) Capability map
+
+Split the platform into layers:
+
+- Device and edge: onboarding, identity, firmware, OTA, edge processing.
+- Ingestion and messaging: command and control, event routing, buffering.
+- Data and analytics: hot path vs cold path, dashboards, historical analysis.
+- Operations: observability, incident flow, SLOs.
+- Governance: RBAC, secrets, policies, network isolation.
+
+### 3) Azure service selection (reference)
+
+- Device connectivity: Azure IoT Hub, Azure IoT Operations, IoT Edge.
+- Event streaming: Event Hubs, Service Bus, Event Grid.
+- Storage: Blob Storage, Data Lake, Cosmos DB, SQL.
+- Analytics: Azure Data Explorer, Stream Analytics, Fabric/Synapse.
+- APIs and applications: API Management, App Service, Container Apps, Functions.
+- Monitoring: Azure Monitor, Application Insights, Log Analytics.
+- Security: Key Vault, Defender for IoT, Private Endpoints, Managed Identity.
+
+### 4) Non-functional design
+
+Define and document:
+
+- Reliability model (zones/regions, retries, dead-letter handling, replay).
+- Security controls (zero trust, encryption, secret rotation, least privilege).
+- Cost controls (retention tiers, rightsizing, autoscaling, workload scheduling).
+- Data lifecycle (raw, curated, aggregated, archived).
+
+### 5) Delivery plan
+
+Create a phased execution:
+
+- Phase 1: Pilot district or single use case.
+- Phase 2: Multi-domain integration.
+- Phase 3: City-scale rollout and optimization.
+
+For each phase, include:
+
+- Exit criteria
+- Dependencies
+- Risks and mitigations
+- KPI set
+
+## Reuse other skills first
+
+There are two sources of skills:
+
+- Runtime-provided skills (external to this repository): only available when the Copilot host environment exposes them.
+- Local repository skills (this repository): available as local files under `skills/`.
+
+### Runtime-provided Azure skills (optional)
+
+If they are available in the execution environment, delegate to these specialized skills for deeper guidance:
+
+- `azure-kubernetes`
+- `azure-messaging`
+- `azure-observability`
+- `azure-storage`
+- `azure-rbac`
+- `azure-cost`
+- `azure-validate`
+- `azure-deploy`
+
+### Local repository alternatives (use in this repo)
+
+When runtime skills are not available, prioritize existing local skills in this repository:
+
+- `azure-architecture-autopilot` for architecture generation and refinement.
+- `azure-resource-visualizer` for resource relationship diagrams.
+- `azure-role-selector` for role selection guidance.
+- `az-cost-optimize` and `azure-pricing` for cost and pricing analysis.
+- `azure-deployment-preflight` for pre-deployment checks.
+- `appinsights-instrumentation` for telemetry instrumentation patterns.
+
+If no specialized skill is available, continue with this skill and keep assumptions explicit.
+
+## Required output artifacts
+
+Always provide these outputs:
+
+1. Smart City solution summary (scope, assumptions, constraints).
+2. Reference architecture (components and data flow).
+3. Security and governance checklist.
+4. Cost and scaling strategy.
+5. Phased implementation backlog (epics and milestones).
+
+## Output template
+
+Use this response structure:
+
+1. Context and objectives
+2. Proposed architecture
+3. Technology decisions and trade-offs
+4. Security, operations, and cost controls
+5. Phased implementation plan
+6. Risks and open questions
+
+## Guidelines
+
+- Do not jump to deployment before validating prerequisites.
+- Do not recommend single-region production for critical city workloads.
+- Do not omit operational ownership (who handles incidents, SLAs, change windows).
+- Clearly separate assumptions from confirmed facts.
--- a/skills/azure-smart-city-iot-solution-builder/references/smart-city-solution-template.md
+++ b/skills/azure-smart-city-iot-solution-builder/references/smart-city-solution-template.md
@@ -0,0 +1,73 @@
+# Smart City IoT Solution Template
+
+Use this template to standardize outputs for each new smart city scenario.
+
+## 1. Use case summary
+
+- Domain:
+- Stakeholders:
+- Problem statement:
+- Success metrics:
+
+## 2. Device and data profile
+
+- Device types and count:
+- Telemetry schema:
+- Ingestion rate:
+- Command/control requirements:
+- Retention policy:
+
+## 3. Reference architecture
+
+- Edge and field layer:
+- Ingestion layer:
+- Processing layer:
+- Storage layer:
+- API and integration layer:
+- Monitoring and security layer:
+
+## 4. NFR checklist
+
+- Availability target:
+- Latency target:
+- Security controls:
+- Data privacy constraints:
+- DR strategy:
+- Cost target:
+
+## 5. Phased roadmap
+
+### Phase 1 - Pilot
+
+- Scope:
+- Deliverables:
+- Exit criteria:
+
+### Phase 2 - Scale
+
+- Scope:
+- Deliverables:
+- Exit criteria:
+
+### Phase 3 - Optimize
+
+- Scope:
+- Deliverables:
+- Exit criteria:
+
+## 6. Initial backlog baseline
+
+- Epic: Device onboarding and identity
+- Epic: Telemetry ingestion and routing
+- Epic: Real-time alerting and incident workflow
+- Epic: Historical analytics and reporting
+- Epic: Security and compliance hardening
+- Epic: Governance and cost optimization
+
+## 7. Risks
+
+- Vendor/device interoperability gaps
+- Network reliability in field locations
+- Data quality issues and schema drift
+- Over-retention that increases costs
+- Ambiguity in operational ownership
--- a/skills/python-azure-iot-edge-modules/SKILL.md
+++ b/skills/python-azure-iot-edge-modules/SKILL.md
@@ -0,0 +1,139 @@
+---
+name: python-azure-iot-edge-modules
+description: 'Build and operate Python Azure IoT Edge modules with robust messaging, deployment manifests, observability, and production readiness checks.'
+---
+
+# Python Azure IoT Edge Modules
+
+Use this skill to design, implement, and validate Python-based IoT Edge modules for telemetry processing, local inference, protocol translation, and edge-to-cloud integration.
+
+## When To Use
+
+Use this skill for requests like:
+
+- "quiero crear un modulo Python para IoT Edge"
+- "como despliego modulos edge con manifest"
+- "necesito filtrar/agregar telemetria antes de subirla"
+- "como manejo desconexiones y reintentos en edge"
+
+## Mandatory Docs Review
+
+Before recommending runtime behavior or deployment decisions, review:
+
+- https://learn.microsoft.com/azure/iot-edge/
+- https://learn.microsoft.com/es-es/azure/iot-edge/
+
+Minimum checks:
+
+- Runtime architecture and module lifecycle.
+- Supported host OS and versions.
+- Deployment model and configuration flow.
+- Current release/version guidance.
+
+If documentation cannot be fetched, proceed with explicit assumptions and flag them clearly.
+
+## Python Official References and Best Practices (Required)
+
+Before proposing Python implementation details, consult official Python sources:
+
+- https://www.python.org/
+- https://docs.python.org/3/
+- https://docs.python.org/3/reference/
+- https://docs.python.org/3/library/
+- references/python-official-best-practices.md
+
+Prefer official docs over community snippets unless there is a specific compatibility reason to deviate.
+
+## Goals
+
+- Deliver module architecture and implementation plan that is production-focused.
+- Ensure reliable edge messaging under network variability.
+- Provide deployment, observability, and validation artifacts.
+
+## Module Use Cases
+
+- Protocol adapter (serial/Modbus/OPC-UA to IoT message format).
+- Telemetry enrichment and normalization.
+- Local anomaly detection or inference.
+- Command orchestration and local actuator control.
+
+## Delivery Workflow
+
+### 1) Contract and Interfaces
+
+Define:
+
+- Module inputs and outputs.
+- Message schema and versioning policy.
+- Routes and priorities for normal vs critical telemetry.
+- Desired properties used for dynamic configuration.
+
+### 2) Runtime and Packaging
+
+Specify:
+
+- Python runtime version target.
+- Container image strategy (base image, slim footprint, CVE hygiene).
+- Resource profile (CPU/memory bounds).
+- Startup and health checks.
+
+### 3) Reliability Design
+
+Implement and validate:
+
+- Retries with exponential backoff and jitter.
+- Graceful degradation on upstream failures.
+- Local queueing strategy where needed.
+- Idempotent processing for replayed messages.
+
+### 4) Security Controls
+
+Require:
+
+- No plaintext secrets in code or manifest.
+- Least-privilege module behavior.
+- Secure transport and trusted cert chain handling.
+- Traceability for command handling and state changes.
+
+### 5) Deployment and Operations
+
+Define:
+
+- Environment-specific deployment manifests.
+- Rollout strategy (pilot, staged, broad).
+- Rollback criteria.
+- SLOs and alerting conditions.
+
+## Reuse Other Skills
+
+When relevant, combine with:
+
+- `azure-smart-city-iot-solution-builder` for platform-level architecture.
+- `appinsights-instrumentation` for telemetry instrumentation approaches.
+- `azure-resource-visualizer` for architecture diagrams and dependency mapping.
+
+Also use `references/python-official-best-practices.md` as baseline quality criteria for module design and implementation guidance.
+
+## Required Output
+
+Always provide:
+
+1. Module design brief (purpose, inputs, outputs).
+2. Deployment model (image, manifest, env settings).
+3. Reliability and error-handling strategy.
+4. Security and operations checklist.
+5. Test matrix (functional, chaos, performance, rollback).
+
+## Output Template
+
+1. Context and assumptions
+2. Module architecture
+3. Deployment and configuration
+4. Reliability, security, observability
+5. Validation and rollout plan
+
+## Guardrails
+
+- Do not recommend direct production rollout without pilot stage.
+- Do not embed secrets in Dockerfiles, source, or manifests.
+- Do not omit health probes, restart behavior, and rollback criteria.
--- a/skills/python-azure-iot-edge-modules/references/python-edge-module-template.md
+++ b/skills/python-azure-iot-edge-modules/references/python-edge-module-template.md
@@ -0,0 +1,63 @@
+# Python IoT Edge Module Template
+
+Use this template to structure implementation proposals and reviews.
+
+## 0) Official Python Baseline
+
+- Official references reviewed from <https://www.python.org/> and <https://docs.python.org/3/>.
+- Language and stdlib usage validated against <https://docs.python.org/3/reference/> and <https://docs.python.org/3/library/>.
+- Best practices reviewed from `references/python-official-best-practices.md`.
+
+## 1) Module Summary
+
+- Module name:
+- Business capability:
+- Inputs:
+- Outputs:
+- Trigger conditions:
+
+## 2) Message Contract
+
+- Schema version:
+- Required fields:
+- Optional fields:
+- Error payload contract:
+
+## 3) Runtime Configuration
+
+- Python version:
+- Base image:
+- Environment variables:
+- Desired properties:
+- Resource limits:
+
+## 4) Resilience
+
+- Retry policy:
+- Backoff policy:
+- Queueing strategy:
+- Idempotency approach:
+- Timeout and circuit-breaker behavior:
+
+## 5) Security
+
+- Secret source (never inline):
+- Identity and permissions:
+- Command authorization model:
+- Audit log requirements:
+
+## 6) Observability
+
+- Health signals:
+- Business metrics:
+- Error metrics:
+- Correlation/trace requirements:
+- Alert thresholds:
+
+## 7) Validation Matrix
+
+- Happy path tests:
+- Malformed payload tests:
+- Network interruption tests:
+- Throughput and latency tests:
+- Rollback validation:
--- a/skills/python-azure-iot-edge-modules/references/python-official-best-practices.md
+++ b/skills/python-azure-iot-edge-modules/references/python-official-best-practices.md
@@ -0,0 +1,48 @@
+# Python Official References and Best Practices
+
+Use these official Python resources before finalizing module architecture or implementation details.
+
+## Official References
+
+- Python home: <https://www.python.org/>
+- Python documentation portal: <https://docs.python.org/3/>
+- Python tutorial: <https://docs.python.org/3/tutorial/>
+- Python language reference: <https://docs.python.org/3/reference/>
+- Python standard library reference: <https://docs.python.org/3/library/>
+- Python HOWTOs: <https://docs.python.org/3/howto/>
+- Installing modules: <https://docs.python.org/3/installing/>
+- Distributing modules: <https://docs.python.org/3/distributing/>
+- PEP index: <https://peps.python.org/>
+- PyPA packaging guide: <https://packaging.python.org/>
+
+## Coding Best Practices
+
+- Target and pin an explicit Python major/minor runtime for each deployment.
+- Prefer explicit, readable code paths over clever compact logic.
+- Use type hints for public interfaces and critical data transformations.
+- Keep module responsibilities focused; separate protocol, business logic, and transport.
+- Validate and sanitize external inputs at boundaries.
+- Use structured exceptions with actionable error messages.
+- Log with enough context for incident triage (correlation id, module id, message id).
+
+## Reliability and Performance Best Practices
+
+- Avoid blocking operations in high-frequency message paths.
+- Enforce timeouts and bounded retries with exponential backoff and jitter.
+- Design idempotent handlers for replay and duplicate deliveries.
+- Use resource limits and monitor memory growth to prevent edge instability.
+- Define graceful shutdown behavior to flush buffered state safely.
+
+## Dependency and Supply Chain Best Practices
+
+- Pin dependencies and document upgrade cadence.
+- Prefer actively maintained libraries with clear release history.
+- Track vulnerabilities and update dependencies regularly.
+- Keep container images minimal and patched.
+
+## Testing Best Practices
+
+- Unit test parsing, validation, and routing logic.
+- Add integration tests for module I/O boundaries.
+- Add chaos tests for network loss, slow upstream, and restart scenarios.
+- Verify rollback behavior and state recovery in deployment tests.