mirror of
https://github.com/github/awesome-copilot.git
synced 2026-05-01 12:45:56 +00:00
feat: add data-breach-blast-radius skill for pre-breach impact analysis (#1487)
* feat: add data-breach-blast-radius skill for pre-breach impact analysis * fix: resolve codespell false positives (ZAR currency code, SME abbreviation) * fix: remove ZAR abbreviation to pass codespell check
This commit is contained in:
449
skills/data-breach-blast-radius/references/hardening-playbook.md
Normal file
449
skills/data-breach-blast-radius/references/hardening-playbook.md
Normal file
@@ -0,0 +1,449 @@
|
||||
# Hardening Playbook
|
||||
|
||||
Prioritized controls to reduce data breach blast radius. Controls are organized by **impact category** and include tech-stack-specific implementation patterns. Each control includes a **blast radius reduction estimate**.
|
||||
|
||||
> **How to use:** After identifying exposure vectors, match each to a control below. Sort your hardening roadmap by `(Blast_Radius_Reduction × Severity) / Effort`.
|
||||
|
||||
---
|
||||
|
||||
## Control Priority Matrix
|
||||
|
||||
| Priority | Control | Blast Radius Reduction | Effort | Category |
|
||||
|----------|---------|----------------------|--------|---------|
|
||||
| P0 | Fix IDOR/BOLA — add ownership checks | 90% for affected vector | Low | Authorization |
|
||||
| P0 | Remove sensitive fields from API responses | 85% for affected fields | Low | Data Minimization |
|
||||
| P0 | Revoke publicly accessible storage (S3/Blob) | 100% for affected store | Low | Access Control |
|
||||
| P0 | Remove plaintext credentials from code/logs | 100% for affected secret | Low | Secrets |
|
||||
| P1 | Add field-level encryption for T1 data | 80% for encrypted fields | Medium | Encryption |
|
||||
| P1 | Mask/tokenize PCI card data | 95% for card exposure | Medium | Tokenization |
|
||||
| P1 | Remove PII from log statements | 70% for log exposure | Medium | Logging |
|
||||
| P1 | Add authentication to unauthenticated endpoints | 95% for exposed endpoints | Low | Authentication |
|
||||
| P2 | Implement data access audit logging | -50% detection time | Medium | Monitoring |
|
||||
| P2 | Enable database activity monitoring | -60% detection time | Medium | Monitoring |
|
||||
| P2 | Add rate limiting to sensitive endpoints | 60% reduction in data harvesting | Low | Rate Limiting |
|
||||
| P2 | Column-level encryption for T2 sensitive data | 70% for encrypted columns | Medium | Encryption |
|
||||
| P3 | Implement data retention + auto-deletion | 40% reduction in stale data exposure | High | Data Lifecycle |
|
||||
| P3 | Separate analytics store from production PII | 60% for analytics breach | High | Architecture |
|
||||
| P3 | Pseudonymize behavioral tracking data | 70% for behavioral data | Medium | Pseudonymization |
|
||||
|
||||
---
|
||||
|
||||
## P0 — Fix Immediately (< 1 day)
|
||||
|
||||
### 1. Fix Authorization: IDOR / BOLA
|
||||
|
||||
**What it fixes:** Broken Object Level Authorization — users can access other users' data by changing an ID.
|
||||
|
||||
**Detection pattern in code:**
|
||||
```python
|
||||
# VULNERABLE — no ownership check
|
||||
@app.get("/api/orders/{order_id}")
|
||||
def get_order(order_id: int):
|
||||
return db.query(Order).filter(Order.id == order_id).first()
|
||||
|
||||
# SECURE — ownership check
|
||||
@app.get("/api/orders/{order_id}")
|
||||
def get_order(order_id: int, current_user: User = Depends(get_current_user)):
|
||||
order = db.query(Order).filter(
|
||||
Order.id == order_id,
|
||||
Order.user_id == current_user.id # ownership check
|
||||
).first()
|
||||
if not order:
|
||||
raise HTTPException(status_code=404)
|
||||
return order
|
||||
```
|
||||
|
||||
```typescript
|
||||
// VULNERABLE
|
||||
app.get('/api/users/:id/profile', authenticate, async (req, res) => {
|
||||
const user = await User.findById(req.params.id);
|
||||
res.json(user);
|
||||
});
|
||||
|
||||
// SECURE
|
||||
app.get('/api/users/:id/profile', authenticate, async (req, res) => {
|
||||
if (req.params.id !== req.user.id && !req.user.isAdmin) {
|
||||
return res.status(403).json({ error: 'Forbidden' });
|
||||
}
|
||||
const user = await User.findById(req.params.id);
|
||||
res.json(user);
|
||||
});
|
||||
```
|
||||
|
||||
```csharp
|
||||
// VULNERABLE
|
||||
[HttpGet("orders/{orderId}")]
|
||||
public async Task<IActionResult> GetOrder(int orderId)
|
||||
{
|
||||
var order = await _db.Orders.FindAsync(orderId);
|
||||
return Ok(order);
|
||||
}
|
||||
|
||||
// SECURE
|
||||
[HttpGet("orders/{orderId}")]
|
||||
[Authorize]
|
||||
public async Task<IActionResult> GetOrder(int orderId)
|
||||
{
|
||||
var userId = User.FindFirst(ClaimTypes.NameIdentifier)?.Value;
|
||||
var order = await _db.Orders
|
||||
.Where(o => o.Id == orderId && o.UserId == userId)
|
||||
.FirstOrDefaultAsync();
|
||||
if (order == null) return NotFound();
|
||||
return Ok(order);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Remove Sensitive Fields from API Responses
|
||||
|
||||
**What it fixes:** Over-fetching — APIs return more data than the client needs.
|
||||
|
||||
**Pattern:**
|
||||
```typescript
|
||||
// VULNERABLE — returns all fields including passwordHash, ssn
|
||||
const user = await User.findById(id);
|
||||
res.json(user);
|
||||
|
||||
// SECURE — explicit projection
|
||||
const user = await User.findById(id).select('id name email createdAt');
|
||||
res.json(user);
|
||||
```
|
||||
|
||||
```python
|
||||
# SECURE — Pydantic response model (FastAPI)
|
||||
class UserPublicResponse(BaseModel):
|
||||
id: int
|
||||
name: str
|
||||
email: str
|
||||
# NOTE: password_hash, ssn, date_of_birth NOT included
|
||||
|
||||
@app.get("/api/users/{id}", response_model=UserPublicResponse)
|
||||
def get_user(id: int):
|
||||
return db.query(User).filter(User.id == id).first()
|
||||
```
|
||||
|
||||
```java
|
||||
// SECURE — DTO with @JsonIgnore
|
||||
public class UserResponse {
|
||||
public String id;
|
||||
public String name;
|
||||
public String email;
|
||||
// passwordHash, ssn not included in DTO
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. Remove Plaintext Credentials from Code
|
||||
|
||||
**Detection patterns:**
|
||||
```
|
||||
# Patterns to search for in all files:
|
||||
password\s*=\s*["'][^"']+["']
|
||||
api_key\s*=\s*["'][^"']+["']
|
||||
secret\s*=\s*["'][^"']+["']
|
||||
token\s*=\s*["'][^"']+["']
|
||||
connectionString\s*=\s*["'][^"']+["']
|
||||
```
|
||||
|
||||
**Fix pattern:**
|
||||
```python
|
||||
# VULNERABLE
|
||||
DATABASE_URL = "postgresql://user:p@ssw0rd@prod-db.example.com/mydb"
|
||||
|
||||
# SECURE
|
||||
import os
|
||||
DATABASE_URL = os.environ.get("DATABASE_URL")
|
||||
# In production: use Azure Key Vault, AWS Secrets Manager, or GCP Secret Manager
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## P1 — Fix This Week
|
||||
|
||||
### 4. Field-Level Encryption for Tier 1 Data
|
||||
|
||||
Encrypt sensitive fields **before** storing them. The encryption key lives in a KMS, not in the database.
|
||||
|
||||
**Python / SQLAlchemy + Azure Key Vault:**
|
||||
```python
|
||||
from azure.keyvault.secrets import SecretClient
|
||||
from cryptography.fernet import Fernet
|
||||
|
||||
# Encrypt at write time
|
||||
def encrypt_field(value: str, key: bytes) -> str:
|
||||
f = Fernet(key)
|
||||
return f.encrypt(value.encode()).decode()
|
||||
|
||||
# Decrypt at read time (only when authorized)
|
||||
def decrypt_field(encrypted_value: str, key: bytes) -> str:
|
||||
f = Fernet(key)
|
||||
return f.decrypt(encrypted_value.encode()).decode()
|
||||
```
|
||||
|
||||
**Node.js / Prisma + AWS KMS:**
|
||||
```typescript
|
||||
import { KMSClient, EncryptCommand, DecryptCommand } from "@aws-sdk/client-kms";
|
||||
|
||||
const kms = new KMSClient({ region: "us-east-1" });
|
||||
|
||||
async function encryptField(plaintext: string): Promise<string> {
|
||||
const { CiphertextBlob } = await kms.send(new EncryptCommand({
|
||||
KeyId: process.env.KMS_KEY_ARN,
|
||||
Plaintext: Buffer.from(plaintext),
|
||||
}));
|
||||
return Buffer.from(CiphertextBlob!).toString('base64');
|
||||
}
|
||||
```
|
||||
|
||||
**C# / EF Core + Azure Key Vault:**
|
||||
```csharp
|
||||
// Use Always Encrypted for SQL Server / Azure SQL
|
||||
// Or manually encrypt with Azure Key Vault
|
||||
services.AddDbContext<AppDbContext>(options =>
|
||||
options.UseSqlServer(connectionString, sqlOptions =>
|
||||
sqlOptions.EnableSensitiveDataLogging(false)));
|
||||
|
||||
// In entity:
|
||||
[Column(TypeName = "nvarchar(500)")]
|
||||
public string EncryptedSsn { get; set; } // store Base64 ciphertext
|
||||
```
|
||||
|
||||
**Fields that MUST be field-encrypted (Tier 1):**
|
||||
- SSN / national ID numbers
|
||||
- Passport numbers
|
||||
- Full payment card numbers (better: use tokenization, see below)
|
||||
- Medical record data / diagnoses
|
||||
- Biometric templates
|
||||
|
||||
---
|
||||
|
||||
### 5. Tokenize Payment Card Data
|
||||
|
||||
**Never store full card numbers.** Use a PCI-compliant vault instead.
|
||||
|
||||
**Recommended providers:**
|
||||
- Stripe (tokenizes via Elements/PaymentIntents — you never touch card numbers)
|
||||
- Braintree / PayPal
|
||||
- Adyen
|
||||
- Square
|
||||
|
||||
**Pattern:**
|
||||
```typescript
|
||||
// CORRECT — use Stripe's tokenization
|
||||
const paymentMethod = await stripe.paymentMethods.create({
|
||||
type: 'card',
|
||||
card: { token: cardToken }, // token from client-side Stripe.js
|
||||
});
|
||||
// Store: paymentMethod.id (token) — never the card number
|
||||
|
||||
// WRONG — never do this
|
||||
const cardNumber = req.body.cardNumber; // Tier 2 PCI-DSS violation
|
||||
await db.save({ userId, cardNumber }); // DO NOT store raw card data
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 6. Remove PII from Log Statements
|
||||
|
||||
**Pattern to search for and fix:**
|
||||
```python
|
||||
# VULNERABLE
|
||||
logger.info(f"User {user.email} logged in")
|
||||
logger.debug(f"Payment by {user.full_name}, card ending {card_last4}")
|
||||
|
||||
# SECURE — log opaque identifiers, not PII
|
||||
logger.info(f"User {user.id} authenticated", extra={"user_id": user.id})
|
||||
logger.debug(f"Payment processed", extra={"user_id": user.id, "payment_id": payment_id})
|
||||
```
|
||||
|
||||
```typescript
|
||||
// VULNERABLE
|
||||
console.log(`Processing order for ${user.email} at ${user.address}`);
|
||||
|
||||
// SECURE
|
||||
logger.info('Processing order', { userId: user.id, orderId: order.id });
|
||||
```
|
||||
|
||||
**Structured logging fields that are SAFE to log:**
|
||||
- Internal user ID (UUID/opaque)
|
||||
- Session ID (if short-lived and not externally shared)
|
||||
- Transaction/correlation IDs
|
||||
- Error codes and error types
|
||||
- Timestamps
|
||||
- HTTP status codes
|
||||
- Duration/latency
|
||||
|
||||
**Structured logging fields that are UNSAFE:**
|
||||
- Email addresses
|
||||
- IP addresses (must be masked — last octet)
|
||||
- Full names
|
||||
- Phone numbers
|
||||
- Any Tier 1–3 sensitive fields
|
||||
|
||||
---
|
||||
|
||||
## P2 — Fix This Sprint
|
||||
|
||||
### 7. Implement Data Access Audit Logging
|
||||
|
||||
Every read/write of Tier 1 and Tier 2 data must be logged to an immutable audit log.
|
||||
|
||||
**What to log:**
|
||||
```
|
||||
{
|
||||
timestamp: ISO8601,
|
||||
actor_id: "user UUID",
|
||||
actor_role: "admin|user|service",
|
||||
action: "READ|WRITE|DELETE|EXPORT",
|
||||
resource_type: "User|HealthRecord|PaymentMethod",
|
||||
resource_id: "UUID of accessed record",
|
||||
fields_accessed: ["email", "phone"], // NOT the values
|
||||
ip_address: "masked IP",
|
||||
result: "success|denied",
|
||||
correlation_id: "request trace ID"
|
||||
}
|
||||
```
|
||||
|
||||
**Do NOT log the actual sensitive field values in the audit log.**
|
||||
|
||||
**Separation:** Store audit logs in a **separate** database/storage account with stricter access controls than the application database.
|
||||
|
||||
---
|
||||
|
||||
### 8. Rate Limit Sensitive Endpoints
|
||||
|
||||
Prevents automated bulk data harvesting even if an auth vulnerability exists.
|
||||
|
||||
```typescript
|
||||
// Express + express-rate-limit
|
||||
import rateLimit from 'express-rate-limit';
|
||||
|
||||
// Aggressive limit for data export endpoint
|
||||
const exportLimiter = rateLimit({
|
||||
windowMs: 60 * 60 * 1000, // 1 hour
|
||||
max: 5, // max 5 exports per hour per IP
|
||||
message: 'Too many export requests'
|
||||
});
|
||||
|
||||
// Standard limit for data lookup
|
||||
const lookupLimiter = rateLimit({
|
||||
windowMs: 15 * 60 * 1000, // 15 minutes
|
||||
max: 100
|
||||
});
|
||||
|
||||
app.get('/api/export', exportLimiter, authMiddleware, exportController);
|
||||
app.get('/api/users/:id', lookupLimiter, authMiddleware, userController);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## P3 — Fix This Quarter
|
||||
|
||||
### 9. Implement Data Retention and Auto-Deletion
|
||||
|
||||
**Every table with personal data must have a defined retention policy.**
|
||||
|
||||
```sql
|
||||
-- Add retention column to all PII tables
|
||||
ALTER TABLE users ADD COLUMN retention_expires_at TIMESTAMP;
|
||||
ALTER TABLE health_records ADD COLUMN retention_expires_at TIMESTAMP;
|
||||
|
||||
-- Set retention at insert time
|
||||
INSERT INTO users (email, retention_expires_at)
|
||||
VALUES ($1, NOW() + INTERVAL '7 years');
|
||||
|
||||
-- Scheduled job to hard-delete expired records (or anonymize)
|
||||
DELETE FROM users
|
||||
WHERE retention_expires_at < NOW()
|
||||
AND deletion_notified_at IS NOT NULL; -- ensure user was notified
|
||||
```
|
||||
|
||||
**Python scheduled cleanup:**
|
||||
```python
|
||||
from apscheduler.schedulers.asyncio import AsyncIOScheduler
|
||||
|
||||
async def purge_expired_records():
|
||||
await db.execute(
|
||||
"DELETE FROM user_sessions WHERE expires_at < NOW()"
|
||||
)
|
||||
# Anonymize users (don't delete if financial records must be retained)
|
||||
await db.execute("""
|
||||
UPDATE users SET
|
||||
email = CONCAT('deleted_', id, '@redacted.invalid'),
|
||||
phone = NULL,
|
||||
address = NULL,
|
||||
date_of_birth = NULL
|
||||
WHERE retention_expires_at < NOW() AND deleted_at IS NULL
|
||||
""")
|
||||
|
||||
scheduler = AsyncIOScheduler()
|
||||
scheduler.add_job(purge_expired_records, 'cron', hour=2) # 2 AM daily
|
||||
scheduler.start()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 10. Pseudonymize Behavioral and Analytics Data
|
||||
|
||||
Replace direct user identifiers in analytics with pseudonymous tokens.
|
||||
|
||||
```python
|
||||
import hashlib
|
||||
import hmac
|
||||
|
||||
PSEUDONYM_SALT = os.environ.get("PSEUDONYM_SALT") # stored in Key Vault
|
||||
|
||||
def pseudonymize_user_id(real_user_id: str) -> str:
|
||||
"""
|
||||
One-way: analyst can track behavior across sessions
|
||||
but cannot identify the real user without the salt.
|
||||
"""
|
||||
return hmac.new(
|
||||
PSEUDONYM_SALT.encode(),
|
||||
real_user_id.encode(),
|
||||
hashlib.sha256
|
||||
).hexdigest()
|
||||
|
||||
# In analytics event
|
||||
analytics.track({
|
||||
"user_id": pseudonymize_user_id(user.id), # NOT real user ID
|
||||
"event": "page_viewed",
|
||||
"page": request.path,
|
||||
"timestamp": datetime.utcnow().isoformat()
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Quick Win Checklist (Complete in < 1 day)
|
||||
|
||||
- [ ] Search all files for hardcoded secrets → move to env vars / Key Vault
|
||||
- [ ] Check all `SELECT *` queries → add explicit column list excluding sensitive fields
|
||||
- [ ] Verify storage buckets/containers → block public access
|
||||
- [ ] Remove `console.log` / `logger.debug` calls that print request bodies
|
||||
- [ ] Add `HttpOnly; Secure; SameSite=Strict` to all session cookies
|
||||
- [ ] Verify that `/api/admin/*` routes require admin role check
|
||||
- [ ] Confirm password reset tokens expire in < 15 minutes
|
||||
- [ ] Check that 500 error responses don't include stack traces in production
|
||||
- [ ] Verify `.env` and secret files are in `.gitignore`
|
||||
- [ ] Run `git log --all --full-history -- "*.env"` to check for historical secret commits
|
||||
|
||||
---
|
||||
|
||||
## Blast Radius Reduction by Control Applied
|
||||
|
||||
When reporting the hardening roadmap, use these estimates:
|
||||
|
||||
| Control Applied | Blast Radius Reduction | Justification |
|
||||
|----------------|----------------------|---------------|
|
||||
| Fix all IDOR vulnerabilities | 80–90% | Most breach scenarios exploit authorization flaws |
|
||||
| Field encryption for T1 data | 75–85% | Encrypted data is useless without KMS key |
|
||||
| Remove PII from logs | 40–60% | Log access is often less controlled than DB access |
|
||||
| Tokenize payment data | 95% for card data | Standard PCI-DSS compliance eliminates card data scope |
|
||||
| Rate limit data endpoints | 30–50% | Limits scale of automated harvesting attacks |
|
||||
| Data retention enforcement | 20–40% | Reduces "data lake" effect — less data to steal |
|
||||
| Audit logging + anomaly detection | 0% prevention, but -60% detection time | Breaches are caught faster |
|
||||
| Pseudonymization of analytics | 60–70% for analytics data | Analytics data decoupled from identity |
|
||||
| Architecture: separate analytics from PII | 50–70% | Breach of analytics store has no PII value |
|
||||
Reference in New Issue
Block a user