* feat: add data-breach-blast-radius skill for pre-breach impact analysis * fix: resolve codespell false positives (ZAR currency code, SME abbreviation) * fix: remove ZAR abbreviation to pass codespell check
14 KiB
Hardening Playbook
Prioritized controls to reduce data breach blast radius. Controls are organized by impact category and include tech-stack-specific implementation patterns. Each control includes a blast radius reduction estimate.
How to use: After identifying exposure vectors, match each to a control below. Sort your hardening roadmap by
(Blast_Radius_Reduction × Severity) / Effort.
Control Priority Matrix
| Priority | Control | Blast Radius Reduction | Effort | Category |
|---|---|---|---|---|
| P0 | Fix IDOR/BOLA — add ownership checks | 90% for affected vector | Low | Authorization |
| P0 | Remove sensitive fields from API responses | 85% for affected fields | Low | Data Minimization |
| P0 | Revoke publicly accessible storage (S3/Blob) | 100% for affected store | Low | Access Control |
| P0 | Remove plaintext credentials from code/logs | 100% for affected secret | Low | Secrets |
| P1 | Add field-level encryption for T1 data | 80% for encrypted fields | Medium | Encryption |
| P1 | Mask/tokenize PCI card data | 95% for card exposure | Medium | Tokenization |
| P1 | Remove PII from log statements | 70% for log exposure | Medium | Logging |
| P1 | Add authentication to unauthenticated endpoints | 95% for exposed endpoints | Low | Authentication |
| P2 | Implement data access audit logging | -50% detection time | Medium | Monitoring |
| P2 | Enable database activity monitoring | -60% detection time | Medium | Monitoring |
| P2 | Add rate limiting to sensitive endpoints | 60% reduction in data harvesting | Low | Rate Limiting |
| P2 | Column-level encryption for T2 sensitive data | 70% for encrypted columns | Medium | Encryption |
| P3 | Implement data retention + auto-deletion | 40% reduction in stale data exposure | High | Data Lifecycle |
| P3 | Separate analytics store from production PII | 60% for analytics breach | High | Architecture |
| P3 | Pseudonymize behavioral tracking data | 70% for behavioral data | Medium | Pseudonymization |
P0 — Fix Immediately (< 1 day)
1. Fix Authorization: IDOR / BOLA
What it fixes: Broken Object Level Authorization — users can access other users' data by changing an ID.
Detection pattern in code:
# VULNERABLE — no ownership check
@app.get("/api/orders/{order_id}")
def get_order(order_id: int):
return db.query(Order).filter(Order.id == order_id).first()
# SECURE — ownership check
@app.get("/api/orders/{order_id}")
def get_order(order_id: int, current_user: User = Depends(get_current_user)):
order = db.query(Order).filter(
Order.id == order_id,
Order.user_id == current_user.id # ownership check
).first()
if not order:
raise HTTPException(status_code=404)
return order
// VULNERABLE
app.get('/api/users/:id/profile', authenticate, async (req, res) => {
const user = await User.findById(req.params.id);
res.json(user);
});
// SECURE
app.get('/api/users/:id/profile', authenticate, async (req, res) => {
if (req.params.id !== req.user.id && !req.user.isAdmin) {
return res.status(403).json({ error: 'Forbidden' });
}
const user = await User.findById(req.params.id);
res.json(user);
});
// VULNERABLE
[HttpGet("orders/{orderId}")]
public async Task<IActionResult> GetOrder(int orderId)
{
var order = await _db.Orders.FindAsync(orderId);
return Ok(order);
}
// SECURE
[HttpGet("orders/{orderId}")]
[Authorize]
public async Task<IActionResult> GetOrder(int orderId)
{
var userId = User.FindFirst(ClaimTypes.NameIdentifier)?.Value;
var order = await _db.Orders
.Where(o => o.Id == orderId && o.UserId == userId)
.FirstOrDefaultAsync();
if (order == null) return NotFound();
return Ok(order);
}
2. Remove Sensitive Fields from API Responses
What it fixes: Over-fetching — APIs return more data than the client needs.
Pattern:
// VULNERABLE — returns all fields including passwordHash, ssn
const user = await User.findById(id);
res.json(user);
// SECURE — explicit projection
const user = await User.findById(id).select('id name email createdAt');
res.json(user);
# SECURE — Pydantic response model (FastAPI)
class UserPublicResponse(BaseModel):
id: int
name: str
email: str
# NOTE: password_hash, ssn, date_of_birth NOT included
@app.get("/api/users/{id}", response_model=UserPublicResponse)
def get_user(id: int):
return db.query(User).filter(User.id == id).first()
// SECURE — DTO with @JsonIgnore
public class UserResponse {
public String id;
public String name;
public String email;
// passwordHash, ssn not included in DTO
}
3. Remove Plaintext Credentials from Code
Detection patterns:
# Patterns to search for in all files:
password\s*=\s*["'][^"']+["']
api_key\s*=\s*["'][^"']+["']
secret\s*=\s*["'][^"']+["']
token\s*=\s*["'][^"']+["']
connectionString\s*=\s*["'][^"']+["']
Fix pattern:
# VULNERABLE
DATABASE_URL = "postgresql://user:p@ssw0rd@prod-db.example.com/mydb"
# SECURE
import os
DATABASE_URL = os.environ.get("DATABASE_URL")
# In production: use Azure Key Vault, AWS Secrets Manager, or GCP Secret Manager
P1 — Fix This Week
4. Field-Level Encryption for Tier 1 Data
Encrypt sensitive fields before storing them. The encryption key lives in a KMS, not in the database.
Python / SQLAlchemy + Azure Key Vault:
from azure.keyvault.secrets import SecretClient
from cryptography.fernet import Fernet
# Encrypt at write time
def encrypt_field(value: str, key: bytes) -> str:
f = Fernet(key)
return f.encrypt(value.encode()).decode()
# Decrypt at read time (only when authorized)
def decrypt_field(encrypted_value: str, key: bytes) -> str:
f = Fernet(key)
return f.decrypt(encrypted_value.encode()).decode()
Node.js / Prisma + AWS KMS:
import { KMSClient, EncryptCommand, DecryptCommand } from "@aws-sdk/client-kms";
const kms = new KMSClient({ region: "us-east-1" });
async function encryptField(plaintext: string): Promise<string> {
const { CiphertextBlob } = await kms.send(new EncryptCommand({
KeyId: process.env.KMS_KEY_ARN,
Plaintext: Buffer.from(plaintext),
}));
return Buffer.from(CiphertextBlob!).toString('base64');
}
C# / EF Core + Azure Key Vault:
// Use Always Encrypted for SQL Server / Azure SQL
// Or manually encrypt with Azure Key Vault
services.AddDbContext<AppDbContext>(options =>
options.UseSqlServer(connectionString, sqlOptions =>
sqlOptions.EnableSensitiveDataLogging(false)));
// In entity:
[Column(TypeName = "nvarchar(500)")]
public string EncryptedSsn { get; set; } // store Base64 ciphertext
Fields that MUST be field-encrypted (Tier 1):
- SSN / national ID numbers
- Passport numbers
- Full payment card numbers (better: use tokenization, see below)
- Medical record data / diagnoses
- Biometric templates
5. Tokenize Payment Card Data
Never store full card numbers. Use a PCI-compliant vault instead.
Recommended providers:
- Stripe (tokenizes via Elements/PaymentIntents — you never touch card numbers)
- Braintree / PayPal
- Adyen
- Square
Pattern:
// CORRECT — use Stripe's tokenization
const paymentMethod = await stripe.paymentMethods.create({
type: 'card',
card: { token: cardToken }, // token from client-side Stripe.js
});
// Store: paymentMethod.id (token) — never the card number
// WRONG — never do this
const cardNumber = req.body.cardNumber; // Tier 2 PCI-DSS violation
await db.save({ userId, cardNumber }); // DO NOT store raw card data
6. Remove PII from Log Statements
Pattern to search for and fix:
# VULNERABLE
logger.info(f"User {user.email} logged in")
logger.debug(f"Payment by {user.full_name}, card ending {card_last4}")
# SECURE — log opaque identifiers, not PII
logger.info(f"User {user.id} authenticated", extra={"user_id": user.id})
logger.debug(f"Payment processed", extra={"user_id": user.id, "payment_id": payment_id})
// VULNERABLE
console.log(`Processing order for ${user.email} at ${user.address}`);
// SECURE
logger.info('Processing order', { userId: user.id, orderId: order.id });
Structured logging fields that are SAFE to log:
- Internal user ID (UUID/opaque)
- Session ID (if short-lived and not externally shared)
- Transaction/correlation IDs
- Error codes and error types
- Timestamps
- HTTP status codes
- Duration/latency
Structured logging fields that are UNSAFE:
- Email addresses
- IP addresses (must be masked — last octet)
- Full names
- Phone numbers
- Any Tier 1–3 sensitive fields
P2 — Fix This Sprint
7. Implement Data Access Audit Logging
Every read/write of Tier 1 and Tier 2 data must be logged to an immutable audit log.
What to log:
{
timestamp: ISO8601,
actor_id: "user UUID",
actor_role: "admin|user|service",
action: "READ|WRITE|DELETE|EXPORT",
resource_type: "User|HealthRecord|PaymentMethod",
resource_id: "UUID of accessed record",
fields_accessed: ["email", "phone"], // NOT the values
ip_address: "masked IP",
result: "success|denied",
correlation_id: "request trace ID"
}
Do NOT log the actual sensitive field values in the audit log.
Separation: Store audit logs in a separate database/storage account with stricter access controls than the application database.
8. Rate Limit Sensitive Endpoints
Prevents automated bulk data harvesting even if an auth vulnerability exists.
// Express + express-rate-limit
import rateLimit from 'express-rate-limit';
// Aggressive limit for data export endpoint
const exportLimiter = rateLimit({
windowMs: 60 * 60 * 1000, // 1 hour
max: 5, // max 5 exports per hour per IP
message: 'Too many export requests'
});
// Standard limit for data lookup
const lookupLimiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100
});
app.get('/api/export', exportLimiter, authMiddleware, exportController);
app.get('/api/users/:id', lookupLimiter, authMiddleware, userController);
P3 — Fix This Quarter
9. Implement Data Retention and Auto-Deletion
Every table with personal data must have a defined retention policy.
-- Add retention column to all PII tables
ALTER TABLE users ADD COLUMN retention_expires_at TIMESTAMP;
ALTER TABLE health_records ADD COLUMN retention_expires_at TIMESTAMP;
-- Set retention at insert time
INSERT INTO users (email, retention_expires_at)
VALUES ($1, NOW() + INTERVAL '7 years');
-- Scheduled job to hard-delete expired records (or anonymize)
DELETE FROM users
WHERE retention_expires_at < NOW()
AND deletion_notified_at IS NOT NULL; -- ensure user was notified
Python scheduled cleanup:
from apscheduler.schedulers.asyncio import AsyncIOScheduler
async def purge_expired_records():
await db.execute(
"DELETE FROM user_sessions WHERE expires_at < NOW()"
)
# Anonymize users (don't delete if financial records must be retained)
await db.execute("""
UPDATE users SET
email = CONCAT('deleted_', id, '@redacted.invalid'),
phone = NULL,
address = NULL,
date_of_birth = NULL
WHERE retention_expires_at < NOW() AND deleted_at IS NULL
""")
scheduler = AsyncIOScheduler()
scheduler.add_job(purge_expired_records, 'cron', hour=2) # 2 AM daily
scheduler.start()
10. Pseudonymize Behavioral and Analytics Data
Replace direct user identifiers in analytics with pseudonymous tokens.
import hashlib
import hmac
PSEUDONYM_SALT = os.environ.get("PSEUDONYM_SALT") # stored in Key Vault
def pseudonymize_user_id(real_user_id: str) -> str:
"""
One-way: analyst can track behavior across sessions
but cannot identify the real user without the salt.
"""
return hmac.new(
PSEUDONYM_SALT.encode(),
real_user_id.encode(),
hashlib.sha256
).hexdigest()
# In analytics event
analytics.track({
"user_id": pseudonymize_user_id(user.id), # NOT real user ID
"event": "page_viewed",
"page": request.path,
"timestamp": datetime.utcnow().isoformat()
})
Quick Win Checklist (Complete in < 1 day)
- Search all files for hardcoded secrets → move to env vars / Key Vault
- Check all
SELECT *queries → add explicit column list excluding sensitive fields - Verify storage buckets/containers → block public access
- Remove
console.log/logger.debugcalls that print request bodies - Add
HttpOnly; Secure; SameSite=Strictto all session cookies - Verify that
/api/admin/*routes require admin role check - Confirm password reset tokens expire in < 15 minutes
- Check that 500 error responses don't include stack traces in production
- Verify
.envand secret files are in.gitignore - Run
git log --all --full-history -- "*.env"to check for historical secret commits
Blast Radius Reduction by Control Applied
When reporting the hardening roadmap, use these estimates:
| Control Applied | Blast Radius Reduction | Justification |
|---|---|---|
| Fix all IDOR vulnerabilities | 80–90% | Most breach scenarios exploit authorization flaws |
| Field encryption for T1 data | 75–85% | Encrypted data is useless without KMS key |
| Remove PII from logs | 40–60% | Log access is often less controlled than DB access |
| Tokenize payment data | 95% for card data | Standard PCI-DSS compliance eliminates card data scope |
| Rate limit data endpoints | 30–50% | Limits scale of automated harvesting attacks |
| Data retention enforcement | 20–40% | Reduces "data lake" effect — less data to steal |
| Audit logging + anomaly detection | 0% prevention, but -60% detection time | Breaches are caught faster |
| Pseudonymization of analytics | 60–70% for analytics data | Analytics data decoupled from identity |
| Architecture: separate analytics from PII | 50–70% | Breach of analytics store has no PII value |