mirror of
https://github.com/github/awesome-copilot.git
synced 2026-03-19 07:35:17 +00:00
Revert "fetch -> web/fetch for everything"
This reverts commit ca790b1716.
This commit is contained in:
@@ -197,7 +197,7 @@ A JSON representation showing 5-10 representative documents for the container
|
||||
"email": "john@example.com"
|
||||
},
|
||||
{
|
||||
"id": "order_456",
|
||||
"id": "order_456",
|
||||
"partitionKey": "user_123",
|
||||
"type": "order",
|
||||
"userId": "user_123",
|
||||
@@ -254,7 +254,7 @@ A JSON representation showing 5-10 representative documents for the container
|
||||
[Explain the overall trade-offs made and optimizations used as well as why - such as the examples below]
|
||||
|
||||
- **Aggregate Design**: Kept Orders and OrderItems together due to 95% access correlation - trades document size for query performance
|
||||
- **Denormalization**: Duplicated user name in Order document to avoid cross-partition lookup - trades storage for performance
|
||||
- **Denormalization**: Duplicated user name in Order document to avoid cross-partition lookup - trades storage for performance
|
||||
- **Normalization**: Kept User as separate document type from Orders due to low access correlation (15%) - optimizes update costs
|
||||
- **Indexing Strategy**: Used selective indexing instead of automatic to balance cost vs additional query needs
|
||||
- **Multi-Document Containers**: Used multi-document containers for [access_pattern] to enable transactional consistency
|
||||
@@ -290,7 +290,7 @@ A JSON representation showing 5-10 representative documents for the container
|
||||
- ALWAYS update cosmosdb_requirements.md after each user response with new information
|
||||
- ALWAYS treat design considerations in modeling file as evolving thoughts, not final decisions
|
||||
- ALWAYS consider Multi-Document Containers when entities have 30-70% access correlation
|
||||
- ALWAYS consider Hierarchical Partition Keys as alternative to synthetic keys if initial design recommends synthetic keys
|
||||
- ALWAYS consider Hierarchical Partition Keys as alternative to synthetic keys if initial design recommends synthetic keys
|
||||
- ALWAYS consider data binning for massive scale workloads of uniformed events and batch type writes workloads to optimize size and RU costs
|
||||
- **ALWAYS calculate costs accurately** - use realistic document sizes and include all overhead
|
||||
- **ALWAYS present final clean comparison** rather than multiple confusing iterations
|
||||
@@ -343,7 +343,7 @@ In aggregate-oriented design, Azure Cosmos DB NoSQL offers multiple levels of ag
|
||||
Multiple entities combined into a single Cosmos DB document. This provides:
|
||||
|
||||
• Atomic updates across all data in the aggregate
|
||||
• Single point read retrieval for all data. Make sure to reference the document by id and partition key via API (example `ReadItemAsync<Order>(id: "order0103", partitionKey: new PartitionKey("TimS1234"));` instead of using a query with `SELECT * FROM c WHERE c.id = "order0103" AND c.partitionKey = "TimS1234"` for point reads examples)
|
||||
• Single point read retrieval for all data. Make sure to reference the document by id and partition key via API (example `ReadItemAsync<Order>(id: "order0103", partitionKey: new PartitionKey("TimS1234"));` instead of using a query with `SELECT * FROM c WHERE c.id = "order0103" AND c.partitionKey = "TimS1234"` for point reads examples)
|
||||
• Subject to 2MB document size limit
|
||||
|
||||
When designing aggregates, consider both levels based on your requirements.
|
||||
@@ -375,7 +375,7 @@ When designing aggregates, consider both levels based on your requirements.
|
||||
• **Cross-partition overhead**: Each physical partition adds ~2.5 RU base cost to cross-partition queries
|
||||
• **Massive scale implications**: 100+ physical partitions make cross-partition queries extremely expensive and not scalable.
|
||||
• Index overhead: Every indexed property consumes storage and write RUs
|
||||
• Update patterns: Frequent updates to indexed properties or full Document replace increase RU costs (and the bigger Document size, bigger the impact of update RU increase)
|
||||
• Update patterns: Frequent updates to indexed properties or full Document replace increase RU costs (and the bigger Document size, bigger the impact of update RU increase)
|
||||
|
||||
## Core Design Philosophy
|
||||
|
||||
@@ -439,7 +439,7 @@ One-to-One: Store the related ID in both documents
|
||||
```json
|
||||
// Users container
|
||||
{ "id": "user_123", "partitionKey": "user_123", "profileId": "profile_456" }
|
||||
// Profiles container
|
||||
// Profiles container
|
||||
{ "id": "profile_456", "partitionKey": "profile_456", "userId": "user_123" }
|
||||
```
|
||||
|
||||
@@ -463,10 +463,10 @@ Frequently accessed attributes: Denormalize sparingly
|
||||
|
||||
```json
|
||||
// Orders document
|
||||
{
|
||||
"id": "order_789",
|
||||
"partitionKey": "user_123",
|
||||
"customerId": "user_123",
|
||||
{
|
||||
"id": "order_789",
|
||||
"partitionKey": "user_123",
|
||||
"customerId": "user_123",
|
||||
"customerName": "John Doe" // Include customer name to avoid lookup
|
||||
}
|
||||
```
|
||||
@@ -493,7 +493,7 @@ When deciding aggregate boundaries, use this decision framework:
|
||||
Step 1: Analyze Access Correlation
|
||||
|
||||
• 90% accessed together → Strong single document aggregate candidate
|
||||
• 50-90% accessed together → Multi-document container aggregate candidate
|
||||
• 50-90% accessed together → Multi-document container aggregate candidate
|
||||
• <50% accessed together → Separate aggregates/containers
|
||||
|
||||
Step 2: Check Constraints
|
||||
@@ -514,8 +514,8 @@ Based on Steps 1 & 2, select:
|
||||
Order + OrderItems:
|
||||
|
||||
Access Analysis:
|
||||
• web/fetch order without items: 5% (just checking status)
|
||||
• web/fetch order with all items: 95% (normal flow)
|
||||
• Fetch order without items: 5% (just checking status)
|
||||
• Fetch order with all items: 95% (normal flow)
|
||||
• Update patterns: Items rarely change independently
|
||||
• Combined size: ~50KB average, max 200KB
|
||||
|
||||
@@ -587,7 +587,7 @@ Index overhead increases RU costs and storage. It occurs when documents have man
|
||||
When making aggregate design decisions:
|
||||
|
||||
• Calculate read cost = frequency × RUs per operation
|
||||
• Calculate write cost = frequency × RUs per operation
|
||||
• Calculate write cost = frequency × RUs per operation
|
||||
• Total cost = Σ(read costs) + Σ(write costs)
|
||||
• Choose the design with lower total cost
|
||||
|
||||
@@ -623,7 +623,7 @@ When facing massive write volumes, **data binning/chunking** can reduce write op
|
||||
```json
|
||||
{
|
||||
"id": "chunk_001",
|
||||
"partitionKey": "account_test_chunk_001",
|
||||
"partitionKey": "account_test_chunk_001",
|
||||
"chunkId": 1,
|
||||
"records": [
|
||||
{ "recordId": 1, "data": "..." },
|
||||
@@ -660,7 +660,7 @@ When multiple entity types are frequently accessed together, group them in the s
|
||||
[
|
||||
{
|
||||
"id": "user_123",
|
||||
"partitionKey": "user_123",
|
||||
"partitionKey": "user_123",
|
||||
"type": "user",
|
||||
"name": "John Doe",
|
||||
"email": "john@example.com"
|
||||
@@ -668,7 +668,7 @@ When multiple entity types are frequently accessed together, group them in the s
|
||||
{
|
||||
"id": "order_456",
|
||||
"partitionKey": "user_123",
|
||||
"type": "order",
|
||||
"type": "order",
|
||||
"userId": "user_123",
|
||||
"amount": 99.99
|
||||
}
|
||||
@@ -705,7 +705,7 @@ Promoting to Single Document Aggregate
|
||||
When multi-document analysis reveals:
|
||||
|
||||
• Access correlation higher than initially thought (>90%)
|
||||
• All documents always web/fetched together
|
||||
• All documents always fetched together
|
||||
• Combined size remains bounded
|
||||
• Would benefit from atomic updates
|
||||
|
||||
@@ -728,7 +728,7 @@ Example analysis:
|
||||
|
||||
Product + Reviews Aggregate Analysis:
|
||||
- Access pattern: View product details (no reviews) - 70%
|
||||
- Access pattern: View product with reviews - 30%
|
||||
- Access pattern: View product with reviews - 30%
|
||||
- Update frequency: Products daily, Reviews hourly
|
||||
- Average sizes: Product 5KB, Reviews 200KB total
|
||||
- Decision: Multi-document container - low access correlation + size concerns + update mismatch
|
||||
@@ -741,7 +741,7 @@ Short-circuit denormalization involves duplicating a property from a related ent
|
||||
2. The duplicated property is mostly immutable or application can accept stale values
|
||||
3. The property is small enough and won't significantly impact RU consumption
|
||||
|
||||
Example: In an e-commerce application, you can duplicate the ProductName from the Product document into each OrderItem document, so that web/fetching order items doesn't require additional queries to retrieve product names.
|
||||
Example: In an e-commerce application, you can duplicate the ProductName from the Product document into each OrderItem document, so that fetching order items doesn't require additional queries to retrieve product names.
|
||||
|
||||
### Identifying relationship
|
||||
|
||||
@@ -788,14 +788,14 @@ StudentCourseLessons container:
|
||||
"type": "student"
|
||||
},
|
||||
{
|
||||
"id": "course_456",
|
||||
"id": "course_456",
|
||||
"partitionKey": "student_123",
|
||||
"type": "course",
|
||||
"courseId": "course_456"
|
||||
},
|
||||
{
|
||||
"id": "lesson_789",
|
||||
"partitionKey": "student_123",
|
||||
"partitionKey": "student_123",
|
||||
"type": "lesson",
|
||||
"courseId": "course_456",
|
||||
"lessonId": "lesson_789"
|
||||
@@ -818,7 +818,7 @@ TenantData container:
|
||||
```json
|
||||
{
|
||||
"id": "record_123",
|
||||
"partitionKey": "tenant_456_customer_789",
|
||||
"partitionKey": "tenant_456_customer_789",
|
||||
"tenantId": "tenant_456",
|
||||
"customerId": "customer_789"
|
||||
}
|
||||
@@ -877,20 +877,20 @@ Azure Cosmos DB doesn't enforce unique constraints beyond the id+partitionKey co
|
||||
function createUserWithUniqueEmail(userData) {
|
||||
var context = getContext();
|
||||
var container = context.getCollection();
|
||||
|
||||
|
||||
// Check if email already exists
|
||||
var query = `SELECT * FROM c WHERE c.email = "${userData.email}"`;
|
||||
|
||||
|
||||
var isAccepted = container.queryDocuments(
|
||||
container.getSelfLink(),
|
||||
query,
|
||||
function(err, documents) {
|
||||
if (err) throw new Error('Error querying documents: ' + err.message);
|
||||
|
||||
|
||||
if (documents.length > 0) {
|
||||
throw new Error('Email already exists');
|
||||
}
|
||||
|
||||
|
||||
// Email is unique, create the user
|
||||
var isAccepted = container.createDocument(
|
||||
container.getSelfLink(),
|
||||
@@ -900,11 +900,11 @@ function createUserWithUniqueEmail(userData) {
|
||||
context.getResponse().setBody(document);
|
||||
}
|
||||
);
|
||||
|
||||
|
||||
if (!isAccepted) throw new Error('The query was not accepted by the server.');
|
||||
}
|
||||
);
|
||||
|
||||
|
||||
if (!isAccepted) throw new Error('The query was not accepted by the server.');
|
||||
}
|
||||
```
|
||||
@@ -929,7 +929,7 @@ Hierarchical Partition Keys provide natural query boundaries using multiple fiel
|
||||
{
|
||||
"partitionKey": {
|
||||
"version": 2,
|
||||
"kind": "MultiHash",
|
||||
"kind": "MultiHash",
|
||||
"paths": ["/accountId", "/testId", "/chunkId"]
|
||||
}
|
||||
}
|
||||
@@ -944,7 +944,7 @@ Hierarchical Partition Keys provide natural query boundaries using multiple fiel
|
||||
- Data has natural hierarchy (tenant → user → document)
|
||||
- Frequent prefix-based queries
|
||||
- Want to eliminate synthetic partition key complexity
|
||||
- Apply only for Cosmos NoSQL API
|
||||
- Apply only for Cosmos NoSQL API
|
||||
|
||||
**Trade-offs**:
|
||||
- Requires dedicated tier (not available on serverless)
|
||||
@@ -963,7 +963,7 @@ Implementation: Add a shard suffix using hash-based or time-based calculation:
|
||||
// Hash-based sharding
|
||||
partitionKey = originalKey + "_" + (hash(identifier) % shardCount)
|
||||
|
||||
// Time-based sharding
|
||||
// Time-based sharding
|
||||
partitionKey = originalKey + "_" + (currentHour % shardCount)
|
||||
```
|
||||
|
||||
@@ -993,7 +993,7 @@ EventLog container (problematic):
|
||||
• Result: Limited to 10,000 RU/s regardless of total container throughput
|
||||
|
||||
Sharded solution:
|
||||
• Partition Key: date + "_" + shard_id (e.g., "2024-07-09_4")
|
||||
• Partition Key: date + "_" + shard_id (e.g., "2024-07-09_4")
|
||||
• Shard calculation: shard_id = hash(event_id) % 15
|
||||
• Result: Distributes daily events across 15 partitions
|
||||
|
||||
@@ -1002,7 +1002,7 @@ Sharded solution:
|
||||
When aggregate boundaries conflict with update patterns, prioritize based on RU cost impact:
|
||||
|
||||
Example: Order Processing System
|
||||
• Read pattern: Always web/fetch order with all items (1000 RPS)
|
||||
• Read pattern: Always fetch order with all items (1000 RPS)
|
||||
• Update pattern: Individual item status updates (100 RPS)
|
||||
|
||||
Option 1 - Combined aggregate (single document):
|
||||
@@ -1010,7 +1010,7 @@ Option 1 - Combined aggregate (single document):
|
||||
- Write cost: 100 RPS × 10 RU (rewrite entire order) = 1000 RU/s
|
||||
|
||||
Option 2 - Separate items (multi-document):
|
||||
- Read cost: 1000 RPS × 5 RU (query multiple items) = 5000 RU/s
|
||||
- Read cost: 1000 RPS × 5 RU (query multiple items) = 5000 RU/s
|
||||
- Write cost: 100 RPS × 10 RU (update single item) = 1000 RU/s
|
||||
|
||||
Decision: Option 1 better due to significantly lower read costs despite same write costs
|
||||
@@ -1029,7 +1029,7 @@ Example: Session tokens with 24-hour expiration
|
||||
{
|
||||
"id": "sess_abc123",
|
||||
"partitionKey": "user_456",
|
||||
"userId": "user_456",
|
||||
"userId": "user_456",
|
||||
"createdAt": "2024-01-01T12:00:00Z",
|
||||
"ttl": 86400
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user