Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install guardiatechnology-ahrena-cursor-skills-kata-schema-designgit clone https://github.com/guardiatechnology/ahrena.gitcp ahrena/SKILL.MD ~/.claude/skills/guardiatechnology-ahrena-cursor-skills-kata-schema-design/SKILL.md---
name: kata-schema-design
description: "Design Data Schema. Schema design for new entity, domain, or expansion of existing model — entities, relationships, indexes, migrations, retention"
---
# Kata: Design Data Schema
> **Prefix:** `kata-` | **Type:** Repeatable Skill | **Scope:** Schema design for new entity, domain, or expansion of existing model — entities, relationships, indexes, migrations, retention
## Workflow
```
Progress:
- [ ] 1. Identify entities, value objects, aggregates
- [ ] 2. Model relationships and cardinalities
- [ ] 3. Decide relational vs. NoSQL
- [ ] 4. Define attributes and types
- [ ] 5. Define indexes for access patterns
- [ ] 6. Classify PII and retention policy
- [ ] 7. Migration strategy (if evolution)
- [ ] 8. Persist schema document
```
### Step 1: Entities, value objects, aggregates
Using `codex-data-modeling`:
- List entities with their own identity (`Refund`, `Payment`).
- Identify immutable value objects (`Money`, `Address`).
- Design aggregates: which entity is root; which are internal to the aggregate.
Rule: aggregate = transaction boundary. Small = greater concurrency.
### Step 2: Relationships
For each pair of entities:
- **1:1**: FK + unique constraint (consider merging into same table if always together).
- **1:N**: FK on the "many" entity.
- **N:M**: junction table with own attributes when there is relation metadata.
- **Polymorphic**: avoid; prefer separate tables or interface pattern.
Draw simple ER diagram in Mermaid:
```mermaid
erDiagram
PAYMENT ||--o{ REFUND : "has"
REFUND {
uuid entity_id PK
uuid payment_id FK
bigint amount_cents
string currency_code
string status
timestamp created_at
}
```
### Step 3: Relational vs. NoSQL
Decision tree in `codex-data-modeling`:
- **Aurora PostgreSQL** is the default for Guardia (transactional OLTP).
- **DynamoDB** for known access patterns + massive scale.
- **Mixed** is valid (relational OLTP + read model in DynamoDB/OpenSearch).
If decision is non-trivial → ADR via `kata-adr-write`.
### Step 4: Attributes and types
For each entity:
- Ahrena base attributes (`codex-entities`): `entity_id`, `entity_type`, `version`, `created_at`, `updated_at`, `created_by`, `updated_by`.
- Specific fields with explicit type and constraints.
- Money: `amount_cents` (bigint) + `currency_code` (char(3)) — never float.
- Timestamps: always UTC (`timestamp with time zone` in Postgres, epoch in DynamoDB).
- Enums: validate in code + `CHECK` constraint in DB.
For each column, decide: NOT NULL? DEFAULT? UNIQUE?
### Step 5: Indexes
List main queries (access patterns):
```
Q1: list refunds by payment_id ordered by created_at DESC
Q2: find refund by idempotency_key (unique)
Q3: count refunds in last 24h for fraud rule
```
Derived indexes:
```sql
CREATE INDEX idx_refund_payment_created ON refund (payment_id, created_at DESC);
CREATE UNIQUE INDEX idx_refund_idempotency ON refund (idempotency_key);
-- Q3 can use Q1 index (prefix (created_at DESC)) or need another
```
Check with `EXPLAIN` when possible.
### Step 6: PII and retention
Classify each column:
| Column | Class | Retention |
|---|---|---|
| `entity_id` | system | 7y (audit) |
| `customer_cpf` | PII | LGPD — soft delete 5y inactive |
| `amount_cents` | transactional | 7y (legal) |
| `notes` (free text) | potential PII | avoid free; if necessary, 90d retention |
Update `docs/data-retention.yaml` (`lex-data-retention`).
### Step 7: Migration (if evolution)
If it is evolution of existing schema, apply expand-contract (`lex-migrations-reversible`):
1. **Expand**: ADD COLUMN nullable; code writes in both.
2. **Backfill**: job in batches.
3. **Cut-over**: code uses only new.
4. **Contract**: DROP or NOT NULL applied.
Estimate duration per phase; if any exceeds 10min in prod, detail strategy (pg_repack, window).
### Step 8: Persist schema document
Structure in `.issues/{n}/03b-schema.md` (complements architecture.md):
```markdown
# Schema — Issue #{n}: {title}
## Entities
### Refund (aggregate root)
| Column | Type | Constraints | Notes |
|---|---|---|---|
| entity_id | UUID | PK | v4 |
| payment_id | UUID | FK → payment.entity_id, NOT NULL | |
| amount_cents | BIGINT | NOT NULL, CHECK > 0 | |
| idempotency_key | TEXT | UNIQUE | |
| ... | ... | ... | ... |
## Relationships
```mermaid
erDiagram
PAYMENT ||--o{ REFUND : "has"
```
## Indexes
| Name | Columns | Reason |
|---|---|---|
| idx_refund_payment_created | (payment_id, created_at DESC) | Q1 (list by payment) |
| ... | ... | ... |
## PII and retention
| Column | Class | Retention |
|---|---|---|
## Migration
(if applicable, expand-contract)
## Decision: DB
Aurora PostgreSQL (OLTP) + Redis cache for idempotency lookups.
Referenced ADR: docs/adr/ADR-XXX-aurora-for-refund.md
```
## Outputs
| Output | Format | Destination |
|--------|--------|-------------|
| Schema document | Markdown | `.issues/{n}/03b-schema.md` |
| ER diagram | Embedded Mermaid | In the document |
| Retention update | YAML | `docs/data-retention.yaml` |
| ADR (if necessary) | MADR Markdown | `docs/adr/ADR-*` |
## Restrictions
- **No over-design**: if entity is simple, do not force aggregate pattern.
- **Mandatory base attributes**: always `entity_id`, `created_at`, `updated_at`.
- **Justified indexes**: each index has documented access pattern; do not speculate.
- **Declared retention**: each new table updates `docs/data-retention.yaml`.