Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install brycewang-stanford-awesome-agent-skills-for-empirical-research-skills-26-data-wise-scholar-skills-mathematical-proof-architectgit clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.gitcp Awesome-Agent-Skills-for-Empirical-Research/SKILL.MD ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-skills-26-data-wise-scholar-skills-mathematical-proof-architect/SKILL.md---
name: proof-architect
description: Structured methodology for constructing and verifying mathematical proofs in statistical research
---
# Proof Architect
**Structured methodology for constructing and verifying mathematical proofs in statistical research**
Use this skill when working on: mathematical proofs, theorem development, derivations, consistency proofs, asymptotic arguments, identification proofs, or verifying proof correctness.
---
## Proof Structure Framework
### Standard Proof Components
Every rigorous statistical proof should contain:
1. **Claim Statement** - Precise mathematical statement of what is being proved
2. **Assumptions** - All conditions required (clearly enumerated A1, A2, ...)
3. **Notation** - Define all symbols before use
4. **Proof Body** - Logical sequence of justified steps
5. **Conclusion** - Explicit statement that claim is established
### Proof Skeleton Template
```latex
\begin{theorem}[Name]
\label{thm:name}
Under Assumptions \ref{A1}--\ref{An}, [precise claim].
\end{theorem}
\begin{proof}
The proof proceeds in [n] steps.
\textbf{Step 1: [Description]}
[Content with justification for each transition]
\textbf{Step 2: [Description]}
[Content]
\vdots
\textbf{Step n: Conclusion}
Combining Steps 1--[n-1], we obtain [result], completing the proof.
\end{proof}
```
---
## Proof Types in Statistical Methodology
### 1. Identification Proofs
**Goal**: Show that a causal/statistical quantity is uniquely determined from observed data distribution.
**Standard Structure**:
1. Define target estimand (e.g., $\psi = E[Y(a)]$)
2. State identifying assumptions (consistency, positivity, exchangeability)
3. Apply identification formula derivation
4. Show formula depends only on observable quantities
**Template**:
```latex
\begin{theorem}[Identification of $\psi$]
Under Assumptions \ref{A:consistency}--\ref{A:positivity}, the causal effect
$\psi = E[Y(a)]$ is identified by
\[
\psi = \int E[Y \mid A=a, X=x] \, dP(x).
\]
\end{theorem}
\begin{proof}
\begin{align}
E[Y(a)] &= E[E[Y(a) \mid X]] && \text{(law of iterated expectations)} \\
&= E[E[Y(a) \mid A=a, X]] && \text{(A\ref{A:exchangeability}: $Y(a) \indep A \mid X$)} \\
&= E[E[Y \mid A=a, X]] && \text{(A\ref{A:consistency}: $Y = Y(A)$)} \\
&= \int E[Y \mid A=a, X=x] \, dP(x) && \text{(definition)}
\end{align}
which depends only on the observed data distribution.
\end{proof}
```
### 2. Consistency Proofs
**Goal**: Show that an estimator converges to the true parameter value.
**Standard Structure**:
1. Define estimator $\hat{\theta}_n$
2. Define target parameter $\theta_0$
3. Establish convergence: $\hat{\theta}_n \xrightarrow{p} \theta_0$
**Key Tools**:
- Law of Large Numbers (LLN)
- Continuous Mapping Theorem
- Slutsky's Theorem
- M-estimation theory
**Template**:
```latex
\begin{theorem}[Consistency]
Under Assumptions \ref{A1}--\ref{An}, $\hat{\theta}_n \xrightarrow{p} \theta_0$.
\end{theorem}
\begin{proof}
Define $M_n(\theta) = n^{-1} \sum_{i=1}^n m(O_i; \theta)$ and
$M(\theta) = E[m(O; \theta)]$.
\textbf{Step 1: Uniform convergence}
By [ULLN conditions], $\sup_{\theta \in \Theta} |M_n(\theta) - M(\theta)| \xrightarrow{p} 0$.
\textbf{Step 2: Unique maximum}
$M(\theta)$ is uniquely maximized at $\theta_0$ (by identifiability).
\textbf{Step 3: Conclusion}
By standard M-estimation theory, Steps 1--2 imply $\hat{\theta}_n \xrightarrow{p} \theta_0$.
\end{proof}
```
### 3. Asymptotic Normality Proofs
**Goal**: Establish $\sqrt{n}(\hat{\theta}_n - \theta_0) \xrightarrow{d} N(0, V)$.
**Standard Structure**:
1. Taylor expansion around true value
2. Apply CLT to score/influence function
3. Invert Hessian/information matrix
4. State limiting distribution
**Key Tools**:
- Central Limit Theorem (CLT)
- Delta Method
- Influence Function Theory
- Semiparametric Efficiency Theory
**Template**:
```latex
\begin{theorem}[Asymptotic Normality]
Under Assumptions \ref{A1}--\ref{An},
\[
\sqrt{n}(\hat{\theta}_n - \theta_0) \xrightarrow{d} N(0, V)
\]
where $V = E[\phi(O)\phi(O)^\top]$ and $\phi$ is the influence function.
\end{theorem}
\begin{proof}
\textbf{Step 1: Score equation}
$\hat{\theta}_n$ solves $\mathbb{P}_n[\psi(O; \theta)] = 0$ where $\psi = \partial_\theta m$.
\textbf{Step 2: Taylor expansion}
\[
0 = \mathbb{P}_n[\psi(O; \hat{\theta}_n)] = \mathbb{P}_n[\psi(O; \theta_0)]
+ \mathbb{P}_n[\dot{\psi}(O; \tilde{\theta})](\hat{\theta}_n - \theta_0)
\]
\textbf{Step 3: Rearrangement}
\[
\sqrt{n}(\hat{\theta}_n - \theta_0) = -\left(\mathbb{P}_n[\dot{\psi}]\right)^{-1}
\sqrt{n} \mathbb{P}_n[\psi(O; \theta_0)]
\]
\textbf{Step 4: Apply CLT}
$\sqrt{n} \mathbb{P}_n[\psi(O; \theta_0)] \xrightarrow{d} N(0, \text{Var}(\psi))$ by CLT.
\textbf{Step 5: Slutsky}
$\mathbb{P}_n[\dot{\psi}] \xrightarrow{p} E[\dot{\psi}]$ by WLLN. Apply Slutsky's theorem.
\end{proof}
```
### 4. Efficiency Proofs
**Goal**: Show estimator achieves semiparametric efficiency bound.
**Standard Structure**:
1. Characterize the tangent space
2. Derive efficient influence function (EIF)
3. Show estimator's influence function equals EIF
4. Conclude variance achieves bound
**Template**:
```latex
\begin{theorem}[Semiparametric Efficiency]
$\hat{\theta}_n$ is semiparametrically efficient with influence function
\[
\phi(O) = [optimal formula]
\]
achieving the efficiency bound $V_{\text{eff}} = E[\phi(O)^2]$.
\end{theorem}
```
### 5. Double Robustness Proofs
**Goal**: Show estimator is consistent if either nuisance model is correctly specified.
**Standard Structure**:
1. Write estimating equation with both nuisance functions
2. Show bias term is product of two errors
3. Conclude: if either error is zero, estimator is consistent
**Template**:
```latex
\begin{theorem}[Double Robustness]
The estimator $\hat{\psi}_{DR}$ is consistent if either:
\begin{enumerate}
\item The outcome model $\mu(a,x) = E[Y \mid A=a, X=x]$ is correctly specified, or
\item The propensity score $\pi(x) = P(A=1 \mid X=x)$ is correctly specified.
\end{enumerate}
\end{theorem}
\begin{proof}
The estimating equation has the form:
\[
\psi - \hat{\psi}_{DR} = E\left[\frac{(A-\pi)(Y-\mu)}{\pi(1-\pi)}\right] + o_p(1)
\]
The bias term $(A-\pi)(Y-\mu)$ is zero in expectation if either:
\begin{itemize}
\item $E[A-\pi \mid X] = 0$ (propensity correctly specified), or
\item $E[Y-\mu \mid A, X] = 0$ (outcome correctly specified).
\end{itemize}
\end{proof}
```
---
## Proof Verification Checklist
### Level 1: Structure Check
- [ ] Claim clearly stated with all conditions
- [ ] All notation defined before use
- [ ] Logical flow apparent (steps labeled)
- [ ] Each step has explicit justification
- [ ] Conclusion explicitly stated
### Level 2: Step Validation
For each step, verify:
- [ ] Mathematical operation is valid
- [ ] Cited results apply (check conditions)
- [ ] Inequalities have correct direction
- [ ] Limits/integrals converge
- [ ] Dimensions/types match
### Level 3: Edge Cases
- [ ] Boundary cases handled (n=1, p=0, etc.)
- [ ] Degenerate cases addressed
- [ ] Assumptions actually used (not vacuous)
- [ ] What happens at assumption boundaries?
### Level 4: Consistency
- [ ] Result matches intuition
- [ ] Special cases recover known results
- [ ] Numerical verification possible?
- [ ] Consistent with simulation evidence?
---
## Common Proof Errors
### Technical Errors
| Error | Example | Fix |
|-------|---------|-----|
| Interchanging limits | $\lim \sum \neq \sum \lim$ | Verify DCT/MCT conditions |
| Division by zero | $1/\pi(x)$ when $\pi(x)=0$ | State positivity assumption |
| Incorrect conditioning | $E[Y \mid A,X] \neq E[Y \mid X]$ | Check independence structure |
| Wrong norm | $\|f\|_2$ vs $\|f\|_\infty$ | Verify which space |
| Missing measurability | Random variable not measurable | State measurability |
### Logical Errors
| Error | Example | Fix |
|-------|---------|-----|
| Circular reasoning | Using result to prove itself | Check logical dependency |
| Unstated assumption | "Clearly, X holds" | Make all assumptions explicit |
| Incorrect quantifier | $\exists$ vs $\forall$ | Be precise about scope |
| Missing case | Not handling $\theta = 0$ | Enumerate all cases |
### Statistical Errors
| Error | Example | Fix |
|-------|---------|-----|
| Confusing $\xrightarrow{p}$ and $\xrightarrow{d}$ | Different convergence modes | State which mode |
| Ignoring dependence | Applying iid CLT to dependent data | Check independence |
| Wrong variance | Using population variance for sample | Distinguish estimator/parameter |
---
## Notation Standards (VanderWeele Convention)
### Causal Quantities
| Symbol | Meaning |
|--------|---------|
| $Y(a)$ | Potential outcome under treatment $a$ |
| $Y(a,m)$ | Potential outcome under $A=a$, $M=m$ |
| $M(a)$ | Potential mediator under treatment $a$ |
| $NDE$ | Natural Direct Effect: $E[Y(1,M(0)) - Y(0,M(0))]$ |
| $NIE$ | Natural Indirect Effect: $E[Y(1,M(1)) - Y(1,M(0))]$ |
| $TE$ | Total Effect: $E[Y(1) - Y(0)] = NDE + NIE$ |
| $P_M$ | Proportion Mediated: $NIE/TE$ |
### Statistical Quantities
| Symbol | Meaning |
|--------|---------|
| $\theta_0$ | True parameter value |
| $\hat{\theta}_n$ | Estimator based on $n$ observations |
| $\phi(O)$ | Influence function |
| $\mathbb{P}_n$ | Empirical measure |
| $\mathbb{G}_n$ | Empirical process: $\sqrt{n}(\mathbb{P}_n - P)$ |
### Convergence
| Symbol | Meaning |
|--------|---------|
| $\xrightarrow{p}$ | Convergence in probability |
| $\xrightarrow{d}$ | Convergence in distribution |
| $\xrightarrow{a.s.}$ | Almost sure convergence |
| $O_p(1)$ | Bounded in probability |
| $o_p(1)$ | Converges to zero in probability |
---
## Proof Construction Workflow
### Step 1: Understand the Goal
- What exactly needs to be proved?
- What type of proof is this? (identification, consistency, etc.)
- What are the key challenges?
### Step 2: Gather Tools
- What theorems/lemmas are available?
- What regularity conditions will be needed?
- Are there similar proofs to reference?
### Step 3: Outline Structure
- Break into logical steps
- Identify the key technical step
- Plan how to handle edge cases
### Step 4: Write First Draft
- Fill in details for each step
- Be explicit about every transition
- Note where conditions are used
### Step 5: Verify
- Run through verification checklist
- Check each step independently
- Test special cases
### Step 6: Polish
- Improve notation consistency
- Add intuitive explanations
- Ensure assumptions are minimal
---
## Integration with Other Skills
This skill works with:
- **identification-theory** - For causal identification proofs
- **asymptotic-theory** - For inference proofs
- **methods-paper-writer** - For presenting proofs in manuscripts
- **proof-verifier** - For systematic verification
---
**Version**: 1.0
**Created**: 2025-12-08
**Domain**: Mathematical Statistics, Causal Inference
## Key References
- van der Vaart
- Lehmann
- Casella
- Bickel
- Serfling