Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install brycewang-stanford-awesome-agent-skills-for-empirical-research-skills-43-wentorai-research-plugins-skills-domains-ai-ml-responsgit clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research.gitcp Awesome-Agent-Skills-for-Empirical-Research/SKILL.MD ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-skills-43-wentorai-research-plugins-skills-domains-ai-ml-respons/SKILL.md---
name: responsible-ai-guide
description: "Resources for trustworthy, fair, and ethical AI research"
metadata:
openclaw:
emoji: "⚖️"
category: "domains"
subcategory: "ai-ml"
keywords: ["responsible AI", "AI ethics", "fairness", "trustworthy AI", "AI safety", "bias"]
source: "https://github.com/AthenaCore/AwesomeResponsibleAI"
---
# Responsible AI Guide
## Overview
A comprehensive collection of resources for building trustworthy, fair, and ethical AI systems. Covers fairness metrics, bias detection and mitigation, explainability methods, privacy-preserving techniques, robustness testing, and governance frameworks. Essential reading for researchers working on AI safety, alignment, and deploying models in high-stakes domains.
## Topic Taxonomy
```
Responsible AI
├── Fairness
│ ├── Bias detection (data, model, outcome)
│ ├── Fairness metrics (demographic parity, equalized odds)
│ ├── Bias mitigation (pre/in/post-processing)
│ └── Intersectional fairness
├── Explainability
│ ├── Feature attribution (SHAP, LIME, IG)
│ ├── Concept-based (TCAV, concept bottleneck)
│ ├── Counterfactual explanations
│ └── Mechanistic interpretability
├── Privacy
│ ├── Differential privacy
│ ├── Federated learning
│ ├── Membership inference attacks
│ └── Machine unlearning
├── Robustness
│ ├── Adversarial attacks/defenses
│ ├── Distribution shift
│ ├── Uncertainty quantification
│ └── Out-of-distribution detection
├── Safety & Alignment
│ ├── RLHF and preference learning
│ ├── Constitutional AI
│ ├── Red teaming
│ └── Guardrails and filters
└── Governance
├── Model cards
├── Datasheets for datasets
├── AI impact assessments
└── Regulatory compliance (EU AI Act)
```
## Key Tools
| Tool | Category | Purpose |
|------|----------|---------|
| **Fairlearn** | Fairness | Bias assessment + mitigation |
| **AI Fairness 360** | Fairness | IBM fairness toolkit |
| **SHAP** | Explainability | Shapley value explanations |
| **Captum** | Explainability | PyTorch interpretability |
| **Opacus** | Privacy | Differential privacy for PyTorch |
| **ART** | Robustness | Adversarial robustness toolbox |
| **Alibi** | Explainability | ML model explanations |
## Fairness Assessment
```python
from fairlearn.metrics import MetricFrame
from sklearn.metrics import accuracy_score, recall_score
# Assess fairness across demographic groups
metrics = MetricFrame(
metrics={
"accuracy": accuracy_score,
"recall": recall_score,
},
y_true=y_test,
y_pred=y_pred,
sensitive_features=demographics,
)
print("Overall:")
print(metrics.overall)
print("\nBy group:")
print(metrics.by_group)
print("\nDifference (max - min):")
print(metrics.difference())
```
## Reading Roadmap
```markdown
### Foundations
1. "Fairness and Machine Learning" (Barocas, Hardt, Narayanan)
2. "Datasheets for Datasets" (Gebru et al., 2021)
3. "Model Cards for Model Reporting" (Mitchell et al., 2019)
### Fairness
4. "On Fairness and Calibration" (Pleiss et al., 2017)
5. "Fairness Through Awareness" (Dwork et al., 2012)
### Explainability
6. "A Unified Approach to Interpreting Model Predictions" (SHAP)
7. "Why Should I Trust You?" (LIME, Ribeiro et al., 2016)
### Safety
8. "Constitutional AI" (Bai et al., 2022)
9. "Red Teaming Language Models" (Perez et al., 2022)
10. "Scaling Monosemanticity" (Anthropic, 2024)
```
## Use Cases
1. **Bias auditing**: Check models for demographic biases
2. **Compliance**: EU AI Act and regulatory requirements
3. **Model documentation**: Model cards and impact assessments
4. **Research ethics**: Ethical considerations for AI research
5. **Course material**: Teach responsible AI principles
## References
- [AwesomeResponsibleAI](https://github.com/AthenaCore/AwesomeResponsibleAI)
- [Fairlearn](https://fairlearn.org/)
- [EU AI Act](https://artificialintelligenceact.eu/)