Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.
npx versuz@latest install freedomintelligence-openclaw-medical-skills-skills-bio-pathway-go-enrichmentgit clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills.gitcp OpenClaw-Medical-Skills/SKILL.MD ~/.claude/skills/freedomintelligence-openclaw-medical-skills-skills-bio-pathway-go-enrichment/SKILL.md---
name: bio-pathway-go-enrichment
description: Gene Ontology over-representation analysis using clusterProfiler enrichGO. Use when identifying biological functions enriched in a gene list from differential expression or other analyses. Supports all three ontologies (BP, MF, CC), multiple ID types, and customizable statistical thresholds.
tool_type: r
primary_tool: clusterProfiler
---
## Version Compatibility
Reference examples tested with: R stats (base), clusterProfiler 4.10+
Before using code patterns, verify installed versions match. If versions differ:
- R: `packageVersion('<pkg>')` then `?function_name` to verify parameters
If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.
# GO Over-Representation Analysis
## Core Pattern
**Goal:** Identify enriched Gene Ontology terms in a gene list from differential expression or similar analyses.
**Approach:** Test for over-representation of GO terms using the hypergeometric test via clusterProfiler enrichGO.
**"Run GO enrichment on my gene list"** → Test whether biological process, molecular function, or cellular component terms are over-represented among significant genes.
```r
library(clusterProfiler)
library(org.Hs.eg.db) # Human - change for other organisms
ego <- enrichGO(
gene = gene_list, # Character vector of gene IDs
OrgDb = org.Hs.eg.db, # Organism annotation database
keyType = 'ENTREZID', # ID type: ENSEMBL, SYMBOL, ENTREZID, etc.
ont = 'BP', # BP, MF, CC, or ALL
pAdjustMethod = 'BH', # p-value adjustment method
pvalueCutoff = 0.05,
qvalueCutoff = 0.2
)
```
## Prepare Gene List from DE Results
**Goal:** Extract significant gene IDs from differential expression results and convert to the format required by enrichGO.
**Approach:** Filter DE results by adjusted p-value and fold change, then convert gene symbols to Entrez IDs using bitr.
```r
library(dplyr)
de_results <- read.csv('de_results.csv')
sig_genes <- de_results %>%
filter(padj < 0.05, abs(log2FoldChange) > 1) %>%
pull(gene_id)
# If using gene symbols, convert to Entrez IDs
gene_ids <- bitr(sig_genes, fromType = 'SYMBOL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)
gene_list <- gene_ids$ENTREZID
```
## ID Conversion with bitr
**Goal:** Convert between gene identifier types (Ensembl, Symbol, Entrez) for compatibility with enrichment tools.
**Approach:** Use clusterProfiler bitr to map between ID types using organism annotation databases.
```r
# Check available key types
keytypes(org.Hs.eg.db)
# Convert between ID types
converted <- bitr(genes, fromType = 'ENSEMBL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)
# Multiple output types
converted <- bitr(genes, fromType = 'SYMBOL', toType = c('ENTREZID', 'ENSEMBL'), OrgDb = org.Hs.eg.db)
```
## With Background Universe
**Goal:** Improve enrichment specificity by restricting the background to genes actually tested in the experiment.
**Approach:** Pass all expressed genes (not just significant ones) as the universe parameter to enrichGO.
```r
# Use all expressed genes as background (recommended)
all_genes <- de_results$gene_id
universe_ids <- bitr(all_genes, fromType = 'SYMBOL', toType = 'ENTREZID', OrgDb = org.Hs.eg.db)
ego <- enrichGO(
gene = gene_list,
universe = universe_ids$ENTREZID, # Background gene set
OrgDb = org.Hs.eg.db,
keyType = 'ENTREZID',
ont = 'BP',
pAdjustMethod = 'BH',
pvalueCutoff = 0.05
)
```
## All Three Ontologies
```r
# Run all ontologies at once
ego_all <- enrichGO(
gene = gene_list,
OrgDb = org.Hs.eg.db,
keyType = 'ENTREZID',
ont = 'ALL', # BP, MF, and CC combined
pAdjustMethod = 'BH',
pvalueCutoff = 0.05
)
# Results include ONTOLOGY column
head(as.data.frame(ego_all))
```
## Make Results Readable
```r
# Convert Entrez IDs to gene symbols in results
ego_readable <- setReadable(ego, OrgDb = org.Hs.eg.db, keyType = 'ENTREZID')
# Or use readable = TRUE directly (only works with ENTREZID input)
ego <- enrichGO(
gene = gene_list,
OrgDb = org.Hs.eg.db,
keyType = 'ENTREZID',
ont = 'BP',
readable = TRUE # Converts to symbols
)
```
## Extract and Export Results
```r
# View top results
head(ego)
# Convert to data frame
results_df <- as.data.frame(ego)
# Key columns: ID, Description, GeneRatio, BgRatio, pvalue, p.adjust, qvalue, geneID, Count
# Export to CSV
write.csv(results_df, 'go_enrichment_results.csv', row.names = FALSE)
# Filter for specific criteria
sig_terms <- results_df[results_df$p.adjust < 0.01 & results_df$Count >= 5, ]
```
## Simplify Redundant Terms
**Goal:** Remove highly similar GO terms to reduce redundancy in enrichment results.
**Approach:** Cluster GO terms by semantic similarity and retain representative terms using the simplify function.
```r
# Remove redundant GO terms (keeps representative terms)
ego_simplified <- simplify(ego, cutoff = 0.7, by = 'p.adjust', select_fun = min)
```
## Different Organisms
```r
# Mouse
library(org.Mm.eg.db)
ego_mouse <- enrichGO(gene = genes, OrgDb = org.Mm.eg.db, ont = 'BP')
# Zebrafish
library(org.Dr.eg.db)
ego_zfish <- enrichGO(gene = genes, OrgDb = org.Dr.eg.db, ont = 'BP')
# Yeast
library(org.Sc.sgd.db)
ego_yeast <- enrichGO(gene = genes, OrgDb = org.Sc.sgd.db, ont = 'BP', keyType = 'ORF')
```
## Group GO Terms by Ancestor
**Goal:** Classify genes by broad GO slim categories for a high-level functional overview.
**Approach:** Use groupGO to assign genes to GO terms at a specific hierarchy level.
```r
# Classify genes by GO slim categories
ggo <- groupGO(
gene = gene_list,
OrgDb = org.Hs.eg.db,
ont = 'BP',
level = 3, # GO hierarchy level
readable = TRUE
)
```
## Key Parameters
| Parameter | Default | Description |
|-----------|---------|-------------|
| gene | required | Vector of gene IDs |
| OrgDb | required | Organism database |
| keyType | ENTREZID | Input ID type |
| ont | BP | BP, MF, CC, or ALL |
| pvalueCutoff | 0.05 | P-value threshold |
| qvalueCutoff | 0.2 | Q-value (FDR) threshold |
| pAdjustMethod | BH | BH, bonferroni, etc. |
| universe | NULL | Background genes |
| minGSSize | 10 | Min genes per term |
| maxGSSize | 500 | Max genes per term |
| readable | FALSE | Convert to symbols |
## Related Skills
- kegg-pathways - KEGG pathway enrichment
- gsea - Gene Set Enrichment Analysis for GO
- enrichment-visualization - Visualize enrichment results
- differential-expression - Generate input gene lists