DocumentJamie-BitFlightFree

analyze-test-failures

Analyzes failing test cases to determine whether failures indicate genuine bugs or incorrect test implementations. Use when debugging test failures, investigating test errors, classifying failures as test bugs vs implementation bugs vs ambiguous behavior, or when given specific failing test names or pytest output. Applies balanced investigative reasoning — never auto-fixes tests without establishing root cause first.

Repo bundle on VersuzJamie-BitFlight/claude_skills264 indexed entries (SKILL.md and CLAUDE.md) from this repository — open the full bundle view.

Open bundle →

View on GitHub ↗</>github.com/Jamie-BitFlight/claude_skills Yours? Claim it ↗

§ 01 — Stats

Stars44

Prior1140

Quality—

Score—

Tasks—

§ 02 — Install

Get analyze-test-failures.

Free SKILL.md scraped from GitHub. Clone the repo or copy the file directly into your Claude Code skills directory.

One-line install · Claude Code

$npx versuz@latest install jamie-bitflight-claude-skills-plugins-python-engineering-skills-analyze-test-failures

Or clone the repo

$git clone https://github.com/Jamie-BitFlight/claude_skills.git

Or copy the SKILL.md manually

More Versuz picks

★ Featured$1.99

vz-bench-debug

Document

★ Featured$0.99

vz-scrape-runner

Web

Got something better ?Submit your skill — it enters tomorrow's cycle. No fee.

Submit yours →

§ 05 — Challenge

Think you can beat it?

$npx versuz challenge jamie-bitflight-claude-skills-plugins-python-engineering-skills-analyze-test-failures↵

Show SKILL.md content (~1.2k tokens)

---
name: analyze-test-failures
description: Analyzes failing test cases to determine whether failures indicate genuine bugs or incorrect test implementations. Use when debugging test failures, investigating test errors, classifying failures as test bugs vs implementation bugs vs ambiguous behavior, or when given specific failing test names or pytest output. Applies balanced investigative reasoning — never auto-fixes tests without establishing root cause first.
argument-hint: <test_file_or_test_name>
user-invocable: true
---

# Analyze Test Failures

Analyze failing test cases with a balanced, investigative approach.

## Context

Load and follow the standards in `/python-engineering:standards-for-python-development` when shared testing or quality rules from this plugin apply.

When tests fail, there are two primary possibilities:

1. **False positive**: The test itself is incorrect
2. **True positive**: The test discovered a genuine bug

Assuming tests are wrong by default is a dangerous anti-pattern that defeats the purpose of testing.

## Analysis Process

### 1. Initial Analysis

- Read the failing test carefully, understanding its intent
- Examine the test's assertions and expected behavior
- Review the error message and stack trace

### 2. Investigate the Implementation

- Check the actual implementation being tested
- Trace through the code path that leads to the failure
- Verify that implementation matches documented behavior

### 3. Apply Critical Thinking

For each failing test, ask:

- What behavior is the test trying to verify?
- Is this behavior clearly documented or implied by the API design?
- Does the current implementation actually provide this behavior?
- Could this be an edge case the implementation missed?

### 4. Make a Determination

Classify the failure as one of:

| Classification         | Meaning                           |
| ---------------------- | --------------------------------- |
| **Test Bug**           | Test's expectations are incorrect |
| **Implementation Bug** | Code doesn't behave as it should  |
| **Ambiguous**          | Intended behavior is unclear      |

### 5. Document Reasoning

Provide clear explanation including:

- Evidence supporting the conclusion
- Specific mismatch between expectation and reality
- Recommended fix (to test or implementation)

## Example Analyses

### Example 1: Ambiguous Behavior

**Scenario**: Test expects `calculateDiscount(100, 0.2)` to return 20, but it returns 80

**Analysis**:

- Test assumes function returns discount amount
- Implementation returns price after discount
- Function name is ambiguous

**Determination**: Ambiguous
**Recommendation**: Check documentation or clarify intended behavior

### Example 2: Implementation Bug

**Scenario**: Test expects `validateEmail("user@example.com")` to return true, but it returns false

**Analysis**:

- Test provides a valid email format
- Implementation regex is missing support for dots in domain
- Other valid emails also fail

**Determination**: Implementation Bug
**Recommendation**: Fix the regex to properly validate email addresses per RFC standards

### Example 3: Test Bug

**Scenario**: Test expects `divide(10, 0)` to return 0, but it throws an error

**Analysis**:

- Test assumes division by zero returns 0
- Implementation throws DivisionByZeroError
- Standard mathematical behavior is to treat as undefined/error

**Determination**: Test Bug
**Recommendation**: Update test to expect an error, not 0

## Output Format

For each failing test, provide:

```text
Test: [test name/description]
Failure: [what failed and how]

Investigation:
- Test expects: [expected behavior]
- Implementation does: [actual behavior]
- Root cause: [why they differ]

Determination: [Test Bug | Implementation Bug | Ambiguous]

Recommendation:
[Specific fix to either test or implementation]
```

## Key Principles

- NEVER automatically assume the test is wrong
- ALWAYS consider that the test might have found a real bug
- When uncertain, lean toward investigating the implementation
- Tests are often your specification - they define expected behavior
- A failing test is a gift - it's either catching a bug or clarifying requirements

## Related Skills

- **test-failure-mindset**: Use `/python-engineering:test-failure-mindset` to set investigative approach for session
- **comprehensive-test-review**: Use `/python-engineering:comprehensive-test-review` for full test suite review