Green CI is a lie when your tests are built on these patterns. Mock-only assertions, assertion-free tests, values that mirror the implementation, probing internals instead of behavior, and parametrize that skips the real edge cases — five ways to write a test suite that passes 100% and catches 0% of real bugs.
When you patch a dependency and then assert that the patch was called, you are testing that unittest.mock works — not that your code does. If you remove the entire function under test, the test still passes. The mock was called by the test setup itself, not by any code path that could ever break.
from unittest.mock import patch, MagicMock
def test_send_welcome_email():
with patch('services.email.send') as mock_send:
mock_send.return_value = {"status": "ok"}
result = send_welcome_email("user@example.com")
# Only verifies the mock was called
# Doesn't verify the email body, subject,
# recipient address, or return value
mock_send.assert_called_once()
from unittest.mock import patch, call
def test_send_welcome_email():
with patch('services.email.send') as mock_send:
mock_send.return_value = {"status": "ok"}
result = send_welcome_email("user@example.com")
# Verify what the code actually did:
# correct recipient, subject, template
mock_send.assert_called_once_with(
to="user@example.com",
subject="Welcome to the platform",
template="welcome",
)
assert result["status"] == "sent"
A mock is a stand-in for external I/O. What you should assert is the contract: what arguments did your code pass to the dependency, and what did your code do with the return value? Both matter. Neither is covered by assert_called_once() alone.
The test file imports the right module, patches the right thing, and has an assertion at the end. It looks complete. The anti-pattern is subtle: assert_called_once() is a real assertion, it just asserts almost nothing useful. Reviewers see mock + assertion and stop reading. The fact that the assertion is vacuous doesn't register without slowing down to think about what it actually verifies.
CodeSight flags test functions that patch a dependency but only assert on the mock object itself (e.g., assert_called, assert_called_once) without any assertion on the function return value, a side-effected variable, or the call arguments. Pure mock-presence tests with no behavioral assertion are surfaced with a suggestion to add outcome assertions.
A test that exercises code but never calls assert — or whose only assertion is assert True — will always be green. It confirms that the code doesn't raise an uncaught exception. That's useful for smoke tests, but it's written to look like a unit test and is treated as coverage. The implementation can return None, return garbage, or silently corrupt state, and the test stays green.
def test_calculate_discount():
order = Order(items=[
Item("shirt", price=50.0, qty=2),
Item("hat", price=20.0, qty=1),
])
# Runs the function. Doesn't verify anything.
order.calculate_discount(coupon="SAVE10")
# Green. Even if discount is calculated wrong.
# Even if the order total is mutated incorrectly.
def test_calculate_discount():
order = Order(items=[
Item("shirt", price=50.0, qty=2),
Item("hat", price=20.0, qty=1),
])
result = order.calculate_discount(coupon="SAVE10")
# 10% off (50*2 + 20*1 = 120) = 12.00 discount
assert result.discount_amount == 12.0
assert result.final_total == 108.0
assert result.coupon_applied == "SAVE10"
Smoke tests have value. Name them test_calculate_discount_does_not_raise and put them in a separate file. When a function body is exercised without being asserted on, it's a coverage metric lie — lines are marked green in the coverage report, but the behavior is unchecked.
The function is called. The test class looks like any other test. Code coverage shows the lines as executed. Coverage tools track execution, not verification — they don't know the difference between a line that ran with an assertion and a line that ran with none. CI is green, coverage is up, the PR ships. The bug surfaces when someone changes the discount formula and every assertion-free test keeps passing.
CodeSight performs AST analysis on test functions: any function named test_* that contains zero assert statements (or only assert True) is flagged as an assertion-free test. It distinguishes between pytest.raises context managers (legitimate exception tests) and truly empty test bodies.
When the expected value in a test is computed by copy-pasting the implementation logic — or by running the function once and baking its current output — the test verifies that the code equals itself. It doesn't verify that the code is correct. Refactoring the formula breaks the test, but the bug that introduced the wrong formula in the first place passed silently.
def calculate_tax(subtotal: float, rate: float) -> float:
return round(subtotal * rate / 100, 2)
def test_calculate_tax():
subtotal = 199.99
rate = 8.5
# Expected is computed the same way as the
# implementation — circular reference
expected = round(199.99 * 8.5 / 100, 2)
assert calculate_tax(subtotal, rate) == expected
def test_calculate_tax():
# 8.5% of $199.99 = $16.999... rounds to $17.00
# Computed manually or from a tax table
assert calculate_tax(199.99, 8.5) == 17.00
# Edge cases from real business rules:
assert calculate_tax(0.0, 8.5) == 0.0
assert calculate_tax(100.0, 0.0) == 0.0
assert calculate_tax(0.01, 8.5) == 0.0 # rounds down
This pattern is especially dangerous for rounding, formatting, and financial calculations where the spec is an external source of truth (a tax table, a product requirements doc, a legal standard) rather than derived from the code itself.
The test calls the function, computes an expected value, and asserts equality. That looks like a correct test. The circular dependency is invisible at a glance — both expressions use the same arithmetic. Even if you run them manually they produce the same number. The reviewer has to consciously ask "where does this expected value come from, and is it independent of the code being tested?" Most don't ask that question in a typical review.
CodeSight detects when a test's expected value contains the same arithmetic expression as the function under test — either by matching inline expressions or by detecting that the expected value is computed using the same operands and operators in the same order. It flags suspected circular test assertions and suggests using fixed literals from an independent source of truth.
Tests that call private methods, inspect internal state, or assert on intermediate variables lock the implementation in place without verifying correctness. If you rename a helper method, change a private variable, or refactor the internals without changing behavior, the tests break — even though the code is still correct. The tests are coupled to how the code works, not what it produces.
class ShoppingCart:
def __init__(self):
self._items = []
self._discount_applied = False
def add_item(self, item):
self._items.append(item)
def apply_discount(self, code):
if code == "SAVE10":
self._discount_applied = True
def test_cart_internals():
cart = ShoppingCart()
cart.add_item({"name": "shirt", "price": 50})
cart.apply_discount("SAVE10")
# Asserting on private state — tests internals
assert cart._discount_applied == True
assert len(cart._items) == 1
def test_cart_discount_applied_to_total():
cart = ShoppingCart()
cart.add_item({"name": "shirt", "price": 50})
cart.apply_discount("SAVE10")
# Assert on the public contract: what does
# the caller actually care about?
total = cart.get_total()
assert total == 45.0 # 10% off $50
def test_cart_invalid_discount_no_effect():
cart = ShoppingCart()
cart.add_item({"name": "shirt", "price": 50})
cart.apply_discount("FAKECODE")
assert cart.get_total() == 50.0
The rule of thumb: if you wouldn't expose it in a public API, don't assert on it in a test. Test the contract — inputs go in, outputs and side-effects come out. Internal representation is irrelevant as long as the contract holds.
Private method tests feel thorough. There's a named test for each internal step. Coverage numbers go up. The tests are specific and exact — they look like careful, rigorous work. The problem only surfaces when someone refactors the internals and the entire test suite goes red despite zero behavior change. That's when the team discovers the tests weren't checking correctness; they were just mirroring structure.
CodeSight flags test code that accesses attributes beginning with _ (private convention) on the object under test, or calls methods prefixed with _ directly in assertions. It distinguishes between incidental access (e.g., reading _id for comparison) and structural dependency (asserting on _internal_flag instead of observable output), and surfaces the latter as fragile implementation-coupled tests.
@pytest.mark.parametrize is one of pytest's most powerful features — and one of its most abused. Tests that parametrize only the happy path, or that vary only cosmetically (three strings that all go through the same code branch), create the illusion of thorough testing. The real edge cases — empty input, boundary values, type mismatches, nulls, zero, maximum values — are the ones that hit production and aren't in the parametrize list.
@pytest.mark.parametrize("username,expected", [
("alice", "Hello, alice!"),
("bob", "Hello, bob!"),
("charlie", "Hello, charlie!"),
])
def test_greeting(username, expected):
assert greet(username) == expected
# Three tests that all go through the exact same
# code path. Zero new branches covered.
# Empty string, None, very long name, whitespace
# — all untested and all break differently.
@pytest.mark.parametrize("username,expected", [
# Happy path
("alice", "Hello, alice!"),
# Whitespace handling
(" bob ", "Hello, bob!"),
# Case normalization (if spec requires it)
("CHARLIE", "Hello, charlie!"),
# Edge: empty string
("", "Hello, stranger!"),
# Edge: None input
(None, "Hello, stranger!"),
# Edge: very long input (truncation spec)
("a" * 300, "Hello, " + "a" * 50 + "...!"),
])
def test_greeting(username, expected):
assert greet(username) == expected
Each parametrize case should exercise a distinct code path or boundary condition. If two cases produce the same internal branch, one of them is redundant. The question to ask for every parametrize row: "which if branch or error condition does this case exercise that the previous cases don't?"
Parametrize looks thorough. Multiple test cases, clean table format, good names. Coverage tools mark the function as covered because the happy path executes all the lines. The empty-string branch often only appears in a deeply nested if not username check that the happy-path parametrize cases never reach. Reviewers see "5 test cases" and stop counting. They don't map cases to branches — that analysis takes longer than most reviewers spend on a test file.
CodeSight analyzes @pytest.mark.parametrize blocks and flags tests where all parametrize cases belong to the same input category: e.g., all non-empty strings, all positive numbers, all truthy booleans. It checks for the absence of boundary values (0, empty string, None, maximum, negative) and flags parametrize lists that vary cosmetically but cover only a single branch of the function under test.
Mock-only assertion detection, assertion-free test flagging, circular expected-value analysis, private-detail coupling alerts, and parametrize edge-case gaps. Every Python PR in 30 seconds, before it merges.
Install Free on GitHub 5 PRs/month free · No credit card · Uninstall in one click