Small samples, loud conclusions.

The smallest datasets produce the loudest findings. Statistical discipline is what stops you publishing nonsense.

March 2026

We see this pattern constantly. A business pulls a number out of a small dataset, broadcasts it as a finding, and commits budget against it. A month later the number has moved and nobody quite remembers why.

One record can move the headline

In any dataset under a few hundred records, a single mis-extracted or mislabelled entry can shift the top-line percentage by more than a point. At N=30 or N=50 it can shift it by five. Most internal analyses never flag this, so the number gets quoted as though it were stable.

Label trust is not automatic

Source data lies more often than people expect. Categories are applied inconsistently. Fields are mislabelled at input. Documents are misclassified by upstream systems. Before you trust a number, you have to audit the labels underneath it. The assumption that a dataset is clean because it is structured is a dangerous one.

Sample-size flags belong in the output

Every finding should carry its own caveat. The number is half the deliverable. The N, the uncertainty range, and the honest note on what could be wrong is the other half. Without those, a 79% based on 30 records and a 79% based on 3,000 records look identical in a slide deck. They are not the same number.

The defensible number is the valuable one

A loud finding you cannot defend is worse than no finding. It commits budget, it anchors the team, and it quietly corrodes trust in the next piece of analysis you do.

Statistical discipline is boring. Flagging small N is boring. Publishing caveats is boring. Boring is what makes a number survive a month of pressure.

20 years of data, zero questions answered.Free data, real rigour.

Want to discuss this further?

Let's talk about how these ideas apply to your business.

Talk to us