Benford's Law Calculator

Distributions and Statistical Models

Enter a list of numbers to see if they follow the distribution predicted by Benford's Law. This is often used in forensic accounting and fraud detection.

Practical Examples

Click on an example to load the data and see how Benford's Law applies to different scenarios.

Corporate Invoices

accounting

A list of invoice amounts from a company. Datasets like this often conform to Benford's Law.

152.34, 28, 475.9, 1102, 34.55, 621, 1987, 54.12, 134, 219.8, 112, 45, 88.7, 1045, 305, 17.6, 953, 1...

Potentially Fraudulent Data

fraud

A dataset where numbers are artificially generated to stay within a narrow range, which often violates Benford's Law.

850, 920, 780, 810, 950, 880, 760, 910, 830, 990, 750, 800, 870, 940, 820, 930, 790, 860, 900, 840, ...

World River Lengths

science

The lengths (in km) of major rivers around the world. Natural phenomena spanning multiple orders of magnitude often follow the law.

6650, 6400, 6300, 6275, 5539, 4880, 4700, 4500, 4444, 4345, 4258, 4180, 4090, 3778, 3700, 3650, 3530...

City Populations

population

A sample of US city populations. Population data is a classic example of a dataset that conforms to Benford's Law.

8175133, 3792621, 2695598, 2100263, 1526006, 1386607, 1321426, 945942, 822458, 672228, 649031, 62096...

Other Titles
Understanding Benford's Law: A Comprehensive Guide
An in-depth look at the first-digit phenomenon, its applications, and the mathematics behind it.

What is Benford's Law?

  • The First-Digit Phenomenon
  • The Mathematical Formula
  • Why Does It Work?
Benford's Law, also known as the First-Digit Law, is a fascinating statistical observation about the frequency of leading digits in many real-life sets of numerical data. Contrary to intuition, the digits 1 through 9 do not appear as the leading digit with equal frequency. Instead, the number 1 appears as the leading digit about 30% of the time, while 9 appears as the leading digit less than 5% of the time. This pattern was first noted by astronomer Simon Newcomb in 1881 and later rediscovered and popularized by physicist Frank Benford in 1938.
The Formula Behind the Law
The probability of a digit 'd' (from 1 to 9) being the first digit in a dataset that follows the law is given by the formula: P(d) = log10(1 + 1/d). This logarithmic relationship explains the decreasing frequency from 1 to 9. The law applies not just to the first digit but can be generalized to the second digit, third digit, and combinations of digits, although the distribution becomes more uniform for later digits.
Conditions for Application
Benford's Law works best on data that spans several orders of magnitude. Data that is constrained to a narrow range (e.g., human heights in feet) will not conform. Key criteria for a dataset to follow the law include: numbers must represent magnitudes of events, have no pre-set limits, and should not be composed of assigned numbers like invoice or check numbers.

Step-by-Step Guide to Using the Benford's Law Calculator

  • Inputting Your Data
  • Interpreting the Results Table
  • Understanding the Chi-Squared Test
Using this calculator is straightforward. Simply copy your list of numbers and paste them into the text area provided. The numbers can be separated by commas, spaces, or new lines. The calculator will automatically parse valid numbers and ignore text or invalid entries.
The Results Table Explained
After you press 'Calculate', the tool generates a table. This table shows each leading digit (1-9) and compares the 'Actual' frequency (from your data) against the 'Benford's' expected frequency. The 'Difference' column highlights how much your data deviates from the expected values, making it easy to spot significant discrepancies.
The Chi-Squared (χ²) Test
To provide a statistical measure of conformity, the calculator performs a Chi-Squared test. This test quantifies the difference between your data's distribution and Benford's distribution. The result includes the Chi-Squared value, the degrees of freedom (which is 8 for first-digit analysis), and a p-value. A small p-value (typically less than 0.05) indicates a statistically significant deviation, suggesting your data does not follow Benford's Law. A larger p-value suggests the data is consistent with the law.

Real-World Applications of Benford's Law

  • Forensic Accounting and Fraud Detection
  • Election Auditing
  • Scientific Data Validation
Detecting Financial Fraud
One of the most powerful applications of Benford's Law is in forensic accounting. When people fabricate numbers (e.g., invoices, check payments, or expense claims), they tend to distribute digits uniformly, violating the logarithmic pattern of Benford's Law. Auditors use this to red-flag datasets that deviate significantly, pointing to potential fraud, manipulation, or errors.
Analyzing Election Data
Benford's Law has been used to analyze vote counts in elections. While a deviation from the law is not definitive proof of fraud (as precinct data may not meet the necessary criteria), it can serve as an indicator for further investigation. It helps identify statistical anomalies in voter turnouts or candidate totals that warrant a closer look.
Validating Scientific and Economic Data
Scientists and economists apply the law to validate the integrity of large datasets. Whether it's macroeconomic data, river lengths, or physical constants, data from natural processes often conforms to the law. A mismatch can suggest measurement errors, data processing issues, or even scientific misconduct.

Common Misconceptions and Correct Methods

  • Not All Data Sets Apply
  • It's a Red Flag, Not Proof
  • Sample Size Matters
Universal Applicability is a Myth
A common mistake is to apply Benford's Law to any dataset. As mentioned, it does not apply to data with a restricted range (like test scores from 0-100), assigned numbers (zip codes, phone numbers), or data influenced by human thought (like prices ending in .99). Applying the law incorrectly will lead to false conclusions.
A Tool for Screening, Not Conviction
A significant deviation from Benford's Law is a statistical red flag, not a smoking gun. It indicates that the data is anomalous and requires further investigation to determine the cause. The cause could be fraud, but it could also be a data processing error, a natural characteristic of the dataset, or an incorrect application of the law.
The Importance of Sufficient Data
For the analysis to be statistically meaningful, the dataset should be sufficiently large. While there's no magic number, a sample size of at least 50-100 valid numbers is recommended, with larger datasets yielding more reliable results. A small sample may show deviations due to random chance alone.

Mathematical Derivation and Examples

  • Logarithmic Spacing
  • Scale Invariance
  • Base Invariance
Why Logarithms?
The law is rooted in the fact that the logarithmic scale is how we perceive relative magnitudes. The gap between log(1) and log(2) is much larger than the gap between log(8) and log(9). Benford's Law arises when data is uniformly distributed on a logarithmic scale. This happens with processes involving multiplicative growth (like investments or population growth).
Scale Invariance
A key property of datasets that follow Benford's law is scale invariance. If you take a set of values (like lengths in miles) and convert them to another unit (kilometers), the new set of numbers will still follow Benford's Law. This unique property makes the law robust for analyzing data from different sources.
Base Invariance
While typically demonstrated in base-10, Benford's Law is not dependent on our decimal system. The principle holds true for other number bases as well, with the probabilities changing according to the base used. This shows that the phenomenon is a fundamental property of numbers and not just an artifact of how we write them.