Sample statistics deviations - Dismounting Rider

Types of errors

The estimates of the mean and median weights of coin series are burdened with four errors:

The data analysed may not be a representative random sample. This may be due to a number of factors. For example, museums and major auction houses tend to select better preserved and therefore potentially heavier specimens. Another example might be a situation where a significant part of the data consists of coins from a single hoard containing coins in a state shortly after leaving the mint, the weight of which is not affected by wear during circulation.
The computed mean and median are sample statistics and are thus subject to sample error. In other words, the data being analysed is a random sample from a probability distribution and therefore the calculated means and medians are random variables that may deviate from the theoretical values.
Data are usually rounded to two decimal places. Some publications and catalogues give weights to three decimal places, but this is not usual.
Coin weights are subject to measurement error, which may exceed the rounding error. The existence of this error can be observed on specimens published in multiple sources. For example, for coins that have appeared on the market more than once, it is sometimes possible to find differences in their weights given in auction catalogues.

Numerical illustration

We now leave aside the first type of error and use the simulation to illustrate the effect of the other three types of errors for coins with stater weight. As in Section Estimation of the weight standard, we will assume that the current weight of coin W_observed can be expressed as the difference between its original weight W_original when it left the mint and the weight loss L due to wear, physico-chemical processes caused by environmental influences over time, and possibly other factors, such as cutting metal off the edge (the so-called clipping), i.e.

W_observed = W_original – L.

We will also assume that these random variables are independent, the quantity W_original is normally distributed with the parameters μ = 10.80 and σ = 0.07 and the quantity L is exponentially distributed with the parameter λ = 6.00 (this choice of distributions is inspired by the illustrative numerical example in section Estimation of the weight standard). The quantity W_observed thus has the mirror-reversed exponentially modified Gaussian distribution with parameters -μ, σ and λ. Its probability density is shown in Figure 1.

Figure 1: Probability density of the exponentially modified Gaussian distribution

Denote by A and M the mean and median of the quantity W_observed. For the chosen values of the parameters μ, σ and λ we have

A	=	10.633 g,
M	=	10.672 g.

In addition, we will consider the rounding to two decimal places and the measurement error. Let us denote the measurement error by ε and assume that it is normally distributed with zero mean and standard deviation 0.01. Its probability density is shown in Figure 2. The absolute value of the measurement error is a maximum of 0.01 g with a probability of 68.3% and a maximum of 0.02 g with a probability of 95.5%.

Figure 2: Probability density of the measurement error

Denote by W^(E)_observed the value of W_observed+ε rounded to two decimal places. Formally expressed:

W^(E)_observed

⌊100×(W_observed + ε) + 1/2⌋/100,

where ⌊.⌋ denote the greatest integer less than or equal to the argument (the floor function).

We will consider random samples of size n, where n = 10, 50, 100 and 500 coins. For each random sample from the above specified mirror-reversed exponentially modified Gaussian distribution, let us denote w₁, … , w_n the exact weights of the coins in the sample and w^(E)₁, … , w^(E)_n the weights biased by measurement errors and rounded to two decimal places. Denote the sample means and medians as follows:

A_n	=	(w₁ + … + w_n)/n,
A^(E)_n	=	(w^(E)₁ + … + w^(E)_n)/n,
M_n	=	median of w₁, … , w_n,
M^(E)_n	=	median of w^(E)₁, … , w^(E)_n.

Tables 1 and 2 show the probabilities of deviations of these sample statistics from the mean and median. For example, according to Table 1, the probability that |A₁₀ − A|>0.01 is 84.2%, where |.| denotes the absolute value of the argument. These probabilities were estimated using the Monte Carlo method, with 10⁷ (ten million) simulations for each sample size n.

sample size	comparison	probabilities of absolute deviations
sample size	comparison	>0.01	>0.02	>0.03	>0.04	>0.05
10	A₁₀ − A	86.1%	72.5%	59.7%	47.9%	37.5%
10	A^(E)₁₀ − A	86.1%	72.5%	59.7%	48.0%	37.6%
50	A₅₀ − A	69.5%	43.3%	23.9%	11.6%	5.0%
50	A^(E)₅₀ − A	69.6%	43.4%	24.0%	11.7%	5.0%
100	A₁₀₀ − A	58.0%	26.8%	9.6%	2.7%	0.6%
100	A^(E)₁₀₀ − A	58.0%	26.9%	9.7%	2.7%	0.6%
500	A₅₀₀ − A	21.6%	1.3%	0.0%	0.0%	0.0%
500	A^(E)₅₀₀ − A	21.7%	1.4%	0.0%	0.0%	0.0%

Table 1: Probabilities of deviations of sample means from the mean

sample size	comparison	probabilities of absolute deviations
sample size	comparison	>0.01	>0.02	>0.03	>0.04	>0.05
10	M₁₀ − M	85.1%	70.7%	57.3%	45.2%	34.8%
10	M^(E)₁₀ − M	85.1%	70.7%	57.3%	45.2%	34.9%
50	M₅₀ − M	68.8%	42.2%	22.9%	11.0%	4.8%
50	M^(E)₅₀ − M	68.8%	42.2%	23.0%	11.2%	4.9%
100	M₁₀₀ − M	57.3%	26.0%	9.2%	2.6%	0.6%
100	M^(E)₁₀₀ − M	57.3%	26.1%	9.4%	2.8%	0.7%
500	M₅₀₀ − M	21.0%	1.3%	0.0%	0.0%	0.0%
500	M^(E)₅₀₀ − M	24.0%	2.0%	0.1%	0.0%	0.0%

Table 2: Probabilities of deviations of sample medians from the median

Not surprisingly, the results of these simulations show that for coins of higher weight (we considered staters), both rounding and measurement errors play a negligible role in estimating the mean and median. However, the size of the analyzed sample of coins is of course a crucial factor.

3 April 2024 – 18 May 2025