Difference between revisions of "Bridging Probability Theory and Statistical Estimation"
Line 4: | Line 4: | ||
about precision of estimating the mean value of a quantity from independent measurements. |
about precision of estimating the mean value of a quantity from independent measurements. |
||
− | The result of every measurement is |
+ | The result of every measurement is assumed to be a real number \(X\), |
− | + | independently drawn from a normal distribution centered at an unknown "true value" \(X_0\) with unknown variance \(S^2\). |
|
This note presents a mathematically grounded interpretation of the uncertainty in an estimate \(\bar{X}\) of an unknown true value \(X_0\), using the [[probability density function]]. |
This note presents a mathematically grounded interpretation of the uncertainty in an estimate \(\bar{X}\) of an unknown true value \(X_0\), using the [[probability density function]]. |
||
Line 11: | Line 11: | ||
The approach builds a bridge between Kolmogorov-style [[probability theory]] and practical statistical inference — without requiring commitment to Bayesian or frequentist ideology. |
The approach builds a bridge between Kolmogorov-style [[probability theory]] and practical statistical inference — without requiring commitment to Bayesian or frequentist ideology. |
||
− | The goal of this article is to |
+ | The goal of this article is to propose the precise terminology |
− | and eliminate |
+ | and eliminate ambiguous terms that often lead to confusion. |
==Introduction== |
==Introduction== |
||
Line 24: | Line 24: | ||
“What’s the possible deviation of your estimate from the true value?” |
“What’s the possible deviation of your estimate from the true value?” |
||
− | These questions |
+ | These questions must be reformulated as requests for clearly defined statistical quantities. |
+ | |||
+ | This task is challenging due to widespread misunderstandings and long-standing confusions. |
||
+ | They remain even in century 21. Some of them are mentioned in publications |
||
+ | <ref name="folk2015"> |
||
+ | https://pmc.ncbi.nlm.nih.gov/articles/PMC4742505/ |
||
+ | Richard D Morey, Rink Hoekstra, Jeffrey N Rouder, Michael D Lee, Eric-Jan Wagenmakers. |
||
+ | The fallacy of placing confidence in confidence intervals. |
||
+ | Psychon Bull Rev. 2015 Oct 8;23:103–123. doi: 10.3758/s13423-015-0947-8 |
||
+ | .. |
||
+ | Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated samples, on average. The width of confidence intervals is thought to index the precision of an estimate; CIs are thought to be a guide to which parameter values are plausible or reasonable; and the confidence coefficient of the interval (e.g., 95 %) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and can lead to unjustified or arbitrary inferences. .. |
||
+ | </ref><ref name="guide2016"> |
||
+ | https://pmc.ncbi.nlm.nih.gov/articles/PMC4877414/ |
||
+ | Sander Greenland, Stephen J Senn, Kenneth J Rothman, John B Carlin, Charles Poole, Steven N Goodman, Douglas G Altman. |
||
+ | Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. |
||
+ | Eur J Epidemiol. 2016 May 21;31:337–350. doi: 10.1007/s10654-016-0149-3 // |
||
+ | Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting. |
||
+ | </ref><ref name="r1"/><ref name="r3"/><ref name="r4"/>. |
||
+ | |||
+ | Here, we do not delve into the popular confusions, but provide the simple formulas that, we hope, allow to avoid the confusions and the misinterpretations. |
||
== Model Assumptions == |
== Model Assumptions == |
||
Line 53: | Line 72: | ||
c_N = \frac{s}{\sqrt{N}} |
c_N = \frac{s}{\sqrt{N}} |
||
\] |
\] |
||
+ | |||
+ | Anticipating the main result, it can be mentioned that at small \(N\), the [[Naive standard error]] \(c_N\) underestimates the uncertainty of estimate \(\tilde X\) of the "true mean value" \(X_0\). |
||
+ | |||
+ | Under the model assumptions, the unbiased in expectation for the uncertainty is |
||
+ | |||
+ | \[ |
||
+ | \sigma_N = |
||
+ | \sqrt{\frac{N-1}{N-3}\ }\cdot c_N |
||
+ | \] |
||
+ | |||
+ | It is considered below. |
||
== Likelihood-Like Density == |
== Likelihood-Like Density == |
||
Line 65: | Line 95: | ||
Function \(f_N\) |
Function \(f_N\) |
||
− | + | acts as a [[confidence distribution density]] (or [[data-driven predictive density]]) for value \(X0\); |
|
+ | under the modeled assumption conditional on the observed data. |
||
+ | <!-- |
||
− | It answers questions like: |
||
+ | This wording avoids committing to “belief” terminology and avoids sounding Bayesian. |
||
− | |||
− | "How close is \(\bar{X}\) likely to be to the true value \(X_0\)?" |
||
− | |||
− | "What’s the uncertainty of the estimate?" |
||
+ | is [[confidence distribution density]] |
||
+ | ([[predictive belief distribution]]) over the possible values of \(X_0\), conditional on the data. |
||
+ | !--> |
||
One may interpret |
One may interpret |
||
Line 81: | Line 112: | ||
as [[probability]] that |
as [[probability]] that |
||
\(X_0\in(A,B)\), given the data and modeling assumptions. |
\(X_0\in(A,B)\), given the data and modeling assumptions. |
||
+ | |||
+ | It answers questions like: |
||
+ | |||
+ | "How close is \(\bar{X}\) likely to be to the true value \(X_0\)?" |
||
+ | |||
+ | "What’s the uncertainty of the estimate?" |
||
== Mean Square Width (Expected Squared Error) == |
== Mean Square Width (Expected Squared Error) == |
||
Line 101: | Line 138: | ||
In such a way, \(\sigma_N\) properly accounts for small‑sample variability. |
In such a way, \(\sigma_N\) properly accounts for small‑sample variability. |
||
− | The |
+ | The correction factor |
\( |
\( |
||
\sqrt{(N{-}1)/(N{-}3)} |
\sqrt{(N{-}1)/(N{-}3)} |
||
Line 107: | Line 144: | ||
arises from computing the second moment of the [[Student Distribution]] and accounts for extra variability in \(s\) due to estimating \(S\). |
arises from computing the second moment of the [[Student Distribution]] and accounts for extra variability in \(s\) due to estimating \(S\). |
||
− | However, \(\sigma_N\) |
+ | However, \(\sigma_N\) makes sense only for \(N>3\). |
+ | |||
− | while the [[probability density function]] \(f_N\) |
||
+ | As for the [[probability density function]] \(f_N\) , it |
||
− | has sense for any integer \(N>1\) |
||
+ | has sense for any integer \(N>1\). |
||
== Future Measurement Expectation == |
== Future Measurement Expectation == |
||
In this section, the additional question is considered: |
In this section, the additional question is considered: |
||
+ | What value should the Expert expect |
||
− | What may Expert expect from the \((N{+}1)\)th similar measurement? |
||
+ | for a new similar independent measirement \((N{+}1)\) ? |
||
− | |||
− | Assuming the same model, the Expert may want to predict the result of an additional, independent measurement \(X_{N+1}\). |
||
+ | Given the data and assuming the same measurement process, |
||
− | Given the sample data \(X_1, \dots, X_N\), the **predictive distribution** of \(X_{N+1}\) is a scaled and shifted *Student’s t-distribution*: |
||
+ | the [[predictive distribution]] for \(X_{N+1}\) , |
||
+ | the conditional probability density \(g\) is expressed as follows: |
||
\[ |
\[ |
||
− | + | g(x)= \frac{1}{s \cdot \sqrt{1 + \frac{1}{N} }} \mathrm{Student}_{N-1}\left(\frac{x-\bar{X}}{s \cdot \sqrt{1 + \frac{1}{N}}} \right) |
|
\] |
\] |
||
+ | This distribution density reflects both the randomness of the next measurement and the uncertainty in estimating \(X_0\). |
||
− | This formula reflects that: |
||
− | <br> |
||
− | The new measurement shares the same variance \(S^2\), |
||
− | <br> |
||
− | But \(S\) is unknown and replaced by the sample \(s\), |
||
+ | The expected deviation for the next measurement: |
||
− | And both estimation of the mean and future randomness contribute to uncertainty. |
||
+ | \[ |
||
− | Therefore, the Expert should expect the next measurement to fluctuate around \(\bar{X}\), with a spread wider than either \(s\) or \(c_N\). |
||
+ | \sqrt{\int_{-\infty}^{\infty} g(x) \ x^2 \ \mathrm d x} = |
||
+ | \sqrt{\frac{N-1}{N-3}} \cdot s \cdot \sqrt{1 + \frac{1}{N}} |
||
+ | \] |
||
+ | It shows that the next measurement is expected to vary more then |
||
− | This helps avoid **overconfidence** in forecasting — especially with small \(N\) — and acknowledges that even “true value” \(X_0\) being fixed doesn’t guarantee low variance in future data. |
||
+ | either the sample standard deviation \(s\) or the standard error \(c_N\), due to compounded uncertainty. |
||
+ | |||
+ | Use of this estimate helps avoid **overconfidence** in forecasting — especially with small \(N\) — and acknowledges that even “true value” \(X_0\) being fixed doesn’t guarantee low variance in the future data. |
||
== Practical Significance == |
== Practical Significance == |
||
Line 142: | Line 183: | ||
is defined within classical probability theory, |
is defined within classical probability theory, |
||
− | requires no |
+ | requires no specific ideology, |
corrects the naive standard error |
corrects the naive standard error |
||
\(\displaystyle \ c_N = \frac s{\sqrt N} \ \), which underestimates the uncertainty at small \(N\). |
\(\displaystyle \ c_N = \frac s{\sqrt N} \ \), which underestimates the uncertainty at small \(N\). |
||
+ | |||
+ | In the limit \(N \to \infty\), the Student's t-distribution converges to the [[normal distribution]], and the naive standard error \(c_N\) becomes an increasingly accurate estimate of the uncertainty. |
||
+ | |||
+ | Table of comparison: |
||
+ | |||
+ | <table style="width:860px"> |
||
+ | <tr><th> Quantity </th><th> Formula </th><th> Interpretation </th></tr> |
||
+ | <tr><td> \(c_N\) </td><td> \(\displaystyle\frac{s}{\sqrt{N}}\) </td><td> Naive standard error of the mean </td></tr> |
||
+ | <tr><td> \(\sigma_N\) </td><td> \(\displaystyle\sqrt{\frac{N-1}{N-3}} \cdot c_N\) </td><td> Corrected [[standard error]] accounting for small \(N\) </td></tr> |
||
+ | <tr><td> Naive Predictive SE </td><td> \(\displaystyle s \cdot \sqrt{1 + \frac{1}{N}}\) </td><td> Scale for \(g(t)\) (not the true variance) </td></tr> |
||
+ | <tr><td> Predictive spread </td><td> \(\displaystyle \sqrt{\frac{N-1}{N-3}} \cdot s \cdot \sqrt{1 + \frac{1}{N}}\) </td><td> [[Standard error]] of the estimate for the next measurement </td></tr> |
||
+ | </table> |
||
+ | |||
+ | Notably, the same correction factor \(\sqrt{\frac{N - 1}{N - 3}}\) |
||
+ | appears both in the standard error of the sample mean \(\bar{X}\) |
||
+ | and in the expected deviation of a future measurement \(X_{N+1}\). |
||
+ | This factor arises due to the variance of the Student’s t-distribution with \(N-1\) degrees of freedom, and corrects for the additional uncertainty from estimating \(S\) with \(s\). |
||
+ | While the predictive density \(g(t)\) is often presented using the naive scale |
||
+ | \(s \cdot \sqrt{1 + \frac{1}{N}}\), |
||
+ | the actual root-mean-square deviation includes the same factor as the corrected estimate of the mean's standard error. |
||
==Originality== |
==Originality== |
||
The formulas above are not original. |
The formulas above are not original. |
||
− | Various authors |
+ | Various authors have discussed common misinterpretations and misleading uses |
related to |
related to |
||
qualification of [[precision]] of estimate of the mean value |
qualification of [[precision]] of estimate of the mean value |
||
from the set of independent measurements |
from the set of independent measurements |
||
− | <ref name=" |
+ | <ref name="folk2015"> |
+ | https://pmc.ncbi.nlm.nih.gov/articles/PMC4742505/ |
||
+ | Richard D Morey, Rink Hoekstra, Jeffrey N Rouder, Michael D Lee, Eric-Jan Wagenmakers. |
||
+ | The fallacy of placing confidence in confidence intervals. |
||
+ | Psychon Bull Rev. 2015 Oct 8;23:103–123. doi: 10.3758/s13423-015-0947-8 |
||
+ | .. |
||
+ | Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated samples, on average. The width of confidence intervals is thought to index the precision of an estimate; CIs are thought to be a guide to which parameter values are plausible or reasonable; and the confidence coefficient of the interval (e.g., 95 %) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and can lead to unjustified or arbitrary inferences. .. |
||
+ | </ref><ref name="guide2016"> |
||
+ | https://pmc.ncbi.nlm.nih.gov/articles/PMC4877414/ |
||
+ | Sander Greenland, Stephen J Senn, Kenneth J Rothman, John B Carlin, Charles Poole, Steven N Goodman, Douglas G Altman. |
||
+ | Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. |
||
+ | Eur J Epidemiol. 2016 May 21;31:337–350. doi: 10.1007/s10654-016-0149-3 // |
||
+ | Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting. |
||
+ | </ref><ref name="r1"> |
||
https://link.springer.com/article/10.3758/s13423-015-0947-8 — *Psychonomic Bulletin & Review*, Oct 2015 |
https://link.springer.com/article/10.3758/s13423-015-0947-8 — *Psychonomic Bulletin & Review*, Oct 2015 |
||
“Confidence intervals are thought to index the precision of an estimate… CIs do not necessarily have any of these properties and thus cannot be used uncritically in this way.” |
“Confidence intervals are thought to index the precision of an estimate… CIs do not necessarily have any of these properties and thus cannot be used uncritically in this way.” |
||
Line 165: | Line 239: | ||
https://pubmed.ncbi.nlm.nih.gov/27256121 — *Eur J Epidemiol*, Apr 2016 |
https://pubmed.ncbi.nlm.nih.gov/27256121 — *Eur J Epidemiol*, Apr 2016 |
||
“There are no interpretations … that are at once simple, intuitive, correct, and foolproof … users routinely misinterpret them (e.g. interpreting 95% CI as ‘there is a 0.95 probability that the parameter is contained in the CI’).” |
“There are no interpretations … that are at once simple, intuitive, correct, and foolproof … users routinely misinterpret them (e.g. interpreting 95% CI as ‘there is a 0.95 probability that the parameter is contained in the CI’).” |
||
− | < |
+ | <b>Why it’s relevant:</b> authoritative critique on core misinterpretations. |
</ref><ref name="r5"> |
</ref><ref name="r5"> |
||
https://arxiv.org/abs/1807.06217 — *arXiv*, Jul 2018 |
https://arxiv.org/abs/1807.06217 — *arXiv*, Jul 2018 |
||
“The so‑called ‘confidence curve’ … may assign arbitrarily low non‑zero probability to the true parameter; thus it is a misleading representation of uncertainty.” |
“The so‑called ‘confidence curve’ … may assign arbitrarily low non‑zero probability to the true parameter; thus it is a misleading representation of uncertainty.” |
||
− | < |
+ | <b>Why it’s relevant:</b> supplies theoretical foundation for careful formulation of \(f_N(x)\). |
</ref>. |
</ref>. |
||
− | + | We tried to compile the result |
|
+ | in the most concise and compact, but still |
||
− | correct and self-consistent form. |
||
+ | correct and internally consistent form. |
||
== Conclusion == |
== Conclusion == |
||
Line 179: | Line 254: | ||
This construction shows how statistical estimation can be framed as <b>probabilistically coherent predictive inference</b>, even within a non-Bayesian or fully deterministic worldview. |
This construction shows how statistical estimation can be framed as <b>probabilistically coherent predictive inference</b>, even within a non-Bayesian or fully deterministic worldview. |
||
− | + | This framework enables Experts to translate vague or poorly-posed questions about "accuracy" or "confidence" into well-defined probabilistic statements — entirely within classical probability theory, and without requiring Bayesian priors. |
|
+ | |||
+ | All probabilistic statements here are conditional on the observed data and modeling assumptions, rather than arising from prior distributions. |
||
==Warning== |
==Warning== |
||
Line 185: | Line 262: | ||
Neither deduction nor proof of the formulas above is presented in this article. |
Neither deduction nor proof of the formulas above is presented in this article. |
||
− | However, Editor and [[ChatGPT]] |
+ | However, Editor and [[ChatGPT]] made every effort to catch and to correct all possible mistakes, misprints. |
If you see at least one mistake that is not yet corrected here, then, please let Editor know. |
If you see at least one mistake that is not yet corrected here, then, please let Editor know. |
||
Line 194: | Line 271: | ||
<!-- |
<!-- |
||
− | https://arxiv.org/abs/1607.05051 |
+ | https://arxiv.org/abs/1607.05051 "False confidence, non-additive beliefs, and valid statistical inference" |
− | https://link.springer.com/article/10.3758/s13423-015-0947-8 |
+ | https://link.springer.com/article/10.3758/s13423-015-0947-8 "The fallacy of placing confidence in confidence intervals | Psychonomic Bulletin & Review" |
− | [3]: https://pmc.ncbi.nlm.nih.gov/articles/PMC4742505/ |
+ | [3]: https://pmc.ncbi.nlm.nih.gov/articles/PMC4742505/ "The fallacy of placing confidence in confidence intervals - PMC" |
− | [4]: https://pmc.ncbi.nlm.nih.gov/articles/PMC4877414/ |
+ | [4]: https://pmc.ncbi.nlm.nih.gov/articles/PMC4877414/ "Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations - PMC" |
− | [5]: https://www.reddit.com/r/AskStatistics/comments/buuiix |
+ | [5]: https://www.reddit.com/r/AskStatistics/comments/buuiix "CI interval misinterpretation by researcher?" |
− | [6]: https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2022.948423/full |
+ | [6]: https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2022.948423/full "Frontiers | Some misunderstandings in psychology about confidence intervals" |
− | [7]: https://arxiv.org/abs/2402.10000 |
+ | [7]: https://arxiv.org/abs/2402.10000 "A computed 95% confidence interval does cover the true value with probability 0.95 if epistemically interpreted" |
− | [8]: https://pmc.ncbi.nlm.nih.gov/articles/PMC4742490/ |
+ | [8]: https://pmc.ncbi.nlm.nih.gov/articles/PMC4742490/ "Continued misinterpretation of confidence intervals: response to Miller and Ulrich - PMC" |
− | [9]: https://arxiv.org/abs/0912.3878 |
+ | [9]: https://arxiv.org/abs/0912.3878 "P values, confidence intervals, or confidence levels for hypotheses?" |
− | [10]: https://www.reddit.com/r/AskStatistics/comments/eici6s |
+ | [10]: https://www.reddit.com/r/AskStatistics/comments/eici6s "Confidence Interval Question" |
− | [11]: https://journals.sagepub.com/doi/10.1177/2059799119826518 |
+ | [11]: https://journals.sagepub.com/doi/10.1177/2059799119826518 "Simple methods for estimating confidence levels, or tentative probabilities, for hypotheses instead of p values - Michael Wood, 2019" |
− | [12]: https://arxiv.org/abs/1807.06217 |
+ | [12]: https://arxiv.org/abs/1807.06217 "An exposition of the false confidence theorem" |
!--> |
!--> |
||
{{fer}} |
{{fer}} |
||
Line 241: | Line 318: | ||
«[[Sampling distributions]]», |
«[[Sampling distributions]]», |
||
«[[Standard deviation]]», |
«[[Standard deviation]]», |
||
+ | «[[Standard Error]]», |
||
«[[Student Distribution]]», |
«[[Student Distribution]]», |
||
«[[Theory of Probability]]», |
«[[Theory of Probability]]», |
||
+ | |||
[[Category:Bayesian ideology]] |
[[Category:Bayesian ideology]] |
||
[[Category:Bayesian statistics]] |
[[Category:Bayesian statistics]] |
Revision as of 21:32, 15 July 2025
Bridging Probability Theory and Statistical Estimation is summary on the exercise about precision of estimating the mean value of a quantity from independent measurements.
The result of every measurement is assumed to be a real number \(X\), independently drawn from a normal distribution centered at an unknown "true value" \(X_0\) with unknown variance \(S^2\).
This note presents a mathematically grounded interpretation of the uncertainty in an estimate \(\bar{X}\) of an unknown true value \(X_0\), using the probability density function.
The approach builds a bridge between Kolmogorov-style probability theory and practical statistical inference — without requiring commitment to Bayesian or frequentist ideology.
The goal of this article is to propose the precise terminology and eliminate ambiguous terms that often lead to confusion.
Introduction
Editors and practitioners often face vague questions like:
“How precise is your estimate?”
“What’s the accuracy of your estimate?”
“What’s the error of your estimate?”
“How confident are you in your estimate?”
“What’s the possible deviation of your estimate from the true value?”
These questions must be reformulated as requests for clearly defined statistical quantities.
This task is challenging due to widespread misunderstandings and long-standing confusions. They remain even in century 21. Some of them are mentioned in publications [1][2][3][4][5].
Here, we do not delve into the popular confusions, but provide the simple formulas that, we hope, allow to avoid the confusions and the misinterpretations.
Model Assumptions
There is an unknown true value \(X_0 \in \mathbb{R}\) to be estimated.
The Expert performs \(N\) independent measurements \(X_1, X_2, \dots, X_N\), modeled as \[ X_i \sim \mathcal{N}(X_0, S^2) \] with both \(X_0\) and \(S\) unknown.
The Expert computes the following quantities:
Sample mean (point estimate): \[ \bar{X} = \frac{1}{N} \sum_{i=1}^N X_i \]
Sample standard deviation: \[ s = \sqrt{ \frac{1}{N - 1} \sum_{i=1}^N (X_i - \bar{X})^2 } \]
Naive standard error: \[ c_N = \frac{s}{\sqrt{N}} \]
Anticipating the main result, it can be mentioned that at small \(N\), the Naive standard error \(c_N\) underestimates the uncertainty of estimate \(\tilde X\) of the "true mean value" \(X_0\).
Under the model assumptions, the unbiased in expectation for the uncertainty is
\[ \sigma_N = \sqrt{\frac{N-1}{N-3}\ }\cdot c_N \]
It is considered below.
Likelihood-Like Density
Let
\[ f_N(x) = \frac{1}{c_N} \cdot \mathrm{Student}_{N-1}\!\left( \frac{x - \bar{X}}{c_N} \right) \]
This is the probability density function of a Student’s t-distribution with \(N\!-\!1\) degrees of freedom, centered at the sample mean \(\bar{X}\) and scaled by the standard error \(c_N\).
Function \(f_N\) acts as a confidence distribution density (or data-driven predictive density) for value \(X0\); under the modeled assumption conditional on the observed data.
One may interpret
\[ \int_A^B f_N(x) \, \mathrm dx \]
as probability that \(X_0\in(A,B)\), given the data and modeling assumptions.
It answers questions like:
"How close is \(\bar{X}\) likely to be to the true value \(X_0\)?"
"What’s the uncertainty of the estimate?"
Mean Square Width (Expected Squared Error)
The expected squared deviation is: \[ \sigma_N^2 = \int_{-\infty}^{\infty} (x - \bar{X})^2 \ f_N(x) \ \mathrm d x \]
which gives \[ \sigma_N=\sqrt{\frac{N-1}{N-3}}\;c_N \]
This is the corrected standard error. It has the property: \[ \mathbb{E}[(\bar{X} - X_0)^2]=\mathbb{E}[\sigma_N^2] \]
In such a way, \(\sigma_N\) properly accounts for small‑sample variability.
The correction factor \( \sqrt{(N{-}1)/(N{-}3)} \) arises from computing the second moment of the Student Distribution and accounts for extra variability in \(s\) due to estimating \(S\).
However, \(\sigma_N\) makes sense only for \(N>3\).
As for the probability density function \(f_N\) , it has sense for any integer \(N>1\).
Future Measurement Expectation
In this section, the additional question is considered:
What value should the Expert expect for a new similar independent measirement \((N{+}1)\) ?
Given the data and assuming the same measurement process, the predictive distribution for \(X_{N+1}\) , the conditional probability density \(g\) is expressed as follows:
\[ g(x)= \frac{1}{s \cdot \sqrt{1 + \frac{1}{N} }} \mathrm{Student}_{N-1}\left(\frac{x-\bar{X}}{s \cdot \sqrt{1 + \frac{1}{N}}} \right) \]
This distribution density reflects both the randomness of the next measurement and the uncertainty in estimating \(X_0\).
The expected deviation for the next measurement:
\[ \sqrt{\int_{-\infty}^{\infty} g(x) \ x^2 \ \mathrm d x} = \sqrt{\frac{N-1}{N-3}} \cdot s \cdot \sqrt{1 + \frac{1}{N}} \]
It shows that the next measurement is expected to vary more then either the sample standard deviation \(s\) or the standard error \(c_N\), due to compounded uncertainty.
Use of this estimate helps avoid **overconfidence** in forecasting — especially with small \(N\) — and acknowledges that even “true value” \(X_0\) being fixed doesn’t guarantee low variance in the future data.
Practical Significance
The consideration above:
is defined within classical probability theory,
requires no specific ideology,
corrects the naive standard error \(\displaystyle \ c_N = \frac s{\sqrt N} \ \), which underestimates the uncertainty at small \(N\).
In the limit \(N \to \infty\), the Student's t-distribution converges to the normal distribution, and the naive standard error \(c_N\) becomes an increasingly accurate estimate of the uncertainty.
Table of comparison:
Quantity | Formula | Interpretation |
---|---|---|
\(c_N\) | \(\displaystyle\frac{s}{\sqrt{N}}\) | Naive standard error of the mean |
\(\sigma_N\) | \(\displaystyle\sqrt{\frac{N-1}{N-3}} \cdot c_N\) | Corrected standard error accounting for small \(N\) |
Naive Predictive SE | \(\displaystyle s \cdot \sqrt{1 + \frac{1}{N}}\) | Scale for \(g(t)\) (not the true variance) |
Predictive spread | \(\displaystyle \sqrt{\frac{N-1}{N-3}} \cdot s \cdot \sqrt{1 + \frac{1}{N}}\) | Standard error of the estimate for the next measurement |
Notably, the same correction factor \(\sqrt{\frac{N - 1}{N - 3}}\) appears both in the standard error of the sample mean \(\bar{X}\) and in the expected deviation of a future measurement \(X_{N+1}\). This factor arises due to the variance of the Student’s t-distribution with \(N-1\) degrees of freedom, and corrects for the additional uncertainty from estimating \(S\) with \(s\). While the predictive density \(g(t)\) is often presented using the naive scale \(s \cdot \sqrt{1 + \frac{1}{N}}\), the actual root-mean-square deviation includes the same factor as the corrected estimate of the mean's standard error.
Originality
The formulas above are not original.
Various authors have discussed common misinterpretations and misleading uses related to qualification of precision of estimate of the mean value from the set of independent measurements [1][2][3][4][5][6].
We tried to compile the result in the most concise and compact, but still correct and internally consistent form.
Conclusion
This construction shows how statistical estimation can be framed as probabilistically coherent predictive inference, even within a non-Bayesian or fully deterministic worldview.
This framework enables Experts to translate vague or poorly-posed questions about "accuracy" or "confidence" into well-defined probabilistic statements — entirely within classical probability theory, and without requiring Bayesian priors.
All probabilistic statements here are conditional on the observed data and modeling assumptions, rather than arising from prior distributions.
Warning
Neither deduction nor proof of the formulas above is presented in this article.
However, Editor and ChatGPT made every effort to catch and to correct all possible mistakes, misprints.
If you see at least one mistake that is not yet corrected here, then, please let Editor know.
References
- ↑ Jump up to: 1.0 1.1 https://pmc.ncbi.nlm.nih.gov/articles/PMC4742505/ Richard D Morey, Rink Hoekstra, Jeffrey N Rouder, Michael D Lee, Eric-Jan Wagenmakers. The fallacy of placing confidence in confidence intervals. Psychon Bull Rev. 2015 Oct 8;23:103–123. doi: 10.3758/s13423-015-0947-8 .. Interval estimates – estimates of parameters that include an allowance for sampling uncertainty – have long been touted as a key component of statistical analyses. There are several kinds of interval estimates, but the most popular are confidence intervals (CIs): intervals that contain the true parameter value in some known proportion of repeated samples, on average. The width of confidence intervals is thought to index the precision of an estimate; CIs are thought to be a guide to which parameter values are plausible or reasonable; and the confidence coefficient of the interval (e.g., 95 %) is thought to index the plausibility that the true parameter is included in the interval. We show in a number of examples that CIs do not necessarily have any of these properties, and can lead to unjustified or arbitrary inferences. ..
- ↑ Jump up to: 2.0 2.1 https://pmc.ncbi.nlm.nih.gov/articles/PMC4877414/ Sander Greenland, Stephen J Senn, Kenneth J Rothman, John B Carlin, Charles Poole, Steven N Goodman, Douglas G Altman. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol. 2016 May 21;31:337–350. doi: 10.1007/s10654-016-0149-3 // Misinterpretation and abuse of statistical tests, confidence intervals, and statistical power have been decried for decades, yet remain rampant. A key problem is that there are no interpretations of these concepts that are at once simple, intuitive, correct, and foolproof. Instead, correct use and interpretation of these statistics requires an attention to detail which seems to tax the patience of working scientists. This high cognitive demand has led to an epidemic of shortcut definitions and interpretations that are simply wrong, sometimes disastrously so—and yet these misinterpretations dominate much of the scientific literature. In light of this problem, we provide definitions and a discussion of basic statistics that are more general and critical than typically found in traditional introductory expositions. Our goal is to provide a resource for instructors, researchers, and consumers of statistics whose knowledge of statistical theory and technique may be limited but who wish to avoid and spot misinterpretations. We emphasize how violation of often unstated analysis protocols (such as selecting analyses for presentation based on the P values they produce) can lead to small P values even if the declared test hypothesis is correct, and can lead to large P values even if that hypothesis is incorrect. We then provide an explanatory list of 25 misinterpretations of P values, confidence intervals, and power. We conclude with guidelines for improving statistical interpretation and reporting.
- ↑ Jump up to: 3.0 3.1 https://link.springer.com/article/10.3758/s13423-015-0947-8 — *Psychonomic Bulletin & Review*, Oct 2015 “Confidence intervals are thought to index the precision of an estimate… CIs do not necessarily have any of these properties and thus cannot be used uncritically in this way.” Why it’s relevant: highlights the error of treating CIs as direct measures of probability about parameters.
- ↑ Jump up to: 4.0 4.1 https://www.frontiersin.org/articles/10.3389/fpsyg.2022.948423/full — *Frontiers in Psychology*, 2022 “It is given as interpretation … that any value within the 95% confidence interval could reasonably be the true value … This is a very common problem and results in ‘confusion intervals.’” Why it’s relevant: shows the widespread nature of this misunderstanding.
- ↑ Jump up to: 5.0 5.1 https://pubmed.ncbi.nlm.nih.gov/27256121 — *Eur J Epidemiol*, Apr 2016 “There are no interpretations … that are at once simple, intuitive, correct, and foolproof … users routinely misinterpret them (e.g. interpreting 95% CI as ‘there is a 0.95 probability that the parameter is contained in the CI’).” Why it’s relevant: authoritative critique on core misinterpretations.
- ↑ https://arxiv.org/abs/1807.06217 — *arXiv*, Jul 2018 “The so‑called ‘confidence curve’ … may assign arbitrarily low non‑zero probability to the true parameter; thus it is a misleading representation of uncertainty.” Why it’s relevant: supplies theoretical foundation for careful formulation of \(f_N(x)\).
Keywords
«Bayesian ideology», «Bayesian statistics», «Central Limit Theorem» (CLT), «ChatGPT», «Confidence interval», «Credible interval», «Duration5», «Expectation and variance», «Frequentist ideology», «Independence and conditional probability», «Law of Large Numbers» (LLN), «Maximum likelihood estimation» (MLE), «Mean square deviation», «Mean value», «Normal distribution», «Probability», «Probability Density Function», «Random variable», «Sampling distributions», «Standard deviation», «Standard Error», «Student Distribution», «Theory of Probability»,
- Bayesian ideology
- Bayesian statistics
- Central Limit Theorem
- ChatGPT
- Confidence interval
- Credible interval
- Expectation and variance
- Frequentist ideology
- Independence and conditional probability
- Law of Large Numbers
- Maximum likelihood estimation
- Mean square deviation
- Mean value
- Normal distribution
- Probability
- Probability Density Function
- Random variable
- Sampling distributions
- Standard deviation
- Student Distribution
- Theory of Probability