Departmental Technical Reports (CS)

Publication Date

11-2015

Technical Report: UTEP-CS-15-87

To appear in Journal of Innovative Technology and Education

Abstract

In statistical analysis, we usually use the observed sample values x1, ..., xn to compute the values of several statistics v(x1, ..., xn) -- such as sample mean, sample variance, etc. The usual formulas for these statistics implicitly assume that we know the exact values x1, ..., xn. In practice, the sample values X1, ..., Xn come from measurements and are, thus, only approximations to the actual (unknown) values x1, ..., xn of the corresponding quantity. Often, the only information that we have about each measurement error Δ xi = Xi − xi is the upper bound Δi on the measurement error: |Δ xi| ≤ Δi. In this case, the only information about each actual value xi is that it belongs to the interval [Xi − Δi, Xi + Δi]. It is therefore desirable to compute the range of each given statistic v(x1, ..., xn) over these intervals. It is known that often, estimating the range of a robust statistic (e.g., median) is computationally easier than estimating the range of its traditional equivalent (e.g., mean). In this paper, we provide a qualitative explanation for this phenomenon.

COinS