Sio-Long Lo

Publication Date



Technical Report: UTEP-CS-10-25


Once we have a collection of values x1, ..., ,xn corresponding a class of objects, a usual way to decide whether a new object with the value x of the corresponding property belongs to this class is to check whether the value x belongs to interval [E - k0 * s, E + k0 * s], where E = (1/n) * (x1 + ... + xn) is the sample mean, V = s^2 = (1/n) * ((x1 - E)^2 + ... + (xn - E)^2) is the sample variance, and the parameter k0 is determined by the degree of confidence with which we want to make the decision. For each value x, the degree of confidence that x belongs to the class depends on the smallest value k for which x belongs to the interval [E - k * s, E + k * s], i.e., on the ratio r = 1/k = s/(E - x). In practice, we often only know the intervals [xi] that contain the actual values xi. Different values xi from these intervals lead, in general, to different values of r, so it is desirable to compute the range [r] of corresponding values of r. Polynomial-time algorithms are known for computing [r] under certain conditions; whether it is possible that [r] can be computed in polynomial time was unknown. In this paper, we prove that the problem of computing [r] is NP-hard. A similar NP-hardness result is proven for a similar ratio V/E that is used in clustering.