# Outlier detection under interval uncertainty: Algorithmic solvability and computational complexity

#### Abstract

In many application areas it is important to detect outliers. Traditional engineering approach to outlier detection is that we start with some “normal” values *x*_{1},…,*x _{n}*, compute the sample average

*E*, the sample standard variation σ, and then mark a value

*x*as an outlier if

*x*is outside the

*k*

_{0}-sigma interval (

*E*−

*k*

_{0}· σ,

*E*+

*k*

_{ 0}· σ] (for some pre-selected parameter

*k*

_{ 0}). In practice, we often have only interval ranges [

*x&barbelow;*] for the normal values

_{ i}, x¯_{i}*x*

_{ 1},…,

*x*. In this case, we only have intervals of possible values for the bounds

_{n}*E*−

*k*

_{0}· σ and

*E*+

*k*

_{0}· σ. We can therefore identify outliers as values that are outside all

*k*

_{0}-sigma intervals. ^ Once we identify a value as an outlier for a fixed

*k*

_{ 0}, it is also desirable to find out to what degree this value is an outlier, i.e., what is the largest value

*k*

_{0}for which this value is an outlier. ^ In this thesis, we analyze the computational complexity of these outlier detection problems, and provide efficient algorithms that solve some of these problems (under reasonable conditions). ^

#### Subject Area

Computer Science

#### Recommended Citation

Patangay, Praveen, "Outlier detection under interval uncertainty: Algorithmic solvability and computational complexity" (2003). *ETD Collection for University of Texas, El Paso*. AAIEP10373.

http://digitalcommons.utep.edu/dissertations/AAIEP10373