Data processing under a combination of interval and probabilistic uncertainty and its application to earth and environmental studies and engineering

Jan Bastian Beck, University of Texas at El Paso

Abstract

In many areas of science and engineering, we are interested in the value of physical quantities that are difficult (or even impossible) to measure directly. For example, it is very difficult to directly measure the amount of oil in a well or, more generally, any geophysical characteristic describing deep geological structures. To find the value of the desired quantity y, we measure related easier-to-measure quantities x 1,…,xn and then use the known relation y = f(x1,…, xn) between y and xi to estimate y. Measurements are never 100% accurate; as a result, the measured values 1,…,n are, in general, different from the actual values xi of the measured quantities. Because of the measurement errors Δxi = i − xi, the result = f(1,…,n) of data processing is, in general, different from the actual value y = f(x1,…, xn) of the desired quantity. It is desirable to estimate the resulting error Δy = y˜ − y . ^ The actual estimation of the uncertainty Δy in data processing depends on whether we only know the upper bounds on Δ xi (interval uncertainty), or we also know the probabilities of different values of Δxi (probabilistic uncertainty), or we have partial information about these probabilities (a combination of interval and probabilistic uncertainty). There exist well-developed techniques for the cases of interval and probabilistic uncertainty. In this dissertation, we extend these techniques to the case of a combination of interval and probabilistic uncertainty. We analyze all the stages of estimating Δ y from extracting the information about the uncertainty Δ xi of the inputs to analyzing possible relations (constraints) between the inputs xi to actually estimating Δ y. All these stages are illustrated on examples from Earth and environmental studies and from engineering. ^ We start with the methods for extracting information the input uncertainty of a given input from the measurement results. Several methods are known for solving this problem, such as methods based on the Kolmogorov-Smirnov confidence limits and methods based on Chebyshev inequalities. All these methods provide bounds for the cumulative distribution function (cdf), i.e., provide a p-box that contains the actual cdf with a guaranteed level of confidence. In Chapter 2, we show that Kolmogorov-Smirnov-type methods are optimal for providing a p-box, design a new optimal Kolmogorov-Smirnov-type method, and show that this method is indeed better than the previously known ones. ^ In Chapter 3, we show how we can extract the information about the relations (constraints) between different inputs. ^ In Chapter 4, we show, on the example of gravity data, how relations between constraints can be used to filter out erroneous measurement results and improve the accuracy of other data points. ^ In Chapters 5–7, we analyze how the information about the uncertainty of the inputs can be used to estimate the uncertainty of the results of data processing. (Abstract shortened by UMI.)^

Subject Area

Computer Science

Recommended Citation

Beck, Jan Bastian, "Data processing under a combination of interval and probabilistic uncertainty and its application to earth and environmental studies and engineering" (2005). ETD Collection for University of Texas, El Paso. AAI3153737.
http://digitalcommons.utep.edu/dissertations/AAI3153737

Share

COinS