Departmental Technical Reports (CS)Copyright (c) 2018 University of Texas at El Paso All rights reserved.
https://digitalcommons.utep.edu/cs_techrep
Recent documents in Departmental Technical Reports (CS)en-usWed, 14 Mar 2018 02:13:56 PDT3600How to Detect Crisp Sets Based on Subsethood Ordering of Normalized Fuzzy Sets? How to Detect Type-1 Sets Based on Subsethood Ordering of Normalized Interval-Valued Fuzzy Sets?
https://digitalcommons.utep.edu/cs_techrep/1217
https://digitalcommons.utep.edu/cs_techrep/1217Mon, 12 Mar 2018 14:06:32 PDT
If all we know about normalized fuzzy sets is which set is a subset of which, will we be able to detect crisp sets? It is known that we can do it if we allow all possible fuzzy sets, including non-normalized ones. In this paper, we show that a similar detection is possible if we only allow normalized fuzzy sets. We also show that we can detect type-1 fuzzy sets based on the subsethood ordering of normalized interval-valued fuzzy sets.
]]>
Christian Servin et al.How to Efficiently Compute Ranges Over a Difference Between Boxes, With Applications to Underwater Localization
https://digitalcommons.utep.edu/cs_techrep/1216
https://digitalcommons.utep.edu/cs_techrep/1216Mon, 12 Mar 2018 14:06:22 PDT
When using underwater autonomous vehicles, it is important to localize them. Underwater localization is very approximate. As a result, instead of a single location x, we get a set X of possible locations of a vehicle. Based on this set of possible locations, we need to find the range of possible values of the corresponding objective function f(x). For missions on the ocean floor, it is beneficial to take into account that the vehicle is in the water, i.e., that the location of this vehicle is not in a set X' describing the under-floor matter. Thus, the actual set of possible locations of a vehicle is a difference set X−X'. So, it is important to find the ranges of different functions over such difference sets. In this paper, we propose an effective algorithm for solving this problem.
]]>
Luc Jaulin et al.Measures of Specificity Used in the Principle of Justifiable Granularity: A Theoretical Explanation of Empirically Optimal Selections
https://digitalcommons.utep.edu/cs_techrep/1215
https://digitalcommons.utep.edu/cs_techrep/1215Mon, 12 Mar 2018 14:06:11 PDT
To process huge amounts of data, one possibility is to combine some data points into granules, and then process the resulting granules. For each group of data points, if we try to include all data points into a granule, the resulting granule often becomes too wide and thus rather useless; on the other case, if the granule is too narrow, it includes only a few of the corresponding point -- and is, thus, also rather useless. The need for the trade-off between coverage and specificity is formalized as the principle of justified granularity. The specific form of this principle depends on the selection of a measure of specificity. Empirical analysis has show that exponential and power law measures of specificity are the most adequate. In this paper, we show that natural symmetries explain this empirically observed efficiency.
]]>
Olga Kosheleva et al.How Many Monte-Carlo Simulations Are Needed to Adequately Process Interval Uncertainty: An Explanation of the Smart Electric Grid-Related Simulation Results
https://digitalcommons.utep.edu/cs_techrep/1214
https://digitalcommons.utep.edu/cs_techrep/1214Mon, 12 Mar 2018 14:06:00 PDT
One of the possible ways of dealing with interval uncertainty is to use Monte-Carlo simulations. A recent study of using this technique for the analysis of different smart electric grid-related algorithms shows that we need approximately 500 simulations to compute the corresponding interval range with 5% accuracy. In this paper, we provide a theoretical explanation for these empirical results.
]]>
Afshin Gholamy et al.Type-2 Fuzzy Analysis Explains Ubiquity of Triangular and Trapezoid Membership Functions
https://digitalcommons.utep.edu/cs_techrep/1213
https://digitalcommons.utep.edu/cs_techrep/1213Mon, 12 Mar 2018 14:05:49 PDT
In principle, we can have many different membership functions. Interestingly, however, in many practical applications, triangular and trapezoidal membership functions are the most efficient ones. In this paper, we use fuzzy approach to explain this empirical phenomenon.
]]>
Olga Kosheleva et al.Lotfi Zadeh: a Pioneer in AI, a Pioneer in Statistical Analysis, a Pioneer in Foundations of Mathematics, and a True Citizen of the World
https://digitalcommons.utep.edu/cs_techrep/1212
https://digitalcommons.utep.edu/cs_techrep/1212Mon, 12 Mar 2018 14:05:38 PDT
Everyone knows Lotfi Zadeh as the Father of Fuzzy Logic. There have been -- and will be -- many papers on this important topic. What I want to emphasize in this paper is that his ideas go way beyond fuzzy logic:

he was a pioneer in AI;

he was a pioneer in statistical analysis; and

he was a pioneer in foundations of mathematics.

My goal is to explain these ideas to non-fuzzy folks. I also want to emphasize that he was a true Citizen of the World.

]]>
Vladik KreinovichWhy Burgers Equation: Symmetry-Based Approach
https://digitalcommons.utep.edu/cs_techrep/1211
https://digitalcommons.utep.edu/cs_techrep/1211Mon, 12 Mar 2018 14:05:28 PDT
In many application areas ranging from shock waves to acoustics, we encounter the same partial differential equation known as the Burgers' equation. The fact that the same equation appears in different application domains, with different physics, makes us conjecture that it can be derived from the fundamental principles. Indeed, in this paper, we show that this equation can be uniquely determined by the corresponding symmetries.
]]>
Leobardo Valera et al.Why Learning Has Aha-Moments and Why We Should Also Reward Effort, Not Just Results
https://digitalcommons.utep.edu/cs_techrep/1210
https://digitalcommons.utep.edu/cs_techrep/1210Mon, 12 Mar 2018 14:05:17 PDT
Traditionally, in machine learning, the quality of the result improves steadily with time (usually slowly but still steadily). However, as we start applying reinforcement learning techniques to solve complex tasks -- such as teaching a computer to play a complex game like Go -- we often encounter a situation in which for a long time, then is no improvement, and then suddenly, the system's efficiency jumps almost to its maximum. A similar phenomenon occurs in human learning, where it is known as the aha-moment. In this paper, we provide a possible explanation for this phenomenon, and show that this explanation leads to the need to reward students for effort as well, not only for their results.
]]>
Gerargo Uranga et al.Why 70/30 or 80/20 Relation Between Training and Testing Sets: A Pedagogical Explanation
https://digitalcommons.utep.edu/cs_techrep/1209
https://digitalcommons.utep.edu/cs_techrep/1209Mon, 12 Mar 2018 14:05:07 PDT
When learning a dependence from data, to avoid overfitting, it is important to divide the data into the training set and the testing set. We first train our model on the training set, and then we use the data from the testing set to gauge the accuracy of the resulting model. Empirical studies show that the best results are obtained if we use 20-30% of the data for testing, and the remaining 70-80% of the data for training. In this paper, we provide a possible explanation for this empirical result.
]]>
Afshin Gholamy et al.A New Kalman Filter Model for Nonlinear Systems Based on Ellipsoidal Bounding
https://digitalcommons.utep.edu/cs_techrep/1208
https://digitalcommons.utep.edu/cs_techrep/1208Mon, 12 Mar 2018 14:04:57 PDT
In this paper, a new filter model called set-membership Kalman filter for nonlinear state estimation problems was designed, where both random and unknown but bounded uncertainties were considered simultaneously in the discrete-time system. The main loop of this algorithm includes one prediction step and one correction step with measurement information, and the key part in each loop is to solve an optimization problem. The solution of the optimization problem produces the optimal estimation for the state, which is bounded by ellipsoids. The new filter was applied on a highly nonlinear benchmark example and a two-dimensional simulated trajectory estimation problem, in which the new filter behaved better compared with extended Kalman filter results. Sensitivity of the algorithm was discussed in the end.
]]>
Ligang Sun et al.Why Skew Normal: A Simple Pedagogical Explanation
https://digitalcommons.utep.edu/cs_techrep/1207
https://digitalcommons.utep.edu/cs_techrep/1207Mon, 12 Mar 2018 14:04:46 PDT
In many practical situations, we only know a few first moments of a random variable, and out of all probability distributions which are consistent with this information, we need to select one. When we know the first two moments, we can use the Maximum Entropy approach and get normal distribution. However, when we know the first three moments, the Maximum Entropy approach doe snot work. In such situations, a very efficient selection is a so-called skew normal distribution. However, it is not clear why this particular distribution should be selected. In this paper, we provide an explanation for this selection.
]]>
José Guadalupe Flores Muñiz et al.When Is Data Processing Under Interval and Fuzzy Uncertainty Feasible: What If Few Inputs Interact? Does Feasibility Depend on How We Describe Interaction?
https://digitalcommons.utep.edu/cs_techrep/1206
https://digitalcommons.utep.edu/cs_techrep/1206Mon, 12 Mar 2018 14:04:36 PDT
It is known that, in general, data processing under interval and fuzzy uncertainty is NP-hard -- which means that, unless P = NP, no feasible algorithm is possible for computing the accuracy of the result of data processing. It is also known that the corresponding problem becomes feasible if the inputs do not interact with each other, i.e., if the data processing algorithm computes the sum of n functions, each depending on only one of the $n$ inputs. In general, inputs x_{i} and x_{j} interact. If we take into account all possible interactions, and we use bilinear functions x_{i} * x_{j} to describe this interaction, we get an NP-hard problem. This raises two natural questions: what if only a few inputs interact? What if the interaction is described by some other functions? In this paper, we show that the problem remains NP-hard if we use different formulas to describe the inputs' interaction, and it becomes feasible if we only have O(log(n)) interacting inputs -- but remains NP-hard of the number of inputs is O(n^{ϵ}) for any ϵ > 0.
]]>
Milan Hladík et al.Which t-Norm Is Most Appropriate for Bellman-Zadeh Optimization
https://digitalcommons.utep.edu/cs_techrep/1205
https://digitalcommons.utep.edu/cs_techrep/1205Mon, 12 Mar 2018 14:04:25 PDT
In 1970, Richard Bellman and Lotfi Zadeh proposed a method for finding the maximum of a function under fuzzy constraints. The problem with this method is that it requires the knowledge of the minimum and the maximum of the objective function over the corresponding crisp set, and minor changes in this crisp set can lead to a drastic change in the resulting maximum. It is known that if we use a product "and"-operation (t-norm), the dependence on the maximum disappears. Natural questions are: what if we use other t-norms? Can we eliminate the dependence on the minimum? What if we use a different scaling in our derivation of the Bellman-Zadeh formula? In this paper, we provide answers to all these questions. It turns out that the product is the only t-norm for which there is no dependence on maximum, that it is impossible to eliminate the dependence on the minimum, and we also provide t-norms corresponding to the use of general scaling functions.
]]>
Vladik Kreinovich et al.Optimization of Quadratic Forms and t-norm Forms on Interval Domain and Computational Complexity
https://digitalcommons.utep.edu/cs_techrep/1204
https://digitalcommons.utep.edu/cs_techrep/1204Mon, 12 Mar 2018 14:04:15 PDT
We consider the problem of maximization of a quadratic form over a box. We identify the NP-hardness boundary for sparse quadratic forms: the problem is polynomially solvable for O(log n) nonzero entries, but it is NP-hard if the number of nonzero entries is of the order n^{ε} for an arbitrarily small ε > 0. Then we inspect further polynomially solvable cases. We define a sunflower graph over the quadratic form and study efficiently solvable cases according to the shape of this graph (e.g. the case with small sunflower leaves or the case with a restricted number of negative entries). Finally, we define a generalized quadratic form, called t-norm form, where the quadratic terms are replaced by t-norms. We prove that the optimization problem remains NP-hard with an arbitrary Lipschitz continuous t-norm.
]]>
Milan Hladik et al.How to Monitor Possible Side Effects of Enhanced Oil Recovery Process
https://digitalcommons.utep.edu/cs_techrep/1203
https://digitalcommons.utep.edu/cs_techrep/1203Mon, 12 Mar 2018 14:04:04 PDT
To extract all the oil from a well, petroleum engineers pump hot reactive chemicals into the well. This enhanced oil recovery process needs to be thoroughly monitored, since the aggressively hot liquid can seep out and, if unchecked, eventually pollute the sources of drinking water. At present, to monitor this process, engineers measure the seismic waves generated when the liquid fractures the minerals. However, the resulting seismic waves are weak in comparison with the background noise. Thus, the accuracy with which we can locate the spreading liquid based on these weak signals is low, and we get only a very crude approximate understanding of how the liquid propagates. To get a more accurate picture of the liquid propagation, we propose to use active seismic analysis: namely, we propose to generate strong seismic waves and use a large-N array of sensors to observe their propagation.
]]>
Jose Manuel Dominguez Esquivel et al.From Traditional Neural Networks to Deep Learning: Towards Mathematical Foundations of Empirical Successes
https://digitalcommons.utep.edu/cs_techrep/1202
https://digitalcommons.utep.edu/cs_techrep/1202Mon, 12 Mar 2018 14:03:54 PDT
How do we make computers think? To make machines that fly, it is reasonable to look at the creatures that know how to fly: the birds. To make computers think, it is reasonable to analyze how we think -- this is the main origin of neural networks. At first, one of the main motivations was speed -- since even with slow biological neurons, we often process information fast. The need for speed motivated traditional 3-layer neural networks. At present, computer speed is rarely a problem, but accuracy is -- this motivated deep learning. In this paper, we concentrate on the need to provide mathematical foundations for the empirical success of deep learning.
]]>
Vladik KreinovichItalian Folk Multiplication Algorithm Is Indeed Better: It Is More Parallelizable
https://digitalcommons.utep.edu/cs_techrep/1201
https://digitalcommons.utep.edu/cs_techrep/1201Mon, 12 Mar 2018 14:03:44 PDT
Traditionally, many ethnic groups had their own versions of arithmetic algorithms. Nowadays, most of these algorithms are studied mostly as pedagogical curiosities, as an interesting way to make arithmetic more exciting to the kids: by applying to their patriotic feelings -- if they are studying the algorithms traditionally used by their ethic group -- or simply to their sense of curiosity. Somewhat surprisingly, we show that one of these algorithms -- a traditional Italian multiplication algorithm -- is actually in some reasonable sense better than the algorithm that we all normally use -- namely, it is easier to parallelize.
]]>
Martine Ceberio et al.Reverse Mathematics Is Computable for Interval Computations
https://digitalcommons.utep.edu/cs_techrep/1200
https://digitalcommons.utep.edu/cs_techrep/1200Mon, 12 Mar 2018 14:03:34 PDT
For systems of equations and/or inequalities under interval uncertainty, interval computations usually provide us with a box whose all points satisfy this system. Reverse mathematics means finding necessary and sufficient conditions, i.e., in this case, describing the set of {\it all} the points that satisfy the given system. In this paper, we show that while we cannot always exactly describe this set, it is possible to have a general algorithm that, given ε > 0, provides an ε-approximation to the desired solution set.
]]>
Martine Ceberio et al.Virtual Agent Interaction Framework (VAIF): A Tool for Rapid Development of Social Agents
https://digitalcommons.utep.edu/cs_techrep/1199
https://digitalcommons.utep.edu/cs_techrep/1199Mon, 12 Mar 2018 14:03:24 PDT
Creating an embodied virtual agent is often a complex process. It involves 3D modeling and animation skills, advanced programming knowledge, and in some cases artificial intelligence or the integration of complex interaction models. Features like lip-syncing to an audio file, recognizing the users’ speech, or having the character move at certain times in certain ways, are inaccessible to researchers that want to build and use these agents for education, research, or industrial uses. VAIF, the Virtual Agent Interaction Framework, is an extensively documented system that attempts to bridge that gap and provide inexperienced researchers the tools and means to develop their own agents in a centralized, lightweight platform that provides all these features through a simple interface within the Unity game engine. In this paper we present the platform, describe its features, and provide a case study where agents were developed and deployed in mobile-device, virtual-reality, and augmented-reality platforms by users with no coding experience.
]]>
Ivan Gris et al.Why Superforecasters Change Their Estimates on Average by 3.5%: A Possible Theoretical Explanation
https://digitalcommons.utep.edu/cs_techrep/1198
https://digitalcommons.utep.edu/cs_techrep/1198Mon, 12 Mar 2018 14:03:14 PDT
A recent large-scale study of people's forecasting ability has shown that there is a small group of superforecasters, whose forecasts are significantly more accurate than the forecasts of an average person. Since forecasting is important in many application areas, researchers have studied what exactly the supreforecasters do differently -- and how we can learn from them, so that we will be able to forecast better. One empirical fact that came from this study is that, in contrast to most people, superforecasters make much smaller adjustments to their probability estimates. On average, their average probability change is 3.5%. In this paper, we provide a possible theoretical explanation for this empirical value.
]]>
Olga Kosheleva et al.How to Explain the Results of the Richard Thaler's 1997 Financial Times Contest
https://digitalcommons.utep.edu/cs_techrep/1197
https://digitalcommons.utep.edu/cs_techrep/1197Mon, 12 Mar 2018 14:03:04 PDT
In 1997, by using a letter published in Financial Times, Richard H. Thaler, the 2017 Nobel Prize winner in Economics, performed the following experiment: he asked readers to submit numbers from 0 to 100, so that the person whose number is the closest to 2/3 of the average will be the winner. An intuitive answer is to submit 2/3 of the average (50), i.e., 33 1/3. A logical answer, as can be explained, is to submit 0. The actual winning submission was -- depending on how we count -- 12 or 13. In this paper, we propose a possible explanation for this empirical result.
]]>
Olga Kosheleva et al.How to Explain Empirical Distribution of Software Defects by Severity
https://digitalcommons.utep.edu/cs_techrep/1196
https://digitalcommons.utep.edu/cs_techrep/1196Mon, 12 Mar 2018 14:02:53 PDT
In the last decades, several tools have appeared that, given a software package, mark possible defects of different potential severity. Our empirical analysis has shown that in most situations, we observe the same distribution or software defects by severity. In this paper, we present this empirical distribution, and we use interval-related ideas to provide an explanation for this empirical distribution.
]]>
Francisco Zapata et al.How Intelligence Community Interprets Imprecise (Fuzzy) Words, and How to Justify This Empirical-Based Interpretation
https://digitalcommons.utep.edu/cs_techrep/1195
https://digitalcommons.utep.edu/cs_techrep/1195Mon, 12 Mar 2018 14:02:42 PDT
To provide a more precise meaning to imprecise (fuzzy) words like "probable" or "almost certain", researchers analyzed how often intelligence predictions hedged by each corresponding word turned out to be true. In this paper, we provide a theoretical explanation for the resulting empirical frequencies.
]]>
Olga Kosheleva et al.How to Gauge Repair Risk?
https://digitalcommons.utep.edu/cs_techrep/1194
https://digitalcommons.utep.edu/cs_techrep/1194Mon, 12 Mar 2018 14:02:31 PDT
At present, there exist several automatic tools that, given a software, find locations of possible defects. A general tool does not take into account a specificity of a given program. As a result, while many defects discovered by this tool can be truly harmful, many uncovered alleged defects are, for this particular software, reasonably (or even fully) harmless. A natural reaction is to repair all the alleged defects, but the problem is that every time we correct a program, we risk introducing new faults. From this viewpoint, it is desirable to be able to gauge the repair risk. This will help use decide which part of the repaired code is most likely to fail and thus, needs the most testing, and even whether repairing a probably harmless defect is worth an effort at all -- if as a result, we increase the probability of a program malfunction. In this paper, we analyze how repair risk can be gauged.
]]>
Francisco Zapata et al.