Departmental Technical Reports (CS)

Computing Covariance and Correlation in Optimally Privacy-Protected Statistical Databases: Feasible Algorithms

Joshua Day, University of Wisconsin - WhitewaterFollow
Ali Jalal-Kamali, The University of Texas at El PasoFollow
Vladik Kreinovich, The University of Texas at El PasoFollow

Publication Date

8-2013

Comments

Technical Report: UTEP-CS-13-43a

To appear in Proceedings of 3rd World Conference on Soft Computing, San Antonio, December 15-18, 2013.

Abstract

In many real-life situations, e.g., in medicine, it is necessary to process data while preserving the patients' confidentiality. One of the most efficient methods of preserving privacy is to replace the exact values with intervals that contain these values. For example, instead of an exact age, a privacy-protected database only contains the information that the age is, e.g., between 10 and 20, or between 20 and 30, etc. Based on this data, it is important to compute correlation and covariance between different quantities. For privacy-protected data, different values from the intervals lead, in general, to different estimates for the desired statistical characteristic. Our objective is then to compute the range of possible values of these estimates.

Algorithms for effectively computing such ranges have been developed for situations when intervals come from the original surveys, e.g., when a person fills in whether his or her age is between 10 or 20, between 20 and 30, etc. These intervals, however, do not always lead to an optimal privacy protection; it turns out that more complex, computer-generated "intervalization" can lead to better privacy under the same accuracy, or, alternatively, to more accurate estimates of statistical characteristics under the same privacy constraints. In this paper, we extend the existing efficient algorithms for computing covariance and correlation based on privacy-protected data to this more general case of interval data.

tr13-43.pdf (91 kB)
Original file UTEP-CS-13-43

Download

Included in

Computer Sciences Commons

COinS

Departmental Technical Reports (CS)

Computing Covariance and Correlation in Optimally Privacy-Protected Statistical Databases: Feasible Algorithms

Publication Date

Comments

Abstract

Included in

Search

Links

Browse

Author Corner

Links

Departmental Technical Reports (CS)

Computing Covariance and Correlation in Optimally Privacy-Protected Statistical Databases: Feasible Algorithms

Authors

Publication Date

Comments

Abstract

Included in

Share

Search

Links

Browse

Author Corner

Links