Whole-coal versus ash basis in coal geochemistry: A mathematical approach to consistent results

Publication Date


Document Type



Geboy NJ, Engle MA, Hower JC. Whole-coal versus ash basis in coal geochemistry: A mathematical approach to consistent interpretations. International Journal of Coal Geology 2013 1 July 2013;113:41-9.


Several standard methods require coal to be ashed prior to geochemical analysis. Researchers, however, are commonly interested in the compositional nature of the whole-coal, not its ash. Coal geochemical data for any given sample can, therefore, be reported in the ash basis on which it is analyzed or the whole-coal basis to which the ash basis data are back calculated. Basic univariate (mean, variance, distribution, etc.) and bivariate (correlation coefficients, etc.) measures of the same suite of samples can be very different depending which reporting basis the researcher uses. These differences are not real, but an artifact resulting from the compositional nature of most geochemical data. The technical term for this artifact is subcompositional incoherence. Since compositional data are forced to a constant sum, such as 100% or 1,000,000 ppm, they possess curvilinear properties which make the Euclidean principles on which most statistical tests rely inappropriate, leading to erroneous results. Applying the isometric logratio (ilr) transformation to compositional data allows them to be represented in Euclidean space and evaluated using traditional tests without fear of producing mathematically inconsistent results. When applied to coal geochemical data, the issues related to differences between the two reporting bases are resolved as demonstrated in this paper using major oxide and trace metal data from the Pennsylvanian-age Pond Creek coal of eastern Kentucky, USA. Following ilr transformation, univariate statistics, such as mean and variance, still differ between the ash basis and whole-coal basis, but in predictable and calculated manners. Further, the stability between two different components, a bivariate measure, is identical, regardless of the reporting basis. The application of ilr transformations addresses both the erroneous results of Euclidean-based measurements on compositional data as well as the inconsistencies observed on coal geochemical data reported on different bases.