Date of Award

2013-01-01

Degree Name

Doctor of Philosophy

Department

Computational Science

Advisor(s)

Ming-Ying Leung

Second Advisor

Igor C. Almeida

Abstract

Glycosylphosphatidylinositol (GPI)-anchored proteins are involved in many biological processes and are of medical importance. The identification and analysis of the entire collection of free and protein-linked GPIs within an organism (i.e., GPIomics) requires highly sensitive instruments. At present, liquid chromatography-tandem mass spectrometry (LC-MS/MS or -MSn) is the most efficient laboratory technique for these tasks. As a typical MSn experiment produces hundreds of thousands of spectra, the data analysis creates a major bottleneck in high-throughput GPIomic projects. Yet, no computational tool for characterizing the chemical structures of GPI is available to date. We propose a library-search algorithm to identify GPIs by matching fragment peaks in the spectra with molecular masses derived from a collection of theoretical GPI structures constructed based on properties of currently known GPIs. A theoretically possible GPI structure is assessed by a scoring scheme that incorporates its fitness values for individual observed spectra as well as its frequency of being considered as a good fit. The algorithm has been tested on a set of experimentally confirmed GPIs for the protozoan parasite Trypanosoma cruzi. The final list of predicted GPI candidates contains 76 out of the 78 known structures in the test set. Three different versions of the proposed algorithm have been developed. Firstly, we ran the algorithm on a single computer completing the predictions in approximately 10 days. A second version uses HTCondor; with an average of 16 processors, it took 3 days, 19 hours, 38 minutes to complete the job. A third version was implemented in MPI; with 72 processors it completed in 22 hours and 38 minutes. Finally, a probability is assessed by logistic regression model that can incorporate expert opinion. This computational tool is expected to quicken the discovery and characterization of GPI molecules.

Language

en

Provenance

Received from ProQuest

File Size

71 pages

File Format

application/pdf

Rights Holder

Juan Clemente Aguilar Bonavides

Share

COinS