Date of Award

2011-01-01

Degree Name

Master of Science

Department

Computer Science

Advisor(s)

Olac Fuentes

Abstract

Ribonucleic acid (RNA) is essential for all forms of life. RNA is made up of a large chain of nucleotide bases: Guanine (G), Uracil (U), Cytosine (C), and Adenine (A). An RNA strand can fold on itself to allow G-C, A-U, and G-U bases to form hydrogen bonds, this is known as a secondary structure. Knowing the secondary structure of an RNA chain is very important because it will allow researchers to better understand its specific functions. RNA will create secondary structures that tend to minimize their free energy. RNA secondary structure prediction is the attempt to predict physical folding of RNA given its linear strand.

A common approach to RNA secondary structure prediction is dynamic programming. Dynamic programming is based on the assumption that a given problem can be solved optimally by recursively solving its subproblems optimally. Dynamic programming approaches for secondary structure prediction have running times of O(n3), where n is the length of the RNA sequence. There are two main problems with the dynamic programming approach to RNA secondary structure. First, for very long chains, computing a prediction can take a substantial amount of time. Second, some foldings contain secondary structures that violate the assumption of optimal substructure.

In this thesis, I propose an approach to RNA secondary structure prediction that attempts to overcome the limitations of dynamic programming. The approach is based on depth-first search in combination with a set of heuristics. I use a preprocessing stage, first proposed by Weise for his genetic algorithms, to find palindromic sequences, which are helical regions of RNA pairings. Then I search for a subset of structures that are mutually compatible and minimize the free energy using depth-first search. This search is further sped by applying a set of heuristics that take into consideration palindrome length and likely compatibility with other potential structures. A couple of advantages of this depth first search approach are that it does not rely on optimal substructures and is easily parallelizable. Experiments show that the proposed methodology is promising because of these advantages and the results that were produced being competitive with those of MFOLD, a well-established secondary structure prediction algorithm.

Language

Provenance

Received from ProQuest

Copyright Date

2011

File Size

83 pages

File Format

application/pdf

Rights Holder

Christopher Roman Cuellar

Recommended Citation

Cuellar, Christopher Roman, "Prediction of Ribonucleic Acid Secondary Structures Using A Heuristic Backtracking Search" (2011). Open Access Theses & Dissertations. 2263.
https://scholarworks.utep.edu/open_etd/2263

Download

Included in

Bioinformatics Commons, Computer Sciences Commons

COinS

Open Access Theses & Dissertations

Prediction of Ribonucleic Acid Secondary Structures Using A Heuristic Backtracking Search

Date of Award

Degree Name

Department

Advisor(s)

Abstract

Language

Provenance

Copyright Date

File Size

File Format

Rights Holder

Recommended Citation

Included in

Search

Links

Browse

Author Corner

Open Access Theses & Dissertations

Prediction of Ribonucleic Acid Secondary Structures Using A Heuristic Backtracking Search

Author

Date of Award

Degree Name

Department

Advisor(s)

Abstract

Language

Provenance

Copyright Date

File Size

File Format

Rights Holder

Recommended Citation

Included in

Share

Search

Links

Browse

Author Corner