[an error occurred while processing this directive] [an error occurred while processing this directive]


University of Saskatchewan, Department of Computer Science


2001-2002 Seminar Series

Normalized local sequence alignment

Abdullah Arslan
Department of Computer Science
University of California at Santa Barbara
Santa Barbara, California

DEPARTMENT SEMINAR
DATE: Monday, March 25, 2002
TIME: 2:30pm
PLACE: Anthropology 132
*** Everyone is welcome ***

Abstract

Detecting local similarities in biological sequences is an important problem which has applications in identifying homologous DNA regions which show evolutionary relatedness, or finding common substructures or functional units in proteins. Given two sequences X and Y, the local alignment problem aims to find a subsequence I from X and J from Y with the highest similarity score under a given scoring scheme. However, this classical notion of similarity does not take into account the lengths of the aligned subsequences I and J. As a consequence, some important local similarities may go undetected. The normalized local alignment aims to avoid this anomaly by using length-normalized similarity scores instead of ordinary scores. Formally, given two sequences X and Y, the normalized local alignment problem seeks for subsequences which maximize score(I,J)/(|I|+|J|) where the alignment length |I|+|J| is at least as large as some bound t. The purpose is to identify regions with high normalized scores over all biologically meaningful (sufficiently large) regions. There are several approaches for achieving this goal. One method proposed yields an efficient fractional programming algorithm with a slightly modified objective function. Another approach limits the length of one of the subsequences and aims to maximize the ordinary score with this length constraint. There is an approximation algorithm for the resulting constrained maximization problem, which can also be useful in detecting circular permutations in proteins. The final approach combines the ideas in the previous two algorithms and returns a local alignment with maximum achievable score over all alignments with length at least t. The length of the resulting alignment differs from the desired length t only by a prescribed fraction of t.

I will give an overview of these three approaches for normalized local sequence alignment, and point to further avenues of research for related questions.

About the speaker

Abdullah Arslan is currently a PhD candidate in Computer Science Department of University of California at Santa Barbara. He earned his Bachelors Degree in Computer Engineering from Middle East Technical University in Ankara, Turkey. Following the completion of his undergraduate study he started to work for the Central Bank of Turkey. He participated in developing and maintaining internal software for the Central Bank for more than three years. His desire for being in academia has increased gradually, and finally he decided to become a graduate student in Computer Science Department of University of North Texas. After earning his Master of Science Degree there, he moved to Santa Barbara to pursue a PhD degree. He expects to complete his PhD study in June 2002. [an error occurred while processing this directive]