[an error occurred while processing this directive]
[an error occurred while processing this directive]
2001-2002 Seminar Series
|
Normalized local sequence alignment
|
Abdullah Arslan
Department of Computer Science
University of California at Santa Barbara
Santa Barbara, California
|
DEPARTMENT SEMINAR
|
DATE:
|
Monday, March 25, 2002
|
|
TIME:
|
2:30pm
|
|
PLACE:
|
Anthropology 132
|
|
|
*** Everyone is welcome ***
|
|
Abstract
Detecting local similarities in biological sequences is an
important problem which has applications in identifying homologous DNA
regions which show evolutionary relatedness, or finding common
substructures or functional units in proteins. Given two sequences X
and Y, the local alignment problem aims to find a subsequence I from X
and J from Y with the highest similarity score under a given scoring
scheme. However, this classical notion of similarity does not take
into account the lengths of the aligned subsequences I and J.
As a consequence, some important local similarities may go undetected.
The normalized local alignment aims to avoid this anomaly
by using length-normalized similarity scores instead of ordinary scores.
Formally, given two sequences X and Y, the normalized local alignment
problem seeks for subsequences which maximize
score(I,J)/(|I|+|J|) where the alignment length |I|+|J|
is at least as large as some bound t.
The purpose is to identify regions with high normalized scores
over all biologically meaningful (sufficiently large) regions.
There are several approaches for achieving this goal.
One method proposed yields an efficient fractional programming
algorithm with a slightly modified objective function.
Another approach limits the length of one of the subsequences
and aims to maximize the ordinary score with this length constraint.
There is an approximation algorithm for the resulting
constrained maximization problem, which can also be useful in
detecting circular permutations in proteins.
The final approach combines the ideas in the previous two algorithms and
returns a local alignment with maximum achievable score over all alignments
with length at least t. The length of the resulting alignment
differs from the desired length t only by a prescribed fraction of t.
I will give an overview of these three approaches for
normalized local sequence alignment,
and point to further avenues of research for related questions.
About the speaker
Abdullah Arslan is currently a PhD candidate in Computer Science
Department of University of California at Santa Barbara.
He earned his Bachelors Degree in Computer Engineering from
Middle East Technical University in Ankara, Turkey.
Following the completion of his undergraduate study
he started to work for the Central Bank of Turkey.
He participated in developing and maintaining internal software
for the Central Bank for more than three years.
His desire for being in academia has increased gradually,
and finally he decided to become a graduate student
in Computer Science Department of University of North Texas.
After earning his Master of Science Degree there,
he moved to Santa Barbara to pursue a PhD degree.
He expects to complete his PhD study in June 2002.
[an error occurred while processing this directive]