Approximate Periods of Strings based on Distance Sum for DNA Sequence Analysis


KIPS Transactions on Software and Data Engineering, Vol. 2, No. 2, pp. 119-122, Feb. 2013
10.3745/KTSDE.2013.2.2.119,   PDF Download:

Abstract

Repetitive strings such as periods have been studied vigorously in so diverse fields as data compression, computer-assisted music analysis, bioinformatics, and etc. In bioinformatics, periods are highly related to repetitive patterns in DNA sequences so called tandem repeats. In some cases, quite similar but not the same patterns are repeated and thus we need approximate string matching algorithms to study tandem repeats in DNA sequences. In this paper, we propose a new definition of approximate periods of strings based on distance sum. Given two strings p (|p|=m) and x (|X||=n), we propose an algorithm that computes the minimum approximate period distance based on distance sum. Our algorithm runs in O (mn²) time for the weighted edit distance, and runs in O(m n) time for the edit distance, and runs in O(n) time for the Hamming distance.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
J. H. Jeong, Y. H. Kim, J. C. Na, J. S. Sim, "Approximate Periods of Strings based on Distance Sum for DNA Sequence Analysis," KIPS Transactions on Software and Data Engineering, vol. 2, no. 2, pp. 119-122, 2013. DOI: 10.3745/KTSDE.2013.2.2.119.

[ACM Style]
Ju Hui Jeong, Young Ho Kim, Joong Chae Na, and Jeong Seop Sim. 2013. Approximate Periods of Strings based on Distance Sum for DNA Sequence Analysis. KIPS Transactions on Software and Data Engineering, 2, 2, (2013), 119-122. DOI: 10.3745/KTSDE.2013.2.2.119.