Assessing the best edit in perturbation-based iterative refinement algorithms to compute the median string

View/ Open
Date
2019-04Author
Sánchez Mirabal, Pedro Aniel
Abreu Salas, José Ignacio
Seco N., Diego
Publisher
Pattern Recognition LettersDescription
Artículo de publicación SCOPUSMetadata
Show full item recordAbstract
Different pattern recognition techniques such as clustering, k-nearest neighbors classification or instance reduction algorithms require prototypes to represent pattern classes. In many applications, strings are used to encode instances, for example, in contour representations or in biological data such as DNA, RNA and protein sequences. Median strings have been used as representatives of a set of strings in different domains. Finding the median string is an NP-Complete problem for several formulations. Alternatively, heuristic approaches that iteratively refine an initial coarse solution by applying edit operations have been proposed. We propose here a novel algorithm that outperforms state of the art heuristic approximations to the median string in terms of convergence speed by estimating the effect of a perturbation in the minimization of the expressions that define the median strings. We present comparative experiments to validate these results.