MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Longest Common Subsequence Over Constant-Sized Alphabets: Beating the Naive Approximation Ratio

Author(s)
Akmal, Shyan
Thumbnail
DownloadThesis PDF (380.7Kb)
Advisor
Williams, Virginia Vassilevska
Williams, Ryan
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
This thesis investigates the approximability of the Longest Common Subsequence (LCS) problem. The fastest known algorithm for solving the LCS problem runs in essentially quadratic time in the length of the input, and it is known that under the Strong Exponential Time Hypothesis there can be no polynomial improvement over this quadratic running time. No similar limitation holds however, for approximate computation of the LCS, except in certain restricted scenarios. When the two input strings come from an alphabet of size k, returning the subsequence formed by the most frequent symbol occurring in both strings achieves a 1/k approximation for the LCS. It is an open problem whether a better than 1/k approximation can be achieved in truly subquadratic time (O(n^{2-δ}) time for constant δ > 0). A recent result [Rubinstein and Song SODA'2020] shows that a 1/2+ε approximation for the LCS over a binary alphabet is possible in truly subquadratic time, provided the input strings have the same length. In this paper we show that if for some ε > 0 a 1/2+ε approximation is achievable for binary LCS in truly subquadratic time when the input strings can have differing lengths, then for every constant k there exists some δ_k > 0 such that there is a truly subquadratic time algorithm that achieves a 1/k+δ_k approximation for k-ary alphabet LCS. Thus, we show that for constant-factor LCS approximation, the case of binary strings is in some sense the hardest case. We also show that for every constant k, if one is given two strings of equal length over a k-ary alphabet, one can obtain a 1/k+ε approximation for some constant ε > 0 in truly subquadratic time. This extends the Rubinstein and Song result to all alphabets of constant size, and gives the first nontrivial improvement over the naive 1/k approximation for the LCS of strings over alphabets of size k for all k ≥ 3.
Date issued
2021-09
URI
https://hdl.handle.net/1721.1/140127
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.