| dc.contributor.author | Wang, Qiwen | |
| dc.contributor.author | Jaggi, Sidharth | |
| dc.contributor.author | Medard, Muriel | |
| dc.contributor.author | Cadambe, Viveck R | |
| dc.contributor.author | Schwartz, Moshe | |
| dc.date.accessioned | 2021-10-27T20:30:46Z | |
| dc.date.available | 2021-10-27T20:30:46Z | |
| dc.date.issued | 2017 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/136088 | |
| dc.description.abstract | © 1963-2012 IEEE. The problem of one-way file synchronization, henceforth called 'file updates', is studied in this paper. Specifically, a client edits a file, where the edits are modeled by insertions and deletions (InDels). An old copy of the file is stored remotely at a data-centre, and is also available to the client. We consider the problem of throughput- and computationally-efficient communication from the client to the data-centre, to enable the data-centre to update its old copy to the newly edited file. Two models for the source files and edit patterns are studied: the random pre-edit sequence left-to-right random InDel (RPES-LtRRID) process, and the arbitrary pre-edit sequence arbitrary InDel (APES-AID) process. In both models, we consider the regime, in which the number of insertions and deletions is a small (but constant) fraction of the length of the original file. For both models, information-theoretic lower bounds on the best possible compression rates that enable file updates are derived (up to first order terms). Conversely, a simple compression algorithm using dynamic programming (DP) and entropy coding (EC), henceforth called DP-EC algorithm, achieves rates that are within constant additive gap (which diminishes as the alphabet size increases) to information-theoretic lower bounds for both models. For the RPES-LtRRID model, a dynamic-programming-run-length-compression (DP-RLC) algorithm is proposed, which achieves a compression rate matching the information-theoretic lower bound up to first order terms. Therefore, when the insertion and deletion probabilities are small (such that first order terms dominate), the achievable rate by DP-RLC is nearly optimal for the RPES-LtRRID model. | |
| dc.language.iso | en | |
| dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | |
| dc.relation.isversionof | 10.1109/TIT.2017.2705100 | |
| dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | |
| dc.source | arXiv | |
| dc.title | File Updates Under Random/Arbitrary Insertions And Deletions | |
| dc.type | Article | |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| dc.relation.journal | IEEE Transactions on Information Theory | |
| dc.eprint.version | Original manuscript | |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | |
| eprint.status | http://purl.org/eprint/status/NonPeerReviewed | |
| dc.date.updated | 2019-06-20T17:58:32Z | |
| dspace.orderedauthors | Wang, Q; Jaggi, S; Medard, M; Cadambe, VR; Schwartz, M | |
| dspace.date.submission | 2019-06-20T17:58:33Z | |
| mit.journal.volume | 63 | |
| mit.journal.issue | 10 | |
| mit.metadata.status | Authority Work and Publication Information Needed | |