| dc.contributor.author | Zhang, Zhoutong | |
| dc.contributor.author | Cole, Forrester | |
| dc.contributor.author | Tucker, Richard | |
| dc.contributor.author | Freeman, William T. | |
| dc.contributor.author | Dekel, Tali | |
| dc.date.accessioned | 2021-10-28T14:08:44Z | |
| dc.date.available | 2021-10-28T14:08:44Z | |
| dc.date.issued | 2021-08 | |
| dc.identifier.issn | 0730-0301 | |
| dc.identifier.issn | 1557-7368 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/136702 | |
| dc.description.abstract | We present a method to estimate depth of a dynamic scene, containing arbitrary moving objects, from an ordinary video captured with a moving camera. We seek a geometrically and temporally consistent solution to this under-constrained problem: the depth predictions of corresponding points across frames should induce plausible, smooth motion in 3D. We formulate this objective in a new test-time training framework where a depth-prediction CNN is trained in tandem with an auxiliary scene-flow prediction MLP over the entire input video. By recursively unrolling the scene-flow prediction MLP over varying time steps, we compute both short-range scene flow to impose local smooth motion priors directly in 3D, and long-range scene flow to impose multi-view consistency constraints with wide baselines. We demonstrate accurate and temporally coherent results on a variety of challenging videos containing diverse moving objects (pets, people, cars), as well as camera motion. Our depth maps give rise to a number of depth-and-motion aware video editing effects such as object and lighting insertion. | en_US |
| dc.publisher | Association for Computing Machinery (ACM) | en_US |
| dc.relation.isversionof | 10.1145/3450626.3459871 | en_US |
| dc.rights | Creative Commons Attribution 4.0 International license | en_US |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
| dc.source | ACM | en_US |
| dc.subject | Computer Graphics and Computer-Aided Design | en_US |
| dc.title | Consistent depth of moving objects in video | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | ACM Transactions on Graphics, Volume 40, Issue 4August 2021 Article No.: 148 pp 1–12 | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory | |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
| dspace.date.submission | 2021-09-27T15:22:20Z | |
| mit.journal.volume | 40 | en_US |
| mit.journal.issue | 4 | en_US |
| mit.license | PUBLISHER_CC | |
| mit.metadata.status | Authority Work Needed | en_US |