Multicore Processing and Efficient On-Chip Caching for H.264 and Future Video Decoders

Finchelstein, Daniel Frederic; Sze, Vivienne; Chandrakasan, Anantha P.

dc.contributor.author	Finchelstein, Daniel Frederic
dc.contributor.author	Sze, Vivienne
dc.contributor.author	Chandrakasan, Anantha P.
dc.date.accessioned	2010-03-09T14:57:26Z
dc.date.available	2010-03-09T14:57:26Z
dc.date.issued	2009-10
dc.date.submitted	2009-05
dc.identifier.uri	http://hdl.handle.net/1721.1/52412
dc.description.abstract	Performance requirements for video decoding will continue to rise in the future due to the adoption of higher resolutions and faster frame rates. Multicore processing is an effective way to handle the resulting increase in computation. For power-constrained applications such as mobile devices, extra performance can be traded-off for lower power consumption via voltage scaling. As memory power is a significant part of system power, it is also important to reduce unnecessary on-chip and off-chip memory accesses. This paper proposes several techniques that enable multiple parallel decoders to process a single video sequence; the paper also demonstrates several on-chip caching schemes. First, we describe techniques that can be applied to the existing H.264 standard, such as multiframe processing. Second, with an eye toward future video standards, we propose replacing the traditional raster-scan processing with an interleaved macroblock ordering; this can increase parallelism with minimal impact on coding efficiency and latency. The proposed architectures allow N parallel hardware decoders to achieve a speedup of up to a factor of N. For example, if N=3, the proposed multiple frame and interleaved entropy slice multicore processing techniques can achieve performance improvements of 2.64times and 2.91times, respectively. This extra hardware performance can be used to decode higher definition videos. Alternatively, it can be traded-off for dynamic power savings of 60% relative to a single nominal-voltage decoder. Finally, on-chip caching methods are presented that significantly reduce off-chip memory bandwidth, leading to a further increase in performance and energy efficiency. Data-forwarding caches can reduce off-chip memory reads by 53%, while using a last-frame cache can eliminate 80% of the off-chip reads. The proposed techniques were validated and benchmarked using full-system Verilog hardware simulations based on an existing decoder; they should- also be applicable to most other decoder architectures. The metrics used to evaluate the ideas in this paper are performance, power, area, memory efficiency, coding efficiency, and input latency.	en
dc.description.sponsorship	Texas Instruments Incorporated	en
dc.description.sponsorship	Nokia Corporation	en
dc.description.sponsorship	IEEE Circuits and Systems Society	en
dc.language.iso	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en
dc.relation.isversionof	http://dx.doi.org/10.1109/tcsvt.2009.2031459	en
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en
dc.source	IEEE	en
dc.subject	video decoders	en
dc.subject	parallelism	en
dc.subject	multicore	en
dc.subject	low-power	en
dc.subject	H.264	en
dc.title	Multicore Processing and Efficient On-Chip Caching for H.264 and Future Video Decoders	en
dc.type	Article	en
dc.identifier.citation	Finchelstein, D.F., V. Sze, and A.P. Chandrakasan. “Multicore Processing and Efficient On-Chip Caching for H.264 and Future Video Decoders.” Circuits and Systems for Video Technology, IEEE Transactions on 19.11 (2009): 1704-1713. © 2009 IEEE	en
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.department	Massachusetts Institute of Technology. Microsystems Technology Laboratories	en_US
dc.contributor.approver	Chandrakasan, Anantha P.
dc.contributor.mitauthor	Sze, Vivienne
dc.contributor.mitauthor	Chandrakasan, Anantha P.
dc.relation.journal	IEEE Transactions on Circuits and Systems for Video Technology	en
dc.eprint.version	Final published version	en
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en
eprint.status	http://purl.org/eprint/status/PeerReviewed	en
dspace.orderedauthors	Finchelstein, D.F.; Sze, V.; Chandrakasan, A.P.	en
dc.identifier.orcid	https://orcid.org/0000-0002-5977-2748
dc.identifier.orcid	https://orcid.org/0000-0003-4841-3990
mit.license	PUBLISHER_POLICY	en
mit.metadata.status	Complete

Files in this item

Name:: Finchelstein-2009-Multicore ...
Size:: 5.640Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record