Resource-demand estimation for edge tensor processing units

  • Machine learning has shown tremendous success in a large variety of applications. The evolution of machine-learning applications from cloud-based systems to mobile and embedded devices has shifted the focus from only quality-related aspects towards the resource demand of machine learning. For embedded systems, dedicated accelerator hardware promises the energy-efficient execution of neural network inferences. Their precise resource demand in terms of execution time and power demand, however, is undocumented. Developers, therefore, face the challenge to fine-tune their neural networks such that their resource demand matches the available budgets. This article presents Precious, a comprehensive approach to estimate the resource demand of an embedded neural network accelerator. We generate randomised neural networks, analyse them statically, execute them on an embedded accelerator while measuring their actual power draw and execution time, and train estimators that map the statically analysed neural network properties to the measured resource demand. In addition, this article provides an in-depth analysis of the neural networks’ resource demands and the responsible network properties. We demonstrate that the estimation error of Precious can be below 1.5% for both power draw and execution time. Furthermore, we discuss what estimator accuracy is practically achievable and how much effort is required to achieve sufficient accuracy.

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Benedict HerzogGND, Stefan ReifGND, Judith HempGND, Timo HönigGND, Wolfgang Schröder-PreikschatGND
URN:urn:nbn:de:hbz:294-109411
DOI:https://doi.org/10.1145/3520132
Parent Title (English):ACM transactions on embedded computing systems
Publisher:Association for Computing Machinery
Place of publication:New York City, New York
Document Type:Article
Language:English
Date of Publication (online):2024/02/23
Date of first Publication:2022/10/08
Publishing Institution:Ruhr-Universität Bochum, Universitätsbibliothek
Tag:Neural network accelerator; resource awareness
Volume:51
Issue:5, Artikel 58
First Page:1
Last Page:24
open_access (DINI-Set):open_access
faculties:Fakultät für Informatik
Licence (English):License LogoCreative Commons - CC BY-NC-ND 4.0 - Attribution-NonCommercial-NoDerivatives 4.0 International