Resource-demand estimation for edge tensor processing units
- Machine learning has shown tremendous success in a large variety of applications. The evolution of machine-learning applications from cloud-based systems to mobile and embedded devices has shifted the focus from only quality-related aspects towards the resource demand of machine learning. For embedded systems, dedicated accelerator hardware promises the energy-efficient execution of neural network inferences. Their precise resource demand in terms of execution time and power demand, however, is undocumented. Developers, therefore, face the challenge to fine-tune their neural networks such that their resource demand matches the available budgets. This article presents Precious, a comprehensive approach to estimate the resource demand of an embedded neural network accelerator. We generate randomised neural networks, analyse them statically, execute them on an embedded accelerator while measuring their actual power draw and execution time, and train estimators that map the statically analysed neural network properties to the measured resource demand. In addition, this article provides an in-depth analysis of the neural networks’ resource demands and the responsible network properties. We demonstrate that the estimation error of Precious can be below 1.5% for both power draw and execution time. Furthermore, we discuss what estimator accuracy is practically achievable and how much effort is required to achieve sufficient accuracy.
Author: | Benedict HerzogGND, Stefan ReifGND, Judith HempGND, Timo HönigGND, Wolfgang Schröder-PreikschatGND |
---|---|
URN: | urn:nbn:de:hbz:294-109411 |
DOI: | https://doi.org/10.1145/3520132 |
Parent Title (English): | ACM transactions on embedded computing systems |
Publisher: | Association for Computing Machinery |
Place of publication: | New York City, New York |
Document Type: | Article |
Language: | English |
Date of Publication (online): | 2024/02/23 |
Date of first Publication: | 2022/10/08 |
Publishing Institution: | Ruhr-Universität Bochum, Universitätsbibliothek |
Tag: | Neural network accelerator; resource awareness |
Volume: | 51 |
Issue: | 5, Artikel 58 |
First Page: | 1 |
Last Page: | 24 |
open_access (DINI-Set): | open_access |
faculties: | Fakultät für Informatik |
Licence (English): | Creative Commons - CC BY-NC-ND 4.0 - Attribution-NonCommercial-NoDerivatives 4.0 International |