Full article - Open Access.

Idioma principal | Segundo idioma

EFFICIENT DEPLOYMENT OF MACHINE LEARNING MODELS ON MICROCONTROLLERS: A COMPARATIVE STUDY OF QUANTIZATION AND PRUNING STRATEGIES.

EFFICIENT DEPLOYMENT OF MACHINE LEARNING MODELS ON MICROCONTROLLERS: A COMPARATIVE STUDY OF QUANTIZATION AND PRUNING STRATEGIES.

Loureiro, Rafael Bessa ; Sá, Paulo Henrique Miranda ; Lisboa, Fernanda Vitória Nascimento ; Peixoto, Rodrigo Matos ; Nascimento, Lian Filipe Santana ; Bonfim, Yasmin da Silva ; Cruz, Gustavo Oliveira Ramos ; Ramos, Thauan de Oliveira ; Montes, Carlos Henrique Racobaldo Luz ; Pagano, Tiago Palma ; Pinheiro, Oberdan Rocha ; Borges, Rafael ;

Full article:

With the advancement and growth of Internet of Things tools, the necessity for more complex and intelligent systems increases, presenting many challenges due to device limitations in memory, computation, and energy consumption. The objective of this study is to do a literature review of optimization techniques with quantization and pruning in machine learning models for deployment on microcontrollers. The methodology consists of searching the literature, the tools, the models, and the techniques. Results show that pruning has better accuracy, while quantization has better inference time and size reduction, with the best results reducing 90% with 3.5x faster inference. Careful consideration of trade-offs between inference time, model size, and accuracy is crucial for deployment on edge devices.

Full article:

With the advancement and growth of Internet of Things tools, the necessity for more complex and intelligent systems increases, presenting many challenges due to device limitations in memory, computation, and energy consumption. The objective of this study is to do a literature review of optimization techniques with quantization and pruning in machine learning models for deployment on microcontrollers. The methodology consists of searching the literature, the tools, the models, and the techniques. Results show that pruning has better accuracy, while quantization has better inference time and size reduction, with the best results reducing 90% with 3.5x faster inference. Careful consideration of trade-offs between inference time, model size, and accuracy is crucial for deployment on edge devices.

Palavras-chave: edge computing, microcontroller, optimization, quantization, pruning,

Palavras-chave: edge computing, microcontroller, optimization, quantization, pruning,

DOI: 10.5151/siintec2023-305873

Referências bibliográficas
  • [1] VÉSTIAS, M; et al. Moving deep learning to the edge. Algorithms, 13(5), 2020. GHIBELLINI, A; et al. Intelligence at the iot edge: Activity recognition with low-power microcontrollers and convolutional neural networks. In Proceedings - IEEE Consumer Communications and Networking Conference, 2022. MOLCHANOV, P; et.al. Pruning convolutional neural networks for resource efficient transfer learning. CoRR, abs/16106440, 2016. HINTON, G; DEAN, J. Distilling the knowledge in a neural network, 2015. DAVID, R; et al. Tensorflow lite micro: Embedded machine learning for tinyml systems. In A. Smola, A. Dimakis, and I. Stoica, editors, Proceedings of Machine Learning and Systems, volume 3, pages 800–811, 202 ABADI, M; et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org. GEMBACZKA, P; et al. Combination of sensor-embedded and secure server-distributed artificial intelligence for healthcare applications. Current Directions in Biomedical Engineering, 5(1):29–32, 2019. PASZKE, A; et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. DING, Huanghao; PU, Jiachen; HU, Conggang. Tinyneuralnetwork: An efficient deep learning model compression framework. https://github.com/alibaba/TinyNeuralNetwork, 202 COOPER, G; MANJUNATH, B.S.; ISUKAPALLI, Y. Edge machine learning for face detection, 202 E Liberis and N D Lane. Differentiable neural network pruning to enable smart applications on microcontrollers. Proceedings of the ACM on Interactive Mobile Wearable And Ubiquitous Technologies, 2022. XU, K; et al. Etinynet: Extremely tiny network for tinyml, 2022.
Como citar:

Loureiro, Rafael Bessa ; Sá, Paulo Henrique Miranda ; Lisboa, Fernanda Vitória Nascimento ; Peixoto, Rodrigo Matos ; Nascimento, Lian Filipe Santana ; Bonfim, Yasmin da Silva ; Cruz, Gustavo Oliveira Ramos ; Ramos, Thauan de Oliveira ; Montes, Carlos Henrique Racobaldo Luz ; Pagano, Tiago Palma ; Pinheiro, Oberdan Rocha ; Borges, Rafael ; "EFFICIENT DEPLOYMENT OF MACHINE LEARNING MODELS ON MICROCONTROLLERS: A COMPARATIVE STUDY OF QUANTIZATION AND PRUNING STRATEGIES.", p. 181-188 . In: . São Paulo: Blucher, 2023.
ISSN 2357-7592, DOI 10.5151/siintec2023-305873

últimos 30 dias | último ano | desde a publicação


downloads


visualizações


indexações