Abstract:
Convolutional neural network is now widely used in computer vision and deep learning applications. The most compute-intensive layer in convolutional neural networks is the convolutional layer, which should be accelerated in hardware. This paper aims to develop an efficient hardware-software co-design framework for machine learning applications on the PYNQ-Z2 board. To achieve this goal, we develop hardware implementations of convolutional IP core and use them as Python overlays. Experiments show that the hardware implementations of the convolutional IP core outperform their software implementations by factors of up to 9 times. Furthermore, we make use of the designed convolutional IP core as hardware accelerator in the handwritten digit recognition application with MNIST dataset. Thanks to the use of the hardware accelerator for the convolutional layers, the execution performance of the convolutional neural network has been improved by a factor of 6.2 times