Thursday, July 13, 2017

Age of embedded GPU Processing

As deep neural network (DNN) is becoming more popular in cognitive applications as such object detection, so is the embedded device meant to power it. For the past one month, i am trying out vision application for some demos to customers as well as self hobbyist project. I have run the Renesas GR Peach camera module and its video is noticeable not fast (although considering that it is an ARM MCU, it is still very impressive). I then try to do video processing with Zedboard, a Zynq powered FPGA board, but the workflow is rather tedious and requires lots of work. A latest device that i am looking at is the NVidia Jetson TX2.

It is a credit card size embedded board powered by dual core NVIDIA Denver2 and quad-core ARM Cortex-A57. It boasts a 8GB 128-bit LPDDR4 and integrated 256-core Pascal GPU. NVidia has a developer zone that provide resource for developer to play with the kit. So far i am still at the stage of literature review. The embedded board is meant for doing robotic and drone application.

I am planning to perform DNN with it. To ease learning curve in CUDA, i am going be using MATLAB to develop it. The parallel computing toolbox currently enable us to write algorithmn that can make use of GPU with simple syntax. In addition, the forthcoming the CPU Coder enable user to convert it into CUDA code, subsequently deploy it onto an embedded board like Jetson TX2.

The developer kit is launched somewhere in March 17 and now can be purchased online at Amazon. It can be shipped to south east asia free of charge.


No comments: