Accelerating Deep Convolutional Neural Networks Using Specialized Hardware

  • Kalin Ovtcharov ,
  • Olatunji Ruwase ,
  • Joo-Young Kim ,
  • Jeremy Fowers ,
  • ,
  • Eric Chung

We describe the design of a convolutional neural network accelerator running on a Stratix V FPGA. The design runs at three times the throughput of previous FPGA CNN accelerator designs. We show that the throughput/watt is significantly higher than for a GPU, and project the performance when ported to an Arria 10 FPGA.