Last June, when Synopsys unveiled its latest-generation DesignWare EV6x vision processor core, the company also introduced an 880-MAC, 12-bit convolutional neural network (CNN) companion processor, the CNN880. Although the CNN880 is optional for Synopsys customers using the EV6x, it's been a key factor (often the lead factor, in fact) in greater than 90% of EV6x customer engagements, according to Product Marketing Manager Gordon Cooper. And although a year ago, an 880-MAC architecture was at the high end of customers' compute requirements range, Cooper claims that it now undershoots many applications’ needs. As a result, and also likely in response to competitors' recent product introductions, Synopsys is preparing to release 1,760- and 3,530-MAC members of its CNN engine family (Figure 1).
Figure 1. Although the high-level block diagram of Synopsys' DesignWare EV6x vision processing core remains seemingly unchanged from last year's unveiling (top), the CNN engine has expanded into a three-variant family offering up to 3,530 12-bit MACs per second (bottom).
In a recent briefing, Cooper stressed that CNN engine remains an optional add-on, not a requirement. While a radar-only ADAS implementation, for example, might greatly value the DesignWare EV6x's vector DSP capabilities, it won't necessarily need deep learning acceleration; the same might be the case in an augmented reality design. However, CNN support is a key requirement for the bulk of the vision processing design opportunities Cooper foresees, often in combination with classic computer vision functions (Figure 2). And this year, he claims, the CNN1760 variant will be the mainstream option chosen by his customers, with last year's CNN880 relegated to entry-level implementations and the CNN3520 taking over the high-end mantle.
Figure 2. Classic computer vision techniques and deep learning are often combined in applications. The DesignWare EV6x's vector DSP and CNN processing capabilities support both approaches.
While competitor Cadence has chosen to implement 8x8 MACs in its Vision C5 CNN core, Synopsys for now is staying with 12x12 MACs. Cooper acknowledges that recent academic studies have shown good results with smaller data types, but he believes that the bulk of his customers aren't yet ready to make the leap of faith below 12-bit precision. And, Cooper says, Synopsys analysis suggests that a migration from 12-bit to 8-bit won't deliver even close to the same silicon area reduction as the previously claimed 50% shrink enjoyed based on moving 16-bit to 12-bit.
Like Cadence’s offering, and unlike CEVA’s, Synopsys' CNN engine natively processes all neural network layers; the CNN880, for example, includes 16 MACs tailored for fully connected layers with the remaining 864 intended for convolutional layers (CEVA's XM-6 acts as a neural network accelerator, directly handling convolutions but leveraging the companion vector DSP for other deep learning operations). Speaking of silicon area, the EV61CNN880 variant shown in Figure 1 consumes less than 1 mm2 on a 16 nm FinFET process, according to Cooper. All three CNN engine variants run at up to 1.28 GHz on that same fabrication foundation.
Synopsys has spent the last year not only expanding its core product line, but also enhancing the associated development tools. The company offers architecture-tailored versions of common OpenCV open-source computer vision library functions, along with an OpenVX framework. Cooper believes that Synopsys’ support of OpenCL C is unique. And Synopsys' CNN graph mapping tool offers ever-expanding support (now including AlexNet, GoogLeNet and SqueezeNet, along with customer-customized graphs), and will be extended beyond Caffe to also support the TensorFlow framework in October (Figure 3). Finally, Cooper points to the companion encryption cores included in Synopsys' IP product line, noting that an increasing number of customers require design security for their deep learning model coefficients and other parameters.
Figure 3. Synopsys' graph mapping tool quantizes floating-point model sources into graphs compatible with the CNN engine's 12-bit MAC resources.
The CNN1760 and CNN3530 members of Synopsys' neural network processor IP family will be available for licensing beginning next month. The company's CNN graph mapping tool support is scheduled to expand beyond Caffe to also cover the TensorFlow framework beginning in October. For more information on Synopsys' CNN engine architecture and companion development tool suite, please see the company's two recent presentations at May's Embedded Vision Summit, "Designing Scalable Embedded Vision SoCs from Day 1" and "Moving CNNs from Academic Theory to Embedded Reality".