When Movidius unveiled the Fathom Neural Compute Stick, based on its Myriad 2 VPU (vision processor), at the May 2016 Embedded Vision Summit, the company targeted a $99 price tag and initially planned to support the TensorFlow framework, with support for Caffe and other frameworks to follow. A lot's changed in a year-plus, most notably Intel's acquisition of Movidius announced in September. The company's new version of the Neural Compute Stick drops the price by 20%, switches from plastic to aluminum packaging, increases onboard memory by 4x or 8x (depending on how you're counting), and reorders the associated deep learning framework support plans to prioritize Caffe (Figure 1).
Figure 1. While plastic packaging was selected for last year's initial beta iterations of Intel Movidius' Neural Compute Stick, the final production version switches to an aluminum chassis, along with substantially increasing the amount of integrated DRAM.
Jack Dashwood, Marketing Business Unit Lead for Intel's Movidius Group, reiterated in a recent briefing that last year's initial version of the hardware never exited private beta. When Movidius briefed BDTI on Fathom last April, the company planned to use a combination of the VPU and 512 MBytes of LPDDR3 SDRAM in a multi-die package. Shortly thereafter, the integrated DRAM plans grew to 1 GByte. And now, reflective of the increasingly large and complex (i.e. deeper) neural network models in use, the production version of the hardware embeds 4 GBytes of LPDDR3 SDRAM in the same multi-die "sandwich" package arrangement (Video 1). The VPU nominally runs at a 600 MHz clock frequency, with the DRAM clocking in at 933 MHz. The combination, according to Vice President and General Manager Remi El-Ouazanne, translates into more than 100 gigaflops of performance within a 1 W power envelope.
Video 1. The Neural Compute Stick combines the Myriad 2 VPU and DRAM in an easy-to-use USB stick form factor suitable for both R&D and low-volume production purposes.
The materials transition from plastic to aluminum was done to improve durability, thereby making the hardware usable for not only development (with the VPU directly integrated within the system in the final production design) but also low-volume production deployments. The price decrease to $79 (Mouser and RS Components are the lead distributors) is a reflection of Intel's "manufacturing and design expertise". One other (software-assisted) new hardware configuration enhancement also bears mention: by means of a multi-port USB3 hub, it's possible to combine multiple Neural Compute Stick-based Myriad 2 VPUs in a parallel multiprocessing array (Figure 2). According to Gary Brown, Vice President of Marketing, internal testing has confirmed that this "Multi-Stick mode" delivers near-linear performance increases up to four parallel sticks; the company is currently validating six- and eight-stick configurations.
Figure 2. Internal testing has, according to company officials, shown near-linear performance increases in up-to four-VPU configurations; the company believes that there is no theoretical maximum, and higher stick count validation is currently underway.
Conceptually, at least, the aspirations for the accompanying software development kit are unchanged from last year's announcement. As with similar toolsets from companies such as Cadence, CEVA and Synopsys, the Intel Movidius offering converts pre-trained floating point-based deep learning models into fixed-point 16-bit integer equivalents compatible with the Myriad 2's processing resources. However, whereas the company initially planned to lead with TensorFlow support, subsequent market evaluation has led to a reprioritization in favor of Caffe. TensorFlow support is still planned "soon" (company officials were unwilling to be any more specific at this time), and other frameworks such as the Facebook-championed Caffe2 and Microsoft-developed Cognitive Toolkit are also in ongoing evaluation.
Initially, the Movidius Neural Compute SDK ran only on an Intel CPU-based PC based on Ubuntu Linux 16.04 (Video 2):
Video 2. The Neural Compute Stick's "getting started" video describes its initial x86-based hardware and Ubuntu Linux-based software requirements.
More recently, and especially interesting given Intel's x86-centric focus, an updated version of the SDK adds support for ARM-based SoCs such as those in Raspberry PI platforms, as well as for Debian Linux (Video 3):
Video 3. Recent expansions in processor and operating system support enable Neural Compute Stick compatibility with, for example, the Raspberry Pi 3 Model B.
To its credit, Intel Movidius has assembled a robust (and growing) set of online resources for developer reference, including both product and technical FAQs, a quick-start guide, and an active discussion forum section. While Myriad 2 dates from mid-2014, the architecture seemingly still has plenty of life in it, judging from recent product announcements from DJI, Motorola and other customers. It will be interesting to see how this particular iteration of the VPU continues to evolve and stack up to competitive offerings, as well as how the combined Intel and Movidius plan to position it long-term against the newly introduced next-generation Myriad X, which InsideDSP plans to cover in an upcoming edition.