When I was a kid, 10,000 lines of code was considered a decent-sized application. Now, it seems, we’re on the verge of seeing applications with 10,000 threads. Or at least, that’s what graphics chip maker Nvidia is envisioning. Nvidia recently announced the Tesla product family, which includes a chip with 128 processors and hardware support for execution of thousands of threads.
Like other Nvidia chips, Tesla is a graphics processing unit (GPU), but this new chip is not really targeting graphics applications. Instead, Nvidia is using its GPU architecture to go after a different application space: high-performance computing. High-performance computing, or HPC, refers to computationally demanding, but non-real-time applications, such as processing seismic data for oil exploration.
Nvidia isn’t the first company to get the idea of retargeting GPUs towards high-performance computing; last year, AMD introduced a similar initiative. It’s one of those times when multiple people have figured out that a technology developed for one thing (graphics) is also useful for something else (high-performance computing). But what I’m wondering is whether Nvidia’s chips will find yet another home, in embedded digital signal processing applications, where massively parallel, multi-core chips have generated significant interest but (as yet) few design wins.
Whether you think threads are a good way to increase performance or not (and some people don’t), Nvidia’s “massively threaded” approach is interesting, and clearly a huge departure from the existing programming paradigms. Chips that make such huge departures typically have huge problems getting substantial market traction. Not only is the technology unproven, it’s often introduced by a new chip company, and new chip companies have an unnerving tendency to go belly up. This has been an ongoing concern in massively parallel chips targeting DSP applications. In this respect, Nvidia has an edge: Nvidia’s GPUs are already found in many PCs; you can walk into Fry’s and buy them. This is not some unknown start-up we’re talking about.
But perhaps the biggest reason why massively parallel chips haven’t yet caught on in DSP applications is that they are highly complex, and engineers are leery of getting stuck with a complicated, untested development paradigm. Maybe Nvidia’s got an edge here, too.
Consider this combination. A supercomputer-like chip that can be bought inexpensively at a local computer store and plunked into the PCs of thousands of under funded researchers and grad students, many of whom are willing to spend countless hours learning a complicated new development paradigm so that they can get their genomes sequenced (or whatever) and complete their PhDs in this lifetime. These researchers could constitute an army of guinea pigs for testing out the massively threaded approach. Couple this initial customer base with the growing demand in embedded signal processing applications for a proven, stable massively parallel chip technology. It could be the start of something interesting.