By 2025, computers and other instruments will generate more than 175 zettabytes of data. For context, there are a billion terabytes in a zettabyte. If you’re wondering how all that information will be analyzed, you’re not alone. At present, only one percent of data produced worldwide is ever evaluated.
To help solve this problem, the Defense Advanced Research Projects Agency (DARPA) has awarded a $1 million grant to Computer Science and Engineering Department professors Tajana Rosing, Sanjoy Dasgupta and Electrical and Computer Engineering Department professor Tara Javidi to explore how hyperdimensional computing (HD) can help address this informational onslaught. The project is called HyDREA (Hyperdimensional Computing: Robust, Efficient and Accurate).
HD computing seeks to better replicate human brain power in silico. The “hyper” part comes from much larger data sizes. Instead of 32 or 64 bits, data can contain 10,000 bits or more.
“Let's imagine I have a 32-bit number,” said Rosing. “Today, each 32-bit number would have to be stored separately. But with hyperdimensional computing, I can combine the information into a single, 10,000-bit vector, which could, for example, represent all photos of a cat or all photos of a dog.”
Brains have similar mechanisms. Sensory data enters at relatively low dimensionality, but the mind expands those representations as it processes them. Data that started at a million bits can be enlarged to 200 million bits.
The 18-month HyDREA project will have to overcome a long list of technical challenges. Current memory isn’t really designed to handle these large datasets. In addition, the team will need to develop new ways to process data and handle large dimensionality without losing speed.
“One issue is encoding the data, or taking whatever data we're currently collecting that's in 32- or 64-bit numbers, and figuring out the right way to map it into high dimensional space,” said Rosing. “This encoding process takes the majority of the time, and we are looking at how we could make it run a lot faster.”
They also want to identify the best applications for HD computing. At present, no one fully understands all the possible HD applications, though that challenge is a bit downstream.
“If I have multiple nodes that can do hyperdimensional computing, how should they communicate the data?” asked Rosing. “Should they communicate the hypervectors? Should they communicate something we learned from the hypervectors? What happens if some data gets blocked? Do we lose a lot of accuracy?” These are just a few of the questions that the project seeks to address.
Overall, the team will have to develop new coding and decoding strategies, fast HD algorithms and efficient hardware. On this last point, the HyDREA team is collaborating with Northrop Grumman to test HD computing on the aerospace company’s stochastic computing chip.
“At the end of the project we'll hopefully have a couple of hardware implementations that will show the power of HD,” said Rosing. “We want to show that HD computing can be made a thousand times more efficient and faster than current machine learning without losing accuracy.”