Software and Hardware Co-design for Scalable and Energy-efficient Neural Network Training with Processing-in-Memory

jishen.jpg
Jishen Zhao

Jishen Zhao
Current Affiliation: University of California, San Diego
Monday, November 26, 2018 @ 11:00am
Room 1202, CSE Building

Software and Hardware Co-design for Scalable and Energy-efficient Neural Network Training with Processing-in-Memory

Abstract:

Neural networks (NNs) have been adopted in a wide range of application domains, such as image classification, speech recognition, object   detection, and computer vision. However, training NNs – especially deep neural networks (DNNs) – can be energy and time consuming, because of frequent data movement between  processor  and  memory. Furthermore,  training  involves massive  fine-grained  operations  with  various  computation  and memory  access  characteristics. Exploiting high  parallelism with such diverse operations is challenging. 

In this talk, I will describe our effort on a  software/hardware co-design of heterogeneous processing-in-memory (PIM) system. Our hardware design incorporates hundreds of fix-function arithmetic units and a programmable core on the logic layer of a 3D die-stacked memory to form a heterogeneous PIM architecture  attached to CPU. Our software design offers a programming model and a runtime system that program, offload, and schedule various NN training operations across  compute resources provided by CPU and heterogeneous PIM. By extending the OpenCL programming model and employing a hardware heterogeneity-aware runtime system, we enable high program portability and easy program maintenance across  various heterogeneous hardware, optimize system  energy efficiency, and improve hardware utilization. Furthermore, DNN training can require terabytes of memory capacity. In order to tackle the memory capacity challenge, we propose a scalable and elastic memory fabric architecture, which consists of (i) a random topology and greedy routing protocol that can efficiently interconnect up to a thousand 3D memory stacks and (ii) a reconfiguration scheme that enables elastic memory fabric scale. In addition to scalability and flexibility, our memory fabric design can also improve both system throughput and energy consumption compared with traditional memory fabric designs.

Bio:

Jishen Zhao is an Assistant Professor in the Computer Science and Engineering Department at University of California, San Diego. Her research spans and stretches the boundary between computer architecture and system software, with a particular emphasis on memory and storage systems, domain-specific acceleration, and system reliability. Her research is driven by both emerging memory technologies (e.g., 3D integration, nonvolatile memories) and modern applications (e.g., big-data analytics, machine learning, and high-performance computing). Before joining UCSD, she was an Assistant Professor at UC Santa Cruz, and a research scientist at HP Labs. She is a precipitant of NSF CAREER Award in 2017 and MICRO Best Paper Award Honorable Mention in 2013.