- Redmond WA, US Benjamin Eliot LUNDELL - Seattle WA, US Larry Marvin WALL - Seattle WA, US Chad Balling McBRIDE - North Bend WA, US Amol Ashok AMBARDEKAR - Redmond WA, US George PETRE - Redmond WA, US Kent D. CEDOLA - Bellevue WA, US Boris BOBROV - Kirkland WA, US
A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.
- Redmond WA, US Boris BOBROV - Kirkland WA, US Kent D. CEDOLA - Bellevue WA, US Chad Balling MCBRIDE - North Bend WA, US George PETRE - Redmond WA, US Larry Marvin WALL - Seattle WA, US
Neural processing elements are configured with a hardware AND gate configured to perform a logical AND operation between a sign extend signal and a most significant bit (“MSB”) of an operand. The state of the sign extend signal can be based upon a type of a layer of a deep neural network (“DNN”) that generate the operand. If the sign extend signal is logical FALSE, no sign extension is performed. If the sign extend signal is logical TRUE, a concatenator concatenates the output of the hardware AND gate and the operand, thereby extending the operand from an N-bit unsigned binary value to an N+1 bit signed binary value. The neural processing element can also include another hardware AND gate and another concatenator for processing another operand similarly. The outputs of the concatenators for both operands are provided to a hardware binary multiplier.
Neural Network Processor Using Compression And Decompression Of Activation Data To Reduce Memory Bandwidth Utilization
- Redmond WA, US Benjamin Eliot LUNDELL - Seattle WA, US Larry Marvin WALL - Seattle WA, US Chad Balling McBRIDE - North Bend WA, US Amol Ashok AMBARDEKAR - Redmond WA, US George PETRE - Redmond WA, US Kent D. CEDOLA - Bellevue WA, US Boris BOBROV - Kirkland WA, US
International Classification:
G06N 3/04 G06N 3/063 H03M 7/30
Abstract:
A deep neural network (“DNN”) module can compress and decompress neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit can receive an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit can receive a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion. This can reduce memory bus utilization, allow a DNN module to complete processing operations more quickly, and reduce power consumption.
Power-Efficient Deep Neural Network Module Configured For Executing A Layer Descriptor List
- Redmond WA, US Kent D. CEDOLA - Bellevue WA, US Larry Marvin WALL - Seattle WA, US Boris BOBROV - Kirkland WA, US George PETRE - Redmond WA, US Chad Balling McBRIDE - North Bend WA, US
International Classification:
G06N 3/063 G06F 13/28 G06F 17/15 G06F 13/16
Abstract:
A deep neural network (DNN) processor is configured to execute descriptors in layer descriptor lists. The descriptors define instructions for performing a pass of a DNN by the DNN processor. Several types of descriptors can be utilized: memory-to-memory move (M2M) descriptors; operation descriptors; host communication descriptors; configuration descriptors; branch descriptors; and synchronization descriptors. A DMA engine uses M2M descriptors to perform multi-dimensional strided DMA operations. Operation descriptors define the type of operation to be performed by neurons in the DNN processor and the activation function to be used by the neurons. M2M descriptors are buffered separately from operation descriptors and can be executed at soon as possible, subject to explicitly set dependencies. As a result, latency can be reduced and, consequently, the neurons can complete their processing faster. The DNN module can then be powered down earlier than it otherwise would have, thereby saving power.
Dynamically Partitioning Workload In A Deep Neural Network Module To Reduce Power Consumption
- Redmond WA, US Boris BOBROV - Kirkland WA, US Chad Balling McBRIDE - North Bend WA, US George PETRE - Redmond WA, US Kent D. CEDOLA - Bellevue WA, US Larry Marvin WALL - Seattle WA, US
International Classification:
G06N 3/063 G06N 3/04 G06N 3/08
Abstract:
A deep neural network (DNN) module is disclosed that can dynamically partition neuron workload to reduce power consumption. The DNN module includes neurons and a group partitioner and scheduler unit. The group partitioner and scheduler unit divides a workload for the neurons into partitions in order to maximize the number of neurons that can simultaneously process the workload. The group partitioner and scheduler unit then assigns a group of neurons to each of the partitions. The groups of neurons in the DNN module process the workload in their assigned partition to generate a partial output value. The neurons in each group can then sum their partial output values to generate a final output value for the workload. The neurons can be powered down once the groups of neurons have completed processing their assigned workload to reduce power consumption.
Name / Title
Company / Classification
Phones & Addresses
Chad Rylan Mcbride
Chad McBride MD Family Doctor
1220 Gossman Ln, Wenatchee, WA 98801 5096626000
Medicine Doctors
Dr. Chad R Mcbride, Wenatchee WA - MD (Doctor of Medicine)
William Beaumont Army Medical Center Cardiovascular Disease 5005 N Piedras St RM 4278, El Paso, TX 79920 9157421840 (phone), 9157428306 (fax)
Education:
Medical School Kansas City University of Medicine and Biosciences College of Osteopathic Medicine Graduated: 2007
Languages:
English
Description:
Dr. McBride graduated from the Kansas City University of Medicine and Biosciences College of Osteopathic Medicine in 2007. He works in El Paso, TX and specializes in Cardiovascular Disease. Dr. McBride is affiliated with William Beaumont Army Medical Center.
Logic Design Lead - Bus Interface Unit - Wii gaming cpu at IBM Systems & Technology Group
Location:
Redmond, Washington
Industry:
Computer Hardware
Work:
IBM Systems & Technology Group - Rochester, Minnesota Area since Jan 2009
Logic Design Lead - Bus Interface Unit - Wii gaming cpu
Microsoft 2012 - 2013
Senior Hardware Designer
Microsoft 2012 - 2012
Senior Hardware Designer
IBM Systems & Technology Group - Rochester, Minnesota Area Oct 2007 - Jan 2012
Logic Design Lead - security engine and cpu bus interface - Xbox 360 cpu.
IBM Systems & Technology Group - San Jose Nov 2006 - Jan 2007
Logic Design - Cisco Switch Chip
Education:
University of Minnesota-Twin Cities 1996 - 2001
Masters, Electrical Engineering
Utah State University 1990 - 1996
BS, Electrical Engineering