Alexei V. Bourd - San Diego CA, US Andrew Gruber - Arlington MA, US Aleksandra L. Krstic - San Diego CA, US Robert J. Simpson - Espoo, FI Colin Sharp - Cardiff CA, US Chun Yu - San Diego CA, US
Assignee:
Qualcomm Incorporated - San Diego CA
International Classification:
G06F 9/38
US Classification:
712 27, 712E09071
Abstract:
This disclosure describes techniques for extending the architecture of a general purpose graphics processing unit (GPGPU) with parallel processing units to allow efficient processing of pipeline-based applications. The techniques include configuring local memory buffers connected to parallel processing units operating as stages of a processing pipeline to hold data for transfer between the parallel processing units. The local memory buffers allow on-chip, low-power, direct data transfer between the parallel processing units. The local memory buffers may include hardware-based data flow control mechanisms to enable transfer of data between the parallel processing units. In this way, data may be passed directly from one parallel processing unit to the next parallel processing unit in the processing pipeline via the local memory buffers, in effect transforming the parallel processing units into a series of pipeline stages.
Techniques For Handling Divergent Threads In A Multi-Threaded Processing System
Lin Chen - San Diego CA, US David Rigel Garcia Garcia - Ontario, CA Andrew E. Gruber - Arlington MA, US Guofang Jiao - San Diego CA, US
Assignee:
QUALCOMM Incorporated - San Diego CA
International Classification:
G06F 9/38 G06F 9/30
US Classification:
712234, 712220, 712E09016, 712E09045
Abstract:
This disclosure describes techniques for handling divergent thread conditions in a multi-threaded processing system. In some examples, a control flow unit may obtain a control flow instruction identified by a program counter value stored in a program counter register. The control flow instruction may include a target value indicative of a target program counter value for the control flow instruction. The control flow unit may select one of the target program counter value and a minimum resume counter value as a value to load into the program counter register. The minimum resume counter value may be indicative of a smallest resume counter value from a set of one or more resume counter values associated with one or more inactive threads. Each of the one or more resume counter values may be indicative of a program counter value at which a respective inactive thread should be activated.
Selectively Activating A Resume Check Operation In A Multi-Threaded Processing System
Lin Chen - San Diego CA, US Yun Du - San Diego CA, US Andrew Gruber - Arlington MA, US
International Classification:
G06F 9/38
US Classification:
712234, 712E09045
Abstract:
This disclosure describes techniques for selectively activating a resume check operation in a single instruction, multiple data (SIMD) processing system. A processor is described that is configured to selectively enable or disable a resume check operation for a particular instruction based on information included in the instruction that indicates whether a resume check operation is to be performed for the instruction. A compiler is also described that is configured to generate compiled code which, when executed, causes a resume check operation to be selectively enabled or disabled for particular instructions. The compiled code may include one or more instructions that each specify whether a resume check operation is to be performed for the respective instruction. The techniques of this disclosure may be used to reduce the power consumption of and/or improve the performance of a SIMD system that utilizes a resume check operation to manage the reactivation of deactivated threads.
Methods And Apparatus For Tensor Object Support In Machine Learning Workloads
- San Diego CA, US Liang LI - San Diego CA, US Andrew Evan GRUBER - Arlington MA, US Jeffrey LEGER - Tyngsboro MA, US Balaji CALIDAS - San Diego CA, US
International Classification:
G06T 1/60
Abstract:
The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may modify at least one texture memory object to support a data structure for one or more tensor objects. The apparatus may also determine one or more supported memory layouts for the one or more tensor objects based on the modified at least one texture memory object. Additionally, the apparatus may access data associated with the one or more tensor objects based on the one or more supported memory layouts, the data for each of the one or more tensor objects corresponding to at least one data instruction. The apparatus may also execute the at least one data instruction based on the accessed data associated with the one or more tensor objects.
Out Of Order Wave Slot Release For A Terminated Wave
- San Diego CA, US Chun YU - Rancho Santa Fe CA, US Andrew Evan GRUBER - Arlington MA, US Zilin YING - San Diego CA, US Baoguang YANG - Fremont CA, US
International Classification:
G06T 1/20 G06T 11/00
Abstract:
Methods, systems, and devices for image processing are described. A device may determine, based on a test operation, to terminate a first wave associated with a first slot of a set of slots. The device may update a terminated wave bit associated with the first slot based on the determination to terminate the first wave. In some aspects, the device may update a number of invocations field associated with the first wave based on the determination to terminate the first wave. The device may release the first slot based on updating the terminated wave bit and the number of invocations field. In some examples, the device may output the number of invocations field to a rendering backend of the device based on the terminated wave bit.
Diverse Redundancy Approach For Safety Critical Applications
- San Diego CA, US Jay Chunsup Yun - Carlsbad CA, US Donghyun Kim - San Diego CA, US Rahul Gulati - San Diego CA, US Brendon Lewis Johnson - San Diego CA, US Andrew Evan Gruber - Arlington MA, US
International Classification:
G06F 11/277 G06T 1/20 G06T 7/00 G06F 11/22
Abstract:
A graphics processing unit (GPU) of a GPU subsystem of a computing device operates in a first rendering mode to process graphics data to produce a first image. The GPU operates in a second rendering mode to process the graphics data to produce a second image. The computing device detects whether a fault has occurred in the GPU subsystem based at least in part on comparing the first image with the second image.
General Purpose Register Allocation In Streaming Processor
- San Diego CA, US Liang Han - San Diego CA, US Lin Chen - San Diego CA, US Chihong Zhang - San Diego CA, US Hongjiang Shang - San Diego CA, US Jing Wu - San Diego CA, US Zilin Ying - San Diego CA, US Chun Yu - Rancho Santa Fe CA, US Guofang Jiao - San Diego CA, US Andrew Gruber - Arlington MA, US Eric Demers - San Diego CA, US
International Classification:
G06F 9/30 G06F 9/38
Abstract:
Systems and techniques are disclosed for general purpose register dynamic allocation based on latency associated with of instructions in processor threads. A streaming processor can include a general purpose registers configured to stored data associated with threads, and a thread scheduler configured to receive allocation information for the general purpose registers, the information describing general purpose registers that are to be assigned as persistent general purpose registers (pGPRs) and volatile general purpose registers (vGPRs). The plurality of general purpose registers can be allocated according to the received information. The streaming processor can include the general purpose registers allocated according to the received information, the allocated based on execution latencies of instructions included in the threads.
- San Diego CA, US Vineet Goel - San Diego CA, US Maurice Franklin Ribble - Lancaster MA, US Andrew Evan Gruber - Arlington MA, US
International Classification:
G06T 1/20 G06T 1/60
Abstract:
A graphics processing unit (GPU) may rasterize a primitive into a plurality of samples, wherein vertices of the primitive are associated with VRS parameters. The GPU may determine a VRS quality group that comprises one or more sub regions of the plurality of samples based at least in part on the VRS parameters. The GPU may fragment shade a VRS tile that represents the VRS quality group, wherein the VRS tile comprises fewer samples than the VRS quality group. The GPU may amplify the stored VRS tile into shaded fragments that correspond to the VRS quality group.
he big story that the Census tells us is that we are truly one region from Brigham City down to Santaquin.... We all need to work together as one region to address our challenges" on the small strip where most Utahns live, said Andrew Gruber, executive director of the Wasatch Front Regional Council.
Date: Mar 26, 2012
Category: Business
Source: Google
Googleplus
Andrew Gruber
Work:
On the Border Mexican Grill & Cantina - Bartender/ Server/ Marketing Team (6)