Victor W. Lee - Santa Clara CA, US Jayashankar Bharadwaj - Saratoga CA, US Daehyun Kim - San Jose CA, US Nalini Vasudevan - Sunnyvale CA, US Albert Hartono - Santa Clara CA, US Sara Baghsorkhi - San Jose CA, US
International Classification:
G06F 7/548
US Classification:
708441
Abstract:
An apparatus and method are described for performing a vector reduction. For example, an apparatus according to one embodiment comprises: a reduction logic tree comprised of a set of N-1 reduction logic blocks used to perform reduction in a single operation cycle for N vector elements; a first input vector register storing a first input vector communicatively coupled to the set of reduction logic blocks; a second input vector register storing a second input vector communicatively coupled to the set of reduction logic blocks; a mask register storing a mask value controlling a set of one or more multiplexers, each of the set of multiplexers selecting a value directly from the first input vector register or an output containing a processed value from one of the reduction logic blocks; and an output vector register coupled to outputs of the one or more multiplexers to receive values output passed through by each of the multiplexers responsive to the control signals.
Apparatus And Method For Selecting Elements Of A Vector Computation
Jayashankar Bharadwaj - Saratoga CA, US Nalini Vasudevan - Sunnyvale CA, US Victor W. Lee - Sant Clara CA, US Daehyun Kim - San Jose CA, US Albert Hartono - Santa Clara CA, US Sara S. Baghsorkhi - San Jose CA, US
International Classification:
G06F 9/30
US Classification:
712 7
Abstract:
An apparatus and method are described for selecting elements to be used in a vector computation. For example, a method according to one embodiment includes the following operations: specifying whether to identify the first, last or next after last active element of an input mask register using an immediate value; identifying the first, last or next after last active element in the input mask register according to the immediate value; reading a value from an input vector register corresponding to the identified first, last or next after last active element in the input mask register; and writing the value to an output vector register.
Systems, Apparatuses, And Methods For Setting An Output Mask In A Destination Writemask Register From A Source Write Mask Register Using An Input Writemask And Immediate
Victor W. LEE - Santa Clara CA, US Daehyun KIM - San Jose CA, US Jayashankar BHARADWAJ - Saratoga CA, US Albert HARTONO - Santa Clara CA, US Sara BAGHSORKHI - San Jose CA, US Nalini VASUDEVAN - Sunnyvale CA, US
Assignee:
Intel Corporation - Santa Clara CA
International Classification:
G06F 9/30 G06F 15/80
Abstract:
Embodiments of systems, apparatuses, and methods for performing in a computer processor generation of a predicate mask based on vector comparison in response to a single instruction are described.
Method And Apparatus For Speculative Vectorization
NALINI VASUDEVAN - Sunnyvale CA, US CHENG WANG - San Ramon CA, US YOUFENG WU - Palo Alto CA, US ALBERT HARTONO - Santa Clara CA, US SARA S. BAGHSORKHI - San Jose CA, US
International Classification:
G06F 9/38 G06F 9/30 G06F 15/80
Abstract:
An apparatus and method for speculative vectorization. For example, one embodiment of a processor comprises: a queue comprising a set of locations for storing addresses associated with vectorized memory access instructions; and execution logic to execute a first vectorized memory access instruction to access the queue and to compare a new address associated with the first vectorized memory access instruction with existing addresses stored within a specified range of locations within the queue to detect whether a conflict exists, the existing addresses having been previously stored responsive to one or more prior vectorized memory access instructions.
Apparatus And Method For Propagating Conditionally Evaluated Values In Simd/Vector Execution Using An Input Mask Register
- Santa Clara CA, US Nalini VASUDEVAN - SUNNYVALE CA, US Victor W. LEE - SANTA CLARA CA, US Daehyun KIM - SAN JOSE CA, US Albert HARTONO - SANTA CLARA CA, US Sara S. BAGHSORKHI - SAN JOSE CA, US
International Classification:
G06F 9/30 G06F 9/38
Abstract:
An apparatus and method for propagating conditionally evaluated values are disclosed. For example, a method according to one embodiment comprises: reading each value contained in an input mask register, each value being a true value or a false value and having a bit position associated therewith; for each true value read from the input mask register, generating a first result containing the bit position of the true value; for each false value read from the input mask register following the first true value, adding the vector length of the input mask register to a bit position of the last true value read from the input mask register to generate a second result; and storing each of the first results and second results in bit positions of an output register corresponding to the bit positions read from the input mask register.
Methods And Systems To Vectorize Scalar Computer Program Loops Having Loop-Carried Dependences
Methods and systems to convert a scalar computer program loop having loop-carried dependences into a vector computer program loop are disclosed. One such method includes, at runtime, identifying, by executing an instruction with one or more processors, a first loop iteration that cannot be executed in parallel with a second loop iteration due to a set of conflicting scalar loop operations. The first loop iteration is executed after the second loop iteration. The method also includes sectioning, by executing an instruction with one or more processors, a vector loop into vector partitions including a first vector partition. The first vector partition executes consecutive loop iterations in parallel and the consecutive loop iterations start at the second loop iteration and end before the first loop iteration.
Methods And Systems To Vectorize Scalar Computer Program Loops Having Loop-Carried Dependences
- Santa Clara CA, US Nalini Vasudevan - Sunnyvale CA, US Albert Hartono - Santa Clara CA, US Sara S. Baghsorkhi - San Jose CA, US
International Classification:
G06F 9/45
Abstract:
Methods and systems to convert a scalar computer program loop having loop-carried dependences into a vector computer program loop are disclosed. One such method includes, replacing the scalar recurrence operation in the scalar computer program loop with a first vector summing operation and a first vector recurrence operation. The first vector summing operation is to generate a first running sum and the first vector recurrence operation is to generate a first vector. In some examples, the first vector recurrence operation is based on the scalar recurrence operation. Disclosed methods also include inserting: 1) a renaming operation to rename the first vector, 2) a second vector summing operation that is to generate a second running sum; and 3) a second vector recurrence operation to generate a second vector based on the renamed first vector.
Method And Apparatus For Approximating Detection Of Overlaps Between Memory Ranges
- Santa Clara CA, US Nalini Vasudevan - Sunnyvale CA, US Sara S. Baghsorkhi - San Jose CA, US Cheng Wang - San Ramon CA, US Youfeng Wu - Palo Alto CA, US
International Classification:
G06F 11/07 G06F 3/06
Abstract:
A computer-implemented method for managing loop code in a compiler includes using a conflict detection procedure that detects across-iteration dependency for arrays of single memory addresses to determine whether a potential across-iteration dependency exists for arrays of memory addresses for ranges of memory accessed by the loop code.
Googleplus
Nalini Vasudevan
About:
Knowledge becomes wisdom only after it has been put to practical use.