Microsoft
Principal Scientist Manager and Lab Director
Microsoft
Principal Development Lead
Microsoft Aug 2006 - Aug 2008
Researcher
Microsoft Jun 1994 - May 2001
Software Design Engineer
Education:
University of Wisconsin - Madison 2001 - 2006
Doctorates, Doctor of Philosophy, Computer Science
Luther College 1990 - 1994
Bachelors, Bachelor of Arts, Computer Science
Skills:
Distributed Systems Software Design Software Engineering Software Development Algorithms Machine Learning Computer Science Quality Assurance C++ C# Data Mining Java Information Retrieval C
Alan Dale Halverson - Verona WI, US Krishnaram Kenthapadi - Mountain View CA, US Nina Mishra - Newark CA, US Umar Ali Syed - Philadelphia PA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 7/00 G06F 17/30
US Classification:
707706, 707746, 707765, 707770
Abstract:
Techniques and systems are disclosed for returning temporally-aware results from an Internet-based search query. To determine if a query is temporally-based one or more query features are collected and input into a trained classifier, yielding a temporal classification for the query. Further, if a query is classified as temporal, the query results are shifted by determining an alternate set of results for the query, and returning one or more alternate results to one or more users. Based on user interactions with the one or more alternate results, the classifier can be updated, for example, by changing the query to a non-temporal query if the user interactions identify it as such.
Performing Parallel Joins On Distributed Database Data
Nikhil Teletia - Madison WI, US Alan Dale Halverson - Verona WI, US José A. Blakeley - Redmond WA, US Milind Madhukar Joshi - Redmond WA, US Jose Aguilar Saborit - Dana Point CA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 17/30
US Classification:
707714, 707764
Abstract:
The present invention extends to methods, systems, and computer program products for performing parallel joins on distributed database data. Embodiments of the invention include a phased semi-join reduction strategy using replication and shuffle operations to join a first and a second data source. A filter building phase uses replication and pushes down a “Distinct” (e. g. , SQL) operator to produce a list of join keys for the first data source (one side of the join). A shuffle phase for the second data source is modified to join to the key list produced in the first phase as a row filtering mechanism. A join phase then joins the first and second data sources.
Data Visibility For Nested Transactions In Distributed Systems
- Redmond WA, US Alan Dale HALVERSON - Verona WI, US Sandeep LINGAM - Redmond WA, US Srikumar RANGARAJAN - Sammamish WA, US
International Classification:
G06F 16/27 G06F 9/46 G06F 16/901
Abstract:
Methods for data visibility in nested transactions in distributed systems are performed by systems and devices. Distributed executions of queries are performed in processing systems according to isolation level protocols with unique nested transaction identifiers for data management and versioning across one or more data sets, one or more compute pools, etc., within a logical server via a single transaction manager that oversees the isolation semantics and data versioning. A distributed query processor of the systems and devices performs nested transaction versioning for distributed tasks by generating nested transaction identifiers, encoded in data rows, which are used to enforce correct data visibility. Data visibility is restricted to previously committed data from distributed transactions and tasks, and is blocked for distributed transactions and tasks that run concurrently. Local commits for completed transactions and tasks are used to minimize transaction manager interactions, and instant rollbacks are enabled for aborted transactions and tasks.
Methods For Automatic Selection Of Degrees Of Parallelism For Efficient Execution Of Queries In A Database System
- Redmond WA, US Rathijit Sen - Madison WI, US Harshada Chavan - Plymouth MN, US Alan Halverson - Verona WI, US
International Classification:
G06F 17/30 G06F 9/52
Abstract:
Methods for automatic selection of degrees of parallelism for efficient execution of queries in a database system are performed by systems and devices. An incoming query associated with a query system is received and features of the incoming query are determined. A system state of the query system and a set of executing queries are also determined, along with a query state of each executing query in the set. At runtime of the incoming query, allocation of a degree of parallelism for executing the query is determined by calculating different possible execution times for the incoming query at least partially concurrently with the set of executing queries. Execution times are calculated for different parallel thread options and based on query features, system state, or query states of executing queries. The execution of the incoming query is initialized with the parallel thread option corresponding to a specific execution completion time.
- Redmond WA, US Karthik S. Ramachandra - Madison WI, US Alan D. Halverson - Verona WI, US
Assignee:
Microsoft Technology Licensing, LLC - Redmond WA
International Classification:
G06F 17/30 G06F 9/50
Abstract:
According to examples, an apparatus may include a machine readable medium on which is stored machine readable instructions that may cause a processor to, for each of a plurality of resource setting levels, determine resource usage characteristics and execution times of executed workloads, assign, based on the resource usage characteristics of the executed workloads, each of the executed workloads into one of a plurality of resource bins, determine, for each of the resource bins, an average execution time of the executed workloads in the resource bin, determine a total average execution time of the executed workloads from the determined average execution times, identify a lowest total average execution time of the determined total average execution times, determine the resource setting level corresponding to the identified lowest total average execution time, and tune a resource setting to the determined resource setting level.
Method For Optimization Of Imperative Code Executing Inside A Relational Database Engine
- Redmond WA, US Kwanghyun PARK - Madison WI, US Alan Dale HALVERSON - Verona WI, US Conor John CUNNINGHAM - Austin TX, US Cesar Alejandro GALINDO-LEGARIA - Redmond WA, US Kameswara Venkatesh EMANI - Mumbai, IN
International Classification:
G06F 17/30
Abstract:
Processing a database query. A method includes receiving a database query from a user. The database query includes one or more imperative functions. The one or more imperative functions are converted to one or more declarative query representations. The one or more declarative query representations include standardized relational operators included in a relational query language. Further, the one or more declarative query representations are optimizable by a query optimizer of the database. The database query is optimized at the query optimizer to create a query plan by evaluating any declarative query representation originally in the database query received from the user and the one or more declarative query representations.
- Redmond WA, US Vinitha Reddy Gankidi - Madison WI, US Alan D. Halverson - Verona WI, US Jignesh M. Patel - Madison WI, US
International Classification:
G06F 17/30 G06F 17/30 G06F 3/06 G06F 17/30
Abstract:
A split-index can be employed for access to external data. The index can be created on a primary data storage system for data stored externally on a secondary data storage system. After creation, the index can be utilized to expedite at least query execution over the externally stored data. The index can be updated upon detection of changes to data. Further, even when the index is not completely up to date, the index can be exploited for query execution. Furthermore, hybrid execution is enabled with the index and without the index.
Per-Node Custom Code Engine For Distributed Query Processing
- Redmond WA, US Nikhil Teletia - Madison WI, US Hideaki Kimura - Madison WI, US Alan D. Halverson - Verona WI, US Srinath Shankar - Madison WI, US Karthik Ramachandra - Madison WI, US
International Classification:
G06F 17/30
Abstract:
Distributed query processing is often performed by a set of nodes that apply MapReduce to a data set and materialize partial results to storage, which are then aggregated to produce the query result. However, this architecture requires a preconfigured set of database nodes; can only fulfill queries that utilize MapReduce processing; and may be slowed down by materializing partial results to storage. Instead, distributed query processing can be achieved by choosing a node for various portions of the query, and generating customized code for the node that only performs the query portion that is allocated to the node. The node executes the code to perform the query portion, and rather than materializing partial results to storage, streams intermediate query results to a next selected node in the distributed query. Nodes selection may be involve matching the details of the query portion with the characteristics and capabilities of the available nodes.
Centennial, ColoradoComputer Programmer/Applications Software at Vario... Computer Applications, Author, Pianist, Lightworker, Networking
Wrote 2 books: Magnificent Transition and Love Power - The Clean Energy Fuel Of The Future