Microsoft Office Cross Cultural Management English Organizational Development International Development Organizational Leadership Organizational Behavior International Organizations Strategic Planning Teaching Leadership Development
Bo Thiesson - Kirkland WA 98033 Christopher A. Meek - Kirkland WA 98033 David Maxwell Chickering - Redmond WA 98052 David Earl Heckerman - Bellevue WA 98008
International Classification:
G06N 302
US Classification:
706 52, 706 45
Abstract:
The invention employs mixtures of Bayesian networks to perform clustering. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN is based upon the hypothesis that the common external hidden variable is in a corresponding one of those states. In one mode of the invention, the MBN having the highest MBN score is selected for use in performing inferencing. The invention determines membership of an individual case in a cluster based upon a set of data of plural individual cases by first learning the structure and parameters of an MBN given that data and then using the MBN to compute the probability of each HSBN generating the data of the individual case.
Mixtures Of Bayesian Networks With Decision Graphs
Bo Thiesson - Kirkland WA Christopher A. Meek - Kirkland WA David Maxwell Chickering - Redmond WA David Earl Heckerman - Bellevue WA
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06N 302
US Classification:
706 52, 706 45
Abstract:
One aspect of the invention is the construction of mixtures of Bayesian networks. Another aspect of the invention is the use of such mixtures of Bayesian networks to perform inferencing. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN is based upon the hypothesis that the common external hidden variable is in a corresponding one of those states. In one mode of the invention, the MBN having the highest MBN score is selected for use in performing inferencing. In another mode of the invention, some or all of the MBNs are retained as a collection of MBNs which perform inferencing in parallel, their outputs being weighted in accordance with the corresponding MBN scores and the MBN collection output being the weighted sum of all the MBN outputs. In one application of the invention, collaborative filtering may be performed by defining the observed variables to be choices made among a sample of users and the hidden variables to be the preferences of those users.
Collaborative Filtering With Mixtures Of Bayesian Networks
Bo Thiesson - Kirkland WA Christopher A. Meek - Kirkland WA David Maxwell Chickering - Redmond WA David Earl Heckerman - Bellevue WA
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06N 302
US Classification:
706 52, 706 45
Abstract:
One aspect of the invention is the construction of mixtures of Bayesian networks. Another aspect of the invention is the use of such mixtures of Bayesian networks to perform inferencing. A mixture of Bayesian networks (MBN) consists of plural hypothesis-specific Bayesian networks (HSBNs) having possibly hidden and observed variables. A common external hidden variable is associated with the MBN, but is not included in any of the HSBNs. The number of HSBNs in the MBN corresponds to the number of states of the common external hidden variable, and each HSBN is based upon the hypothesis that the common external hidden variable is in a corresponding one of those states. In one mode of the invention, the MBN having the highest MBN score is selected for use in performing inferencing. In another mode of the invention, some or all of the MBNs are retained as a collection of MBNs which perform inferencing in parallel, their outputs being weighted in accordance with the corresponding MBN scores and the MBN collection output being the weighted sum of all the MBN outputs. In one application of the invention, collaborative filtering may be performed by defining the observed variables to be choices made among a sample of users and the hidden variables to be the preferences of those users.
D. Maxwell Chickering - Redmond WA David E. Heckerman - Bellevue WA Christopher A. Meek - Kirkland WA Robert L. Rounthwaite - Fall City WA Amir Netz - Bellevue WA Thierry DHers - Issaquah WA
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 1730
US Classification:
707 10, 7071041
Abstract:
Visualization of high-dimensional data sets is disclosed, particularly the display of a network model for a data set. The network, such as a dependency or a Bayesian network, has a number of nodes having dependencies thereamong. The network can be displayed items and connections, corresponding to nodes and dependencies, respectively. Selection of a particular item in one embodiment results in the display of the local distribution associated with the node for the item. In one embodiment, only a predetermined number of the items are shown, such as only the items representing the most popular nodes. Furthermore, in one embodiment, in response to receiving a user input, a sub-set of the connections is displayed, proportional to the user input. In another embodiment, a particular item is displayed in an emphasized manner, and the particular connections representing dependencies including the node represented by the particular item, as well as the items representing nodes also in these dependencies, are also displayed in the emphasized manner. Furthermore, in one embodiment, only an indicated sub-set of the items is displayed.
Preference-Based Catalog Browser That Utilizes A Belief Network
David E. Heckerman - Bellevue WA Christopher A. Meek - Kirkland WA Usama M. Fayad - Mercer Island WA
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 1760
US Classification:
705 27, 706 12
Abstract:
An electronic shopping aid is provided that assists a user in selecting a product from an electronic catalog of products based on their preferences for various features of the products. Since the electronic shopping aid helps a user select a product based on the users preferences, it is referred to as a preference-based product browser. In using the browser, the user initially inputs an indication of their like or dislike for various features of the products as well as an indication of how strongly they feel about the like or dislike. The browser then utilizes this information to determine a list of products in which the user is most likely interested. As part of this determination, the browser performs collaborative filtering and bases the determination on what other users with similar characteristics (e. g. , age and income) have liked. After creating this list, the browser displays the list and also displays a list of features which the user has not indicated either a like or a dislike for and which the browser has identified as being most relevant to the determination of the products that the user may like.
David E. Heckerman - Bellevue WA D. Maxwell Chickering - Bellevue WA John C. Platt - Bellevue WA Christopher A. Meek - Kirkland WA Bo Thiesson - Woodinville WA
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06N 502
US Classification:
706 45, 706 26, 706 20, 706 25
Abstract:
Clustering for purposes of data visualization and making predictions is disclosed. Embodiments of the invention are operable on a number of variables that have a predetermined representation. The variables include input-only variables, output-only variables, and both input-and-output variables. Embodiments of the invention generate a model that has a bottleneck architecture. The model includes a top layer of nodes of at least the input-only variables, one or more middle layer of hidden nodes, and a bottom layer of nodes of the output-only and the input-and-output variables. At least one cluster is determined from this model. The model can be a probabilistic neural network and/or a Bayesian network.
System And Method For Approximating Probabilities Using A Decision Tree
Christopher A. Meek - Kirkland WA David M. Chickering - Bellevue WA Jeffrey R. Bernhardt - Woodinville WA Robert L. Rounthwaite - Fall City WA
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 1518
US Classification:
706 12, 706 14, 706 46
Abstract:
Disclosed is a system for approximating conditional probabilities using an annotated decision tree where predictor values that did not exist in training data for the system are tracked, stored, and referenced to determine if statistical aggregation should be invoked. Further disclosed is a system for storing statistics for deriving a non-leaf probability corresponding to predictor values, and a system for aggregating such statistics to approximate conditional probabilities.
Classification System Trainer Employing Maximum Margin Back-Propagation With Probabilistic Outputs
Christopher A. Meek - Kirkland WA John C. Platt - Bellevue WA
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06N 302
US Classification:
706 25, 706 12
Abstract:
A training system for a classifier utilizes both a back-propagation system to iteratively modify parameters of functions which provide raw output indications of desired categories, wherein the parameters are modified based on a weighted decay, and a probability determining system with further parameters that are determined during iterative training. A margin error metric may be combined with weight decay, and a sigmoid is used to calibrate the raw outputs to probability percentages for each category. A method of training such a system involves gathering a training set of inputs and desired corresponding outputs. Classifier parameters are then initialized and an error margin is calculated with respect to the classifier parameters. A weight decay is then used to adjust the parameters. After a selected number of times through the training set, the parameters are deemed in final form, and an optimization routine is used to derive a set of probability transducer parameters for use in calculating the probable classification for each input.