Joshua Goodman - Redmond WA, US Vitor De Carvalho - Pittsburgh PA, US Kristin Bromm - Mountain View CA, US Denise Hui - San Mateo CA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 17/30
US Classification:
707001000
Abstract:
A computer-implemented implicit querying system comprises a scanning component that scans content of a document. An analysis component analyzes the scanned content and outputs a query based at least in part upon the analysis and frequency of use information associated with the query. The system can further comprise a weighting component that provides weights to text within the document based at least in part upon location of text within the document. The query can then be output to a user based at least in part upon the provided weights.
Joshua Goodman - Redmond WA, US Vitor de Carvalho - Pittsburgh PA, US
Assignee:
MICROSOFT CORPORATION - Redmond WA
International Classification:
G06F 17/30
US Classification:
707005000
Abstract:
Extraction analysis techniques biased, in part, by query frequency information from a query log file and/or search engine cache are employed along with machine learning processes to determine candidate keywords and/or phrases of web documents. Web oriented features associated with the candidate keywords and/or phrases are also utilized to analyze the web documents. A keyword and/or phrase extraction mechanism can be utilized to score keywords and/or phrases in a web document and estimate a likelihood that the keywords and/or phrases are relevant, for example, in an advertising system and the like.
Online Stratified Sampling For Classifier Evaluation
Paul N. Bennett - Kirkland WA, US Vitor R. Carvalho - Redmond WA, US
Assignee:
Microsoft Corporation - Redmond WA
International Classification:
G06F 17/30
US Classification:
707740, 707748, 707E17014, 707E1709
Abstract:
To determine if a set of items belongs to a class of interest, the set of items is binned into sub-populations based on a score, ranking, or trait associated with each item. The sub-populations may be created based on the score associated with each item, such as an equal score interval, or with the distribution of the items within the overall population, such as a proportion interval. A determination is made of how may samples are needed from each sub-population in order to make an estimation regarding the entire set of items. Then a calculation of the precision and variance for each sub-population is completed and are combined to provide an overall precision and variance value for the overall population.
System And Method For Anomaly Detection For Time Series Data
- Mountain View CA, US Karen C. LO - San Diego CA, US Vitor R. CARVALHO - San Diego CA, US
Assignee:
Intuit Inc. - Mountain View CA
International Classification:
G06N 5/04 G06F 16/2458 G06N 20/20 G06N 5/00
Abstract:
Systems and methods that may implement an anomaly detection process for time series data. The systems and methods may implement a model ensemble process comprising at least one machine learning model in a supervised class and at least one machine learning model in an unsupervised class.
Method And System For Facilitating User Support Using Multimodal Information
A method for facilitating user support using multimodal information involves obtaining an interaction between a user and a support agent, generating a question embedding from the interaction, obtaining a clickstream associated with the interaction, and generating a clickstream embedding from the clickstream. The question embedding and the clickstream embedding form a shared latent space representation. The method further involves decoding a problem summary from the shared latent space representation and providing the problem summary to the support agent.
- Mountain View CA, US Vitor R. Carvalho - San Diego CA, US Sparsh Gupta - San Diego CA, US
Assignee:
Intuit Inc. - Mountain View CA
International Classification:
G06F 17/27 G06N 3/04 G06N 3/08
Abstract:
A computer-implemented method is provided to perform text classification with a neural network system. The method includes providing a computing device to receive input datasets including user input question text and feed the datasets to the neural network system. The neural network system includes one or more neural networks configured to extract and concatenate character-based features, word-based features from the question datasets and clickstream embeddings of clickstream data to form a representation vector indicative of the question text and user behavior. A representation vector is fed into fully connected layers of a feed-forward network. The feed-forward network is configured to predict a first class and a second class associated with respective user input questions based on the representation vector.
Detecting Duplicated Questions Using Reverse Gradient Adversarial Domain Adaptation
Vitor R. Carvalho - San Diego CA, US Anusha Kamath - Pittsburgh PA, US
Assignee:
Intuit Inc. - Mountain View CA
International Classification:
G06F 16/215 G06F 16/28 G06F 16/2455
Abstract:
Detect duplicated questions using reverse gradient adversarial domain adaptation includes applying a general network to multiple general question pairs to obtain a first set of losses. A target domain network is applied to multiple domain specific network pairs to obtain a second set of losses. Further, a domain distinguishing network is applied to a set of domain specific questions and a set of general questions to obtain a third set of losses. A set of accumulated gradients is calculated from the first set of losses, the second set of losses, and the third set of losses. Multiple features are updated according to the set of accumulated gradients to train the target domain network.
Collecting Data From A Statistically Significant Group Of Mobile Devices
- San Diego CA, US Brian Fink - Ramona CA, US Michael William Paddon - Shinjuku-ku, JP Craig Brown - Freshwater, AU Vitor Carvalho - San Diego CA, US
International Classification:
H04W 8/24 H04L 29/08 H04W 24/08
Abstract:
Methods, systems, and devices are described for wireless communication to enable data collection from wireless devices in an efficient manner. An aspect of the data collection approaches described herein may involve determining a smaller group of wireless devices from which to collect data. Determining the group may be performed such that the data collected is representative of the wireless devices as a whole. For example, a statistically significant group of wireless devices may be selected to be statistically representative of the wireless devices of the network. Various criteria may be identified for selecting the group. Such criteria may include a specified technique for selecting wireless devices for the group.
Intuit
Principal Data Scientist and Group Manager
Snap Inc. Jun 2015 - Nov 2017
Lead Research Scientist and Manager
University of Washington Nov 2011 - Oct 2017
Advisory Board Member - Data Science Certificate
Qualcomm May 2012 - Jun 2015
Senior Staff Engineer
Intelius Nov 2010 - May 2012
Principal Scientist and Data Engineering Manager
Education:
Carnegie Mellon University 2005 - 2008
Doctorates, Doctor of Philosophy, Computer Science
Carnegie Mellon University
Master of Science, Masters, Computer Science
Universidade Estadual De Campinas
Master of Science, Masters, Telecommunications
Universidade Federal De Pernambuco
Bachelors, Bachelor of Science, Electrical Engineering
Skills:
Machine Learning Information Retrieval Data Mining Algorithms Text Mining Natural Language Processing Software Development Information Extraction Computer Science Software Engineering Scalability Python Big Data Pattern Recognition Software Design Mapreduce Project Management Artificial Intelligence Perl Hadoop Computer Vision Search Solr Record Linkage Speech Recognition Lucene High Performance Computing Predictive Modeling
Interests:
Crowdflower Information Retrieval Classification (Machine Learning) Seattle Text Mining San Diego Natural Language Processing Information Extraction Intelius Email Big Data Social Networks The Economist Breaking News Search The Wall Street Journal Computer Science Amazon Mechanical Turk Machine Learning Pittsburgh Microsoft Com Freakonomics (2005 Book) Woot
LondonSenior Manager at Planet Hollywood Planet Hollywood Ltd., (Paris Champs-Elysees, Paris EuroDisney, New York, Las Vegas and London), Senior Restaurant Manager currently based in London.
Chefe da Divisão de Serviços e Clientes at Institu... Past: Director de Serviços de Sistemas de Informação at Secretaria Geral do MTSS, Chefe da...
PortoProfessor Auxiliar at Faculdade de Economia do Por... Past: Assistente at Faculdade de Economia do Porto, Assistente EstagiƔrio at Faculdade de...