Text Analytics Text Mining Natural Language Processing Information Retrieval Information Extraction Software Engineering Machine Learning Artificial Intelligence Hadoop Computer Science Data Mining Object Oriented Design Java Python Distributed Systems Sql Eclipse Programming Recommender Systems Sentiment Analysis Perl Software Development R C Statistical Modeling Algorithms Big Data Linux Junit Predictive Analytics Software Design Pattern Recognition Social Network Analysis Predictive Modeling Knowledge Representation Mapreduce Human Computer Interaction Search Unix Shell Scripting Computational Linguistics Data Visualization Optimization Apache Pig Data Science Analytics Scalability Nosql
Hyperlinking or associating documents to other documents based on the names of people in the documents has become more desirable. Although there is an automated system for installing such hyperlinks into judicial opinions, the system is not generally applicable to other types of names and documents, nor well suited to determine hyperlinks for names that might refer to two or more similarly named persons. Accordingly, the inventor devised systems, methods, and software that facilitate hyperlinking names in documents, regardless of type. One exemplary system includes a descriptor module and a linking module. The descriptor module develops descriptive patterns for selecting co-occurent document information that is useful in recognizing associations between names and professional classes. The linking module tags names in an input document, extracts co-occurent information using the descriptive patterns, and uses a Bayesian inference network that processes a (non-inverse-document-frequency) name-rarity score for each name along with the name and selected co-occurent document information to determine appropriate hyperlinks to other documents, such as entries in professional directories.
Systems, Methods, Interfaces And Software For Automated Collection And Integration Of Entity Data Into Online Databases And Professional Directories
Yohendran Arumainayagam - Stamford CT, US Christopher C. Dozier - Minneapolis MN, US
Assignee:
Thomson Reuters Global Resurces - Zug
International Classification:
G06F 17/30
US Classification:
707101, 707 4, 707 7, 707 10
Abstract:
An information-retrieval system includes a server that receives queries for documents from client devices and means for outputting results of queries to the client devices, with the results provided in association with one or more interactive control features that are selectable to invoke display of information regarding entities, such as professionals, referenced in the results.
Systems, Methods, Interfaces And Software For Automated Collection And Integration Of Entity Data Into Online Databases And Professional Directories
Yohendran Arumainayagam - Lakeville MN, US Christopher C. Dozier - Minneapolis MN, US
Assignee:
Thomson Reuters Global Resources - Baar
International Classification:
G06F 17/30
US Classification:
707749, 707752, 707802, 707803
Abstract:
An information-retrieval system includes a server that receives queries for documents from client devices and means for outputting results of queries to the client devices, with the results provided in association with one or more interactive control features that are selectable to invoke display of information regarding entities, such as professionals, referenced in the results.
Systems, Methods, And Software For Assessing Ambiguity Of Medical Terms
Christopher Dozier - Minneapolis MN, US Mark Chaudhary - Eagan MN, US Ravi Kondadadi - Eagan MN, US
Assignee:
West Services, Inc. - Eagan MN
International Classification:
G06F 17/30
US Classification:
707005000
Abstract:
Some known medical terms may function as non-medical terms depending on their particular context. Accordingly, the present inventors devised systems, methods, and software that facilitate determining whether a term that is found in a medical corpus is likely to be a medical term when found in another corpus. An exemplary embodiment receives a term and computes an ambiguity score based on language models for a medical and a non-medical corpus.
Systems, Methods, And Software For Hyperlinking Names
Hyperlinking or associating documents to other documents based on the names of people in the documents has become more desirable. Although there is an automated system for installing such hyperlinks into judicial opinions, the system is not generally applicable to other types of names and documents, nor well suited to determine hyperlinks for names that might refer to two or more similarly named persons. Accordingly, the inventor devised systems, methods, and software that facilitate hyperlinking names in documents, regardless of type. One exemplary system includes a descriptor module and a linking module. The descriptor module develops descriptive patterns for selecting co-occurent document information that is useful in recognizing associations between names and professional classes. The linking module tags names in an input document, extracts co-occurent information using the descriptive patterns, and uses a Bayesian inference network that processes a (non-inverse-document-frequency) name-rarity score for each name along with the name and selected co-occurent document information to determine appropriate hyperlinks to other documents, such as entries in professional directories.
Systems, Methods, And Software For Entity Relationship Resolution
Jack G. Conrad - Eagan MN, US Christopher C. Dozier - Minneapolis MN, US Sriharsha Veeramachaneni - St. Paul MN, US
International Classification:
G06F 17/30
US Classification:
707 5, 707E17014
Abstract:
To facilitate access to public records, the present inventors devised, among other things, an entity resolution system. The exemplary system includes master records database of 300 million entities, which is partitioned into multiple distinct portions. The exemplary system extracts entity information from input public records and constructs one or more blocking queries against specific portions of the master records database to identify one or more sets of candidate records. Feature vectors are defined for the candidate records and machine learning techniques, such as Support Vector Machine, are used to determine which of the candidate records from the master records database match the input public records. Candidate records that match are logically associated with public records, enabling ready access via direct or indirect queries.
Systems, Methods, And Software For Entity Extraction And Resolution Coupled With Event And Relationship Extraction
Marc Light - St. Paul MN, US Frank Schilder - St. Paul MN, US Ravi Kumar Kondadadi - Eagan MN, US Christopher C. Dozier - Minneapolis MN, US Wenhui Liao - Minneapolis MN, US Sriharsha Veeramachaneni - St. Paul MN, US
International Classification:
G06N 5/02
US Classification:
706 47, 706 50, 706 54
Abstract:
For automated text processing, the inventors devised, among other things, an exemplary system that includes an entity tagger, an entity resolver, a text segment classifier, and a relationship extractor. The entity tagger receives an input text segment, and tags named entities with the segment as being a person, company, or place. The entity resolver accesses authority files, and associates the persons and companies named in the text segment with specific entries in the files. The text segment classifier determines whether the text segment includes a relationship event, such as job-change event or merger and acquisition event, and if an event is detected, the relationship extractor determines the event role of entities named in the segment. For example, the extractor determines for a merger and acquisition event, which named company was the acquirer and which was acquired.
Systems, Methods And Software For Entity Relationship Resolution
Christopher Dozier - Minneapolis MN, US Souptik Datta - Jersey City NJ, US Merine Thomas - Eagan MN, US
International Classification:
G06F 17/30
US Classification:
707609
Abstract:
A method includes receiving an entity record, wherein the entity record comprising at least one entity field element, and resolving the entity record to an authority record being associated with an initial authority file, wherein the authority record comprising at least one authority field element. The method further includes calculating a field element update measurement, the field element update measurement being associated with the at least one entity field element and the at least one authority field element and if the field element update measurement meets or exceeds a threshold, updating the authority record, or if the field element update measurement does not meet or exceed a threshold, not updating the authority record. The method further includes developing, in response to updating the authority record, an updated authority file associated with at least one updated authority record. The method also includes incorporating an additional authority record to the updated authority file.