An approach to alignment of transcripts with recorded audio is tolerant of moderate transcript inaccuracies, untranscribed speech, and significant non-speech noise. In one aspect, a number of search terms are formed from the transcript such that each search term is associated with a location within the transcript. Possible locations of the search terms are then determined in the audio recording. The audio recording and the transcript are then aligned using the possible locations of the search terms. In another aspect a search expression is accepted, and then a search is performed for spoken occurrences of the search expression in an audio recording. This search includes searching for text occurrences of the search expression in a text transcript of the audio recording, and searching for spoken occurrences of the search expression in the audio recording.
Jon A. Arrowood - Smyrna GA, US Robert W. Morris - Atlanta GA, US Kenneth K. Griggs - Roswell GA, US
Assignee:
Nexidia, Inc. - Atlanta GA
International Classification:
G10L 15/04
US Classification:
704254, 704257, 704 7
Abstract:
This invention relates to processing of audio files, and more specifically, to an improved technique of searching audio. More particularly, a method and system for processing audio using a multi-stage searching process is disclosed.
An approach to alignment of transcripts with recorded audio is tolerant of moderate transcript inaccuracies, untranscribed speech, and significant non-speech noise. In one aspect, a number of search terms are formed from the transcript such that each search term is associated with a location within the transcript. Possible locations of the search terms are then determined in the audio recording. The audio recording and the transcript are then aligned using the possible locations of the search terms. In another aspect a search expression is accepted, and then a search is performed for spoken occurrences of the search expression in an audio recording. This search includes searching for text occurrences of the search expression in a text transcript of the audio recording, and searching for spoken occurrences of the search expression in the audio recording.
Robert W. Morris - Atlanta GA, US Jon A. Arrowood - Smyrna GA, US Mark A. Clements - Lilburn GA, US Kenneth King Griggs - Roswell GA, US Peter S. Cardillo - Atlanta GA, US Marsal Gavalda - Sandy Springs GA, US
Assignee:
Nexidia Inc. - Atlanta GA
International Classification:
G10L 15/04 G06F 17/30
US Classification:
704251, 707E17039, 704E15001
Abstract:
In one aspect, a method for processing media includes accepting a query. One or more language patterns are identified that are similar to the query. A putative instance of the query is located in the media. The putative instance is associated with a corresponding location in the media. The media in a vicinity of the putative instance is compared to the identified language patterns and data characterizing the putative instance of the query is provided according to the comparing of the media to the language patterns, for example, as a score for the putative instance that is determined according to the comparing of the media to the language patterns.
Jon A. Arrowood - Smyrna GA, US Kenneth King Griggs - Roswell GA, US Marsal Gavalda - Sandy Springs GA, US Robert W. Morris - Atlanta GA, US
Assignee:
Nexidia Inc. - Atlanta GA
International Classification:
G10L 15/26 G06F 17/30
US Classification:
704235, 704E15043, 707759, 707769, 707705, 707802
Abstract:
Some general aspects relate to systems and methods for media processing. One aspect, for example, relates to a method for aligning multimedia recording with a transcript. A group of search terms are formed from the transcript, with each search term being associated with a location within the transcript. Putative locations of the search terms are determined in a time interval of the multimedia recording. For each search term, zero or more putative locations are determined and, for at least some of the search terms, multiple putative locations are determined in the time interval of the multimedia recording. According to a first sequencing constraint, a first representation of a group of sequences each of a subset of the putative locations of the search terms is formed. A second representation of a group of sequences each of a subset of the search terms is formed. Using the first and the second representations, the time interval of the multimedia recording is partially aligned with the transcript.
Kenneth King Griggs - Roswell GA, US Jon A. Arrowood - Smyrna GA, US
Assignee:
Nexidia Inc. - Atlanta GA
International Classification:
G10L 15/00
US Classification:
704251, 704E15001
Abstract:
Systems, methods, and apparatus, including computer program products for accepting a predetermined vocabulary-dependent characterization of a set of audio signals, the predetermined characterization including an identification of putative occurrences of each of a plurality of vocabulary items in the set of audio signals, the plurality of vocabulary items included in the vocabulary; accepting a new vocabulary item not included in the vocabulary; accepting putative occurrences of the new vocabulary item in the set of audio signals; and generating, by an analysis engine of a speech processing system, an augmented characterization of the set of audio signals based on the identified putative occurrences of the new vocabulary item.
Jacob B. Garland - Marietta GA, US Drew Lanham - Menlo Park CA, US Daryl Kip Watters - Marietta GA, US Marsal Gavalda - Sandy Springs GA, US Mark Finlay - Tucker GA, US Kenneth K. Griggs - Roswell GA, US
Assignee:
Nexidia Inc. - Atlanta GA
International Classification:
G10L 15/04
US Classification:
704254, 704251, 704E15005
Abstract:
In an aspect, in general, method for aligning an audio recording and a transcript includes receiving a transcript including a plurality of terms, each term of the plurality of terms associated with a time location within a different version of the audio recording, forming a plurality of search terms from the terms of the transcript, determining possible time locations of the search terms in the audio recording, determining a correspondence between time locations within the different version of the audio recording associated with the search terms and the possible time locations of the search terms in the audio recording, and aligning the audio recording and the transcript including updating the time location associated with terms of the transcript based on the determined correspondence.
An approach to alignment of transcripts with recorded audio is tolerant of moderate transcript inaccuracies, untranscribed speech, and significant non-speech noise. In one aspect, a number of search terms are formed from the transcript such that each search term is associated with a location within the transcript. Possible locations of the search terms are then determined in the audio recording. The audio recording and the transcript are then aligned using the possible locations of the search terms. In another aspect a search expression is accepted, and then a search is performed for spoken occurrences of the search expression in an audio recording. This search includes searching for text occurrences of the search expression in a text transcript of the audio recording, and searching for spoken occurrences of the search expression in the audio recording.
Name / Title
Company / Classification
Phones & Addresses
Kenneth Griggs Principal
Screen On The Green Business Services at Non-Commercial Site
Medical School Uniformed Services University of the Health Sciences Hebert School of Medicine Graduated: 1994
Languages:
English Spanish
Description:
Dr. Griggs graduated from the Uniformed Services University of the Health Sciences Hebert School of Medicine in 1994. He works in Lumberton, NC and specializes in Diagnostic Radiology. Dr. Griggs is affiliated with Southeastern Regional Medical Center.
Liberty Elementary School White Settlement TX 1980-1986, Brewer Middle School White Settlement TX 1986-1988, Ore City Junior High School Ore City TX 1988-1989