Shinji Watanabe

age ~49

from Pittsburgh, PA

Get Report

Connected to:

Shinji Watanabe Phones & Addresses

Pittsburgh, PA
Ellicott City, MD
Arlington, MA
Cambridge, MA

Isbn (Books And Publications)

Spell of a Bird

view source

Author
Shinji Watanabe

ISBN #
0533121906

Us Patents

Methods And Systems For Enhancing Audio Signals Corrupted By Noise
view source
US Patent:

20200058314, Feb 20, 2020
Filed:

Aug 16, 2018
Appl. No.:

15/998765
Inventors:

- Cambridge MA, US
Shinji Watanabe - Baltimore MD, US
John Hershey - Winchester MA, US
Gordon Wichem - Boston MA, US
International Classification:

G10L 19/032
Abstract:

Systems and methods for audio signal processing including an input interface to receive a noisy audio signal including a mixture of target audio signal and noise. An encoder to map each time-frequency bin of the noisy audio signal to one or more phase-related value from one or more phase quantization codebook of phase-related values indicative of the phase of the target signal. Calculate, for each time-frequency bin of the noisy audio signal, a magnitude ratio value indicative of a ratio of a magnitude of the target audio signal to a magnitude of the noisy audio signal. A filter to cancel the noise from the noisy audio signal based on the phase-related values and the magnitude ratio values to produce an enhanced audio signal. An output interface to output the enhanced audio signal.

Methods And Systems For Recognizing Simultaneous Speech By Multiple Speakers
view source
US Patent:

20190318725, Oct 17, 2019
Filed:

Apr 13, 2018
Appl. No.:

15/952330
Inventors:

- Cambridge MA, US
Takaaki Hori - Lexington MA, US
Shane Settle - Moraga CA, US
Hiroshi Seki - Toyohashi, JP
Shinji Watanabe - Baltimore MD, US
John Hershey - Winchester MA, US
International Classification:

G10L 15/16
G10L 17/00
G10L 15/06
G10L 15/22
Abstract:

Systems and methods for a speech recognition system for recognizing speech including overlapping speech by multiple speakers. The system including a hardware processor. A computer storage memory to store data along with having computer-executable instructions stored thereon that, when executed by the processor is to implement a stored speech recognition network. An input interface to receive an acoustic signal, the received acoustic signal including a mixture of speech signals by multiple speakers, wherein the multiple speakers include target speakers. An encoder network and a decoder network of the stored speech recognition network are trained to transform the received acoustic signal into a text for each target speaker. Such that the encoder network outputs a set of recognition encodings, and the decoder network uses the set of recognition encodings to output the text for each target speaker. An output interface to transmit the text for each target speaker.

Method And Apparatus For Open-Vocabulary End-To-End Speech Recognition
view source
US Patent:

20190189115, Jun 20, 2019
Filed:

Dec 15, 2017
Appl. No.:

15/843055
Inventors:

- Cambridge MA, US
Shinji Watanabe - Baltimore MD, US
John Hershey - Winchester MA, US
International Classification:

G10L 15/16
G10L 15/22
G10L 15/14
G10L 15/02
Abstract:

A speech recognition system includes an input device to receive voice sounds, one or more processors, and one or more storage devices storing parameters and program modules including instructions which cause the one or more processors to perform operations. The operations include extracting an acoustic feature sequence from audio waveform data converted from the voice sounds, encoding the acoustic feature sequence into a hidden vector sequence using an encoder network having encoder network parameters, predicting first output label sequence probabilities by feeding the hidden vector sequence to a decoder network having decoder network parameters, predicting second output level sequence probabilities by a hybrid network using character-base language models (LMs) and word-level LMs; and searching, using a label sequence search module, for an output label sequence having a highest sequence probability by combining the first and second output label sequence probabilities provided from the decoder network and the hybrid network.

System And Method For End-To-End Speech Recognition
view source
US Patent:

20180330718, Nov 15, 2018
Filed:

May 11, 2017
Appl. No.:

15/592527
Inventors:

- Cambridge MA, US
Shinji Watanabe - Arlington MA, US
John Hershey - Winchester MA, US
Assignee:

Mitsubishi Electric Research Laboratories, Inc. - Cambridge MA
International Classification:

G10L 15/16
G10L 15/02
G10L 15/14
G10L 19/00
G06N 7/00
G06N 3/04
G06N 3/08
Abstract:

A speech recognition system includes an input device to receive voice sounds, one or more processors, and one or more storage devices storing parameters and program modules including instructions executable by the one or more processors. The instructions includes extracting an accoustic feature sequence from audio waveform data converted from the voice sounds encoding the acoustic feature sequence into a hidden vector sequence using an encoder network having encoder network parameters, predicting first output label sequence probabilities by feeding the hidden vector sequence to a decoder network having decoder network parameters, predicting second output label sequence probabilities by a connectionist temporal classification (CTC) module using CTC network parameters and the hidden vector sequence from the encoder network, and searching, using a label sequence search module, for an output label sequence having a highest sequence probability by combining the first and second output label sequence probabilities provided from the decoder network and the CTC module.

Method And System For Multi-Label Classification
view source
US Patent:

20180157743, Jun 7, 2018
Filed:

Dec 7, 2016
Appl. No.:

15/371513
Inventors:

- Cambridge MA, US
Chiori Hori - Lexington MA, US
Shinji Watanabe - Arlington MA, US
John Hershey - Winchester MA, US
Bret Harsham - Newton MA, US
Jonathan Le Roux - Arlington MA, US
Assignee:

Mitsubishi Electric Research Laboratories, Inc. - Cambridge MA
International Classification:

G06F 17/30
G06N 3/04
G06N 3/08
Abstract:

A method for performing multi-label classification includes extracting a feature vector from an input vector including input data by a feature extractor, determining, by a label predictor, a relevant vector including relevant labels having relevant scores based on the feature vector, updating a binary masking vector by masking pre-selected labels having been selected in previous label selections, applying the updated binary masking vector to the relevant vector such that the relevant label vector is updated to exclude the pre-selected labels from the relevant labels, and selecting a relevant label from the updated relevant label vector based on the relevant scores of the updated relevant label vector.

Method And System For Training Language Models To Reduce Recognition Errors
view source
US Patent:

20170221474, Aug 3, 2017
Filed:

Feb 2, 2016
Appl. No.:

15/013239
Inventors:

- Cambridge MA, US
Chiori Hori - Lexington MA, US
Shinji Watanabe - Arlington MA, US
John Hershey - Winchester MA, US
International Classification:

G10L 15/06
G10L 15/16
G10L 15/02
G10L 15/08
Abstract:

A method and for training a language model to reduce recognition errors, wherein the language model is a recurrent neural network language model (RNNLM) by first acquiring training samples. An automatic speech recognition system (ASR) is appled to the training samples to produce recognized words and probabilites of the recognized words, and an N-best list is selected from the recognized words based on the probabilities. determining word erros using reference data for hypotheses in the N-best list. The hypotheses are rescored using the RNNLM. Then, we determine gradients for the hypotheses using the word errors and gradients for words in the hypotheses. Lastly, parameters of the RNNLM are updated using a sum of the gradients.

Method And System For Role Dependent Context Sensitive Spoken And Textual Language Understanding With Neural Networks
view source
US Patent:

20170161256, Jun 8, 2017
Filed:

Dec 4, 2015
Appl. No.:

14/959132
Inventors:

- Cambridge MA, US
Takaaki Hori - Lexington MA, US
Shinji Watanabe - Arlington MA, US
John Hershey - Winchester MA, US
International Classification:

G06F 17/27
G06N 3/04
G06N 3/08
G10L 15/26
Abstract:

A method and system processes utterances that are acquired either from an automatic speech recognition (ASR) system or text. The utterances have associated identities of each party, such as role A utterances and role B utterances. The information corresponding to utterances, such as word sequence and identity, are converted to features. Each feature is received in an input layer of a neural network(NN). A dimensionality of each feature is reduced, in a projection layer of the NN, to produce a reduced dimensional feature. The reduced dimensional feature is processed to provide probabilities of labels for the utterances.

Method For Distinguishing Components Of An Acoustic Signal
view source
US Patent:

20170011741, Jan 12, 2017
Filed:

May 5, 2016
Appl. No.:

15/147382
Inventors:

- Cambridge MA, US
Jonathan Le Roux - Arlington MA, US
Shinji Watanabe - Arlington MA, US
Zhuo Chen - New York NY, US
International Classification:

G10L 15/16
G10L 15/06
G10L 15/02
Abstract:

A method distinguishes components of a signal by processing the signal to estimate a set of analysis features, wherein each analysis feature defines an element of the signal and has feature values that represent parts of the signal, processing the signal to estimate input features of the signal, and processing the input features using a deep neural network to assign an associative descriptor to each element of the signal, wherein a degree of similarity between the associative descriptors of different elements is related to a degree to which the parts of the signal represented by the elements belong to a single component of the signal. The similarities between associative descriptors are processed to estimate correspondences between the elements of the signal and the components in the signal. Then, the signal is processed using the correspondences to distinguish component parts of the signal.

Youtube

New Faculty Meet & Greet - Shinji Watanabe

Shinji Watanabe, associate professor in the Language Technologies Inst...

Duration:

2m 31s

KUDO UK FIGHTING TALK: Japan Renshi Shinji Wa...

Shinji Watanabe explains how he uses Kata practice as preparation for ...

Duration:

27m 22s

End-to-End Speech Recognition by Following my...

Carnegie Mellon University Course: 11-785, Intro to Deep Learning Offe...

Duration:

1h 29m 22s

KUDO UK Mal's 2 Minute Tip: Shinji Watanabe K...

Tokyo based Kudoka Shinji Watanabe Shihan demonstrates how the Muay Th...

Duration:

2m 11s

The inside story of that tournament, and abou...

This is a inside story about the Kanto qualifying round of the All Jap...

Duration:

18m 29s

How to do Naihanchi The tips for how to makin...

I received a few questions from some of you who watched the last video...

Duration:

12m 48s

Googleplus

Shinji Watanabe

Relationship:

Single

About:

Citizen Runner, Traveler, Permanent learner. All remarks are personal opinion.

Tagline:

Life is art.

Bragging Rights:

・慣れた土地は地図を見なくても移動できる　・百人一首全首暗誦

Shinji Watanabe

News

Controversy Continues After Engineered-Bird Flu-Study Published

view source

rus in ferrets. By Masaki Imai, Tokiko Watanabe, Masato Hatta, Subash C. Das, Makoto Ozawa, Kyoko Shinya, Gongxun Zhong, Anthony Hanson, Hiroaki Katsura, Shinji Watanabe, Chengjun Li, Eiryo Kawakami, Shinya Yamada, Maki Kiso, Yasuo Suzuki, Eileen A. Maher, Gabriele Neumann & Yoshihiro Kawaoka. Nat
Date: May 02, 2012
Category: Sci/Tech
Source: Google