- Mountain View CA, US Hainan Xu - Mountain View CA, US Kartik Audhkhasi - Mountain View CA, US Yinghui Huang - Mountain View CA, US
Assignee:
Google LLC - Mountain View CA
International Classification:
G10L 15/04 G10L 25/30 G06N 3/04
Abstract:
A method for subword segmentation includes receiving an input word to be segmented into a plurality of subword units. The method also includes executing a subword segmentation routine to segment the input word into a plurality of subword units by accessing a trained vocabulary set of subword units and selecting the plurality of subword units from the input word by greedily finding a longest subword unit from the input word that is present in the trained vocabulary set until an end of the input word is reached.