Eric Cosatto - Highlands NJ Hans Peter Graf - Lincroft NJ
Assignee:
ATT Corp. - New York NY
International Classification:
G06T 1700
US Classification:
345473
Abstract:
A method for modeling three-dimensional objects to create photo-realistic animations using a data-driven approach. The three-dimensional object is defined by a set of separate three-dimensional planes, each plane enclosing an area of the object that undergoes visual changes during animation. Recorded video is used to create bitmap data to populate a database for each three-dimensional plane. The video is analyzed in terms of both rigid movements (changes in pose) and plastic deformation (changes in expression) to create the bitmaps. The modeling is particularly well-suited for animations of a human face, where an audio track generated by a text-to-speech synthesizer can be added to the animation to create a photo-realistic âtalking headâ.
Audio-Visual Selection Process For The Synthesis Of Photo-Realistic Talking-Head Animations
Eric Cosatto - Highlands NJ Hans Peter Graf - Lincroft NJ Gerasimos Potamianos - White Plains NY Juergen Schroeter - New Providence NJ
Assignee:
ATT Corp. - New York NY
International Classification:
G06T 1300
US Classification:
345473
Abstract:
A system and method for generating photo-realistic talking-head animation from a text input utilizes an audio-visual unit selection process. The lip-synchronization is obtained by optimally selecting and concatenating variable-length video units of the mouth area. The unit selection process utilizes the acoustic data to determine the target costs for the candidate images and utilizes the visual data to determine the concatenation costs. The image database is prepared in a hierarchical fashion, including high-level features (such as a full 3D modeling of the head, geometric size and position of elements) and pixel-based, low-level features (such as a PCA-based metric for labeling the various feature bitmaps).
Coarticulation Method For Audio-Visual Text-To-Speech Synthesis
Eric Cosatto - Highlands NJ Hans Peter Graf - Lincroft NJ Juergen Schroeter - New Providence NJ
Assignee:
ATT Corp. - New York NY
International Classification:
G10L 1308
US Classification:
704260, 704263, 704276
Abstract:
A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.
Digitally-Generated Lighting For Video Conferencing Applications
Andrea Basso - Ocean NJ, US Eric Cosatto - Highlands NJ, US David Crawford Gibbon - Lincroft NJ, US Hans Peter Graf - Lincroft NJ, US Shan Liu - Los Angeles CA, US
Assignee:
AT&T Corp. - New York NY
International Classification:
G06K009/40
US Classification:
382274, 382103, 382190, 382291, 348 1401
Abstract:
A method of improving the lighting conditions of a real scene or video sequence. Digitally generated light is added to a scene for video conferencing over telecommunication networks. A virtual illumination equation takes into account light attenuation, lambertian and specular reflection. An image of an object is captured, a virtual light source illuminates the object within the image. In addition, the object can be the head of the user. The position of the head of the user is dynamically tracked so that an three-dimensional model is generated which is representative of the head of the user. Synthetic light is applied to a position on the model to form an illuminated model.
Method For Sending Multi-Media Messages Using Emoticons
Joern Ostermann - Morganville NJ, US Mehmet Reha Civanlar - Middletown NJ, US Eric Cosatto - Highlands NJ, US Hans Peter Graf - Lincroft NJ, US Yann Andre LeCun - Lincroft NJ, US
Assignee:
AT&T Corp. - New York NY
International Classification:
G10L 13/08 G10L 21/00 G06T 13/00
US Classification:
704260, 704272, 704275, 704270, 345473
Abstract:
A system and method of providing sender-customization of multi-media messages through the use of emoticons is disclosed. The sender inserts the emoticons into a text message. As an animated face audibly delivers the text, emoticons associated with the message are started a predetermined period of time or number of words prior to the position of the emoticon in the message text and completed a predetermined length of time or number of words following the location of the emoticon. The sender may insert emoticons through the use of emoticon buttons that are icons available for choosing. Upon sender selections of an emoticon, an icon representing the emoticon is inserted into the text at the position of the cursor. Once an emoticon is chosen, the sender may also choose the amplitude for the emoticon and increased or decreased amplitude will be displayed in the icon inserted into the message text.
Method For Sending Multi-Media Messages Using Customizable Background Images
Joern Ostermann - Morganville NJ, US Barbara Buda - Morristown NJ, US Mehmet Reha Civanlar - Middletown NJ, US Eric Cosatto - Highlands NJ, US Hans Peter Graf - Lincroft NJ, US Thomas M. Isaacson - Dunkirk MD, US Yann Andre LeCun - Lincroft NJ, US
Assignee:
AT&T Corp. - New York NY
International Classification:
G10L 13/00
US Classification:
704260, 704267, 704258
Abstract:
A system and method of providing sender customization of multi-media messages through the use of inserted images or video. The images or video may be sender-created or predefined and available to the sender via a web server. The method relates to customizing a multi-media message created by a sender for a recipient, the multi-media message having an animated entity audibly presenting speech converted from text created by the sender. The method comprises receiving at least one image from the sender, associating each at least one image with a tag, presenting the sender with options to insert the tag associated with one of the at least one image into the sender text, and after the sender inserts the tag associated with one of the at least one images into the sender text, delivering the multi-media message with the at least one image presented as background to the animated entity according to a position of the tag associated with the at least one image in the sender text. In another embodiment of the invention, a template is provided to the sender to create multi-media messages using predefined static images or video clips. The method comprises providing the sender with a group of customizable multi-media message templates, each template of the groups of templates including predefined parameters comprising a predefined text message, a predefined animated entity, a predefined background, predefined background music, and a predefined set of emoticons within the text of the message.
System And Method Of Providing Conversational Visual Prosody For Talking Heads
Eric Cosatto - Highlands NJ, US Hans Peter Graf - Lincroft NJ, US Thomas M. Isaacson - Huntingtown MD, US Volker Franz Strom - Jersey City NJ, US
Assignee:
AT&T Corp. - New York NY
International Classification:
G10L 11/00
US Classification:
704275, 704276, 704272
Abstract:
A system and method of controlling the movement of a virtual agent while the agent is listening to a human user during a conversation is disclosed. The method comprises receiving speech data from the user, performing a prosodic analysis of the speech data and controlling the virtual agent movement according to the prosodic analysis.
Coarticulation Method For Audio-Visual Text-To-Speech Synthesis
Eric Cosatto - Highlands NJ, US Hans Peter Graf - Lincroft NJ, US Juergen Schroeter - New Providence NJ, US
Assignee:
AT&T Corp. - New York NY
International Classification:
G10L 19/00
US Classification:
704260, 345706
Abstract:
A method for generating animated sequences of talking heads in text-to-speech applications wherein a processor samples a plurality of frames comprising image samples. Representative parameters are extracted from the image samples and stored in an animation library. The processor also samples a plurality of multiphones comprising images together with their associated sounds. The processor extracts parameters from these images comprising data characterizing mouth shapes, maps, rules, or equations, and stores the resulting parameters and sound information in a coarticulation library. The animated sequence begins with the processor considering an input phoneme sequence, recalling from the coarticulation library parameters associated with that sequence, and selecting appropriate image samples from the animation library based on that sequence. The image samples are concatenated together, and the corresponding sound is output, to form the animated synthesis.
Youtube
Il Silenzio - Nini Rosso played on guitar by ...
Italian Instrumental tune written Nini Rosso & Guglielmo Brezza. Very ...
Duration:
2m 56s
Supa Stroller from Cosatto
Keep your kid comfortable and occupied in the Supa Stroller! This cool...
Duration:
2m 6s
Cosatto Supa Umbrella Stroller Review by Baby...
Visit the Baby Gizmo YouTube Channel for Play Doh, American Girl, Barb...
Duration:
8m 45s
Cosatto Fix It Guide - Front Wheel Replacement
Watch me if you're fitting a new wheel on your pushchair* and Eric wil...
Duration:
3m 9s
Cosatto Anti-Escape System; Safer Child, Safe...
Watch our video and learn about Cosatto's brand new Anti-Escape System...
Duration:
2m 2s
Cosatto Fix It Video Guide - Yo! Stroller: Fi...
Watch me if your Yo! stroller is a bit shaky. Eric shows you how to ti...