Comparative Analysis of Audio-visual Perceptual Mechanisms and their Applications on Coding Systems


    This work investigates how perceivers extract phonetically relevant visual information from dynamic audiovisual speech. It tests the hypothesis that lo resolution spatial and temporal information is sufficient for speech perception. Audiovisual perception studies were carrier out using spatial and temporal low-pass filters applied to video image sequences for Japanese, English and Brazilian Portuguese sentences at the rate of 30 frames/s. Intelligibility tests were run using two different kinds of three dimensional filtering. A detailed analysis of the frequency contents of video sequences also allows a deeper understanding of audiovisual speech perception.



Part 1 : Measuring Speech Reception in the presence of Audiovisual Information

    Tests have been developed to determine how visual information can complete noise and band corrupted speech. Speech Intelligibility tests were performed using samples with and without visual information.
    Go to the project page.
Part 2 : Linking production and perception Through spatial and temporal filtering of visible speech

    Investigates how perceivers extract phonetically relevant visual information from dynamic audiovisual speech. It tests the hypothesis that low resolution spatial and temporal information is sufficient for speech perception. We conducted audiovisual perception studies using spatial and temporal low-pass filters applied to the video image sequences for Japanese and English sentences. The results suggested that information with a frequency higher than that of the normal rate of opening and closing the vocal tract rate can be removed without degrading the perception of visual speech information.
    Go to the project page.

Copyright © 2004 CEFALA/UFMG
Contact the webmaster for website update requests.