Repository logo
 

Search Results

Now showing 1 - 5 of 5
  • Adaptive modeling and high quality spectral estimation for speech enhancement
    Publication . Coelho, Luis; Braga, Daniela
    In this work an adaptive modeling and spectral estimation scheme based on a dual Discrete Kalman Filtering (DKF) is proposed for speech enhancement. Both speech and noise signals are modeled by an autoregressive structure which provides an underlying time frame dependency and improves time-frequency resolution. The model parameters are arranged to obtain a combined state-space model and are also used to calculate instantaneous power spectral density estimates. The speech enhancement is performed by a dual discrete Kalman filter that simultaneously gives estimates for the models and the signals. This approach is particularly useful as a pre-processing module for parametric based speech recognition systems that rely on spectral time dependent models. The system performance has been evaluated by a set of human listeners and by spectral distances. In both cases the use of this pre-processing module has led to improved results.
  • ezGo: A voice operated wheelchair with biosignal monitoring for home environments
    Publication . Coelho, Luis; Braga, Daniela
    In this paper we present ezGo, an electric powered wheelchair with a speech based interface and biosignal monitoring instrumentation. The user can use the voice, a natural communication method, for controlling the chair movement and obtain information about his health. Additionally a set of semi-autonomous modes with macro recording enable the execution of navigation tasks with little effort and improved precision. The main purpose of the system is to provide severely disabled persons with an assistive device that can improve their confidence and daily independence. The obtained results on usability tests showed that users consider ezGo a valuable help on their daily tasks and a very desirable addition to standard wheelchairs.
  • Homograph ambiguity resolution in front-end design for portuguese TTS systems
    Publication . Braga, Daniela; Coelho, Luis; Resende Jr., Fernando Gil V.
    In this paper, a module for homograph disambiguation in Portuguese Text-to-Speech (TTS) is proposed. This module works with a part-of-speech (POS) parser, used to disambiguate homographs that belong to different parts-of-speech, and a semantic analyzer, used to disambiguate homographs which belong to the same part-of-speech. The proposed algorithms are meant to solve a significant part of homograph ambiguity in European Portuguese (EP) (106 homograph pairs so far). This system is ready to be integrated in a Letter-to-Sound (LTS) converter. The algorithms were trained and tested with different corpora. The obtained experimental results gave rise to 97.8% of accuracy rate. This methodology is also valid for Brazilian Portuguese (BP), since 95 homographs pairs are exactly the same as in EP. A comparison with a probabilistic approach was also done and results were discussed.
  • Speech as the basic interface for assistive technology
    Publication . Teixeira, António; Braga, Daniela; Coelho, Luis; Fonseca, José Alberto; Alvarelhão, Joaquim; Martim, Inácio; Queirós, Alexandra; Rocha, Nelson; Calado, António; Sales-Dias, Miguel
    Speech interfaces for Assistive Technologies are not common and are usually replaced by others. The market they are targeting is not considered attractive and speech technologies are still not well spread. Industry still thinks they present some performance risks, especially Speech Recognition systems. As speech is the most elemental and natural way for communication, it has strong potential for enhancing inclusion and quality of life for broader groups of users with special needs, such as people with cerebral palsy and elderly staying at their homes. This work is a position paper in which the authors argue for the need to make speech become the basic interface in assistive technologies. Among the main arguments, we can state: speech is the easiest way to interact with machines; there is a growing market for embedded speech in assistive technologies, since the number of disabled and elderly people is expanding; speech technology is already mature to be used but needs adaptation to people with special needs; there is still a lot of R&D to be done in this area, especially when thinking about the Portuguese market. The main challenges are presented and future directions are proposed.
  • A rule-based grapheme-to-phone converter for TTS systems in european portuguese
    Publication . Braga, Daniela; Coelho, Luis; Vianna Resende, Fernando Gil
    In this paper, a linguistically rule-based grapheme-to-phone (G2P) transcription algorithm is described for European Portuguese. A complete set of phonological and phonetic transcription rules regarding the European Portuguese standard variety is presented. This algorithm was implemented and tested by using online newspaper articles. The obtained experimental results gave rise to 98.80% of accuracy rate. Future developments in order to increase this value are foreseen. Our purpose with this work is to develop a module/ tool that can improve synthetic speech naturalness in European Portuguese. Other applications of this system can be expected like language teaching/learning. These results, together with our perspectives of future improvements, have proved the dramatic importance of linguistic knowledge on the development of Text-to-Speech systems (TTS).