LING 15 Lecture Notes - Lecture 9: Audio Signal, Spectrogram, Speech Recognition

43 views4 pages
School
Department
Course
Professor
Technology and language
Modeling language behavior
Speak commands instead of typing/mouse
Computer response with synthetic speech
Automating language analysis
Other benefits
Search applications, research, error detection
Relationship among symbols
Info → meaning → propositions → words → glyphs and phonemes → segments →
acoustics
All human language is based around acoustics (physical substance of phonemes)
We are able to infer segments → phonemes → words → prepositions → words
from acoustics
Speech synthesis : Computer starts with a proposition and breaks it down into acoustics
Models
Model: some artificial construction which performs what the real thing does
computers cannot laugh but they can change propositions into acoustics
In speech synthesis, a computer is modeling speech
Computer can come up with words and generate human-like ordering of words
Text-to-speech: process in which computer takes text input and turns it into auditory (acoustic)
signal
Process of converting words and/or text to auditory output
Two approaches [Both require stored data (program “looks up ” what sounds to make)]
Whole word: uses stored recording of each word and replays when needed
Tell computer to pronounce “Hello world” → computer looks up acoustic
pronunciation of hello and of world → plays recordings in order
computer is told to pronounce a word → computer looks up
recording in database → plays recording of sound
Problems:
Need multiple versions of each word
Requires a lot of storage space
Phonemic: determines phoneme order for each word
“Hello world” converted into a string of phonemes → look up acoustic
pronunciation of each phoneme → string objects together and play
Advantages:
Can predict phoneme string from spelling
Smaller number of recordings = needs less storage space
Problems:
Imperfect mapping: physical segments differ by context
Sounds different depending on vowel and consonants so it
is hard to make the words sound natural and flow
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows page 1 of the document.
Unlock all 4 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Info meaning propositions words glyphs and phonemes segments acoustics. All human language is based around acoustics (physical substance of phonemes) We are able to infer segments phonemes words prepositions words from acoustics. Speech synthesis : computer starts with a proposition and breaks it down into acoustics. Model: some artificial construction which performs what the real thing does. Computers cannot laugh but they can change propositions into acoustics. In speech synthesis, a computer is modeling speech. Computer can come up with words and generate human-like ordering of words. Text-to-speech: process in which computer takes text input and turns it into auditory (acoustic) signal. Process of converting words and/or text to auditory output. Two approaches [both require stored data (program looks up what sounds to make)] Whole word: uses stored recording of each word and replays when needed. Tell computer to pronounce hello world computer looks up acoustic pronunciation of hello and of world plays recordings in order.

Get access

Grade+
$40 USD/m
Billed monthly
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
10 Verified Answers
Class+
$30 USD/m
Billed monthly
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
7 Verified Answers

Related Documents