Phonetic figures are sound.docx
- 文档编号:24022896
- 上传时间:2023-05-23
- 格式:DOCX
- 页数:13
- 大小:136.85KB
Phonetic figures are sound.docx
《Phonetic figures are sound.docx》由会员分享,可在线阅读,更多相关《Phonetic figures are sound.docx(13页珍藏版)》请在冰豆网上搜索。
Phoneticfiguresaresound
Phoneticfiguresaresound-relatedfiguresofspeech.
Theyencompassvariousstylisticmeans,namelythealliteration,assonance,cacophony,
paronomasia(pun)andonomatopoiea.
Alloftheseconcreterealisationsofphoneticfiguresrelatetosoundsastheyrepresentrepititionofsoundsandvowels
(alliterationandassonance),clashesofsounds(cacophony),"playuponthesoundsandmeaningsofwords"(pun)
andimitationofsounds(onomatopoeiaPrevious/Next/Index
3.PhoneticsandTheoryofSpeechProduction
Speechprocessingandlanguagetechnologycontainslotsofspecialconceptsandterminology.Tounderstandhowdifferentspeechsynthesisandanalysismethodsworkwemusthavesomeknowledgeofspeechproduction,articulatoryphonetics,andsomeotherrelatedterminology.Thebasictheoryofthesetopicswillbediscussedbrieflyinthischapter.Formoredetailedinformation,seeforexampleFant(1970),Flanagan(1972),Witten(1982),O'Saughnessy(1987),orKleijnetal(1998).
3.1RepresentationandAnalysisofSpeechSignals
Continuousspeechisasetofcomplicatedaudiosignalswhichmakesproducingthemartificiallydifficult.Speechsignalsareusuallyconsideredasvoicedorunvoiced,butinsomecasestheyaresomethingbetweenthesetwo.Voicedsoundsconsistoffundamentalfrequency(F0)anditsharmoniccomponentsproducedbyvocalcords(vocalfolds).Thevocaltractmodifiesthisexcitationsignalcausingformant(pole)andsometimesantiformant(zero)frequencies(Witten1982).Eachformantfrequencyhasalsoanamplitudeandbandwidthanditmaybesometimesdifficulttodefinesomeoftheseparameterscorrectly.Thefundamentalfrequencyandformantfrequenciesareprobablythemostimportantconceptsinspeechsynthesisandalsoinspeechprocessingingeneral.
Withpurelyunvoicedsounds,thereisnofundamentalfrequencyinexcitationsignalandthereforenoharmonicstructureeitherandtheexcitationcanbeconsideredaswhitenoise.Theairflowisforcedthroughavocaltractconstrictionwhichcanoccurinseveralplacesbetweenglottisandmouth.Somesoundsareproducedwithcompletestoppageofairflowfollowedbyasuddenrelease,producinganimpulsiveturbulentexcitationoftenfollowedbyamoreprotractedturbulentexcitation(Kleijnetal.1998).Unvoicedsoundsarealsousuallymoresilentandlesssteadythanvoicedones.ThedifferencesbetweentheseareeasytoseefromFigure3.2wherethesecondandlastsoundsarevoicedandtheothersunvoiced.Whisperingisthespecialcaseofspeech.Whenwhisperingavoicedsoundthereisnofundamentalfrequencyintheexcitationandthefirstformantfrequenciesproducedbyvocaltractareperceived.
Speechsignalsofthethreevowels(/a//i//u/)arepresentedintime-andfrequencydomaininFigure3.1.Thefundamentalfrequencyisabout100HzinallcasesandtheformantfrequenciesF1,F2,andF3withvowel/a/areapproximately600Hz,1000Hz,and2500Hzrespectively.Withvowel/i/thefirstthreeformantsare200Hz,2300Hz,and3000Hz,andwith/u/300Hz,600Hz,and2300Hz.Theharmonicstructureoftheexcitationisalsoeasytoperceivefromfrequencydomainpresentation.
Fig.3.1.Thetime-andfrequency-domainpresentationofvowels/a/,/i/,and/u/.
Itcanbeseenthatthefirstthreeformantsareinsidethenormaltelephonechannel(from300Hzto3400Hz)sotheneededbandwidthforintelligiblespeechisnotverywide.Forhigherquality,upto10kHzbandwidthmaybeusedwhichleadsto20kHzsamplingfrequency.Unless,thefundamentalfrequencyisoutsidethetelephonechannel,thehumanhearingsystemiscapabletoreconstructitfromitsharmoniccomponents.
Anothercommonlyusedmethodtodescribeaspeechsignalisthespectrogramwhichisatime-frequency-amplitudepresentationofasignal.Thespectrogramandthetime-domainwaveformofFinnishwordkaksi(two)arepresentedinFigure3.2.Higheramplitudesarepresentedwithdarkergray-levelssotheformantfrequenciesandtrajectoriesareeasytoperceive.Alsospectraldifferencesbetweenvowelsandconsonantsareeasytocomprehend.Therefore,spectrogramisperhapsthemostusefulpresentationforspeechresearch.FromFigure3.2itiseasytoseethatvowelshavemoreenergyanditisfocusedatlowerfrequencies.Unvoicedconsonantshaveconsiderablylessenergyanditisusuallyfocusedathigherfrequencies.Withvoicedconsonantsthesituationissomethingbetweenofthesetwo.InFigure3.2thefrequencyaxisisinkilohertz,butitisalsoquitecommontouseanauditoryspectrogramwherethefrequencyaxisisreplacedwithBark-orMel-scalewhichisnormalizedforhearingproperties.
Fig.3.2.Spectrogramandtime-domainpresentationofFinnishwordkaksi(two).
Fordeterminingthefundamentalfrequencyorpitchofspeech,forexampleamethodcalledcepstralanalysismaybeused(Cawley1996,Kleijnetal.1998).CepstrumisobtainedbyfirstwindowingandmakingDiscreteFourierTransform(DFT)forthesignalandthenlogaritmizingpowerspectrumandfinallytransformingitbacktothetime-domainbyInverseDiscreteFourierTransform(IDFT).TheprocedureisshowninFigure3.3.
Fig.3.3.Cepstralanalysis.
Cepstralanalysisprovidesamethodforseparatingthevocaltractinformationfromexcitation.Thusthereversetransformationcanbecarriedouttoprovidesmootherpowerspectrumknownashomomorphicfiltering.
Fundamentalfrequencyorintonationcontouroverthesentenceisimportantforcorrectprosodyandnaturalsoundingspeech.Thedifferentcontoursareusuallyanalyzedfromnaturalspeechinspecificsituationsandwithspecificspeakercharacteristicsandthenappliedtorulestogeneratethesyntheticspeech.ThefundamentalfrequencycontourcanbeviewedasthecompositesetofhierarchicalpatternsshowninFigure3.4.Theoverallcontourisgeneratedbythesuperpositionofthesepatterns(Sagisaga1990).MethodsforcontrollingthefundamentalfrequencycontoursaredescribedlaterinChapter5.
Fig.3.4.Hierarchicallevelsoffundamentalfrequency(Sagisaga1990).
3.2SpeechProduction
HumanspeechisproducedbyvocalorganspresentedinFigure3.5.Themainenergysourceisthelungswiththediaphragm.Whenspeaking,theairflowisforcedthroughtheglottisbetweenthevocalcordsandthelarynxtothethreemaincavitiesofthevocaltract,thepharynxandtheoralandnasalcavities.Fromtheoralandnasalcavitiestheairflowexitsthroughthenoseandmouth,respectively.TheV-shapedopeningbetweenthevocalcords,calledtheglottis,isthemostimportantsoundsourceinthevocalsystem.Thevocalcordsmayactinseveraldifferentwaysduringspeech.Themostimportantfunctionistomodulatetheairflowbyrapidlyopeningandclosing,causingbuzzingsoundfromwhichvowelsandvoicedconsonantsareproduced.Thefundamentalfrequencyofvibrationdependsonthemassandtensionandisabout110Hz,200Hz,and300Hzwithmen,women,andchildren,respectively.Withstopconsonantsthevocalcordsmayactsuddenlyfromacompletelyclosedpositioninwhichtheycuttheairflowcompletely,tototallyopenpositionproducingalightcoughoraglottalstop.Ontheotherhand,withunvoicedconsonants,suchas/s/or/f/,theymaybecompletelyopen.Anintermediatepositionmayalsooccurwithforexamplephonemeslike/h/.
Fig.3.5.Thehumanvocalorgans.
(1)Nasalcavity,
(2)Hardpalate,(3)Alveoralridge,(4)Softpalate(Velum),(5)Tipofthetongue(Apex),(6)Dorsum,(7)Uvula,(8)Radix,(9)Pharynx,(10)Epiglottis,(11)Falsevocalcords,(12)Vocalcords,(13)Larynx,(14)Esophagus,and(15)Trachea.
Thepharynxconnectsthelarynxtotheoralcavity.Ithasalmostfixeddimensions,butitslengthmaybechangedslightlybyraisingorloweringthelarynxatoneendandthesoftpalateattheotherend.Thesoftpalatealsoisolatesorconnectstheroutefromthenasalcavitytothepharynx.Atthebottomofthepharynxaretheepiglottisandfalsevocalcordstopreventfoodreachingthelarynxandtoisolatetheesophagusacousticallyfromthevocaltract.Theepiglottis,thefalsevocalcordsandthevocalcordsareclosedduringswallowingandopenduringnormalbreathing.
Theoralcavityisoneofthemostimportantpartsofthevocaltract.Itssize,shapeandacousticscanbevariedbythemovementsofthepalate,thetongue,thelips,thecheeksandtheteeth.Especiallythetongueisveryflexible,thetipandtheedgescanbemovedindependentlyandtheentiretonguecanmoveforward,backward,upanddown.Thelipscontrolthesizeandshapeofthemouthopeningthroughwhichspeechsoundisradiated.Unliketheoralcavity,thenasalcavityhasfixeddimensionsandshape.Itslengthisabout12cmandvolume60cm3.Theairstreamtothenasalcavityiscontrolledbythesoftpalate.
Fromtechnicalpointofview,thevocalsystemmaybeconsideredasasingleacoustictubebetweentheglottisandmouth.GlottalexcitedvocaltractmaybethenapproximatedasastraightpipeclosedatthevocalcordswheretheacousticalimpedanceZg=∞andopenatthemouth(Zm=0).Inthiscasethevolume-velocitytransferfunctionofvocaltractis(Flanagan1972,O'Saughnessy1987)
(3.1)
wherelisthelengthofthetube,ωisradianfrequencyandcissoundvelocity.ThedenominatoriszeroatfrequenciesFi=ωi/2π(i=1,2,3,...),where
and
(3.2)
Ifl=17cm,V(ω)isinfiniteatfrequenciesFi=500,1500,2500,...Hzwhichmeansresonancesevery1kHzstartingat500Hz.Ifthelengthlisotherthan17cm,thefrequenciesFiwillbescaledbyfactor17/lsothevocaltractmaybeapproximatedwithtwoorthreesectionsoftubewheretheareasofadjacentsectionsarequitedifferen
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- Phonetic figures are sound
![提示](https://static.bdocx.com/images/bang_tan.gif)