Back to Silas S. Brown's home page
Chinese mistakes in commercial speech synthesizersCommercial unit-selection voices may sound pleasant, but they do make mistakes. If you use one for language learning, be sure that it is not your only source. For example Gradint has a function to alternate between different synthesizers on different repeats (it also has a syllable-based voice which should at least be predictable).
To demonstrate the trouble with unit-selection voices for language learning, below are some example Chinese mistakes that I found, usually after just a few minutes of experimenting with each voice.
|Google Translate (2011-05, using SVOX Yun which is also used by Android)||继续学院||The 学 is only half-pronounced. It seems like they had a recording of a whole 学 but some program played only half of it. You can't really hear the '-ue' of the 'xue'.|
|糖尿病||'n' of 尿 unclear|
|深省||Google correctly says this is "shēn xǐng", but its voice incorrectly says "shēn shěng" (the voice must be using a smaller dictionary than the transcriber)|
|绝||somewhat unclear when spoken in isolation|
|Beijing Infoquick SinoVoice (2011-05; online trial no longer available)||用出来||The main word 用 could be clearer; at least 来 (and possibly 出) should be neutral tone (轻声) but isn't|
|iFlyTek InterPhonic / Bider SpeechPlus (free trial no longer available)||bao3zheng4, bian4ming2, fou3ren4, jia3ru2, mei3zhou1, mu4du3, many others (via CSSML pinyin markup)||Incorrect syllables spoken (I'd have thought pinyin gives better control but it doesn't)|
|Neospeech Hui (2011-05, rebranded as ReadSpeaker Mandarin Female in 2019)||糖尿病||'n' of 尿 unclear|
|奉公守法||first syllable unclear|
|ScanSoft (Nuance) MeiLing (also used by Nokia)||深省||省 spoken as shěng instead of xǐng; no way to add a dictionary entry to override it|
|地, 行 and many other ambiguous hanzi||Engine often gets the wrong reading (e.g. dì instead of de in many adverbs, xíng instead of háng in 十四行诗), no way to override (except sometimes by writing wrong hanzi)|
|邮编||编 pitch too low for the context|
|切合实际，对||际 in 切合实际 by itself is correctly pronounced jì, but when followed by |
|絶 (variant of 絕/绝), 説 (variant of 說/说) and others||completely skipped, with no indication that there is a missing character in the text|
|用户界面||界 sounds too much like 3rd tone instead of 4th tone|
|齁声||Pitch falls from B to E-flat. Some drop in pitch of tone 1 at the end of a phrase is acceptable, but an augmented fifth? (Compare 中东, 拼车, etc)|
|人文学||Faults on 文 (but not in 人文 by itself). Sounds better if incorrectly written as 人闻学.|
|撞击||击 sounds like a truncated neutral tone instead of tone 1|
|电脑及资讯科技||something like half a 个 is inserted before the 及|
|劫难||sounds more like jián'àn than jiénàn (it must be a coded exception to 难's usual nán pronunciation but it seems the syllable boundary is wrong)|
|耳闻||ěr sounds like èr|
|Microsoft Lili (couldn't test but heard a demo)||才||spoken as an unclear cǎi instead of cái (the old "MS Simplified Chinese" voice actually gets this one right but gets 央行 wrong)|
|Neospeech Lily (no longer sold separately but used by NextSpeak and ImTranslator 2011-05 without the lexicon access)||糖尿病||'n' of 尿 very unclear|
|yong4chu5lai5, zhuan3lai2zhuan3qu4 (via pinyin lexicon)||Incorrectly read as yòngchūlai, zhuǎilái... but OK if input as hanzi 用出来, 转来转去|
|chan3chu2 or 铲除||says chù instead of chú|
|shan4yong4 or 善用||shèn instead of shàn in pinyin; "n"s clipped in hanzi|
|li4bi4 or 利弊||sounds like bībì|
|you2bian1 or 邮编||biān pitch too low for the context|
|jia1de5fu1||spoken as jiādìfū (maybe it's being treated as 加的夫, which might be right but a pinyin override shouldn't try to guess what the pinyin should have been; what if it came from 家的夫?)|
|Loquendo Lisheng (2011; interactive demo no longer available)||mu4du3, mu4du4.||both words seem to end in dù (the du3 sounds OK if it's the last thing in the sentence)|
|Apple Ting-ting (in OS 10.7)||乐||always spoken as yuè even in words like 快乐 and 乐意 when it should be lè (fixed before macOS 11.4; these and other dictionary mistakes---pó instead of fán in 繁体字, etc---are forgivable because the voice can work reasonably well from pinyin)|
|mu4du3, mu4du4||Both "du" sounds seem incomplete|
|yue4du2||dú fails to rise in pitch|
|zhi1 di4||dì sounds too neutral ("fa2 zhi1 di4" is worse as this zhī is high by comparison)|
|jing4qi2li3||q sounds like x in this context|
|ming2 que4||què glitches in mid-syllable (it's OK when said in isolation)|
|jing1juan4||juan sounds like a garbled jue (can also sound like jue in contexts e.g. jing1juan4ming2)|
|chang3kai1||chǎng sounds like a tone 1 higher than the kāi; if doubled to 敞开敞开, the second chǎng is better but is almost a full third tone instead of a half|
|kou3 zheng1guo1||guo sounds almost like gua (zheng1guo1 by itself is better except the pitch falls nearly a major sixth)|
|cheng2qiang2 tan1ta1||q becomes like x + pitch drop at end|
|qu3dai4||tones not clear|
|ying3 pian4||n dropped (better in context)|
All material © Silas S. Brown unless otherwise stated.
Android is a trademark of Google LLC.
Apple is a trademark of Apple Inc.
Baidu is a trademark of Baidu Online Network Technology (Beijing) Co. Ltd.
Google is a trademark of Google LLC.
Loquendo is a trademark of Loquendo S.p.A.
Microsoft is a registered trademark of Microsoft Corp.
ScanSoft and Nuance are trademarks of Nuance Communications, Inc.
Any other trademarks I mentioned without realising are trademarks of their respective holders.