首页 | 本学科首页   官方微博 | 高级检索  
     检索      

汉语双模情感语音数据库标注及一致性检测
引用本文:景少玲,毛峡,陈立江,张娜娜.汉语双模情感语音数据库标注及一致性检测[J].北京航空航天大学学报,2015,41(10):1925-1934.
作者姓名:景少玲  毛峡  陈立江  张娜娜
作者单位:北京航空航天大学电子信息工程学院,北京,100191;北京航空航天大学电子信息工程学院,北京,100191;北京航空航天大学电子信息工程学院,北京,100191;北京航空航天大学电子信息工程学院,北京,100191
基金项目:高等学校博士学科点专项科研基金(20121102130001),中央高校基本科研业务费专项资金(YWF-14-DZXY-015)
摘    要:对缺少含有丰富情感标注信息的情感语音数据库问题,建立了一个包含语音和电声门图仪(EGG)信息的汉语双模情感语音数据库,并对其进行了标注和一致性检测.首先,根据情感语音数据库的特色制定了详细的标注规则和方法,由5名标注者按照制定的标注规则对情感语音数据库进行标注.其次,为了确保情感语音数据库的标注质量和测试标注规则的完整性,标注者在正式标注之前先进行了测试性标注,测试语音包含280条语音(7种情感×2名说话人×20条语音).最后,根据语音标注规则设计了相应的一致性检测算法.结果表明,在5 ms的时间误差范围内,5名标注者对相同语音标注的一致性平均可以达到60%以上,当误差范围增大至8 ms和10 ms时,一致性平均可提高5%和8%.实验说明5名标注者对语音的理解较一致,制定的标注规则比较完整,情感语音数据库的质量也较高.

关 键 词:汉语  双模  情感语音数据库  语音标注  一致性检测
收稿时间:2014-12-08

Annotations and consistency detection for Chinese dual-mode emotional speech database
JING Shaoling,MAO Xia,CHEN Lijiang,ZHANG Nana.Annotations and consistency detection for Chinese dual-mode emotional speech database[J].Journal of Beijing University of Aeronautics and Astronautics,2015,41(10):1925-1934.
Authors:JING Shaoling  MAO Xia  CHEN Lijiang  ZHANG Nana
Abstract:To solve problem of lacking emotional speech database with rich emotion annotation information, a Chinese dual-mode emotional speech database which contained speech and Electroglottography (EGG) information was established. Annotation and consistency detection for the established database were conducted. Firstly, we designed detailed annotation rules and methods according to characteristics of emotional speech database and selected 5 annotators labeling emotional speech database in accordance with the rules. Secondly, in order to ensure annotation quality of emotional speech database and test the integrity of annotation rules, annotators labeled parts of utterances as a test before the official annotation, the test material comprises 280 sentences (seven emotions×two actors×twenty sentences). Finally, according to the speech annotation rules, we designed corresponding consistency detection algorithm. The results show that within the time error range of 5 ms, the annotation consistency for the same utterances which labeled by 5 annotators reaches more than 60% on average. When the time error range increased to 8 ms and 10 ms, consistency can be increased by 5% and 8% on average. The experiment indicates that 5 annotators are more consistent in understanding speech. The annotation rules we designed are more complete. The quality of emotional speech database is higher.
Keywords:Chinese  dual-mode  emotional speech database  speech annotation  consistency detection
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京航空航天大学学报》浏览原始摘要信息
点击此处可从《北京航空航天大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号