首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
State-of-the-art in speaker recognition   总被引:1,自引:0,他引:1  
Recent advances in speech technologies have produced new tools that can be used to improve the performance and flexibility of speaker recognition. While there are few degrees of freedom or alternative methods when using fingerprint or iris identification techniques, speech offers much more flexibility and different levels to perform recognition: the system can force the user to speak in a particular manner, different for each attempt to enter. Also, with voice input, the system has other degrees of freedom, such as the use of knowledge/codes that only the user knows, or dialectical/semantical traits that are difficult to forge. This paper offers an overview of the state-of-the-art in speaker recognition, with special emphasis on the pros and cons, and the current research lines. The current research lines include improved classification systems, and the use of high level information by means of probabilistic grammars. In conclusion, speaker recognition is far away from being a technology where all the possibilities have already been explored.  相似文献   

2.
Automatic speaker recognition systems have been largely dominated by acoustic-spectral-based systems, relying in proper modelling of the short-term vocal tract of speakers. However, there is scientific and intuitive evidence that speaker-specific information is embedded in the speech signal in multiple short- and long-term characteristics. In this work, a multilevel speaker recognition system combining acoustic, phonotactic, and prosodic subsystems is presented and assessed by blind submission to NIST 2005 Speaker Recognition Evaluation  相似文献   

3.
4.
铁路道口报警及控制系统   总被引:4,自引:2,他引:2  
为了保证道口行车安全,在无人监护道口设置闪光和语音报警及道口横杆控制系统,利用单片机和无线通信技术以实现列车接近道口时,道口报警闪光器和语音扬声器同时报警,并放下道口横杆;列车离开道口时,道口闪光器和语音扬声器同时关闭,并升起道口横杆。在分析和比较几种单片机和传感器的基础上,确定本系统采用MCS51系列低功耗单片机和磁电传感器。  相似文献   

5.
Software designers at Hill Air Force Base have developed a voice recognition and speech synthesis system (Voice Control) for use with the F-16 Analog Test Station Sustainment (FATSS) project. The Voice Control system is reliable, speaker independent, and has a total added hardware price tag of under $50.00 per station. In contrast to traditional voice recognition systems, operator training is not required. OO-ALC has developed a general-purpose internal interface (Voice Control) to the Speech Recognition and Text-To-Speech engines provided by Microsoft. Voice Control can be accessed by any 32-bit Windows software, which has windows messaging capability. This is available to standard programming languages such as LabWindows CVI, Borland or Microsoft C/C++, Visual Basic, or even commercial packages such as Lotus Notes and Microsoft Word. Through Voice Control, the computer uses both video and voice prompts to request input from the operator. The operator is allowed to enter data and to control the software flow by voice command or from the keyboard or mouse. The Voice Control system allows for dynamic specification of a grammar set, or legal set of commands. The use of a reduced grammar set greatly increases recognition accuracy. The computer voice enables the operator to focus his attention away from the computer screen, which is required for activities such as probing a circuit card and taking readings. When the operator takes readings, the computer, to insure reliable entry, echoes his voice entries. With electronic tuning, speech synthesis allows the operator to hear the resulting reading, enabling him to focus on the circuit card instead of constantly turning his head to see the computer screen. This paper describes the capability and functionality of the Voice Control system  相似文献   

6.
Editor's Note: Mr. Fowler was the banquet speaker at the Tri Service Radar Symposium held at West Point in 1977. We invited Mr. Fowl to share his speech with the AES readership by providing this paper. The material and style correspond closely to his oral presentation.  相似文献   

7.
提出了一个基于语音交互的奥运信息查询系统。该系统综合应用语音识别、智能人机交互等先进技术,实现了通过语音对历届奥运会的相关信息的查询。在语音识别和问句理解方面,将统计方法与规则方法相结合,利用领域相关知识和受限语言文法,显著地提高了语音识别和问句理解的正确率。实验结果表明,提出和采用的各种技术方法是有效的。  相似文献   

8.
空管模拟机的IBM ViaVoice技术实现研究   总被引:2,自引:0,他引:2  
阐述了空管模拟机自动机长席位的实现方法。通过对语音识别技术的研究,在引进IBMViaVoice语音识别系统的基础上,根据空管模拟机的功能需求,通过对其主控软件的改进,使语音识别技术与管制训练控制系统有机结合,实现了自动机长席位对非特定管制学员、连续管制指令语言的识别和仿真模拟雷达显示控制。  相似文献   

9.
机器人手术网络遥控系统中的语音识别   总被引:1,自引:0,他引:1  
在广泛参考了国内外关于语音识别技术研究及其应用的优秀论文和著作后,简单介绍了语音识别技术及其算法。在熟悉了微软的语音开发工具包的基础上,汲取其中和本项目相关的部分,具体描述了完成机器人手术网络遥控系统中的语音识别部分的方法和实现的步骤。利用开发工具Microsoft VC 6.0和Microsoft Speech API 5.1,通过一个具体实例对机器人手术网络遥控系统中的语音识别部分作了较详细的论述。  相似文献   

10.
A new class of techniques for multisensor fusion and target recognition is proposed using sequence comparison by dynamic programming and multiple model estimation. The objective is to fuse information on the kinematic state and “nonkinematic” signature of unclassified targets, assessing the joint likelihood of all observed events for recognition. Relationships are shown to previous efforts in pattern recognition and state estimation. This research applies “classical” speech processing-related and other sequence comparison methods to moving target recognition, extends the efforts of previous researchers through improved fusion with kinematic information, relates the proposed techniques to Bayesian theory, and applies parameter identification methods to target recognition for improved understanding of the subject in general. The proposed techniques are evaluated and compared with existing approaches using the method of generalized ambiguity functions, which lends to a form of Cramer-Rao lower bound for target recognition  相似文献   

11.
夏鹏  杨特  徐江  王乐  杨智春 《航空学报》2021,42(7):224452-224452
将时延神经网络引入动载荷识别研究中,结合时延神经网络的"记忆"特性、因果有限长冲激响应(FIR)系统理论与振动响应的求解原理,提出一种利用时延神经网络的时域动载荷倒序识别方法。对一个受两点随机动载荷作用的舵面模型结构进行载荷识别验证实验,结果表明,用本文方法识别的两个激励点上识别载荷样本的时间序列与真实载荷样本的时间序列之间的均方根误差分别为0.635 4和2.543 7,识别载荷样本时间序列与真实载荷样本时间序列的相关系数分别为0.965 7和0.826 2,功率谱密度曲线也能够较好吻合。本文提出的方法具有不需要结构动力学模型、识别精度高的优点。  相似文献   

12.
13.
高谦  刘马宝  岳开宪 《航空学报》2009,30(7):1203-1207
通过对飞机座舱环境下语音信号特征的分析,提出了一种根据噪声分类逐类进行语音增强,然后依一定次序级联各类算法的语音增强方法,并讨论了级联次序对增强效果的影响。在分别实现脉冲噪声、周期噪声和宽带噪声消除算法的基础上,依据此方法开发了基于MATLAB计算环境的交互式语音增强系统,并利用该平台进行多种算法的级联实验,取得了良好的语音增强效果。  相似文献   

14.
为提高计算机连续语音识别准确率,应用遗传算法人工智能控制技术具有的高效全局启发式搜索特点,探讨了提高计算机连续语音识别准确率的优化方法,通过提取语音特征参数,作为初始语音群体的生成,把语音识别内核程序中语言焦点作为遗传算法中操作算子,以高效全局启发式适配语音类型,连续语音识别准确率提高7%,为实现真正意义上计算机“人一机对话”进行了大胆的尝试。  相似文献   

15.
通过对传统呼叫中心的分析与研究,针对传统呼叫中心的缺点,设计并实现了一种基于VoiceXML的呼叫中心平台.该平台采用开源项目OpenVXI作为VoiceXML解析器,在集成东进公司语音卡的基础上实现了语音合成、语音识别、呼叫控制等功能.利用该平台,开发人员可以轻松快捷地建立呼叫中心及各种语音增值业务.介绍了平台的总体设计和关键技术,最后给出了测试实例及运行结果.  相似文献   

16.
Nowadays many systems need a portable media to store some sensible data, such as smart cards. The information can be protected by the user with his Personal Identification Number (PIN), or through biometrics. Unfortunately, there is not a smart card today that can verify the biometric template inside it, performing this task in the terminal. The author has developed the algorithms and data structures needed to solve this problem. Therefore, he has created a smart card with user biometric authentication, based on an Open Platform smart card (in this case, a JavaCard). To achieve these results, different biometric techniques have been studied: speaker verification, hand geometry and iris recognition. Experimental results are given to show the viability of the prototype developed  相似文献   

17.
This paper analyzes the performance of a number of speech processing techniques that have been considered for single voice channel per carrier transmissions via satellite. This transmission mode is applicable to demand assigiment systems, which make possible the most efficient loading of the satellite transponder applications where the traffic at individual earth terminals is not sufficient to justify a substantial number of dedicated channels. The analysis basically determines the operating point for the satellite transponder which minimizes the required earth terminal G/T. Both analog and digital modulation techniques are considered, and the effects of syllabic companding and voice actuation of carriers are demonstrated. The results are presented in general analytic form, applicable to any modulation technique for which carrier-to-noise density ratio, channel width, and guardband width may be specified. Full carrier modulation is assumed. Curves are also furnished showing the required G/T and uplink per carrier EIRP as a function of the number of voice channels per transponder and the transponder gain. "CCIR/CCITT type" speech quality is assumed, together with a transponder El RP and bandwidth typical of current domestic satellite configurations.  相似文献   

18.
Speech recognition features desired by air traffic controllers, such as the ability to use complex messages and address hundreds of individual aircraft could not be implemented a decade ago, but these tasks became possible with improved speech recognition engines and an increase in processing power and memory. Speech recognition was a key element in the air traffic controller (ATC) workstation used to support a Controller-Pilot Data Link Communications (CPDLC) system. Our work, under the direction of the Avionics Engineering Center at Ohio University, was in support of the Federal Aviation Administration's (FAA) Runway Incursion Reduction Program (RIRP) and the National Aeronautics and Space Administration's (NASA) Runway Incursion Prevention System (RIPS) conducted at the Dallas-Fort Worth International Airport (DFW). This paper examines the challenges and opportunities of developing voice recognition software solutions in ATC workstations using multiple dialects and accents, complex and varied grammars and terminology, accuracy, hardware restrictions, and user-training procedures.  相似文献   

19.
A system has been developed to enhance the quality of mutilated speech. A standard spectrogram analysis of the damaged speech is performed. The speech is then resynthesized as a sum of computer-generated sinusoids whose amplitudes and phases are derived partly from the given mutilated speech signal and partly from rules based on known properties of normal speech. The sinusoids selected are only approximate harmonics of the glottal pitch and are selected by a nonlinear, noncausal set of rules to reduce the nonspeech components in the synthesized speech output. The system has been shown to increase the quality of the mutilated speech appreciably  相似文献   

20.
Conclusions X-ray variability is seen in all types of AGN but large amplitude ( factor 2) outbursts on short timescales (days) occur rarely, perhaps once every 100 days. There is no strong dependence of variability on luminosity, but radio-powerful AGN, particularly BL Lacs and 0VV QS0s, do vary most. Sensitive detectors, such as the EXOSAT ME, have been able to detect variability of smaller amplitude (20%) and on shorter timescales (1 hour) than previous experiments, but this too is not common. There is very little evidence of spectral variability during changes in intensity and so it is very likely that such changes are total power variations and not artefacts of variable obscuration. The variability timescales imply that most Seyfert galaxies are emitting well below the Eddington limit. On efficiency considerations only two observations of X-ray variability, those of the QS01525+227 and the BL Lac H0322+022, require exotic black hole models, relativistic beaming, or a change in the assumed value of H0. The most dramatic observation of variability so far reported, that of repeated variations on a timescale of 4000 seconds in NGC4051 is probably related to a hydrodynamical timescale in the accretion disc and encourages us to believe that, with future observations, our understanding of AGN may approach that of galactic X-ray sources.Many Seyferts do have a canonical =0.7 spectral index, but it is becoming increasingly clear that a wide variety of spectral indices exist, both in Seyfert galaxies and in other classes of AGN. Both thermal and non-thermal emission mechanisms are tenable explanations for most of these spectra as, in general, the very high energy observations which could distinguish between the two are not available.Timing observations rarely require relativistic beaming, however, the (low) observed X-ray fluxes of BL Lacs and 0VV QS0s generally do. reacceleration of particles on short timescales is necessary to explain the continuous infrared to X-ray spectra of BL Lacs.The status of soft excesses in the low energy spectra of Seyfert galaxies which have canonical medium energy spectra is not clear. A separate soft component has been detected in EXOSAT observations of NGC4151 but this need not be associated with the nuclear continuum source. No SSS or EXOSAT observations definitely require such excesses. EXOSAT is, in principle, very sensitive to soft excesses but the uncertainty in the Boron filter calibration and in the value of the galactic absorption at present limit precise determinations.The absorbing column in the direction of many AGN is, in many cases, entirely accountable for purely by absorption in our own galaxy. In cases where a substantial absorbing column is detected, variations in the column are occasionally seen but it is not yet clear whether these variations are due to bulk movements of obscuring material or increased photoionisation (warm absorbers). All observations of iron lines are consistent with fluorescence in a cold gas which probably surrounds the X-ray emitting region in a sphere or shell-type geometry, though (by Gauss' law) this need not necessarily lie immediately next to the central black hole.Detailed observations of the time-variability of the complete X-ray to radio spectrum offer the best hope of further progress in this complex but interesting field.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号