当前位置:文档之家› 外文翻译---说话人识别

外文翻译---说话人识别

外文翻译---说话人识别
外文翻译---说话人识别

附录A 英文文献

Speaker Recognition

By Judith A. Markowitz, J. Markowitz Consultants

Speaker recognition uses features of a person?s voice to identify or verify that person. It is a well-established biometric with commercial systems that are more than 10 years old and deployed non-commercial systems that are more than 20 years old. This paper describes how speaker recognition systems work and how they are used in applications.

1. Introduction

Speaker recognition (also called voice ID and voice biometrics) is the only human-biometric technology in commercial use today that extracts information from sound patterns. It is also one of the most well-established biometrics, with deployed commercial applications that are more than 10 years old and non-commercial systems that are more than 20 years old.

2. How do Speaker-Recognition Systems Work

Speaker-recognition systems use features of a person?s voice and speaking style to:

●attach an identity to the voice of an unknown speaker

●verify that a person is who she/ he claims to be

●separate one person?s voice from other voices in a multi-speaker

environment

The first operation is called speak identification or speaker recognition; the second has many names, including speaker verification, speaker authentication, voice verification, and voice recognition; the third is speaker separation or, in some situations, speaker classification. This papers focuses on speaker verification, the most highly commercialized of these technologies.

2.1 Overview of the Process

Speaker verification is a biometric technology used for determining whether the person is who she or he claims to be. It should not be confused with speech recognition, a non-biometric technology used for identifying what a person is saying. Speech recognition products are not designed to determine who is speaking.

Speaker verification begins with a claim of identity (see Figure A1). Usually, the claim entails manual entry of a personal identification number (PIN), but a growing number of products allow spoken entry of the PIN and use speech recognition to identify the numeric code. Some applications replace manual or spoken PIN entry with bank cards, smartcards, or the number of the telephone being used. PINS are also eliminated when a speaker-verification system contacts the user, an approach typical of systems used to monitor home-incarcerated criminals.

Figure A1.

Once the identity claim has been made, the system retrieves the stored voice sample (called a voiceprint) for the claimed identity and requests spoken input from the person making the claim. Usually, the requested input is a password. The newly input speech is compared with the stored voiceprint and the results of that comparison are measured against an acceptance/rejection threshold. Finally, the system accepts the speaker as the authorized user, rejects the speaker as an impostor, or takes another action determined by the application. Some systems report a confidence level or other score indicating how confident it about its decision.

If the verification is successful the system may update the acoustic information in the stored voiceprint. This process is called adaptation. Adaptation is an unobtrusive solution for keeping voiceprints current and is used by many commercial speaker verification systems.

2.2 The Speech Sample

As with all biometrics, before verification (or identification) can be performed the person must provide a sample of speech (called enrolment). The sample is used to create the stored voiceprint.

Systems differ in the type and amount of speech needed for enrolment and verification. The basic divisions among these systems are

●text dependent

●text independent

●text prompted

2.2.1 Text Dependent

Most commercial systems are text dependent.Text-dependent systems expect the speaker to say a pre-determined phrase, password, or ID. By controlling the words that are spoken the system can look for a close match with the stored voiceprint. Typically, each person selects a private password, although some administrators prefer to assign passwords. Passwords offer extra security, requiring an impostor to know the correct PIN and password and to have a matching voice. Some systems further enhance security by not storing a human-readable representation of the password.

A global phrase may also be used. In its 1996 pilot of speaker verification Chase Manhattan Bank used …Verification by Chemical Bank?. Global phrases avoid the problem of forgotten passwords, but lack the added protection offered by private passwords.

2.2.2 Text Independent

Text-independent systems ask the person to talk. What the person says is different every time. It is extremely difficult to accurately compare utterances that are totally different from each other - particularly in noisy environments or over poor telephone connections. Consequently, commercial deployment of text-independent

verification has been limited.

2.2.3 Text Prompted

Text-prompted systems (also called challenge response) ask speakers to repeat one or more randomly selected numbers or words (e.g. “43516”, “27,46”, or “Friday, c omputer”). Text prompting adds time to enrolment and verification, but it enhances security against tape recordings. Since the items to be repeated cannot be predicted, it is extremely difficult to play a recording. Furthermore, there is no problem of forgetting a password, even though the PIN, if used, may still be forgotten.

2.3 Anti-speaker Modelling

Most systems compare the new speech sample with the stored voiceprint for the claimed identity. Other systems also compare the newly input speech with the voices of other people. Such techniques are called anti-speaker modelling. The underlying philosophy of anti-speaker modelling is that under any conditions a voice sample from a particular speaker will be more like other samples from that person than voice samples from other speakers. If, for example, the speaker is using a bad telephone connection and the match with the speaker?s voiceprint is poor, it is likely that the scores for the cohorts (or world model) will be even worse.

The most common anti-speaker techniques are

●discriminate training

●cohort modeling

●world models

Discriminate training builds the comparisons into the voiceprint of the new speaker using the voices of the other speakers in the system. Cohort modelling selects a small set of speakers whose voices are similar to that of the person being enrolled. Cohorts are, for example, always the same sex as the speaker. When the speaker attempts verification, the incoming speech is compared with his/her stored voiceprint and with the voiceprints of each of the cohort speakers. World models (also called background models or composite models) contain a cross-section of voices. The same world model is used for all speakers.

2.4 Physical and Behavioural Biometrics

Speaker recognition is often characterized as a behavioural biometric. This description is set in contrast with physical biometrics, such as fingerprinting and iris scanning. Unfortunately, its classification as a behavioural biometric promotes the misunderstanding that speaker recognition is entirely (or almost entirely) behavioural. If that were the case, good mimics would have no difficulty defeating speaker-recognition systems. Early studies determined this was not the case and identified mimic-resistant factors. Those factors reflect the size and shape of a speaker?s speaking mechanism (called the vocal tract).

The physical/behavioural classification also implies that performance of physical biometrics is not heavily influenced by behaviour. This misconception has led to the design of biometric systems that are unnecessarily vulnerable to careless and resistant users. This is unfortunate because it has delayed good human-factors design for those biometrics.

3. How is Speaker Verification Used?

Speaker verification is well-established as a means of providing biometric-based security for:

●telephone networks

●site access

●data and data networks

and monitoring of:

●criminal offenders in community release programmes

●outbound calls by incarcerated felons

●time and attendance

3.1 Telephone Networks

Toll fraud (theft of long-distance telephone services) is a growing problem that costs telecommunications services providers, government, and private industry US$3-5 billion annually in the United States alone. The major types of toll fraud include the following:

●Hacking CPE

●Calling card fraud

●Call forwarding

●Prisoner toll fraud

●Hacking 800 numbers

●Call sell operations

●900 number fraud

●Switch/network hits

●Social engineering

●Subscriber fraud

●Cloning wireless telephones

Among the most damaging are theft of services from customer premises equipment (CPE), such as PBXs, and cloning of wireless telephones. Cloning involves stealing the ID of a telephone and programming other phones with it. Subscriber fraud, a growing problem in Europe, involves enrolling for services, usually under an alias, with no intention of paying for them.

Speaker verification has two features that make it ideal for telephone and telephone network security: it uses voice input and it is not bound to proprietary hardware. Unlike most other biometrics that need specialized input devices, speaker verification operates with standard wireline and/or wireless telephones over existing telephone networks. Reliance on input devices created by other manufacturers for a purpose other than speaker verification also means that speaker verification cannot expect the consistency and quality offered by a proprietary input device. Speaker verification must overcome differences in input quality and the way in which speech frequencies are processed. This variability is produced by differences in network type (e.g. wireline v wireless), unpredictable noise levels on the line and in the background, transmission inconsistency, and differences in the microphone in telephone handset. Sensitivity to such variability is reduced through techniques such as speech enhancement and noise modelling, but products still need to be tested under expected conditions of use.

Applications of speaker verification on wireline networks include secure calling cards, interactive voice response (IVR) systems, and integration with security for

proprietary network systems. Such applications have been deployed by organizations as diverse as the University of Maryland, the Department of Foreign Affairs and International Trade Canada, and AMOCO. Wireless applications focus on preventing cloning but are being extended to subscriber fraud. The European Union is also actively applying speaker verification to telephony in various projects, including Caller Verification in Banking and Telecommunications, COST250, and Picasso.

3.2 Site access

The first deployment of speaker verification more than 20 years ago was for site access control. Since then, speaker verification has been used to control access to office buildings, factories, laboratories, bank vaults, homes, pharmacy departments in hospitals, and even access to the US and Canada. Since April 1997, the US Department of Immigration and Naturalization (INS) and other US and Canadian agencies have been using speaker verification to control after-hours border crossings at the Scobey, Montana port-of-entry. The INS is now testing a combination of speaker verification and face recognition in the commuter lane of other ports-of-entry.

3.3 Data and Data Networks

Growing threats of unauthorized penetration of computing networks, concerns about security of the Internet, and increases in off-site employees with data access needs have produced an upsurge in the application of speaker verification to data and network security.

The financial services industry has been a leader in using speaker verification to protect proprietary data networks, electronic funds transfer between banks, access to customer accounts for telephone banking, and employee access to sensitive financial information. The Illinois Department of Revenue, for example, uses speaker verification to allow secure access to tax data by its off-site auditors.

3.4 Corrections

In 1993, there were 4.8 million adults under correctional supervision in the United States and that number continues to increase. Community release programmes, such as parole and home detention, are the fastest growing segments of this industry. It is no longer possible for corrections officers to provide adequate monitoring of

those people.

In the US, corrections agencies have turned to electronic monitoring systems. Since the late 1980s speaker verification has been one of those electronic monitoring tools. Today, several products are used by corrections agencies, including an alcohol breathalyzer with speaker verification for people convicted of driving while intoxicated and a system that calls offenders on home detention at random times during the day.

Speaker verification also controls telephone calls made by incarcerated felons. Inmates place a lot of calls. In 1994, US telecommunications services providers made $1.5 billion on outbound calls from inmates. Most inmates have restrictions on whom they can call. Speaker verification ensures that an inmate is not using another inmate?s PIN to make a forbidden contact.

3.5 Time and Attendance

Time and attendance applications are a small but growing segment of the speaker-verification market. SOC Credit Union in Michigan has used speaker verification for time and attendance monitoring of part-time employees for several years. Like many others, SOC Credit Union first deployed speaker verification for security and later extended it to time and attendance monitoring for part-time employees.

4. Standards

This paper concludes with a short discussion of application programming interface (API) standards. An API contains the function calls that enable programmers to use speaker-verification to create a product or application. Until April 1997, when the Speaker Verification API (SV API) standard was introduced, all available APIs for biometric products were proprietary. SV API remains the only API standard covering a specific biometric. It is now being incorporated into proposed generic biometric API standards. SV API was developed by a cross-section of speaker-recognition vendors, consultants, and end-user organizations to address a spectrum of needs and to support a broad range of product features. Because it supports both high level functions (e.g. calls to enrol) and low level functions (e.g. choices of audio input features) it

facilitates development of different types of applications by both novice and experienced developers.

Why is it important to support API standards? Developers using a product with a proprietary API face difficult choices if the vendor of that product goes out of business, fails to support its product, or does not keep pace with technological advances. One of those choices is to rebuild the application from scratch using a different product. Given the same events, developers using a SV API-compliant product can select another compliant vendor and need perform far fewer modifications. Consequently, SV API makes development with speaker verification less risky and less costly. The advent of generic biometric API standards further facilitates integration of speaker verification with other biometrics. All of this helps speaker-verification vendors because it fosters growth in the marketplace. In the final analysis active support of API standards by developers and vendors benefits everyone.

附录B 中文翻译

说话人识别

作者:Judith A. Markowitz, J. Markowitz Consultants 说话人识别是用一个人的语音特征来辨认或确认这个人。有着10多年的商业系统和超过20年的非商业系统部署,它是一种行之有效的生物测定学。本文介绍了说话人识别系统的工作原理,以及它们在应用软件中如何被使用。

1. 介绍

说话人识别(也叫语音身份和语音生物测定学)是当今从声音模式提取信息的商业应用中唯一的人类生物特征识别技术。有着10多年的商业应用程序部署和超过20年的非商业系统,它也是最行之有效的生物测定学之一。

2. 说话人识别系统如何工作

说话人识别系统使用一个人的语音和说话风格来达到以下目的:

●为一个未知说话人的声音绑定一个身份

●确认一个人是他/她所宣称的

●在多说话人的环境中从其它的声音中区分出每一特定人的声音

第一个操作被称为说话人辨认或说话人识别;第二个有许多名字,包括说话人确认,说话人鉴定,声音确认和声音识别;第三个是说话人分离,某些情形下也叫说话人分类。本文着重这些技术中最高度商业化的说话人确认。

2.1 方法概览

说话人确认是决定一个人是否是他或她所宣称身份的一种生物测定技术。它不应同语音识别相混淆。后者是一种用来确定一个人说什么的非生物测定技术。语音识别产品不是被设计用来确定谁在发言的。

说话人确认以一个身份声明开始(见图B1)。通常情况下,声明需要手工输入个人识别码( PIN ) 但越来越多的产品允许发言输入密码并使用语音识别确定数字代码。一些应用程序用银行卡,智能卡,或使用中的电话号码取代个人识别码的手

动或语音输入。当一个说话人确认系统联系用户时,个人识别码也会被取消,一个典型的这种系统被用来监测在家服刑的罪犯。

用户:声明一个身份

系统:访问该身份的存储声纹

系统:提示用户输入密码

用户:说出密码

系统:比较密码和存储样本

系统:比较结果和阈值

系统:接受或拒绝身份声明

图B1

一旦身份声明被做出,系统会取回声明身份的存储语音样本(叫做声纹)并要求声明用户的语音输入。通常,要求的输入是一个密码。最新输入的语音同存储的声纹相比较,比较的结果用一个接受/拒绝的阈值进行衡量。最终,系统接受说话人为授权用户,或拒绝说话人为冒名顶替者,或做出应用程序定义的其它动作。一些系统报告一个可信度或其它评分来说明它的决定的可信程度。

如果确认成功,系统可能升级存储声纹的声学信息。这个过程叫做适应。适应是用来保持声纹正确性的一种稳妥的解决方案。它在许多商用说话人确认系统中被使用。

2.2 语音样本

同所有的生物认证一样,在确认(或辨认)可以被执行之前,一个语音样本必须被提供(这个过程也叫做登记)。这个样本被用来生成存储声纹。

在需要登记和确认的语音类型和数量方面,系统之间有区别。这些系统的基本分类是:

●文本相关

●文本无关

●文本提示型

2.2.1 文本相关

大部分的商业系统都是文本相关的。文本相关的系统期待用户说出事先定义好的词组、密码或者标识符。通过对被说出单词的控制,系统可以从存储的声纹中找出最为匹配的一个。一个典型的例子,每个用户可以选择一个私有的密码,尽管一些管理员更喜欢分配密码。因为冒名顶替者需要同时知道正确的个人身份号码和密码并且还要拥有一个相匹配的声音,所以密码提供了额外的安全性。有些系统通过不存储密码的人类可读性信息来进一步提高安全性。

通用短语也可以被使用。在1996年的说话人确认试验中,大通曼哈顿银行使用了“化学银行确认”。通用短语避免了忘记密码的问题,但是缺乏私有密码所提供的额外保护。

2.2.2 文本无关

文本无关的系统要求用户说话。该用户每次说的内容是不同的。精确的匹配完全不同的语音是非常困难的,尤其是在高噪音环境下或者非常差的电话连接中。因此,文本无关确认的商业化部署受到限制。

2.2.3 文本提示型

文本提示系统(也叫做口令应答)要求说话人重复一个或多个随机选择的数字或单词(例如“43516”、“27、46”或者“星期五、计算机”)。文本提示增加了登记和确认的时间,但是它提高了针对磁带录音的安全性。由于重述的条目不能被预测到,播放录音是非常困难的。此外,这里没有忘记密码的问题。即使是使用个人身份号码,它也可能被遗忘掉。

2.3 反说话人模型

大部分系统把新的语音样本同要求身份的存储声纹进行比较。另一些系统也把最近输入的语音同其它人的声音相比较。这种技术被叫做反说话人模型。反说话人模型的基本原理是在任何条件下,来自某一特定说话人的语音样本比起其它说话人的语音样本总是更像这个说话人的其它样本。例如,如果说话人使用一个差的电话连接并且这个说话人的声纹匹配也很差,很有可能同期组群(或世界模型)的得分会更差。

最常见的反说话人技术有:

●区别训练

●同期组群模型

●世界模型

区别训练在系统中建立了使用其它说话人声音的新说话人的声纹对照。同期组群模型挑选少数说话人。他们的声音与已登记人类似。例如,同期组群通常是相同性别的说话人。当说话人试图确认时,进入的语音与他/她的声纹及其每一个同期组群说话人的声纹进行比较。世界模型(又称背景模式或复合模式) 包含一个语音的横截面断片。同一个世界模型被用于所有的说话人。

2.4 物理和行为生物测定学

说话人识别通常表现为行为生物测定学的特征。这样的描述是设定在与物理生物测定学的对照中的,例如指纹,虹膜扫描。不幸的是,其作为行为生物测定学的分类促进了说话人识别被认为是完全(或者几乎完全)是行为性的误解。如果是那样的话,好的模仿者会毫无困难地击败说话人识别系统。早期的研究决定了事实并非如此。它们确定了模仿抵抗因素。这些因素反映了说话人发音器官(叫做声道)的大小和形状。

物理/行为的分类也暗示了物理生物测定学的性能不会受到很强的行为影响。这种误解曾导致不必要地易受粗心、有抵抗力的用户攻击的生物测定系统的设计。这是不幸的,因为它延缓了用于那些生物测定的好的人性因素设计。

3 说话人确认如何使用

说话人确认是一种行之有效的生物型安全手段。它常用于:

●电话网络

●站点访问

●数据和数据网络

此外,也用于以下情况的监测:

●罪犯的社区释放方案

●在押重犯的外拨电话

●时间和出勤

3.1 电话网络

收费欺诈(盗用长途电话服务)是一个日益严重的问题,仅在美国它每年花费电讯服务供应商、政府与私营行业3-5亿美元。主要的收费欺诈类型包括以下几种:

●黑客终端

●电话卡诈骗

●呼叫促进

●囚犯收费欺诈

●黑客800号码

●电话业务出售

●900号码欺骗

●交换机/网络攻击

●社会操纵

●欺诈订户

●克隆无线电话

其中最具破坏性的是客户端设备服务盗取,例如专用分组交换机和无线电话的克隆。克隆包括电话号码的盗取并用它编程其它话机。在欧洲,订户欺诈是一个日益严重的问题。它涉及通常化名的服务登记,使用者无意为化名支付费用。

说话人确认有两个特征使它非常适用于电话和电话网络安全:它使用输入的语音而无需进入私人的硬件。不像其它的生物测定需要特殊的输入设备,说话人识别可以在现有电话网络上的有线及/或无线电话上运转。输入设备制造厂商的目的不是说话人确认。依靠他们制造的输入设备意味着不能指望依靠一个专有输入装置来获得稳固性和质量。说话人识别必须克服输入设备和语音频率处理方式上的困难。可变性是由不同的网络类型(例如有线和无线),线路和环境中不可预知的噪音水平,传输不一致以及电话听筒的麦克风不同所引起。这种可变性的灵敏度可以通过类似语音增强和噪音模型的技术来减弱,但是产品仍旧需要在期待的使用环境下进行测试。

有线网络上的说话人识别应用包括安全呼叫卡,互动声讯系统和专有网络体系的安全整合。这种应用已经在马里兰大学、加拿大外交商贸部和美国石油公司多种组织中配置起来。无线应用的重点在于防止克隆,但目前正扩展至订户欺诈。欧盟也积极地将说话人识别运用到各种项目的电话业务中,其中包括银行和电信业的呼叫者确认系统,COST250系统和毕加索系统。

3.2站点访问

20多年前,第一个说话人确认系统的部署是用于站点访问控制的。从那时起,说话人确认已经被用于办公楼、工厂、实验室、银行保险箱、住宅、医院药剂部门,甚至进入美国和加拿大的访问控制。从1997年起,美国移民局(INS)和其他美国和加拿大的机构已经使用说话人确认来控制斯克比下班后的边境口岸和蒙大拿州的入境港。美国移民局正在其它入境港的通勤线测试一个说话人确认和人脸识别的联合系统。

3.3 数据和数据网络

日益增长的涉及到互联网安全的未经授权的计算机网络渗透威胁和有数据访问需求的场外雇员的增加已经导致数据和网络安全的说话人确认应用的高潮。

在使用说话人确认保护专有数据网络,电子资金银行转帐,电话银行的客户账户访问和雇员访问敏感金融信息方面,金融服务行业一直处于领先地位。例如,伊利诺斯州税务部使用说话人确认允许它的场外审计员对税务数据进行安全访问。3.4 惩教

1993年,美国共有480万成年人处于惩教监管之下并且这个数字仍在继续增加。社区释放方案,如假释和家庭拘留,是这个行业增长最快的部分。狱警为这些人提供足够的监控已经不再可能了。

在美国,惩教机构已经转向电子监控系统。从二十世纪八十年代后期开始,说话人确认已经成为那些电子监控工具中的一种。今天,包括用于酒后驾驶者的说话人确认酒精检测器和一个在白天随机时间呼叫家庭拘留罪犯的系统在内的好几种产品被惩教机构使用。

说话人确认也可以控制在押重犯的电话呼叫。囚犯的地方有很多电话。1994年,美国电信服务供应商从监狱对外呼叫中得到15亿美元。大部分犯人在呼叫对象上有限制。说话人确认确保犯人不使用另一位同室者的个人身份号码获得禁止的联系。

3.5 时间和出勤

时间和出勤应用是说话人确认市场中很小但持续增长的一部分。密歇根州的SOC信用合作社已经将说话人确认用于兼职员工的时间和出勤监测好几年了。和其它机构一样,SOC信用合作社首先配置说话人识别用于安全,之后扩展到兼职员工的时间和出勤监测。

4. 标准

本文用简短的讨论总结应用程序接口(API)标准。一个应用程序接口包含一个函数,程序员能够调用它生成一个说话人确定的产品或应用。直到1997年四月,说话人识别应用程序接口(SVAPI)才被提出。在此之前,所有可得到的生物测定产品的应用程序接口都是私有的。说话人识别应用程序接口仍旧是唯一涵盖一种具体生物测定学的应用程序接口标准。它现已被纳入拟议的生物测定学通用应用程序接口标准。说话人识别应用程序接口是由一个跨部门的说话人识别供应商,顾问和终端用户组织发展的,用于解决一系列的需求并支持一系列广泛的产品特色。因为既支持高层次的功能(如呼叫登记)又支持低层次的功能(如选择音频输入特点),它有利于被新手和有经验的开发商发展出不同类型的应用。

为什么支持应用程序接口标准非常重要呢?如果产品厂商生意倒闭、不再支持该产品或者没有跟上技术进步,那么使用私有应用程序接口产品的开发者就会面临困难的选择。其中的一个选择就是使用不同的产品从零开始重建应用程序。相同的情况下,使用兼容说话人识别应用程序接口产品的开发人员可以选择另一个兼容厂商,从而只需要作少得多的修改。因此,说话人识别应用程序接口使说话人确认的开发更小风险并更少费用。通用生物测定应用程序接口标准的出现使说话人确认同其它生物测定学的结合更加便利。所以的这些都是对说话人确认厂商有利的因为培养了市场的增长。归根究柢,开发者和厂商对应用程序接口标准的积极支持有益于每一个人。

机器人外文翻译

英文原文出自《Advanced Technology Libraries》2008年第5期 Robot Robot is a type of mechantronics equipment which synthesizes the last research achievement of engine and precision engine, micro-electronics and computer, automation control and drive, sensor and message dispose and artificial intelligence and so on. With the development of economic and the demand for automation control, robot technology is developed quickly and all types of the robots products are come into being. The practicality use of robot products not only solves the problems which are difficult to operate for human being, but also advances the industrial automation program. At present, the research and development of robot involves several kinds of technology and the robot system configuration is so complex that the cost at large is high which to a certain extent limit the robot abroad use. To development economic practicality and high reliability robot system will be value to robot social application and economy development. With the rapid progress with the control economy and expanding of the modern cities, the let of sewage is increasing quickly: With the development of modern technology and the enhancement of consciousness about environment reserve, more and more people realized the importance and urgent of sewage disposal. Active bacteria method is an effective technique for sewage disposal,The lacunaris plastic is an effective basement for active bacteria adhesion for sewage disposal. The abundance requirement for lacunaris plastic makes it is a consequent for the plastic producing with automation and high productivity. Therefore, it is very necessary to design a manipulator that can automatically fulfill the plastic holding. With the analysis of the problems in the design of the plastic holding manipulator and synthesizing the robot research and development condition in recent years, a economic scheme is concluded on the basis of the analysis of mechanical configuration, transform system, drive device and control system and guided by the idea of the characteristic and complex of mechanical configuration,

人形机器人论文中英文资料对照外文翻译

中英文资料对照外文翻译 最小化传感级别不确定性联合策略的机械手控制 摘要:人形机器人的应用应该要求机器人的行为和举止表现得象人。下面的决定和控制自己在很大程度上的不确定性并存在于获取信息感觉器官的非结构化动态环境中的软件计算方法人一样能想得到。在机器人领域,关键问题之一是在感官数据中提取有用的知识,然后对信息以及感觉的不确定性划分为各个层次。本文提出了一种基于广义融合杂交分类(人工神经网络的力量,论坛渔业局)已制定和申请验证的生成合成数据观测模型,以及从实际硬件机器人。选择这个融合,主要的目标是根据内部(联合传感器)和外部( Vision 摄像头)感觉信息最大限度地减少不确定性机器人操纵的任务。目前已被广泛有效的一种方法论就是研究专门配置5个自由度的实验室机器人和模型模拟视觉控制的机械手。在最近调查的主要不确定性的处理方法包括加权参数选择(几何融合),并指出经过训练在标准操纵机器人控制器的设计的神经网络是无法使用的。这些方法在混合配置,大大减少了更快和更精确不同级别的机械手控制的不确定性,这中方法已经通过了严格的模拟仿真和试验。 关键词:传感器融合,频分双工,游离脂肪酸,人工神经网络,软计算,机械手,可重复性,准确性,协方差矩阵,不确定性,不确定性椭球。 1 引言 各种各样的机器人的应用(工业,军事,科学,医药,社会福利,家庭和娱乐)已涌现了越来越多产品,它们操作范围大并呢那个在非结构化环境中运行 [ 3,12,15]。在大多数情况下,如何认识环境正在发生变化且每个瞬间最优控制机器人的动作是至关重要的。移动机器人也基本上都有定位和操作非常大的非结构化的动态环境和处理重大的不确定性的能力[ 1,9,19 ]。每当机器人操作在随意性自然环境时,在给定的工作将做完的条件下总是存在着某种程

人脸识别论文文献翻译中英文

人脸识别论文中英文 附录(原文及译文) 翻译原文来自 Thomas David Heseltine BSc. Hons. The University of York Department of Computer Science For the Qualification of PhD. -- September 2005 - 《Face Recognition: Two-Dimensional and Three-Dimensional Techniques》 4 Two-dimensional Face Recognition 4.1 Feature Localization Before discussing the methods of comparing two facial images we now take a brief look at some at the preliminary processes of facial feature alignment. This process typically consists of two stages: face detection and eye localisation. Depending on the application, if the position of the face within the image is known beforehand (for a cooperative subject in a door access system for example) then the face detection stage can often be skipped, as the region of interest is already known. Therefore, we discuss eye localisation here, with a brief discussion of face detection in the literature review(section 3.1.1). The eye localisation method is used to align the 2D face images of the various test sets used throughout this section. However, to ensure that all results presented are representative of the face recognition accuracy and not a product of the performance of the eye localisation routine, all image alignments are manually checked and any errors corrected, prior to testing and evaluation. We detect the position of the eyes within an image using a simple template based method. A training set of manually pre-aligned images of faces is taken, and each image cropped to an area around both eyes. The average image is calculated and used as a template. Figure 4-1 - The average eyes. Used as a template for eye detection. Both eyes are included in a single template, rather than individually searching for each eye in turn, as the characteristic symmetry of the eyes either side of the nose, provides a useful feature that helps distinguish between the eyes and other false positives that may be picked up in the background. Although this method is highly susceptible to scale(i.e. subject distance from the

工业机器人外文翻译

附录外文文献 原文 Industrial Robots Definition “A robot is a reprogrammable,multifunctional machine designed to manipulate materials,parts,tools,or specialized devices,through variable programmed motions for the performance of a variety of tasks.” --Robotics Industries Association “A robot is an automatic device that performs functions normally ascribrd to humans or a machine in orm of a human.” --Websters Dictionary The industrial robot is used in the manufacturing environment to increase productivity . It can be used to do routine and tedious assembly line jobs , or it can perform jobs that might be hazardous to do routine and tedious assembly line jobs , or it can perform jobs that might be hazardous to the human worker . For example , one of the first industrial robots was used to replace the nuclear fuel rods in nuclear power plants . A human doing this job might be exposed to harmful amounts of radiation . The industrial robot can also operate on the assembly line , putting together small components , such as placing electronic components on a printed circuit board . Thus , the human worker can be relieved of the routine operation of this tedious task . Robots can also be programmed to defuse bombs , to serve the handicapped , and to perform functions in numerous applications in our society . The robot can be thought of as a machine that will move an end-of-arm tool , sensor , and gripper to a preprogrammed location . When the robot arrives at this location , it will perform some sort of task . This task could be welding , sealing , machine loading , machine unloading , or a host of assembly jobs . Generally , this work can be accomplished without the involvement of a human being , except for programming and for turning the system on and off . The basic terminology of robotic systems is introduced in the following :

机器人结构论文中英文对照资料外文翻译文献

中英文对照资料外文翻译文献 FEM Optimization for Robot Structure Abstract In optimal design for robot structures, design models need to he modified and computed repeatedly. Because modifying usually can not automatically be run, it consumes a lot of time. This paper gives a method that uses APDL language of ANSYS 5.5 software to generate an optimal control program, which mike optimal procedure run automatically and optimal efficiency be improved. 1)Introduction Industrial robot is a kind of machine, which is controlled by computers. Because efficiency and maneuverability are higher than traditional machines, industrial robot is used extensively in industry. For the sake of efficiency and maneuverability, reducing mass and increasing stiffness is more important than traditional machines, in structure design of industrial robot. A lot of methods are used in optimization design of structure. Finite element method is a much effective method. In general, modeling and modifying are manual, which is feasible when model is simple. When model is complicated, optimization time is longer. In the longer optimization time, calculation time is usually very little, a majority of time is used for modeling and modifying. It is key of improving efficiency of structure optimization how to reduce modeling and modifying time. APDL language is an interactive development tool, which is based on ANSYS and is offered to program users. APDL language has typical function of some large computer languages. For example, parameter definition similar to constant and variable definition, branch and loop control, and macro call similar to function and subroutine call, etc. Besides these, it possesses powerful capability of mathematical calculation. The capability of mathematical calculation includes arithmetic calculation, comparison, rounding, and trigonometric function, exponential function and hyperbola function of standard FORTRAN language, etc. By means of APDL language, the data can be read and then calculated, which is in database of ANSYS program, and running process of ANSYS program can be controlled.

图像检测外文翻译参考文献

图像检测外文翻译参考文献(文档含中英文对照即英文原文和中文翻译)

译文 基于半边脸的人脸检测 概要:图像中的人脸检测是人脸识别研究中一项非常重要的研究分支。为了更有效地检测图像中的人脸,此次研究设计提出了基于半边脸的人脸检测方法。根据图像中人半边脸的容貌或者器官的密度特征,比如眼睛,耳朵,嘴巴,部分脸颊,正面的平均全脸模板就可以被构建出来。被模拟出来的半张脸是基于人脸的对称性的特点而构建的。图像中人脸检测的实验运用了模板匹配法和相似性从而确定人脸在图像中的位置。此原理分析显示了平均全脸模型法能够有效地减少模板的局部密度的不确定性。基于半边脸的人脸检测能降低人脸模型密度的过度对称性,从而提高人脸检测的速度。实验结果表明此方法还适用于在大角度拍下的侧脸图像,这大大增加了侧脸检测的准确性。 关键词:人脸模板,半边人脸模板,模板匹配法,相似性,侧脸。 I.介绍 近几年,在图像处理和识别以及计算机视觉的研究领域中,人脸识别是一个很热门的话题。作为人脸识别中一个重要的环节,人脸检测也拥有一个延伸的研究领域。人脸检测的主要目的是为了确定图像中的信息,比如,图像总是否存在人脸,它的位置,旋转角度以及人脸的姿势。根据人脸的不同特征,人脸检测的方法也有所变化[1-4]。而且,根据人脸器官的密度或颜色的固定布局,我们可以判定是否存在人脸。因此,这种基于肤色模型和模板匹配的方法对于人脸检测具有重要的研究意义[5-7]。 这种基于模板匹配的人脸检测法是选择正面脸部的特征作为匹配的模板,导致人脸搜索的计算量相对较大。然而,绝大多数的人脸都是对称的。所以我们可以选择半边正面人脸模板,也就是说,选择左半边脸或者有半边脸作为人脸匹配的模板,这样,大大减少了人脸搜索的计算。 II.人脸模板构建的方法 人脸模板的质量直接影响匹配识别的效果。为了减少模板局部密度的不确定性,构建人脸模板是基于大众脸的信息,例如,平均的眼睛模板,平均的脸型模板。这种方法很简单。 在模板的仿射变换的实例中,人脸检测的有效性可以被确保。构建人脸模板的过程如下[8]: 步骤一:选择正面人脸图像; 步骤二:决定人脸区域的大小和选择人脸区域; 步骤三:将选出来的人脸区域格式化成同一种尺寸大小;

外文翻译:机器人本科生外文翻译资料

外文翻译资料原文 学院 专业班级 学生姓名 指导教师

Robot Darrick Addison (dtadd95@https://www.doczj.com/doc/f914250803.html,), Senior Software Engineer/Consultant, ASC Technologies Inc. 01 Sep 2001 "A re-programmable, multifunctional manipulator designed to move material, parts, tools, or specialized devices through various programmed motions for the performance of a variety of tasks." -- From the Robot Institute of America, 1979 Darrick Addison, an experienced developer in databases, networks, user interfaces, and embedded systems, introduces the field of robotics and the issues surrounding robotic systems. He covers mechanical design, sensory systems, electronic control, and software. He also discusses microcontroller systems, including serial and memory-mapped interfacing, and talks about some of the available open source software options. The word "robot" originates from the Czech word for forced labor, or serf. It was introduced by playwright Karel Capek, whose fictional robotic inventions were much like Dr. Frankenstein's monster -- creatures created by chemical and biological, rather than mechanical, methods. But the current mechanical robots of popular culture are not much different from these fictional biological creations. Basically a robots consists of: ? A mechanical device, such as a wheeled platform, arm, or other construction, capable of interacting with its environment ?Sensors on or around the device that are able to sense the environment and give useful feedback to the device ?Systems that process sensory input in the context of the device's current situation and instruct the device to perform actions in response to the situation In the manufacturing field, robot development has focused on engineering robotic arms that perform manufacturing processes. In the space industry, robotics focuses on highly specialized, one-of-kind planetary rovers. Unlike a highly automated manufacturing plant, a planetary rover operating on the dark side of the moon -- without radio communication -- might run into unexpected situations. At a minimum, a planetary rover must have some source of sensory input, some way of interpreting that input, and a way of modifying its actions to respond to a changing world. Furthermore, the need to sense and adapt to a partially unknown environment requires intelligence (in other words, artificial intelligence).

智能避障机器人设计外文翻译

INTELLIGENT VEHICLE Our society is awash in “machine intelligence” of various kinds.Over the last century, we have witnessed more and more of the “drudgery” of daily living being replaced by devices such as washing machines. One remaining area of both drudgery and danger, however, is the daily act ofdriving automobiles 1.2 million people were killed in traffic crashes in 2002, which was 2.1% of all globaldeaths and the 11th ranked cause of death . If this trend continues, an estimated 8.5 million people will be dying every year in road crashes by 2020. In fact, the U.S. Department of Transportation has estimated the overall societal cost of road crashes annually in the United States at greater than $230 billion. When hundreds or thousands of vehicles are sharing the same roads at the same time, leading to the all too familiar experience of congested traffic. Traffic congestion undermines our quality of life in the same way air pollution undermines public health.Around 1990, road transportation professionals began to apply them to traffic and road management. Thus was born the intelligent transportation system(ITS). Starting in the late 1990s, ITS systems were developed and deployed. In developed countries, travelers today have access to signifi-cant amounts of information about travel conditions, whether they are driving their own vehicle or riding on public transit systems. As the world energy crisis, and the war and the energy

搬运机器人外文翻译

外文翻译 专业机械电子工程 学生姓名张华 班级 B机电092 学号 05 指导教师袁健

外文资料名称:Research,design and experiment of end effector for wafer transfer robot 外文资料出处:Industrail Robot:An International Journal 附件: 1.外文资料翻译译文 2.外文原文

晶片传送机器人末端效应器研究、设计和实验 刘延杰、徐梦、曹玉梅 张华译 摘要:目的——晶片传送机器人扮演一个重要角色IC制造行业并且末端执行器是一个重要的组成部分的机器人。本文的目的是使晶片传送机器人通过研究其末端执行器提高传输效率,同时减少晶片变形。 设计/方法/方法——有限元方法分析了晶片变形。对于在真空晶片传送机器人工作,首先,作者运用来自壁虎的超细纤维阵列的设计灵感研究机器人的末端执行器,和现在之间方程机器人的交通加速度和参数的超细纤维数组。基于这些研究,一种微阵列凹凸设计和应用到一个结构优化的末端执行器。对于晶片传送机器人工作在大气环境中,作者分析了不同因素的影响晶片变形。在吸收面积的压力分布的计算公式,提出了最大传输加速度。最后, 根据这些研究得到了一个新的种末端执行器设计大气机器人。 结果——实验结果表明, 通过本文研究应用晶片传送机器人的转换效率已经得到显着提高。并且晶片变形吸收力得到控制。 实际意义——通过实验可以看出,通过本文的研究,可以用来提高机器人传输能力, 在生产环境中减少晶片变形。还为进一步改进和研究末端执行器打下坚实的基础,。 创意/价值——这是第一次应用研究由壁虎启发了的超细纤维阵列真空晶片传送机器人。本文还通过有限元方法仔细分析不同因素在晶片变形的影响。关键词:晶片传送机器人末端执行器、超细纤维数组、晶片 1.介绍

人工智能专业外文翻译-机器人

译文资料: 机器人 首先我介绍一下机器人产生的背景,机器人技术的发展,它应该说是一个科学技术发展共同的一个综合性的结果,同时,为社会经济发展产生了一个重大影响的一门科学技术,它的发展归功于在第二次世界大战中各国加强了经济的投入,就加强了本国的经济的发展。另一方面它也是生产力发展的需求的必然结果,也是人类自身发展的必然结果,那么随着人类的发展,人们在不断探讨自然过程中,在认识和改造自然过程中,需要能够解放人的一种奴隶。那么这种奴隶就是代替人们去能够从事复杂和繁重的体力劳动,实现人们对不可达世界的认识和改造,这也是人们在科技发展过程中的一个客观需要。 机器人有三个发展阶段,那么也就是说,我们习惯于把机器人分成三类,一种是第一代机器人,那么也叫示教再现型机器人,它是通过一个计算机,来控制一个多自由度的一个机械,通过示教存储程序和信息,工作时把信息读取出来,然后发出指令,这样的话机器人可以重复的根据人当时示教的结果,再现出这种动作,比方说汽车的点焊机器人,它只要把这个点焊的过程示教完以后,它总是重复这样一种工作,它对于外界的环境没有感知,这个力操作力的大小,这个工件存在不存在,焊的好与坏,它并不知道,那么实际上这种从第一代机器人,也就存在它这种缺陷,因此,在20世纪70年代后期,人们开始研究第二代机器人,叫带感觉的机器人,这种带感觉的机器人是类似人在某种功能的感觉,比如说力觉、触觉、滑觉、视觉、听觉和人进行相类比,有了各种各样的感觉,比方说在机器人抓一个物体的时候,它实际上力的大小能感觉出来,它能够通过视觉,能够去感受和识别它的形状、大小、颜色。抓一个鸡蛋,它能通过一个触觉,知道它的力的大小和滑动的情况。第三代机器人,也是我们机器人学中一个理想的所追求的最高级的阶段,叫智能机器人,那么只要告诉它做什么,不用告诉它怎么去做,它就能完成运动,感知思维和人机通讯的这种功能和机能,那么这个目前的发展还是相对的只是在局部有这种智能的概念和含义,但真正完整意义的这种智能机器人实际上并没有存在,而只是随着我们不断的科学技术的发展,智能的概念越来越丰富,它内涵越来越宽。 下面我简单介绍一下我国机器人发展的基本概况。由于我们国家存在很多其

中英文对照资料外文翻译文献

中英文对照资料外文翻译文献 平设计任何时期平面设计可以参照一些艺术和专业学科侧重于视觉传达和介绍。采用多种方式相结合,创造和符号,图像和语句创建一个代表性的想法和信息。平面设计师可以使用印刷,视觉艺术和排版技术产生的最终结果。平面设计常常提到的进程,其中沟通是创造和产品设计。共同使用的平面设计包括杂志,广告,产品包装和网页设计。例如,可能包括产品包装的标志或其他艺术作品,举办文字和纯粹的设计元素,如形状和颜色统一件。组成的一个最重要的特点,尤其是平面设计在使用前现有材料或不同的元素。平面设计涵盖了人类历史上诸多领域,在此漫长的历史和在相对最近爆炸视觉传达中的第20和21世纪,人们有时是模糊的区别和重叠的广告艺术,平面设计和美术。毕竟,他们有着许多相同的内容,理论,原则,做法和语言,有时同样的客人或客户。广告艺术的最终目标是出售的商品和服务。在平面设计,“其实质是使以信息,形成以思想,言论和感觉的经验”。

在唐朝(618-906 )之间的第4和第7世纪的木块被切断打印纺织品和后重现佛典。阿藏印在868是已知最早的印刷书籍。在19世纪后期欧洲,尤其是在英国,平面设计开始以独立的运动从美术中分离出来。蒙德里安称为父亲的图形设计。他是一个很好的艺术家,但是他在现代广告中利用现代电网系统在广告、印刷和网络布局网格。于1849年,在大不列颠亨利科尔成为的主要力量之一在设计教育界,该国政府通告设计在杂志设计和制造的重要性。他组织了大型的展览作为庆祝现代工业技术和维多利亚式的设计。从1892年至1896年威廉?莫里斯凯尔姆斯科特出版社出版的书籍的一些最重要的平面设计产品和工艺美术运动,并提出了一个非常赚钱的商机就是出版伟大文本论的图书并以高价出售给富人。莫里斯证明了市场的存在使平面设计在他们自己拥有的权利,并帮助开拓者从生产和美术分离设计。这历史相对论是,然而,重要的,因为它为第一次重大的反应对于十九世纪的陈旧的平面设计。莫里斯的工作,以及与其他私营新闻运动,直接影响新艺术风格和间接负责20世纪初非专业性平面设计的事态发展。谁创造了最初的“平面设计”似乎存在争议。这被归因于英国的设计师和大学教授Richard Guyatt,但另一消息来源于20世纪初美国图书设计师William Addison Dwiggins。伦敦地铁的标志设计是爱德华约翰斯顿于1916年设计的一个经典的现代而且使用了系统字体设计。在20世纪20年代,苏联的建构主义应用于“智能生产”在不同领域的生产。个性化的运动艺术在2俄罗斯大革命是没有价值的,从而走向以创造物体的功利为目的。他们设计的建筑、剧院集、海报、面料、服装、家具、徽标、菜单等。J an Tschichold 在他的1928年书中编纂了新的现代印刷原则,他后来否认他在这本书的法西斯主义哲学主张,但它仍然是非常有影响力。Tschichold ,包豪斯印刷专家如赫伯特拜耳和拉斯洛莫霍伊一纳吉,和El Lissitzky 是平面设计之父都被我们今天所知。他们首创的生产技术和文体设备,主要用于整个二十世纪。随后的几年看到平面设计在现代风格获得广泛的接受和应用。第二次世界大战结束后,美国经济的建立更需要平面设计,主要是广告和包装等。移居国外的德国包豪斯设计学院于1937年到芝加哥带来了“大规模生产”极简到美国;引发野火的“现代”

外文翻译-多自由度步行机器人

多自由度步行机器人 摘要在现实生活中设计一款不仅可以倒下而且还可以站起来的机器人灵活智能机器人很重要。本文提出了一种两臂两足机器人,即一个模仿机器人,它可以步行、滚动和站起来。该机器人由一个头,两个胳膊和两条腿组成。基于远程控制,设计了双足机器人的控制系统,解决了机器人大脑内的机构无法与无线电联系的问题。这种远程控制使机器人具有强大的计算头脑和有多个关节轻盈的身体。该机器人能够保持平衡并长期使用跟踪视觉,通过一组垂直传感器检测是否跌倒,并通过两个手臂和两条腿履行起立动作。用实际例子对所开发的系统和实验结果进行了描述。 1 引言随着人类儿童的娱乐,对于设计的双足运动的机器人具有有站起来动作的能力是必不可少。 为了建立一个可以实现两足自动步行的机器人,设计中感知是站立还是否躺着的传感器必不可少。两足步行机器人它主要集中在动态步行,作为一种先进的控制问题来对待它。然而,在现实世界中把注意力集中在智能反应,更重要的是创想,而不是一个不会倒下的机器人,是一个倒下来可以站起来的机器人。 为了建立一个既能倒下又能站起来的机器人,机器人需要传感系统就要知道它是否跌倒或没有跌倒。虽然视觉是一个机器人最重要的遥感功能,但由于视觉系统规模和实力的限制,建立一个强大的视觉系统在机器人自己的身体上是困难的。如果我们想进一步要求动态反应和智能推理经验的基础上基于视觉的机器人行为研究,那么机器人机构要轻巧足以够迅速作出迅速反应,并有许多自由度为了显示驱动各种智能行为。至于有腿机器人,只有一个以视觉为基础的

小小的研究。面临的困难是在基于视觉有腿机器人实验研究上由硬件的显示所限制。在有限的硬件基础上是很难继续发展先进的视觉软件。为了解决这些问题和推进基于视觉的行为研究,可以通过建立远程脑的办法。身体和大脑相连的无线链路使用无线照相机和远程控制机器人,因为机体并不需要电脑板,所以它变得更加容易建立一个有许多自由度驱动的轻盈机身。 在这项研究中,我们制定了一个使用远程脑机器人的环境并且使它执行平衡的视觉和起立的手扶两足机器人,通过胳膊和腿的合作,该系统和实验结果说明如下。图 1 远程脑系统的硬件配置图 2 两组机器人的身体结构 2 远程脑系统 远程控制机器人不使用自己大脑内的机构。它留大脑在控制系统中并且与它用无线电联系。这使我们能够建立一个自由的身体和沉重大脑的机器人。身体和大脑的定义软件和硬件之间连接的接口。身体是为了适应每个研究项目和任务而设计的。这使我们提前进行研究各种真实机器人系统。 一个主要利用远程脑机器人是基于超级并行计算机上有一个大型及重型颅脑。虽然硬件技术已经先进了并拥有生产功能强大的紧凑型视觉系统的规模,但是硬件仍然很大。摄像头和视觉处理器的无线连接已经成为一种研究工具。远程脑的做法使我们在基于视觉机器人技术各种实验问题的研究上取得进展。 另一个远程脑的做法的优点是机器人机体轻巧。这开辟了与有腿移动机器人合作的可能性。至于动物,一个机器人有 4 个可以行走的四肢。我们的重点是基于视觉的适应行为的4肢机器人、机械动物,在外地进行试验还没有太多的研究。 大脑是提出的在母体环境中通过接代遗传。大脑和母体可以分享新设计

人脸检测外文翻译参考文献

人脸检测外文翻译参考文献(文档含中英文对照即英文原文和中文翻译) 译文: 基于PAC的实时人脸检测和跟踪方法 摘要: 这篇文章提出了复杂背景条件下,实现实时人脸检测和跟踪的一种方法。这种方法是以主要成分分析技术为基础的。为了实现人脸的检测,首先,我们要用一个肤色模型和一些动作信息(如:姿势、手势、眼色)。然后,使用PAC技术检测这些被检验的区域,从而判定人脸真正的位置。而人脸跟踪基于欧几里德(Euclidian)距离的,其中欧几里德距离在位于以前被跟踪的人脸和最近被检测的人脸之间的特征空间中。用于人脸跟踪的摄像控制器以这样的方法工作:利用平衡/(pan/tilt)平台,把被检测的人脸区域控制在屏幕的中央。这个方法还可以扩展到其他的系统中去,例如电信会议、入侵者检查系统等等。

1.引言 视频信号处理有许多应用,例如鉴于通讯可视化的电信会议,为残疾人服务的唇读系统。在上面提到的许多系统中,人脸的检测喝跟踪视必不可缺的组成部分。在本文中,涉及到一些实时的人脸区域跟踪[1-3]。一般来说,根据跟踪角度的不同,可以把跟踪方法分为两类。有一部分人把人脸跟踪分为基于识别的跟踪喝基于动作的跟踪,而其他一部分人则把人脸跟踪分为基于边缘的跟踪和基于区域的跟踪[4]。 基于识别的跟踪是真正地以对象识别技术为基础的,而跟踪系统的性能是受到识别方法的效率的限制。基于动作的跟踪是依赖于动作检测技术,且该技术可以被分成视频流(optical flow)的(检测)方法和动作—能量(motion-energy)的(检测)方法。 基于边缘的(跟踪)方法用于跟踪一幅图像序列的边缘,而这些边缘通常是主要对象的边界线。然而,因为被跟踪的对象必须在色彩和光照条件下显示出明显的边缘变化,所以这些方法会遭遇到彩色和光照的变化。此外,当一幅图像的背景有很明显的边缘时,(跟踪方法)很难提供可靠的(跟踪)结果。当前很多的文献都涉及到的这类方法时源于Kass et al.在蛇形汇率波动[5]的成就。因为视频情景是从包含了多种多样噪音的实时摄像机中获得的,因此许多系统很难得到可靠的人脸跟踪结果。许多最新的人脸跟踪的研究都遇到了最在背景噪音的问题,且研究都倾向于跟踪未经证实的人脸,例如臂和手。 在本文中,我们提出了一种基于PCA的实时人脸检测和跟踪方法,该方法是利用一个如图1所示的活动摄像机来检测和识别人脸的。这种方法由两大步骤构成:人脸检测和人脸跟踪。利用两副连续的帧,首先检验人脸的候选区域,并利用PCA技术来判定真正的人脸区域。然后,利用特征技术(eigen-technique)跟踪被证实的人脸。 2.人脸检测 在这一部分中,将介绍本文提及到的方法中的用于检测人脸的技术。为了改进人脸检测的精确性,我们把诸如肤色模型[1,6]和PCA[7,8]这些已经发表的技术结合起来。

相关主题
文本预览
相关文档 最新文档