在轨高效目标检测加速技术 Efficient Acceleration Technology for On board Object Detection期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

在轨高效目标检测加速技术

引用本文：	呼延烺,李映,蒋冬梅,张艳宁,周诠,魏佳圆,刘娟妮.在轨高效目标检测加速技术[J].宇航学报,2022,43(11):1544-1556.

作者姓名：	呼延烺李映蒋冬梅张艳宁周诠魏佳圆刘娟妮

作者单位：	1. 西北工业大学计算机学院，西安 710129; 2. 空间微波技术国家级重点实验室，西安 710100

基金项目：	国家重点研发计划（2020YFB1808003）；中国航天科技集团公司创新基金（Y20 JTKJCX 02）；国家重点实验室基金（6142411432107）；中国空间技术研究院人才基金（Y19 RCWYZ 04）

摘要：	针对深度卷积网络目标检测算法参数量大、计算量大以及受星上计算资源、存储资源及功耗的限制，难以实现在轨部署的问题，提出了一种在轨高效目标检测算法加速框架与实现方法。首先，设计了一种可以同时兼容三种卷积算子的计算引擎，有效提高了资源利用率;其次，从通道和卷积核两个维度将目标检测算法模型展开，实现了加速器的高度并行化和可扩展性;最后，在多种FPGA平台上实现了该加速器并对其性能进行了评估。实验结果表明：所提出的加速器计算性能可以达到1843.2 GFLOPs(每秒千兆次浮点运算)，推理时间为0.22 ms。与同类加速器方案相比，所提出的加速器框架在性能、功耗、能效比及推理时间方面具有很大优势，适合部署在资源受限环境中，具有良好的星上应用前景和价值。
关键词：	目标检测模型量化模型加速计算强度卷积神经网络
收稿时间：	2022-03-23
Efficient Acceleration Technology for On board Object Detection

HUYAN Lang,LI Ying,JIANG Dongmei,ZHANG Yanning,ZHOU Quan,WEI Jiayuan,LIU Juanni.Efficient Acceleration Technology for On board Object Detection[J].Journal of Astronautics,2022,43(11):1544-1556.

Authors:	HUYAN Lang LI Ying JIANG Dongmei ZHANG Yanning ZHOU Quan WEI Jiayuan LIU Juanni

Affiliation:	1. School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China; 2. National Key Laboratory of Science and Technology on Space Microwave, Xi’an 710100, China

Abstract:	To solve the problem that deep convolutional neural network object detection algorithms are difficult to deploy on board due to their large number of parameters, large computation, limitations of onboard computing resources, storage resources, and power consumption, an efficient on board object detection algorithm acceleration framework and implementation method are proposed. First of all, a computing engine that can be compatible with three convolutional operators is designed, which effectively improves resource utilization. Secondly, the object detection algorithm model is expanded from the two dimensions of channel and convolution kernel, which realizes the high parallelization and scalability of the accelerator. Finally, the accelerator was implemented on multiple FPGA platforms and its performance was evaluated. Experimental results show that the proposed FPGA based accelerator can achieve up to 1843.2 GFLOPs throughput, and the inference time is 0.22 ms. Compared with accelerators proposed in related literature, the accelerator proposed in this paper has great advantages in terms of performance, power consumption, energy efficiency ratio, and inference time. It is suitable for deployment in resource constrained environments and has good application prospects and values on satellites.

Keywords:	Object detection Model quantization Model acceleration Computational intensity Convolutional neural networks

	点击此处可从《宇航学报》浏览原始摘要信息
	点击此处可从《宇航学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏