面向集群环境的虚拟化GPU计算平台 Virtualized GPU computing platform in clustered system environment期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

面向集群环境的虚拟化GPU计算平台

引用本文：	杨经纬,马凯,龙翔.面向集群环境的虚拟化GPU计算平台[J].北京航空航天大学学报,2016,42(11):2340-2348.

作者姓名：	杨经纬马凯龙翔

作者单位：	北京航空航天大学计算机学院, 北京 100083

摘要：	针对集群系统的多节点多GPU环境，提出一种新型虚拟化GPU计算平台。该平台实现对集群系统所有节点上GPU资源的统一抽象与管理，构建公共GPU资源池。原有GPU应用程序可以不经任何修改而迁移到虚拟化GPU计算平台，并具备访问资源池内任何GPU的能力，编程人员无需显式针对多节点多GPU应用展开MPI编程。应用程序摆脱了单个节点上GPU资源的限制，并具备无差别地访问集群系统中任何可用GPU资源的能力，能有效提高系统总体资源利用率以及吞吐量。采用流水化通信技术，实现对虚拟化GPU计算平台的运行时开销以及节点间数据传输延迟的隐藏。实验表明:与非流水化通信相比，系统总体数据传输延迟降低了50%~70%，具备与节点机本地数据传输等同的通信性能。
关键词：	GPU MPI CUDA 集群系统硬件加速并行计算高性能计算
收稿时间：	2015-11-09
Virtualized GPU computing platform in clustered system environment

YANG Jingwei,MA Kai,LONG Xiang.Virtualized GPU computing platform in clustered system environment[J].Journal of Beijing University of Aeronautics and Astronautics,2016,42(11):2340-2348.

Authors:	YANG Jingwei MA Kai LONG Xiang

Institution:	School of Computer Science and Engineering, Beijing University of Aeronautics and Astronautics, Beijing 100083, China

Abstract:	A virtualized GPU computing platform is proposed for clustered systems, which are often equipped with GPUs in some nodes. All GPUs in system are uniformly abstracted as virtualized ones in a commonly accessed resource pool. Legacy GPU programs can execute on the virtualized GPU computing platform without any modification and any free virtualized GPU in the common resource pool is available to it, which relieves the burden of MPI programming. The platform frees programs with the limit of GPUs in local node and makes it possible for them to access any available GPU in distributed nodes, leading to higher system utilization and throughput. Based on pipelined communication, the run-time overhead and inter-node transmitting latency in virtualized GPU computing platform are hidden by intra-node memory copying and GPU computing. Compared with the non-pipelined communication, the total transmission latency is decreased by approximately 50%-70%. It results in a comparable performance with intra-node local data transmission.

Keywords:	GPU MPI CUDA clustered systems hardware acceleration parallel computing high performance computing
本文献已被万方数据等数据库收录！
	点击此处可从《北京航空航天大学学报》浏览原始摘要信息
	点击此处可从《北京航空航天大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏