优化的BLAS对并行程序性能的影响 Optimized BLAS and Its Effect on Performance of Parallel Programs期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

优化的BLAS对并行程序性能的影响

引用本文：	龙翔,李忠泽,陈瑾.优化的BLAS对并行程序性能的影响[J].北京航空航天大学学报,2001,27(1):79-82.

作者姓名：	龙翔李忠泽陈瑾

作者单位：	北京航空航天大学计算机科学与工程系

基金项目：	国家高技术研究发展计划(863计划);863-306-ZD-03;

摘要：	利用SMP结构的多处理器结点通过高速网络构造高性能并行计算系统是当前的一种发展趋势.为了使BLAS在SMP结点上具有更高的效率,本文讨论了BLAS3的多线程化问题;同时以SUMMA作为并行计算的实例,说明提高结点机程序的性能对并行计算性能的影响.试验结果显示,在一定的条件下,多线程BLAS3在SMP平台上可以获得超线性加速比;结点计算性能的提高的同时必须提高网络有效带宽,才能充分发挥并行计算的效能.
关键词：	并行处理线性代数最佳化多线程 BLAS SUMMA
文章编号：	1001-5965(2001)01-0079-04
收稿时间：	1999-05-04
修稿时间：	1999年5月4日
Optimized BLAS and Its Effect on Performance of Parallel Programs

LONG Xiang,LI Zhong-ze,CHEN Jin.Optimized BLAS and Its Effect on Performance of Parallel Programs[J].Journal of Beijing University of Aeronautics and Astronautics,2001,27(1):79-82.

Authors:	LONG Xiang LI Zhong-ze CHEN Jin

Institution:	Beijing University of Aeronautics and Astronautics, Dept. of Computer Science and Engineering

Abstract:	It is the trend of using SMP board as the compute node of a high performance system. The benefits of multithreading is discussed firstly and the BLAS3 is rewritten to get higher performance on a Dual Pentium II system. To investigate the relation between the performance of a single compute node and the entire parallel system, the SUMMA(Scalable Universal Matrix Multiplication Algorithm) is taken as an instance of our research works afterwards. The result demonstrates that the higher the performance of a SMP compute node is, the more sensitive to the capability of the SAN(System Area Network) the performance of the whole parallel system is.

Keywords:	parallel processing linear algebra optimization multithread BLAS SUMM
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《北京航空航天大学学报》浏览原始摘要信息
	点击此处可从《北京航空航天大学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏