首页 | 本学科首页   官方微博 | 高级检索  
     检索      

大数据高性能排序算法的设计与实现
引用本文:陈洪雁,万俊伟,汪琦.大数据高性能排序算法的设计与实现[J].飞行器测控学报,2015,34(2):120-127.
作者姓名:陈洪雁  万俊伟  汪琦
作者单位:北京跟踪与通信技术研究所;北京跟踪与通信技术研究所;北京跟踪与通信技术研究所
摘    要:针对大数据排序算法的需求,提出了基于任务驱动的并行排序算法。该算法采用任务驱动、AIO(Asynchronous Input/Output,异步输入/输出)和双缓冲区机制等技术充分利用系统资源;通过构造等价排序键,优化快速排序算法;并在算法实现上,采用多线程处理任务,通过控制线程个数控制并行度。综合利用这些技术,该算法使得大数据的排序性能接近理论极限值,在CPU(Central Processing Unit,中央处理器)资源充裕的情况下,利用异步压缩技术,还可以突破这一极限,最终实现的系统2 000s就可以对超过500Gbyte的磁盘数据做一次完整的排序。在数据库设计中充分利用此思想,将会实现连接和线程的分离,数据库将可以支持更大的连接数,从而提高数据库支持的并发度。

关 键 词:国产数据库  海量数据  大数据  排序算法  自主可控  等价排序键

Design and Implementation of High Performance Ranking Algorithm for Big Data
CHEN Hongyan,WAN Junwei and WANG Qi.Design and Implementation of High Performance Ranking Algorithm for Big Data[J].Journal of Spacecraft TT&C Technology,2015,34(2):120-127.
Authors:CHEN Hongyan  WAN Junwei and WANG Qi
Institution:Beijing Institute of Tracking and Telecommunications Technology;Beijing Institute of Tracking and Telecommunications Technology;Beijing Institute of Tracking and Telecommunications Technology
Abstract:A task-driving parallel ranking algorithm is proposed to meet demands for ranking algorithms for big data.Task-driving,AIO (Asynchronous Input and Output) and dual-buffer zone mechanisms are employed to make full use of system resources.The quick ranking algorithm is optimized by building equivalent keys.In algorithm implementation,parallel concurrences are controlled through the number of threads by using multi-threading in task handling.Through integrative use of such technologies,the ranking performance of the algorithm is approached the theoretical limit.It is even possible to go beyond the limit,that is,completing ranking of more than 500 Gbyte disk data in 2000 s,by using asynchronous compression technology when there is adequate CPU (Central Processing Unit) resource.Utilizing this algorithm in database design will facilitate separation of connection and thread and the database will be able to support an even larger number of connections,thus increasing concurrences supported by the database.
Keywords:homegrown database  mass data  big data  ranking algorithm  independent and controllable  equivalent sorting key
本文献已被 CNKI 等数据库收录!
点击此处可从《飞行器测控学报》浏览原始摘要信息
点击此处可从《飞行器测控学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号