Author Login Editor-in-Chief Peer Review Editor Work Office Work

Collections

Big Data
Journal
Publication year
Channels
Sort by Default Latest Most read  
Please wait a minute...
  • Select all
    |
  • YAN Jiankang,CHEN Gengsheng

    To meet the real-time processing needs of compute-intensive big data applications,H-Storm heterogeneous computing platform TS developed based on Apache Storm.Through the Multi-process Service(MPS) feature,Graphic Process Unit(GPU) resource quantization and distributed calling mechanism are designed the task scheduling strategy of H-Storm heterogeneous clusters is proposed,and the task scheduling algorithm of GPU performance and load and adaptive flow distribution decision mechanism under cooperative computing are realized.Experimental results show that in the case of 512×512 matrix multiplication,the throughput of H-Storm heterogeneous computing platform increases by 54.9 times and the response delay decreases by 77 times compared with that of native Storm.

  • ZHU Jinshan,LIU Liangxu,ZHOU Chaolan,GUAN Bo
    In view of the tide problem of fast developed city public bike system,this paper proposes a station clustering algorithm based on SimRank,which uses the characteristics of public bike.Firstly,the definition of station similarity is proposed based on the relation between stations.Secondly,the SimRank algorithm is introduced to calculate the similarity between stations.Finally,according to the calculated similarity values,the stations are clustered with the idea of maximum similarity priority.Experimental results show that the clustering results by the proposed algorithm have accurate bike flow characteristic and regional characteristic,meanwhile,the members in same cluster have great relevance.
  • GAO Yanjun,ZHANG Xueying,LI Fenglian,TIAN Yuchu
    In the process of distributed processing of all-to-all comparison problem for large data,the existing data allocation strategies think less of the special dependency between the comparison task and the data,which lead to the low storage efficiency and imbalanced task allocation.Aiming at this problem,a Data Allocation Algorithm Based on Graph Covering(DAABGC) is proposed.Firstly,the problem of data allocation for large data is summarized as the problem of graph covering by theoretical analysis.Then,the optimal solution of several graph covering is constructed successfully and the data are allocated according to the special solution.Experimental results show that,compared with the Hadoop-based data allocation strategy,the proposed algorithm ensures that the comparison task has 100% data locality and load balancing between nodes.It also improve storage saving rate and overall computing performance.
  • YI Jeongho,JIN Depeng
    Aiming at the problems of low operational efficiency and low coverage in the late-night bus routes in China’s big cities,this paper introduces a bus route evaluation model and Dijkstra algorithm model considering the regional equilibrium,and proposes an improved scheme for late-night bus lines in the city.Taking the late-night bus network in Shanghai as an example,the late-night mobile demand display module in urban areas,the late-night bus network evaluation module in urban areas and the new late-night bus network design module are researched to realize the integrated design of evaluation to optimization.By establishing an effective evaluation and optimization system of urban public transportation network,the key technologies of mobile data application in urban public transportation network optimization are discussed.Optimization results show that the improved scheme of late-night bus network basically covers all the demand for late-night mobile and taxi mobile data high,compared with the existing late-night bus lines,the scheme to meets the late-night travel needs in Shanghai.
  • ZHUO Yu,YOU Jiali,WANG Jinlin,QI Weining,QIAO Nannan
    Selecting the proper service provider for different users has many important problems.The most important one comes from the heterogeneity and dynamic of user’s network.Therefore,based on the sea service architecture,a measurement and recommendation system oriented online video services is designed and implemented.The system simulates a large number of client nodes to measure and predict the user’s Quality of Experience(QoE) based on the measurement results,thereby providing users with real-time service source recommendations.The system is used to build a video measurement and recommendation system containing 10 video sites,and the data for 9 months are observed and analyzed.Experimental results show that the system can provide service source recommendation to users needs to watch,so that the user can use the system to obtain the best viewing experience under the current network conditions.
  • NING Ke,SUN Tongjing,XU Jiejie
    Aiming at the problem that the Nearest Neighbor Absorption First(NNAF) clustering algorithm is difficult to be applied in the massive data clustering process,an improved algorithm is proposed based on MapReduce.By introducing MapReduce parallel programming framework and using Canopy coarse clustering,it optimizes the calculation process and improves the process of clustering the intersection.Three different data sets are used to compare the K-means algorithm,the improved NNAF clustering algorithm and the NNAF clustering algorithm.Experimental results show that the improved algorithm can guarantee the clustering quality to a certain extent and has higher running speed.It is suitable for clustering analysis of massive data.
  • PENG Daqin,LAI Xiangwu,LIU Yanlin
    The existing routing algorithms do not consider the real-time transmission status and traffic characteristics of the link,a multi-path routing algorithm based on link real-time status and traffic characteristics is proposed based on the idea of Software Defined Network(SDN) centralized control and whole network control.The algorithm divides the data stream into the big stream and the small stream,and has the characteristics of high demand for the large stream throughput,routing according to the path weight value,and is available because of the big number of small streams and the low complexity of the small stream processing.The path with the largest remaining bandwidth is selected as its routing path.Simulation results show that this algorithm can improve the average link utilization and network throughput of fat-tree data center network compared with Equal Cost Multi-path(ECMP) and Software-defined Hybrid Routing(SHR) mechanism.