• BLADYG
      A Novel Block-Centric Framework for the Analysis of LArge DYnamic Graphs
  • Summary   |   System Overview   |   Results   |   Downloads

    • Summary:
    • Recently, distributed processing of large dynamic graphs has become very popular, especially in certain domains such as social network analysis, Web graph analysis and spatial network analysis. In this context, many distributed/parallel graph processing systems have been proposed, such as Pregel, GraphLab, and Trinity. These systems can be divided into two categories: (1) vertex-centric and (2) block-centric approaches. In vertex-centric approaches, each vertex corresponds to a process, and message are exchanged among vertices. In block-centric approaches, the unit of computation is a block, a connected subgraph of the graph, and message exchanges occur among blocks. In this work, we are considering the issues of scale and dynamism in the case of block-centric approaches. We present BLADYG, a block-centric framework that addresses the issue of dynamism in large-scale graphs. BLADYG is implemented on top of AKKA framework.

    • System overview:

    BLADYG

    BLADYG System overview



    • Results:

    We have applied BLADYG framework to the problem of distributed k-core decomposition in large dynamic graphs. We have performed a set of experiments to evaluate the effectiveness and efficiency of BLADYG framework on a number of different real and synthetic datasets.

    • Experimental environment

    We have implemented BLADYG on top of the akka framework, a toolkit and runtime for building highly concurrent, distributed, resilient message-driven applications. In order to evaluate the performance of our approach, we used 9 m3.medium instances on Amazon EC2. Each m3.medium instance contained 1 virtual 64-bit CPU, 3.75 GB of main memory a 4 GB of local instance storage.
    We compared our BLADYG solution for k-core decomposition with a HBase-based solution proposed by Aksu et al. [1]. The HBase-based solution was tested using 9 m3.medium instances on Amazon EC2: 1 acting as hmaster, namenode and zookeeper node, 8 as datanode and hregion.

    • Datasets

    In order to evaluate and test the effectiveness of BLADYG, we performed an extensive set of experiments on a number of different real and synthetic datasets.

    Datasets
    Dataset Type Number of nodes (N) Number of edges (M) Diameter Average clustering coefficient Max(k)
    DS1Synthetic10,00070,62240.397733
    DS2Synthetic20,000144,74140.393538
    DS3Synthetic50,000365,88340.392942
    DS4Synthetic100,000734,41640.390846
    ego-FacebookReal4,03988,23480.6055115
    email-EnronReal36,692183,831110.497043
    roadNet-TXReal1,379,9171,921,6601,0540.04703
    roadNet-CAReal1,965,2062,766,6078490.04643
    com-LiveJournalReal3,997,96234,681,189170.2843296
    soc-LiveJournal1Real4,847,57168,993,773160.2742318
    Real datasets are provided by Stanford University (SNAP).
    Synthetic datasets were created using the graph data generator proposed by [2] [3].
    • Scalability
    • Results

    • We mention that the presented runtime values of the HBase-based approach correspond to the maintenance time of only one fixed k value core (k = max(k) in our experimental study). This means that, for each dataset, the maintenance process of the HBase-based approach needs to be repeated max(k) times in order to achieve the same results as our approach.

    • Downloads:
    • BLADYG source code is downloadable here.
      Our Java implementation of the HBase-based [1] approach is available here.
      • Related publications:
      • S. Aridhi, A. Montresor, Y. Velegrakis. BLADYG: A Graph Processing Framework for Large Dynamic Graphs. Big Data Research (BDR), Elsevier, 9(C), pp. 9-17, 2017.
      • S. Aridhi, M. Brugnara,A. Montresor, Y. Velegrakis. Distributed k-core Decomposition and Maintenance in Large Dynamic Graphs. In Proceedings of the 10th ACM International Conference on Distributed and Event-Based Systems (DEBS'16), pp. 161-168, Irvine, 2016.
      • C. Sakouhi, S. Aridhi, A. Guerrieri, S. Sassi and A. Montresor. DynamicDFEP: A distributed edge partitioning approach for large dynamic graphs. In Proceedings of the 20th International Database Engineering & Applications Symposium (IDEAS'16), pp. 61-168, Montreal, Canada, 2016.
      • Aridhi S., Montresor A., Velegrakis Y. BLADYG: A novel block-centric framework for the analysis of large dynamic graphs. In Proceedings of the ACM Workshop on High Performance Graph Processing (HPGP'16) @ HPDC'16, pp. 39-42, Kyoto, Japan, 2016.
      • References:
      • [1] Aksu H., Canim M., Yuan-Chi Chang, Korpeoglu I., Ulusoy O. Distributed k-Core View Materialization and Maintenance for Large Dynamic Graphs. IEEE Transactions on Knowledge and Data Engineering, vol.26, no.10, pp.2439-2452, Oct. 2014
        [2] Alessandra Sala, Lili Cao, Christo Wilson, Robert Zablit, Haitao Zheng, and Ben Y. Zhao. Measurement-calibrated Graph Models for Social Network Experiments. Proceedings of The 19th World Wide Web Conference (WWW 2010). ACM, New York, NY, USA, 861-870.
        [3] Alessandra Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, and Ben Y. Zhao. Sharing Graphs using Differentially Private Graph Models. Proceedings of The 11th ACM SIGCOMM Internet Measurement Conference (IMC 2011). ACM, New York, NY, USA, 81-98.