### We would like to thank both the reviewers for the detailed comments on our work. We have incorporated changes required for all the observations of the reviewers and highlighted the changes with blue color in the updated manuscript. These reviews have helped us in improving the manuscript.
## Reviewer 1
### Reasons to accept:
The authors provide an innovative modelling approach that is of broader scientific significance and interest and the data sources are made publicly available for reproducibility.
Notably, various experiments with data sources spanning multiple domains are utilized to test for generalizability of their network approach that is situated with the scientific literature. The contribution is rigorously carved out and the results are presented in an adequate quality and elaboration degree.
**Answer:** Thank you for your detailed feedback on our work.
### Abstract:
**You introduce the HMN as an abbreviation, please refer to the generalized network model in the same way for better clarity. Please provide an example of the structural measures that you deliver.**
**Answer:** We have added the structural measures in the abstract, also modified the text for better clarity.
**You state that heterogeneous multi-layered networks (HMN) are more efficient, but the comparison lacks context. Specify the benchmark or state-of-the-art (SOTA) methods against which you are comparing your approach.**
**Answer:** Our proposed HMN is more efficient when compared to representing the same information through existing heterogeneous or multilayered networks like \[90, 11\]. Also, we discuss this in more detail with relevant comparison in Section 4.1, Page 8, where we compare our representation with multiple existing representations \[10, 41, 97, 101, 1\]. The citations can be referred to in the updated manuscript.
**Explain the rationae behind choosing the Twitter network and air-transportation network as your use cases. What specific characteristics of these networks make them suitable for your research goals?l**
**Answer:** We consider a Twitter dataset as an example of a real-life network that can be better modeled as a HMN with heterogeneity in its layers. The air transportation network, on the other hand, is a popular multilayer network; hence, we decided to experiment with the same. In reality, our model will be inevitable to model networks like Fediverse (https://en.wikipedia.org/wiki/Fediverse), where each server can be considered as a layer, and each layer will be depicted as a heterogeneous network of its own. As Fediverse allows cross-server (in other words, cross layer) following, this depicts a true HMN in real world.
**You conducted multiple experiments with various datasets (e.g., molecule dataset, brain networks) to prove generalizability. Clearly state the research question/hypothesis and corresponding methodological choices for these experiments**
**Answer:** We have tried to generate networks belonging to different categories like brain network, crime network, chemical and biological networks to show that our generation algorithm can be used to model different types of networks. We have also updated the manuscript to include a clear methodology for generating each network in Tables 2-5 (Section 7 and 7.3, Page 15-19).
### Section 4.1.1:
**You use the MovieLens dataset for link prediction, which was not introduced earlier. Provide a clear introduction to the methods used and lay out all of the experiments beforehand to ensure coherence.**
**Answer:** We have updated the manuscript providing a clear experiment design and more details about the movie lens dataset in Section 4.1.1, Page 9 in the modified manuscript.
**You mention "state-of-the-art GNN architectures available" (line 31, p.9) but do not specify which ones. Clarify this and explain why you chose to use t-SNE. Additionally, detail how you fine-tuned the hyperparameters of t-SNE.**
**Answer:** We have updated the manuscript, highlighting the GNN architectures and the hyperparameters used for t-SNE in Section 4.1.1, Page 10.
### Table Results:
**Report the results presented in the tables (e.g., Table 4 and Table 5\) within the text to ensure they are clearly contextualized.**
**Answer:** We have incorporated the suggestions made by the reviewers in our manuscript and added clear explanations for Tables 2-5 in the updated manuscript in Section 7.3, Pages 18 \- 19 in the modified manuscript.
### Limitations Section:
**Include a section discussing the limitations of your study to provide a balanced view and acknowledge areas for future research.**
**Answer** **:** We have added a limitations section addressing the limitations of our work (Section 8, Page 20\).
### Figures:
**Figure 2: Improve the interpretation. The text mentions, "The first layer contains tweets and the second layer contains users," but it is unclear which layer is which in the figure.**
**Answer:** We thank the reviewer for pointing this out. We have addressed this ambiguity in the updated manuscript in Section 4.1. The caption of Figure 2 is also modified in Page 8 of the new manuscript.
**Figure 3: The description says "with each rectangle representing a layer. The first layer contains tweets and the second layer contains users," but as a reader, I only see one layer (rectangle). You seem to express layers through edge types, which is challenging to understand. Clarify this representation.**
**Answer:** We have addressed this ambiguity in the updated manuscript. The caption of Figure 3 was incorrect and rectified in the modified manuscript. Thank you for pointing this out.
**Figure 5: This figure is never mentioned or discussed in the text. Ensure that every figure is referenced and discussed. Additionally, the captions of the subplots are too small to read, and the font differs from the rest of the manuscript. Ensure consistency in formatting.**
**Answer:** We thank the reviewer for pointing this out. We have updated the manuscript to include references and discussion on Figure 5 in Section 7.2 (Page 17\). Also, we have updated the figures to make the caption more visible in the text.
**Figure 6: The figure lacks an interpretation in the text. The caption does not adequately explain what subplots (a-d) specifically represent. Provide a detailed explanation in both the text and caption.**
**Answer:** We have updated the manuscript to include more discussion on Figure 6 in Section 7.3, Page 18.
## Reviewer 2
### The manuscript contains several extremely strong statements that are either false or conflicting with each other:
1. "A multi-layered network cannot support heterogeneity in a layer due to the absence of node or edge types." (Adding heterogeneity to layers is the main contribution of their manuscript)
We have added appropriate citations in support of this statement. Also, we request the reviewer to kindly note that the main contribution of this paper is not only adding heterogeneity to a layer but also developing a generalized definition of network data structure (in addition to providing a generalized network generation algorithm). The same is proved through Lemma 4.1 to 4.3. In other words, the majority of other network data structures can be considered as a special case of our proposed model.
2. "it is difficult to obtain heterogeneous multi-layered networks despite a lot of real-world networks being HMN" (contradicting what was said before, and easy to argue against it)
We thank the reviewer for pointing this out. It was a typo; we missed the keyword “dataset” in between. We meant to say that it is difficult to obtain heterogeneous multi-layered network **datasets** despite a lot of networks being HMN. We have now corrected the typos in the modified manuscript. Furthermore, due to the availability of Fediverse \[1\] or ActivityPub \[2\] protocols that allow different social networking apps to communicate among them, we will see more heterogeneous multilayer datasets in the future. In fact, once the Fediverse-like data sets are available for research, we think it will be inevitable to utilize generalized network data structures such as the proposed HMN.
1. https://www.fediverse.to
2. ActivityPub by Evan Prodromou, Released August 2024, Publisher(s): O'Reilly Media, Inc, ISBN: 9781098169466\.
3. "except this work, there is no mention of heterogeneous multilayered networks in the literature" (from a quick search from Google Scholar, more than 200 papers contain the wording "multilayer heterogeneous network", \[1\] itself propose a framework very similar to the one described in the Manuscript, but they don't mention it, nor they explain why their model is better, or where it differs from it)
We agree with the reviewer that the word “heterogeneous multi-layered” has been used in several literatures, however, the statement we used in the paper is not for the literal name but for the definition of the data structure. We already highlighted (in Introduction and Related work) that in several works, the literal name of multilayer heterogeneous (or heterogeneous multilayer) is synonymously used for the data structure of a multilayer network or multiplex network. These definitions of multilayer networks and multiplex networks are quite well known in the field \[3\], and one can verify that the definitions used for “multilayer-heterogeneous” networks in the literature are nothing but multilayer or multiplex data structures used for specific applications in these papers \[1,2,4,5,6\]. The same is evident from the statement of the reviewer in the “Reason for Accept: (One could reduce each network generated by the authors manuscript to a network as described in \[1\], but that would multiply the number of layers)”. This is exactly the limitation of multilayer networks \[3\] in the literature, and widely used for many applications as mentioned before \[1,2,4,5,6\].
Furthermore, we agree that we overlooked some of these papers in our literature review. We have now included several of them in the literature review Section 3, Page 4 in the modified manuscript based on the reviewer’s suggestion.
Regarding the comparison, as the definition of the data structure itself is different, the comparison would not be fairly applicable. The paper provides a generalized data structure and is not targeted at any particular application (but we do mention some advantages of using our representation over existing heterogeneous or multilayer definitions in Section 4.1, Page 8).
1. Wan, M. Zhang, X. Li, L. Sun, X. Wang and K. Liu, Identification of Important Nodes in Multilayer Heterogeneous Networks Incorporating Multirelational Information, IEEE Transactions on Computational Social Systems 9(6) (2022), 1715–1724. doi:10.1109/TCSS.2022.3161305.
2. L. Gyanendro Singh, A. Mitra and S. Ranbir Singh, Sentiment Analysis of Tweets using Heterogeneous Multi-layer Network Representation and Embedding, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Online, 2020, pp. 8932–8946. doi:10.18653/v1/2020.emnlp-main.718.
3. G. Bianconi, Multilayer Networks: Structure and Function, Oxford University Press, Oxford, 2018, p. 416\. ISBN 9780198753919\. doi:10.1093/oso/9780198753919.001.0001.
4. Y. Tian and O. Ya˘gan, Spreading Processes With Layer-Dependent Population Heterogeneity Over Multilayer Networks, IEEE Transactions on Network Science and Engineering 11(5) (2024), 4106–4119. doi:10.1109/TNSE.2024.3396730.
5. Liu, A. Li, A. Zeng, J. Zhou, Y. Fan and Z. Di, Motif-based community detection in heterogeneous multilayer networks, Scientific Reports 14(1) (2024), 8769\. doi:10.1038/s41598-024-59120-5.
6. M. Bazzi, L.G.S. Jeub, A. Arenas, S.D. Howison and M.A. Porter, A framework for the construction of generative models for mesoscale structure in multilayer networks, Phys. Rev. Research 2(2) (2020), 023100\. doi:10.1103/PhysRevResearch.2.023100.
###
### The most problematic section is Section 7\.
**The authors introduced in Sec 6 an algorithm to generate HMN, and here they compare the network generated with their algorithm to real-life networks and show that their network is better suited to capture their characteristics compared to classical network generators. (Which is a lovely idea per se)**
**However, they fail to specify the parameters they used to generate the other network or why they chose such parameters. The result is an Erdos–Rènyi graph (they call it erdos-reyni) with a probability of connection p \> 0.5, which leads to an average number of links per node above 10^4. I have more concerns about Figure 6: the degree distribution of an ER graph in a log-log scale should be extremely narrow, the one they generate is not shown in its entirety, and it's very flat, spanning an entire order of magnitude. The Barabasi-Albert degree distribution is not a power-law.**
**Answer:** We have reported the parameters of our generated networks in the revised manuscript. Also, regarding the ER plot we use a regression plot with a log scale for both the axes. We chose regression plots with log scale to clearly distinguish between the degree distribution of each model. We would like to bring the difference between the log-log plot of the ER model (nodes \= 20000, p \= 0.5) and the BA (20000 nodes, m=3) model with and without the regression in the figures (links below and also added in the supplementary material).




###
###
If we represent ER using the log log plot without regression then it is extremely narrow as mentioned by the reviewer. This is one of the reasons we went with the regression plot for clarity in demonstration.
Kindly note that in the regression plot the degree distribution is smoothed that may exclude some of the edge case data points but it retains the overall characteristics of the degree distribution.
**There is also a problem with inconsistencies in the labels. They are comparing their model to generate HMN with the TWITT dataset; in the plot, they refer to the TWITT dataset as "user-user", the network they generate is named "synthetic 20000" in the legend, and "synthetic" in the main body of the manuscript, the Erdos-Reny network is called "erdos-reyni", and "Internet as graph" in the main text became "random internet" in the figure legend.**
We thank the reviewer for pointing out the inconsistencies in the labels. We have updated the manuscript to correct these inconsistencies.
###
### Tables
**In Tables 2-5, the authors compared the network generated with their algorithm with several real-life networks. (I like the idea) But they fail even at generating networks with the same number of nodes. In table 2, for example, they generated both a HMN network and a BINBALL network to mimic the EATN network. The original network had 55 nodes and 97 edges, while the one they generated with their algorithm had 67 nodes and 208 edges, and the one generated with the classical model BINBALL had 106 nodes and 22 edges. Not only the BINBALL air network is not connected, but more than half of the airports (nodes) they generated have 0 connections with other airports.**
**Answer:** The objective was to generate networks that are realistic and observe several properties of the network. In fact, when we generate a synthetic network from the statistics of another real world network, matching the number of nodes is possibly the lowest statistic of concern. The reason behind this is that our target was to generate a network with comparable global properties and not with the same number of nodes as shown in \[10\]. The same can be understood by the study of Barabasi on ER networks \[1\]. That was the same reason ER graphs were not able to explain many properties of real world networks \[2,11\]. The reviewer can refer to the following literature on synthetic network data set generators for similar approaches \[3-9\]. Now, in our experiments, we consider properties like centrality, clustering coefficient, and triangles, among others, so we did not match the exact number of nodes but tried to keep them comparable. This is in line with Unknown node-correspondence (UNC) methods where we compare two networks based on global structural properties \[10\]. The reviewer has correctly mentioned that we have not tried to exactly replicate the same network but modeled it using our algorithm which is a Known node-correspondence (KNC) method. Exact replication, possibly does not make any sense, as then rewiring network edges \[11\] will provide much better results. Our final objective is to generate networks that can be utilized to develop algorithms applicable to HMN. In the experiments, we tried to show that we can generate heterogeneous, homogeneous, and multilayer networks that follow real world network properties. Analogously, then the algorithm can be used to generate HMN as well.
Following the reviewer's suggestion, we have updated all the tables in our manuscript with the parameters to make the results reproducible.
Furthermore, for BINABLL, we have implemented the algorithm ourselves since the authors did not make the code available. We rechecked our implementation of the concern raised by the reviewer and found the results to be the same. So, we believe this is a problem with the BINBALL algorithm itself.
1. L. Barabási and R. Albert, Emergence of Scaling in Random Networks, Science 286(5439) (1999), 509–512. doi:10.1126/science.286.5439.509.
2. P. Erdös and A. Rényi. On random graphs, i. Publicationes Mathematicae Debrecen, 6:290, 1959\.
3. Paul W. Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. Stochastic blockmodels: First steps. Social Networks, 5(2):109–137, June 1983\. doi:10.1016/0378-8733(83)90021-7.
4. Stephen J. Young and Edward R. Scheinerman. Random Dot Product Graph Models for Social Networks. In *Algorithms and Models for the Web-Graph*, pages 138–149. Springer, Berlin, Germany, 2007\. [doi:10.1007/978-3-540-77004-6\_11](https://doi.org/10.1007/978-3-540-77004-6\_11).
5. Benchmark graphs for testing community detection algorithms”, Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi, Phys. Rev. E 78, 046110 2008
6. Watts, D. J. ‘Networks, Dynamics, and the Small-World Phenomenon.’ Amer. J. Soc. 105, 493-527, 1999\.
7. Hakimi S., On Realizability of a Set of Integers as Degrees of the Vertices of a Linear Graph. I, Journal of SIAM, 10(3), pp. 496-506 (1962)
8. Albert, R., & Barabási, A. L. (2000) Topology of evolving networks: local events and universality Physical review letters, 85(24), 5234\.
9. M. Bazzi, L.G.S. Jeub, A. Arenas, S.D. Howison and M.A. Porter, A framework for the construction of generative models for mesoscale structure in multilayer networks, Phys. Rev. Research 2(2) (2020), 023100\. doi:10.1103/PhysRevResearch.2.023100.
10. Tantardini, M., Ieva, F., Tajoli, L. *et al.* Comparing methods for comparing networks. *Sci Rep* **9**, 17557 (2019). https://doi.org/10.1038/s41598-019-53708-y
11. The Structure and Function of Complex Networks, M. E. J. Newman, SIAM Review 2003 45:2, 167-256
2 Comments
meta-review by editor
Submitted by Tobias Kuhn on
Both reviewers acknowledge improvements but also express that they consider the contribution of your work to be either unclear or limited. Considering the very skeptical evaluation of the reviewers, I expect the process of turning the manuscript into a paper publishable in Data Science to be very long. Accordingly, I decided to reject your submission.
Michael Maes (https://orcid.org/0000-0001-9416-3211)
Attachment not accesible
Submitted by Shraban Chatterjee on
Kindly allow access to the attchment, we are not able to aceess it.