Reviewer has chosen not to be AnonymousOverall Impression:
UndecidedTechnical Quality of the paper:
Limited noveltyData availability:
All used and produced data (if any) are FAIR and openly available in established data repositoriesLength of the manuscript:
The length of this manuscript is about right
Summary of paper in a few sentences (summary of changes and improvements for
second round reviews):
This manuscript describes an approach to estimate the similarity of networks based on a Deep Learning model. The goal is to predict the best generative model to approximate a given real network.
Reasons to accept:
I am focusing my review here on how well the comments by the previous reviewers have been taken into account.
The authors have in my view sufficiently address the issue of assortativity and provided more information on the case study.
Reasons to reject:
The problem of variability in the network properties / parameters is in my view not sufficiently taken care of. Table 1 still looks very suspicious and the mentioned range of network sizes seems to confirm that suspicion. Moreover, Figure 4 is highly confusing as it includes results obtained from different settings. It seems likely that TripleFit had an unfair advantage. See my points below with some more details.
- "The size of the network is randomly chosen from 1000 to 5000 nodes": This seems to be a quite narrow range. Why was this range not made larger? (like from 10 to 1 million nodes)
- This is related to the issue of variability of one of the previous reviewers. Covering larger, and in particular also smaller, networks would increase this variability and probably significantly change Table 1. (For example, networks of just 10 nodes are probably often indiscernible, and so no perfect accuracy will be possible.)
- It seems the dataset and process used by the authors of  was not identical as the one presented here, and therefore Figure 4 is highly confusing (the top performance of TripleFit was achieved under different settings than the other methods).
- The data points in the plot in Figure 3 are made of numbers, but they are mostly too overlapping so the numbers cannot be read. Therefore the data points should rather be small dots (or crosses) instead of numbers (possibly even semi-transparent to visually get the density across too).
- It's unclear how the triplets are generated (one per dataset?). It seems this can't be exhaustive, so I assume some elements are picked randomly, but it's not clear how.
- Are the equations (10)-(14) different from what other such approaches normally use? This should be stated, and if they are the same, possibly not all these equations need to be shown here.
- Abstract: "network's" > "networks"
- Introduction: "Lots of literature" should be phrased better (e.g. "Many existing works" or "A large amount of existing literature")
- Introduction: missing "and" in front of "classical graph similarity approaches"
- Materials and Methods: "network structural similarity" should probably be "structural network similarity"
- page 7: "Deep Learning algorithm" > "The Deep Learning algorithm" or just "Deep Learning"
- Figure 3: a legend with the color codes would be helpful.
- page 11: "iteartion" > "iteration"
- equation (15) is unnecessary as euclidean distance is so well known and simple.
- Figure 6: also show numerical values in addition to color shades
Meta-Review by Editor
Submitted by Tobias Kuhn on
We ask you to carefully address all points raised by the reviewers, focusing on the size of the studied networks and on improving Figure 4 and Table 1. I also echo Reviewer 1´s recommendation to extend the discussion section and to carefully proofread your paper.
Michael Maes (https://orcid.org/0000-0001-9416-3211)