Reviewer has chosen not to be Anonymous
Overall Impression: Bad
Suggested Decision: Reject
Technical Quality of the paper: Unable to judge
Presentation: Bad
Reviewer`s confidence: High
Significance: Low significance
Background: Incomplete or inappropriate
Novelty: Unable to judge
Data availability: Not all used and produced data are FAIR and openly available in established data repositories; authors need to fix this
Length of the manuscript: The authors need to elaborate more on certain aspects and the manuscript should therefore be extended (if the general length limit is already reached, I urge the editor to allow for an exception)
Summary of paper in a few sentences:
The paper describes a technique for prediction of software bugs based on fuzzy logic, to be used in situations where the training datasets contains a small number of positive labels.
Although being interesting and relevant work, the paper raises many questions about soundness of methodology and results, thus I recommend rejection.
Specifically, there are too many unjustified statements and modeling choices. Without proper justification, i.e. citing previous work or showing convincing evidence, it is impossible to determine if the claims are true and if results are novel.
Reasons to accept:
None
Reasons to reject:
A large field of literature is neither described nor compared to the proposed method, which weakens the paper significantly (see the survey paper D’Ambros et al. 2012. Evaluating defect prediction approaches: a benchmark and an extensive comparison).
The technique uses a classic evaluation metric: the ROC curve. This choice over a more sophisticated metric, e.g. effort-aware, is not justified.
The authors state that some attributes of the dataset have been removed as they were weakly relevant, yet no proof of this has been presented.
Furthermore, the results of the simulations are not compared to any other existing technique, thus it is impossible to evaluate the quality of the proposed method.
Nanopublication comments:
Further comments:
There are several grammatical errors in the text, so a thorough proof reading of the paper by a native English speaker is recommended.
The introduction talks about how other authors used a specific dataset and lists its attributes.
Although being a useful information, it is not presented appropriately: this information should be moved to either the related work section or to the methods section, the database is not named, specific work relying on this database should be explicitly cited, the attribute names are not informative enough by themselves.
Figure 2 is a screenshot showing several plots, which are not readable and miss labels and captions. This figure should be rethought in order to convey more information with less charts.
Figure1 and the description of the layers can be improved.
Table 2 is not explained.
Figures 3 to 6 need titles, labels, legends and a baseline for comparison.
Overall the paper does not look curated and gives the impression of being put together in a hurry.