Review Details
Reviewer has chosen to be Anonymous
Overall Impression: Weak
Suggested Decision: Reject
Technical Quality of the paper: Average
Presentation: Average
Reviewer`s confidence: High
Significance: Low significance
Background: Reasonable
Novelty: Limited novelty
Data availability: All used and produced data (if any) are FAIR and openly available in established data repositories
Length of the manuscript: The authors need to elaborate more on certain aspects and the manuscript should therefore be extended (if the general length limit is already reached, I urge the editor to allow for an exception)
Summary of paper in a few sentences:
The authors introduce their new outlier detection approach that uses by data decomposition using k-means clustering to reduce the data space and use this reduced space in conjunction with existing outlier detection algorithms. The approach has two stages. In stage 1, the authors perform data decomposition using space partitioning to partition the input data into sub-groups. In stage 2, they assign the outlier detectors in the sub-groups according to the outlier detection algorithm. The authors perform experiments to compare their proposed approach by using it with a range of outlier detection techniques and analyses it across a range of datasets.
Reasons to accept:
If the authors can revise the paper to the extent where it motivates the need for a new clustering approach, e.g., by identifying gaps in the literature etc then the authors approach would be justified.
Reasons to reject:
The paper does not motivate the authors approach well. It does not identify gaps in the literature and then explain how the authors are filling those gaps.
It is not clear what is novel here? Outlier clustering was described in 2004 in Hodge & Austin and in 2012 by Han & Kamber. Why is the method proposed here novel, what is different from previous approaches? The authors need to discuss previous approaches from the literature and identify the novelty better.
Han, J., Kamber, M., & Pei, J. (2012). Outlier Detection. In Data Mining (pp. 543–584). Elsevier. https://doi.org/10.1016/b978-0-12-381479-1.00012-5
Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial intelligence review, 22, 85-126.
Section 3 is a literature review with no motivation, or compare and contrast. What are the gaps in the literature? Why is a new approach necessary? What gap or gaps is the authors' technique filling?
On page 18, how do the authors select k? In the results in the tables and figures, the optimal k value varies across data sets. Does this method work solely with labelled data? On page 2 , line 8, the authors seem to introduce their method as unsupervised as they state that unsupervised outlier detection is the exciting research area and then introduce their approach which leaves the reader to assume it is unsupervised. How would a user of the authors' approach select the optimal k-value without labelled data which is often not available in outlier detection?
See Han, J., Kamber, M., & Pei, J. (2012). Outlier Detection. In Data Mining (pp. 543–584). Elsevier. https://doi.org/10.1016/b978-0-12-381479-1.00012-5
The paper needs more explanation of ideas and concepts.
Page 1, line 42, the authors introduce supervised, semi-supervised, and unsupervised outlier detection but do not describe them or provide a citation, e.g, Han & Kamber. Han, J., Kamber, M., & Pei, J. (2012). Outlier Detection. In Data Mining (pp. 543–584). Elsevier. https://doi.org/10.1016/b978-0-12-381479-1.00012-5
Page 2, line 13, how is data decomposition different to clustering?
Page 7, What does figure 3 show? This needs to be explained better. What do 3a, 3b, 3c and 3d show?
Page 8, the caption for figure 3 should explain what the figure shows
Page 11, table 2, are these outliers prescribed in the UCI repository or have the authors chosen which class to choose as outliers? If the authors have chosen then they need to motivate why they have chosen each class in each data set.
Page 16, figure 5, please explain what this shows. What are the numbers on the bars?
Page 18, line 8, Table ??,
The layout needs improving, figures should be placed near where they are first cited. Some are 2 or 3 pages later.
Page 2, line 15, I assume 2 - d should be 2-d ?
Nanopublication comments:
Further comments:
1 Comment
meta-review by editor
Submitted by Tobias Kuhn on
The paper is rejected due to a lack of novelty, as it fails to present new or groundbreaking findings in its field. Additionally, weaknesses in the methodology and analysis are identified. Furthermore, the paper does not clearly articulate its relation to existing literature, and it requires more thorough clarification on its theories and principles, to enhance understanding and context.
Francesca D. Faraci (https://orcid.org/0000-0002-8720-1256)