Review Details
Reviewer has chosen to be Anonymous
Overall Impression: Weak
Suggested Decision: Reject
Technical Quality of the paper: Weak
Presentation: Good
Reviewer`s confidence: High
Significance: Moderate significance
Background: Reasonable
Novelty: Lack of novelty
Data availability: Not all used and produced data are FAIR and openly available in established data repositories; authors need to fix this
Length of the manuscript: The authors need to elaborate more on certain aspects and the manuscript should therefore be extended (if the general length limit is already reached, I urge the editor to allow for an exception)
Summary of paper in a few sentences:
Authors used a deep 1D Convolutional Neural Network to identify whether any threats exist or not in voice calls of mobile phones.
They have collected voice call dataset with more than 9 hours of audio data and annotated with 3 labels namely crime, normal and sarcastic.
CNN with MLP was employed to classify the audio to one of the 3 labels. Two variations namely WeightedRandomSampler and ClassWeight methods were experimented on the dataset.
Reasons to accept:
Data set contribution.
Paper is well written
Reasons to reject:
Lack of contributions in the methodology
Lack of analysis
Nanopublication comments:
Further comments:
The major contribution of this paper is on data set. However, more details can be added related to data set collection.
1. How many annotators were involved in labeling?
2. What is the inter-rater agreement?
3. What are the guidelines used for annotating to crime, normal and sarcastic in Bengali language?
Introduction section can be concluded with what are the open challenges and contributions of this research.
Related work section can be summarized with a table highlighting the research gaps.
What is the reason for choosing M11 architecture for solving this problem? What is the impact of recurrent neural network which may be a better choice for voice data that captures the context better than CNN architectures.
The data set used for evaluation is not a gold standard one. Authors selected a few instances to be the test data set and they claim they achieved 91% accuracy which may be subjective to the test data they have chosen. It would be better if authors perform a k-fold cross validation when they evaluate on their own data set.
A detailed empirical analysis can be done with more variations of methodology.
Error analysis and statistical analysis can be included.
2 Comments
Review the paper and comment.
Submitted by Malik Jawarneh on
Positive Comments:
-The research paper proposes a unique approach to detect potential threats in phone calls using deep learning.
-The proposed system uses a Deep 1D Convolutional Neural Network to analyze the calls and a Multi-Layer Perceptron to decide whether any threats exist or not.
-The proposed simple baseline solution is able to achieve 91% precision, recall and F1-score in detecting the crime calls.
-The recorded calls are freely available to use by the future researchers.
Negative Comments:
-The research paper does not provide any specifics on the Multi-Layer Perceptron used in the system.
-The research paper does not provide any information on the challenges faced while collecting the dataset.
-The research paper does not provide any information on the potential applications of the system.
Meta-Review by Editor
Submitted by Tobias Kuhn on
There are many question from reviewers which are not answered in the paper. I encorage the authors to consider the reviewers suggestion carfully to improve the paper for future work.
Important questions to take care while updating the papers are
1) Annotation process inculding the guidelines and annotators (reviewer 1 and reviewer 2 question)-- since the papers main contribution is dataset
2) Situation in code-mixed -- real world senario (reviewer 1 question)
3) Lack of analysis (reviewer 2 question)
4) Details about the models used
Bharathi Raja Chakravarthi (https://orcid.org/0000-0002-4575-7934)