--------------------------
Response to reviewer #1
## Abstract
The findings have been translated in terms of averages as a new dataset from the CSV on the Web Working Group (https://github.com/w3c/csvw) has been added to the experiments.
## Introduction
A brief explanation has been added as to why CSV file headers do not require additional configuration, along with a brief review of the differences between the methodologies of the state-of-the-art tool and the new proposal, as well as the advantages of adopting the new proposed approach.
The missing reference to the RFC-4180 specifications has been added as a footnote.
## Related Work
The "ad-dressed" typo has been fixed
## Problem formulation
Section renamed to Preliminaries
## Table uniformity
The section has been moved to a subsection of the new Approach section.
## Type detection
Added a concise hint on how data type detection is implemented. It should be noted that numerical data can be inferred by programming languages in a very practical way.
Regarding IPv6: data detection is a necessarily incomplete process, in order to favor one type of data over others. This does not mean that new types of data can be incorporated, significantly increasing the accuracy of the dialects.
Regarding empty data: The decision to favor fields with data over empty ones is based on the fact that studies have shown that a low percentage of the files available in large web repositories contain empty columns. This clarification has been duly added to the paper.
## Determining CSV file dialects
Moved numerical example to Experiments section
## Evaluation setup
The section is now covered in the paper.
Regarding CSVW test cases: this dataset is now part of experiments.
## Experiments
Regarding FAIR principles: the Zenodo record is available at https://zenodo.org/records/11331538.
The section now describes how the experiments were performed and contains the example moved from the "Determining CSV file dialects" section.
## Conclusion
This section has been added.
--------------------------
Response to reviewer #2
- Regarding tool and data files published on GitHub: Zenodo record at https://zenodo.org/records/11331538
- Regarding to file attached to a GitHub Issue: the sentence now reads "File was accessed from the CleverCSV repository on GitHub".
## Further comments: the conclusion section has been added. It discusses the practical impact of the methodology in a data mining environment, performance considerations relating to CleverCSV, future approaches with a tendency towards the creation of a hybrid system that allows an LLM to perform the post-processing of the data loaded with the inferred dialects.
1 Comment
meta-review by editor
Submitted by Tobias Kuhn on
The reviewers agree that the remaining shortcomings have been resolved, and the paper can therefore be accepted for publication.
Tobias Kuhn (https://orcid.org/0000-0002-1267-0234)