Intuitiveness as the next stage of open data: dataset design and complexity

Tracking #: 944-1924

Authors:


Submission Type: 

Research Paper

Abstract: 

Purpose: Open data platforms face a critical challenge: datasets designed for expert users remain inaccessible to citizens and professionals with varying data literacy levels. This paper addresses the need for intuitive datasets whose complexity can adapt to user needs and capabilities, enabling broader value extraction from open data. Methods: We develop a conceptual meta-design framework grounded in design science research methodology and hierarchical design patterns. The framework defines five levels of data abstraction (L0–L4), from atomic datums to unlinkable multi-level datasets. We formalize dataset complexity mathematically and demonstrate that transitions between abstraction levels achieve 75–100% complexity reduction. The framework is implemented as the open-source intuitiveness Python package, featuring AI-assisted entity discovery, semantic domain matching, and interactive navigation. We validate the approach through a case study with a major international logistics operator managing 8,368 indicators across multiple data sources. Results: The descent phase (L4→L0) transformed chaotic metadata into a clear atomic metric, revealing 40,279 relationships and enabling systematic identification of redundancy clusters. The ascent phase (L0→L3) reconstructed intuitive multi- level tables that directly answered business questions about indicator consolidation. The accompanying Streamlit interface integrates with France’s data.gouv.fr platform, making 45,000+ datasets accessible through guided workflows supporting both descent and ascent operations. Conclusions: The framework provides a principled method for designing intuitive datasets that serve diverse data publics—from citizens seeking single facts to data scientists building complex products. By formalizing complexity levels and reduction mechanisms, we enable open data platforms to move beyond one-size-fits-all presentations toward adaptive, user-responsive dataset designs. Future work includes formal user studies across literacy levels and extension to streaming data contexts.

Manuscript: 

Tags: 

  • Reviewed

Data repository URLs: 

Date of Submission: 

Friday, January 9, 2026

Date of Decision: 

Wednesday, January 14, 2026


Nanopublication URLs:

Decision: 

Reject (Pre-Screening)