Abstract:
This research introduces an enhanced SS-DBSCAN, a scalable and robust density-based clustering algorithm designed to tackle challenges in high-dimensional and complex data analysis. The algorithm integrates advanced parameter optimization techniques to improve clustering accuracy and interpretability. Key innovations include a Fast Grid Search (FGS) method for optimizing the search of optimal MinPts by keeping the epsilon parameter obtained constant. Notably, this study emphasizes the often-overlooked MinPts parameter, introducing a dynamic approach that initiates by calculating density metrics within a specified epsilon distance and adjusting the MinPts range based on the standard deviation of these metrics. This approach identifies optimal MinPts values based on the maximum allowed range. Comprehensive experiments on five real-world datasets demonstrate SS-DBSCAN's superior performance compared to DBSCAN, HDBSCAN, and OPTICS, evidenced by higher silhouette and Davies-Bouldin Index scores. The results highlight SS-DBSCAN's ability to capture intrinsic clustering structures accurately, providing deeper insights across various research domains. SS-DBSCAN's scalability and adaptability to diverse data densities make it a valuable tool for analyzing large, complex datasets.