Abstract:
To address the limitations of current ocean data quality control methods in terms of accuracy and their failure to fully consider the interdependencies among data factors, this paper proposes an ocean data anomaly detection method based on curve fitting and dynamic thresholding. This method enhances anomaly detection capabilities in ocean temperature profile data by integrating density clustering algorithms, curve fitting techniques, and an adaptive threshold adjustment strategy. Firstly, the ocean temperature data is segmented using a density clustering algorithm, and the largest cluster is selected, assumed to be an approximate dataset of normal data. Secondly, a model for ocean temperature and depth data within this largest cluster is created using curve fitting techniques, utilizing corresponding curve functions. The residuals between the original data and the predicted values are subsequently calculated. Finally, the maximum temperature differences at different depths are calculated, and a relationship model between temperature difference and depth is established to dynamically adjust the discrimination threshold. This approach enables precise detection of anomalous data points. Experimental results demonstrate that, compared to current operational quality control methods and machine learning techniques, the proposed method shows significant advantages in detecting anomalies in ocean temperature profile observations across 3 regions in the West Pacific, achieving an F1 score of up to 99.53%. The application of this method not only improves the accuracy of anomaly detection in ocean temperature data but also provides robust technical backing for marine scientific research.