Optimizing Machine Learning through Data–Algorithm Matching: An Empirical Study

Ditong Jin; Shenwei Sun

PDF

Published: 2025-10-09

Updated: 2025-10-09

Versions:

2025-10-09 (5)

2025-10-09 (4)

2025-10-09 (3)

2025-10-09 (2)

2025-09-29 (1)

Keywords:

Artificial Intelligence, Machine Learning, Data Quality, Algorithm Selection, Model Efficacy

Ditong Jin

Shenwei Sun

Abstract

With the rapid development of artificial intelligence, machine learning models have been widely applied in various fields such as computer vision, natural language processing, and intelligent recommendation. However, the efficacy of these models is often constrained by two key factors: data quality and algorithm selection. This study conducts an empirical investigation to explore the quantitative impacts of different data quality indicators (including data completeness, accuracy, and consistency) and common machine learning algorithms (such as Random Forest, Support Vector Machine, and Convolutional Neural Network) on model performance. Experimental results show that data completeness and algorithm adaptability to task scenarios are the primary determinants of model efficacy. When data completeness reaches 95% and the algorithm matches the task characteristics, the model's average performance metric (F1-score) can be improved by up to 32% compared to low-quality data and mismatched algorithms. This research provides practical guidance for optimizing machine learning model deployment in real-world applications.

Issue

Vol. 1 No. 1 (2025): September 2025

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section