Blind label ratio estimation

Kazemitabar, Javad; Rahimi, Soheila; Rezaei-Ghadim, Amir

doi:10.22034/jsmta.2026.22764.1173

	Blind label ratio estimation
Journal of Statistical Modelling: Theory and Applications
دوره 6، شماره 1، فروردین 2025، صفحه 105-114 اصل مقاله (1.2 M)
نوع مقاله: Original Scientific Paper
شناسه دیجیتال (DOI): 10.22034/jsmta.2026.22764.1173
نویسندگان
Javad Kazemitabar^* ¹؛ Soheila Rahimi²؛ Amir Rezaei-Ghadim²
¹Department of Electrical and Computer Engineering‎, ‎Babol Noshirvani University of Technology‎, ‎Tehran‎, ‎Iran
²Son Corporate Group‎, ‎Tehran‎, ‎Iran
چکیده
Many anomaly detection algorithms require knowledge of the ratio of the two labels to operate‎. ‎In real life‎, ‎however‎, ‎we may not have access to this value‎. ‎As such‎, ‎we often run anomaly detection packages with default values that may differ significantly from the actual value‎. ‎Experiments on multiple datasets show that correctly determination of this ratio or at least obtaining a close estimate can makes a significant difference in the final performance of the anomaly detection algorithm‎. ‎In this paper‎, ‎we address the problem of estimating this ratio using both theoretical and heuristic techniques‎. ‎In the theoretical method‎, ‎we maximize the mutual information between features and labels to find the exact ratio‎. ‎In the heuristic method‎, ‎we sweep the [0,1] range in 0.01 steps to search for the ratio‎. ‎On each iteration‎, ‎we run the anomaly detection algorithm based on the ratio for that iteration and record the correlation coefficient between the features and the label generated by the algorithm‎. ‎After the 100th iteration‎, ‎we declare the ratio that provides the maximum correlation coefficient as our estimate of the label ratio‎. ‎Our experiments on multiple datasets and several anomaly detection algorithms show that maximizing the correlation coefficient leads to the best results.
کلیدواژه‌ها
Anomaly detection؛ ‎Correlation coefficient؛ ‎Mutual information؛ ‎One-class support vector machine؛ ‎Spectral ranking of anomalies

آمار تعداد مشاهده مقاله: 99 تعداد دریافت فایل اصل مقاله: 38

سامانه مدیریت نشریات علمی دانشگاه یزد

Blind label ratio estimation