Predictive Ethereum Fraud Detection with Few-Shot and Active Learning on Scarce Labels


Ayaz T. B., Özara M. F., Çelik A. E., AKBULUT A.

9th International Artificial Intelligence and Data Processing Symposium, IDAP 2025, Malatya, Türkiye, 6 - 07 Eylül 2025, (Tam Metin Bildiri) identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/idap68205.2025.11222326
  • Basıldığı Şehir: Malatya
  • Basıldığı Ülke: Türkiye
  • Anahtar Kelimeler: Active Learning, Blockchain, Ethereum, Few-Shot Learning, Fraud Detection, Machine Learning, Security
  • İstanbul Kültür Üniversitesi Adresli: Evet

Özet

Blockchain systems promote transparency, decentralization, and reliability. Nevertheless, they remain vulnerable to more sophisticated fraudulent actors, particularly on newly launched platforms with little or nonexistent transaction histories. This paper introduces an innovative hybrid learning model that integrates few-shot learning with active learning to address two fundamental issues in fraud detection within financial systems: the limited availability of annotated fraud data and the ongoing evolution of illegal behavior. The study provides a comprehensive evaluation of two complementary datasets: a public dataset comprising real-life transactions obtained from Kaggle to establish a generalizable benchmark (9,841 transactions, 22.1% fraudulent) and a custom synthetic dataset designed for the PointXchange platform (197,458 transactions, 0.16% fraudulent). The approach we use generates balanced training sets and minimizes annotation costs by sampling as few as 8 to 128 samples per class and iteratively querying an oracle for useful labels. Benchmarks conducted across four families of algorithms: gradient boosting machines (XGBoost, LightGBM, CatBoost), boosting (AdaBoost), ensemble learners (Random Forest, Extra Trees), and neural networks (MLP, XNet), demonstrate the effectiveness of the proposed approach with recall scores reaching up to 0.9906.