This repository is an extended implementation of GraphPart, adapted specifically for Graph Anomaly Detection (GAD) tasks using Active Learning.
We integrate the partition-based active learning strategy into transductive GAD scenarios, addressing key challenges such as extreme class imbalance and cold-start labeling. This framework supports comprehensive benchmarking across multiple GAD datasets using standard Graph Neural Networks (GCN, GraphSAGE, GAT).
Original Paper: Partition-Based Active Learning for Graph Neural Networks (TMLR 2023)
Compared to the original implementation (focused on node classification), this repository introduces the following adaptations for Anomaly Detection:
-
GAD Benchmarks Integration: Full support for 6 standard GAD datasets:
Weibo,Reddit,Books,Enron,Disney, andInj_Cora. - Imbalance-Aware Evaluation: Switched evaluation metrics from Accuracy/Macro-F1 to ROC-AUC and Average Precision (AP) to correctly assess performance on highly imbalanced data.
-
Algorithmic Robustness:
- Implemented "Crash Protection" for K-Means clustering when partition sizes are smaller than the query budget.
- Unified label formatting (Binary: 0 for Normal, 1 for Anomaly).
-
Automated Experiment Pipeline:
-
10-Seed Stability: Automated execution over 10 random seeds with mean
$\pm$ std reporting. - Resume Capability: Smart checkpointing to skip completed experimental configurations.
-
Visualization Tools: Includes scripts for generating Latex tables (
latex.py) and plotting learning curves (plot.py).
-
10-Seed Stability: Automated execution over 10 random seeds with mean
Note: This codebase is designed for Linux systems. Due to the discontinuation of Windows support for recent versions of DGL (Deep Graph Library), we strongly recommend running this framework on Ubuntu or WSL2.
Clone the repository and install the dependencies:
git clone [https://github.com/wajason/GAD-GraphPart-Active.git](https://github.com/wajason/GAD-GraphPart-Active.git)
cd GAD-GraphPart-Active
# It is recommended to create a conda environment
conda create -n graphpart python=3.9
conda activate graphpart
# Install PyTorch and DGL (Linux) matching your CUDA version
# Example:
pip install torch torchvision torchaudio
pip install dgl -f [https://data.dgl.ai/wheels/cu118/repo.html](https://data.dgl.ai/wheels/cu118/repo.html)
pip install torch-geometric ogb scikit-learn matplotlib networkxYou can run the full benchmark (all datasets, models, and budgets) using the main script. The script automatically handles data partitioning and result logging.
python main.pyAfter the experiments are finished, you can generate reports using the included tools:
- Generate LaTeX Tables: Prints IEEE/ACM standard tables for your paper.
python latex.py
- Plot Learning Curves: Generates
.pngfigures for AUC trends across budgets.python plot.py
We evaluate our framework on benchmark datasets. The following learning curves demonstrate that GraphPart (Ours) consistently outperforms baselines, especially in low-budget scenarios.
If you use the core GraphPart algorithm in your research, please cite the original authors:
@article{ma2022partition,
title={Partition-based active learning for graph neural networks},
author={Ma, Jiaqi and Ma, Ziqiao and Chai, Joyce and Mei, Qiaozhu},
journal={arXiv preprint arXiv:2201.09391},
year={2022}
}If you find this GAD adaptation useful, please star this repository!! Thanks!



