Skip to content

【Hackathon 10th Spring No.13】GDI-NN模型复现#252

Open
megemini wants to merge 14 commits intoPaddlePaddle:developfrom
megemini:gdinn
Open

【Hackathon 10th Spring No.13】GDI-NN模型复现#252
megemini wants to merge 14 commits intoPaddlePaddle:developfrom
megemini:gdinn

Conversation

@megemini
Copy link
Copy Markdown

@megemini megemini commented Mar 20, 2026

由于本人不是做材料相关研究的,很多专业术语等实在不敢妄言,此 PR 仅针对 GDI-NN 模型集成到 PaddleMaterials 中进行了尝试,几点说明:

  1. 将 GDI-NN 中的 model_GNN.py model_MCM.py 模型集成到了 PaddleMaterials 中
  2. 没有将 GDI-NN repo 中的每一个函数都迁移到 PaddleMaterials 中,只把一些必要的类和函数做了转换
  3. 可以使用 python property_prediction/train.py -c property_prediction/configs/gdinn/solvgnn_binary_gamma.yaml 进行模型训练(注意修改数据路径)
  4. 后续会写一个 RFC 详细讲一下各个文件 RFC,以及模型精度对齐的情况
  5. pre-commit 修改了默认的 line 长度,原来的 88 太短了,很多公式等很容易超标

关于 AI Coding 使用的说明:

项目使用了 ai coding,模型从 torch 转换为 paddle ,基本都是与 ai 交互进行代码的转换,但是,每一个文件都经过了人工审核与很多次的修改才到如今的状态。可以参考我的 repo 中的另一个 branch tmp_gdinn 。

REVIEW 建议:

由于专业所限,也是第一次玩 PaddleMaterials,请帮忙先看看有没有大方向错误的地方,感谢!

关联:https://github.com/PaddlePaddle/community/blob/master/hackathon/hackathon_10th/%E3%80%90Hackathon_10th%E3%80%91%E5%BC%80%E6%BA%90%E8%B4%A1%E7%8C%AE%E4%B8%AA%E4%BA%BA%E6%8C%91%E6%88%98%E8%B5%9B%E6%98%A5%E8%8A%82%E7%89%B9%E5%88%AB%E5%AD%A3%E2%80%94%E4%BB%BB%E5%8A%A1%E5%90%88%E9%9B%86.md#no6---no19-paddlemateirals%E6%A8%A1%E5%9E%8B%E5%A4%8D%E7%8E%B0

#194

@paddle-bot
Copy link
Copy Markdown

paddle-bot Bot commented Mar 20, 2026

Thanks for your contribution!

@paddle-bot paddle-bot Bot added the contributor External developers label Mar 20, 2026
Copy link
Copy Markdown
Collaborator

@leeleolay leeleolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#258 辛苦参考说明,请提交模型的readme文档

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个文件实现的功能,套件里应该已经有了,参考下dataset里build的部分

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done~

@megemini
Copy link
Copy Markdown
Author

#258 辛苦参考说明,请提交模型的readme文档

Update 2026041

已经增加了 readme 文档。

这里单独说明一下复现的情况,请在以下网盘获取复现的模型与日志:

通过网盘分享的文件:gdinn_output_20260421.zip
链接: https://pan.baidu.com/s/1T_u7oWU5A3A5qXQO9IWr8Q 提取码: m8xq 复制这段内容后打开百度网盘手机App,操作更方便哦

这里共测试了以下四个模型的配置文件:

  • solvgnn_binary_gamma.yaml

    使用命令:

    python property_prediction/train.py \
        -c property_prediction/configs/gdinn/solvgnn_binary_gamma.yaml

    对应的文件为:

    output/solvgnn_binary_gamma_t_20260421_143837_s_42

    单独实验了一下

    Global:
    do_train: false
    do_eval: true
    python property_prediction/train.py \
        -c property_prediction/configs/gdinn/solvgnn_binary_gamma.yaml \
        Trainer.pretrained_model_path=/home/aistudio/PaddleMaterials/output/solvgnn_binary_gamma_t_20260421_143837_s_42/checkpoints/best.pdparams

    对应文件

    output/solvgnn_binary_gamma_t_20260421_145631_s_42
  • solvgnn_xmlp_binary_gamma.yaml

    使用命令:

    python property_prediction/train.py \
        -c property_prediction/configs/gdinn/solvgnn_xmlp_binary_gamma.yaml

    对应文件

    output/solvgnn_xmlp_binary_gamma_t_20260421_153252_s_42
  • gegnn_binary_gamma.yaml

    使用命令:

    python property_prediction/train.py \
        -c property_prediction/configs/gdinn/gegnn_binary_gamma.yaml

    对应文件

    output/gegnn_binary_gamma_t_20260421_154206_s_42
  • mcm_multimlp_binary_gamma.yaml

    使用命令:

    python property_prediction/train.py \
        -c property_prediction/configs/gdinn/mcm_multimlp_binary_gamma.yaml

    对应文件

    output/mcm_multimlp_binary_gamma_t_20260421_175552_s_42

我在 readme 中简单贴了一下 solvgnn 的训练日志,loss 稳定下降,其他几个模型也一样。

关于 readme 中介绍复现时 **数据集**:output_binary_with_inf_all.csv (280000 条样本中抽取 35374 条训练样本) 这个 35374 条数据的问题,主要是因为原始数据有点多,全量、全部epoch训练太慢了,所以我只随机抽取了一部分数据并减少了训练epoch以做验证使用。

另外,我在 rfc PaddlePaddle/community#1254 中添加了两个测试文件

  • quick_test.py 快速测试新增的模型和接口
  • test_alignment.py 与 GDI-NN 原始 repo 精度对齐测试

也在本地测试通过。

因此,鉴于:

  • 使用 yaml 文件进行测试,loss 稳定下降
  • 使用 test_alignment.py 能够与 GDI-NN 精度对齐

初步判断复现应该是问题不大。

还请 review, 感谢! @leeleolay

@megemini megemini requested a review from leeleolay April 21, 2026 10:36
@leeleolay
Copy link
Copy Markdown
Collaborator

readme文档辛苦参考其他模型的readme文档的格式,做一下调整

@leeleolay
Copy link
Copy Markdown
Collaborator

辛苦提供原始的数据集链接

@leeleolay
Copy link
Copy Markdown
Collaborator

提供的solvgnn_binary_gamma_t_20260421_145631_s_42 这个文档里面没有checkpoint

@megemini
Copy link
Copy Markdown
Author

提供的solvgnn_binary_gamma_t_20260421_145631_s_42 这个文档里面没有checkpoint

这个是我单独测试的 eval 的,不是 train,之前的说明有写~~~

Global:
do_train: false
do_eval: true

@megemini
Copy link
Copy Markdown
Author

readme文档辛苦参考其他模型的readme文档的格式,做一下调整

放到 configs 目录下,用英文,并写明 results 和 train 命令是吧~

@megemini
Copy link
Copy Markdown
Author

Update 20260421 21:19

已更新 readme ,并放置到 config 目录中。readme 中有数据的链接地址:https://git.rwth-aachen.de/avt-svt/public/GDI-NN/-/tree/6383142feb3b926fd279ae676a211fd8b3f1dac3/data

@leeleolay 请 review ~ :)

@leeleolay
Copy link
Copy Markdown
Collaborator

leeleolay commented Apr 22, 2026

数据集辛苦按照套件里已有的规范,支持训练中自动下载,链接中的4个文件是否都需要下载,我需要传到云上

@leeleolay
Copy link
Copy Markdown
Collaborator

辛苦提供复现的精度对比

@megemini
Copy link
Copy Markdown
Author

接中的4个文件是否都需要下

按照 readme 中说的,只需要 output_binary_with_inf_all.csv solvent_list.csv 这两个

@megemini
Copy link
Copy Markdown
Author

辛苦提供复现的精度对比

以下是 GDI-NN 通过它自己的 train.py 进行训练,与 PaddleMaterials 之前训练日志的精度对比,共四个模型,日志整理后各分为三个部分:

  1. GDI-NN 的日志(含执行命令)
  2. GDI-NN 每个 epoch 训练后的 loss
  3. PaddleMaterials 每个 epoch 训练后的 loss (从之前网盘中的日志整理而来)

由于后面日志比较多,这里先做几点结论说明:

  • 这里只比较最终 train 的 Loss,其他的子 loss 都有对应的换算关系,这里就不单独列举了。
  • 这里不需要看 eval 的 loss,eval 我只是为了跑通流程放的数据。
  • 每个模型的每个 epoch 训练后的 loss 都不是完全一样,有波动,趋势一致。
  • 由于 aistudio 上不能跑 torch,GDI-NN 我是在本地跑的,本地环境有限,只能跑这些了。
  • 由于两次训练环境不完全一样,即使固定了 seed ,也不能保证完全一致。
  • 我在 RFC 【Hackathon 10th Spring No.13】GDI-NN模型复现 RFC community#1254 中单独有 test_alignment 的测试脚本,那边可以查看具体到各个模型的精度,基本上精度差异 < 1e-06

@leeleolay 以下是具体的日志

模型 SolvGNN:

(venv310)  shun@shun-B660M-Pro-RS  ~/workspace/Projects/github/GDI-NN   main ±  python train.py \
    --model_type SolvGNN \
    --batch_size 256 \
    --epochs 10 \
    --hidden_dim 256 \
    --lr 0.001 \
    --pinn_lambda 1.0 \
    --mlp_activation relu \
    --enc_activation relu \
    --seed 42
Namespace(model_type='SolvGNN', batch_size=256, mlp_dropout_rate=0.0, mlp_activation='relu', enc_activation='relu', mlp_num_hid_layers=2, pinn_lambda=1.0, pinn_start_epoch=0, hidden_dim=256, batch_adding=True, lr=0.001, use_lr_scheduler=False, epochs=10, seed=42, data='binaryGamma', data_split_mode='comp_inter', num_splits=5, comp_range=[0.0, 1.0], wandb_logs=False)
/home/shun/venv310/lib/python3.10/site-packages/dgl/heterograph.py:92: DGLWarning: Recommend creating graphs by `dgl.graph(data)` instead of `dgl.DGLGraph(data)`.
  dgl_warning(
dataset size: 35374
solvgnn_binary(
  (conv1): GraphConv(in=74, out=256, normalization=both, activation=None)
  (conv2): GraphConv(in=256, out=256, normalization=both, activation=None)
  (global_conv1): MPNNconv(
    (project_node_feats): Sequential(
      (0): Linear(in_features=257, out_features=256, bias=True)
      (1): ReLU()
    )
    (gnn_layer): NNConv(
      (edge_func): Sequential(
        (0): Linear(in_features=1, out_features=32, bias=True)
        (1): ReLU()
        (2): Linear(in_features=32, out_features=65536, bias=True)
      )
    )
    (gru): GRU(256, 256)
  )
  (classify1): Linear(in_features=256, out_features=256, bias=True)
  (classify2): Linear(in_features=256, out_features=256, bias=True)
  (classify3): Linear(in_features=256, out_features=1, bias=True)
)
Epoch [1][0/138]Time 0.568 (0.568)	Loss 0.652 (0.652)	Loss-Pred 0.652 (0.652)	Loss1 1.301 (1.301)	Loss2 0.003 (0.003)	Loss-GD 0.000003 (0.000003)	
[Stage train]: Epoch 1 finished with loss=0.535 lossPred=0.524 loss1=1.008 loss2=0.040 lossGD=0.010604
[Stage validate]: Epoch 1 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [2][0/138]Time 0.247 (0.247)	Loss 0.437 (0.437)	Loss-Pred 0.423 (0.423)	Loss1 0.810 (0.810)	Loss2 0.037 (0.037)	Loss-GD 0.013930 (0.013930)	
[Stage train]: Epoch 2 finished with loss=0.381 lossPred=0.359 loss1=0.687 loss2=0.030 lossGD=0.022315
[Stage validate]: Epoch 2 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [3][0/138]Time 0.244 (0.244)	Loss 0.266 (0.266)	Loss-Pred 0.234 (0.234)	Loss1 0.434 (0.434)	Loss2 0.034 (0.034)	Loss-GD 0.032073 (0.032073)	
[Stage train]: Epoch 3 finished with loss=0.279 lossPred=0.255 loss1=0.487 loss2=0.022 lossGD=0.024307
[Stage validate]: Epoch 3 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [4][0/138]Time 0.244 (0.244)	Loss 0.188 (0.188)	Loss-Pred 0.158 (0.158)	Loss1 0.307 (0.307)	Loss2 0.009 (0.009)	Loss-GD 0.029788 (0.029788)	
[Stage train]: Epoch 4 finished with loss=0.203 lossPred=0.180 loss1=0.345 loss2=0.015 lossGD=0.023250
[Stage validate]: Epoch 4 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [5][0/138]Time 0.246 (0.246)	Loss 0.156 (0.156)	Loss-Pred 0.132 (0.132)	Loss1 0.261 (0.261)	Loss2 0.004 (0.004)	Loss-GD 0.023485 (0.023485)	
[Stage train]: Epoch 5 finished with loss=0.154 lossPred=0.135 loss1=0.260 loss2=0.009 lossGD=0.019134
[Stage validate]: Epoch 5 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [6][0/138]Time 0.245 (0.245)	Loss 0.112 (0.112)	Loss-Pred 0.090 (0.090)	Loss1 0.161 (0.161)	Loss2 0.020 (0.020)	Loss-GD 0.021825 (0.021825)	
[Stage train]: Epoch 6 finished with loss=0.147 lossPred=0.130 loss1=0.251 loss2=0.009 lossGD=0.016817
[Stage validate]: Epoch 6 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [7][0/138]Time 0.246 (0.246)	Loss 0.167 (0.167)	Loss-Pred 0.158 (0.158)	Loss1 0.313 (0.313)	Loss2 0.003 (0.003)	Loss-GD 0.008930 (0.008930)	
[Stage train]: Epoch 7 finished with loss=0.128 lossPred=0.113 loss1=0.220 loss2=0.007 lossGD=0.014428
[Stage validate]: Epoch 7 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [8][0/138]Time 0.248 (0.248)	Loss 0.147 (0.147)	Loss-Pred 0.130 (0.130)	Loss1 0.257 (0.257)	Loss2 0.003 (0.003)	Loss-GD 0.017360 (0.017360)	
[Stage train]: Epoch 8 finished with loss=0.098 lossPred=0.085 loss1=0.165 loss2=0.006 lossGD=0.012977
[Stage validate]: Epoch 8 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [9][0/138]Time 0.245 (0.245)	Loss 0.076 (0.076)	Loss-Pred 0.067 (0.067)	Loss1 0.130 (0.130)	Loss2 0.004 (0.004)	Loss-GD 0.009346 (0.009346)	
[Stage train]: Epoch 9 finished with loss=0.107 lossPred=0.095 loss1=0.184 loss2=0.006 lossGD=0.011871
[Stage validate]: Epoch 9 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [10][0/138]Time 0.245 (0.245)	Loss 0.062 (0.062)	Loss-Pred 0.050 (0.050)	Loss1 0.096 (0.096)	Loss2 0.004 (0.004)	Loss-GD 0.012476 (0.012476)	
[Stage train]: Epoch 10 finished with loss=0.097 lossPred=0.086 loss1=0.167 loss2=0.005 lossGD=0.011429
[Stage validate]: Epoch 10 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Total training time: 360.28 seconds

---

GDI-NN

Epoch 1: 0.535

Epoch 2: 0.381

Epoch 3: 0.279

Epoch 4: 0.203

Epoch 5: 0.154

Epoch 6: 0.147

Epoch 7: 0.128

Epoch 8: 0.098

Epoch 9: 0.107

Epoch 10: 0.097


---

PaddleMaterials:

Epoch 1: 0.534919

Epoch 2: 0.399445

Epoch 3: 0.327364

Epoch 4: 0.262740

Epoch 5: 0.197765

Epoch 6: 0.159561

Epoch 7: 0.124270

Epoch 8: 0.106535

Epoch 9: 0.092665

Epoch 10: 0.078702

模型 SolvGNNxMLP:

(venv310)  shun@shun-B660M-Pro-RS  ~/workspace/Projects/github/GDI-NN   main ±  python train.py \
    --model_type SolvGNNxMLP \
    --batch_size 256 \
    --epochs 10 \
    --hidden_dim 256 \
    --lr 0.001 \
    --pinn_lambda 1.0 \
    --mlp_activation relu \
    --enc_activation relu \
    --seed 42
Namespace(model_type='SolvGNNxMLP', batch_size=256, mlp_dropout_rate=0.0, mlp_activation='relu', enc_activation='relu', mlp_num_hid_layers=2, pinn_lambda=1.0, pinn_start_epoch=0, hidden_dim=256, batch_adding=True, lr=0.001, use_lr_scheduler=False, epochs=10, seed=42, data='binaryGamma', data_split_mode='comp_inter', num_splits=5, comp_range=[0.0, 1.0], wandb_logs=False)
/home/shun/venv310/lib/python3.10/site-packages/dgl/heterograph.py:92: DGLWarning: Recommend creating graphs by `dgl.graph(data)` instead of `dgl.DGLGraph(data)`.
  dgl_warning(
dataset size: 35374
solvgnn_xMLP_binary(
  (conv1): GraphConv(in=74, out=256, normalization=both, activation=None)
  (conv2): GraphConv(in=256, out=256, normalization=both, activation=None)
  (global_conv1): MPNNconv(
    (project_node_feats): Sequential(
      (0): Linear(in_features=256, out_features=256, bias=True)
      (1): ReLU()
    )
    (gnn_layer): NNConv(
      (edge_func): Sequential(
        (0): Linear(in_features=1, out_features=32, bias=True)
        (1): ReLU()
        (2): Linear(in_features=32, out_features=65536, bias=True)
      )
    )
    (gru): GRU(256, 256)
  )
  (mlp_dropout): Dropout(p=0.0, inplace=False)
  (classify1): Linear(in_features=257, out_features=256, bias=True)
  (classify2): Linear(in_features=256, out_features=256, bias=True)
  (classify3): Linear(in_features=256, out_features=1, bias=True)
)
Epoch [1][0/138]Time 0.321 (0.321)	Loss 0.650 (0.650)	Loss-Pred 0.650 (0.650)	Loss1 1.297 (1.297)	Loss2 0.003 (0.003)	Loss-GD 0.000015 (0.000015)	
[Stage train]: Epoch 1 finished with loss=0.552 lossPred=0.543 loss1=1.022 loss2=0.064 lossGD=0.008773
[Stage validate]: Epoch 1 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [2][0/138]Time 0.086 (0.086)	Loss 0.443 (0.443)	Loss-Pred 0.430 (0.430)	Loss1 0.767 (0.767)	Loss2 0.093 (0.093)	Loss-GD 0.012569 (0.012569)	
[Stage train]: Epoch 2 finished with loss=0.424 lossPred=0.406 loss1=0.708 loss2=0.105 lossGD=0.017442
[Stage validate]: Epoch 2 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [3][0/138]Time 0.087 (0.087)	Loss 0.385 (0.385)	Loss-Pred 0.361 (0.361)	Loss1 0.548 (0.548)	Loss2 0.175 (0.175)	Loss-GD 0.023958 (0.023958)	
[Stage train]: Epoch 3 finished with loss=0.359 lossPred=0.337 loss1=0.542 loss2=0.131 lossGD=0.022313
[Stage validate]: Epoch 3 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [4][0/138]Time 0.087 (0.087)	Loss 0.257 (0.257)	Loss-Pred 0.233 (0.233)	Loss1 0.369 (0.369)	Loss2 0.098 (0.098)	Loss-GD 0.023411 (0.023411)	
[Stage train]: Epoch 4 finished with loss=0.313 lossPred=0.288 loss1=0.444 loss2=0.133 lossGD=0.024586
[Stage validate]: Epoch 4 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [5][0/138]Time 0.087 (0.087)	Loss 0.222 (0.222)	Loss-Pred 0.199 (0.199)	Loss1 0.268 (0.268)	Loss2 0.131 (0.131)	Loss-GD 0.022940 (0.022940)	
[Stage train]: Epoch 5 finished with loss=0.284 lossPred=0.260 loss1=0.381 loss2=0.139 lossGD=0.023567
[Stage validate]: Epoch 5 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [6][0/138]Time 0.087 (0.087)	Loss 0.237 (0.237)	Loss-Pred 0.212 (0.212)	Loss1 0.287 (0.287)	Loss2 0.138 (0.138)	Loss-GD 0.024766 (0.024766)	
[Stage train]: Epoch 6 finished with loss=0.273 lossPred=0.250 loss1=0.364 loss2=0.137 lossGD=0.022893
[Stage validate]: Epoch 6 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [7][0/138]Time 0.088 (0.088)	Loss 0.254 (0.254)	Loss-Pred 0.231 (0.231)	Loss1 0.387 (0.387)	Loss2 0.076 (0.076)	Loss-GD 0.022668 (0.022668)	
[Stage train]: Epoch 7 finished with loss=0.265 lossPred=0.243 loss1=0.354 loss2=0.132 lossGD=0.021838
[Stage validate]: Epoch 7 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [8][0/138]Time 0.087 (0.087)	Loss 0.403 (0.403)	Loss-Pred 0.378 (0.378)	Loss1 0.555 (0.555)	Loss2 0.200 (0.200)	Loss-GD 0.025458 (0.025458)	
[Stage train]: Epoch 8 finished with loss=0.252 lossPred=0.231 loss1=0.333 loss2=0.128 lossGD=0.021537
[Stage validate]: Epoch 8 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [9][0/138]Time 0.087 (0.087)	Loss 0.286 (0.286)	Loss-Pred 0.258 (0.258)	Loss1 0.317 (0.317)	Loss2 0.198 (0.198)	Loss-GD 0.028066 (0.028066)	
[Stage train]: Epoch 9 finished with loss=0.237 lossPred=0.216 loss1=0.308 loss2=0.125 lossGD=0.020499
[Stage validate]: Epoch 9 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [10][0/138]Time 0.087 (0.087)	Loss 0.190 (0.190)	Loss-Pred 0.170 (0.170)	Loss1 0.234 (0.234)	Loss2 0.106 (0.106)	Loss-GD 0.020324 (0.020324)	
[Stage train]: Epoch 10 finished with loss=0.224 lossPred=0.205 loss1=0.286 loss2=0.123 lossGD=0.019550
[Stage validate]: Epoch 10 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Total training time: 143.25 seconds
Training completed successfully!

---

GDI-NN

Epoch 1: 0.552

Epoch 2: 0.424

Epoch 3: 0.359

Epoch 4: 0.313

Epoch 5: 0.284

Epoch 6: 0.273

Epoch 7: 0.265

Epoch 8: 0.252

Epoch 9: 0.237

Epoch 10: 0.224

---

PaddleMaterials

Epoch 1: 0.559123

Epoch 2: 0.456567

Epoch 3: 0.398103

Epoch 4: 0.342948

Epoch 5: 0.301473

Epoch 6: 0.286764

Epoch 7: 0.264959

Epoch 8: 0.249806

Epoch 9: 0.232608

Epoch 10: 0.222790

模型 GEGNN:

(venv310)  shun@shun-B660M-Pro-RS  ~/workspace/Projects/github/GDI-NN   main ±  python train.py \
    --model_type GEGNN \
    --batch_size 256 \
    --epochs 10 \
    --hidden_dim 256 \
    --lr 0.001 \
    --pinn_lambda 1.0 \
    --mlp_activation relu \
    --enc_activation relu \
    --seed 42
Namespace(model_type='GEGNN', batch_size=256, mlp_dropout_rate=0.0, mlp_activation='relu', enc_activation='relu', mlp_num_hid_layers=2, pinn_lambda=1.0, pinn_start_epoch=0, hidden_dim=256, batch_adding=True, lr=0.001, use_lr_scheduler=False, epochs=10, seed=42, data='binaryGamma', data_split_mode='comp_inter', num_splits=5, comp_range=[0.0, 1.0], wandb_logs=False)
/home/shun/venv310/lib/python3.10/site-packages/dgl/heterograph.py:92: DGLWarning: Recommend creating graphs by `dgl.graph(data)` instead of `dgl.DGLGraph(data)`.
  dgl_warning(
dataset size: 35374
gegnn_binary(
  (conv1): GraphConv(in=74, out=256, normalization=both, activation=None)
  (conv2): GraphConv(in=256, out=256, normalization=both, activation=None)
  (global_conv1): MPNNconv(
    (project_node_feats): Sequential(
      (0): Linear(in_features=256, out_features=256, bias=True)
      (1): ReLU()
    )
    (gnn_layer): NNConv(
      (edge_func): Sequential(
        (0): Linear(in_features=1, out_features=32, bias=True)
        (1): ReLU()
        (2): Linear(in_features=32, out_features=65536, bias=True)
      )
    )
    (gru): GRU(256, 256)
  )
  (mfp_trans): Linear(in_features=257, out_features=257, bias=True)
  (classify1): Linear(in_features=257, out_features=256, bias=True)
  (classify2): Linear(in_features=256, out_features=256, bias=True)
  (classify3): Linear(in_features=256, out_features=1, bias=True)
)
Epoch [1][0/138]Time 0.320 (0.320)	Loss 0.467 (0.467)	Loss-Pred 0.467 (0.467)	Loss1 0.933 (0.933)	Loss2 0.002 (0.002)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 1 finished with loss=0.572 lossPred=0.572 loss1=1.062 loss2=0.083 lossGD=0.000000
[Stage validate]: Epoch 1 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [2][0/138]Time 0.090 (0.090)	Loss 0.461 (0.461)	Loss-Pred 0.461 (0.461)	Loss1 0.755 (0.755)	Loss2 0.167 (0.167)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 2 finished with loss=0.444 lossPred=0.444 loss1=0.791 loss2=0.097 lossGD=0.000000
[Stage validate]: Epoch 2 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [3][0/138]Time 0.089 (0.089)	Loss 0.313 (0.313)	Loss-Pred 0.313 (0.313)	Loss1 0.580 (0.580)	Loss2 0.047 (0.047)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 3 finished with loss=0.338 lossPred=0.338 loss1=0.633 loss2=0.042 lossGD=0.000000
[Stage validate]: Epoch 3 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [4][0/138]Time 0.089 (0.089)	Loss 0.198 (0.198)	Loss-Pred 0.198 (0.198)	Loss1 0.388 (0.388)	Loss2 0.007 (0.007)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 4 finished with loss=0.267 lossPred=0.267 loss1=0.517 loss2=0.017 lossGD=0.000000
[Stage validate]: Epoch 4 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [5][0/138]Time 0.089 (0.089)	Loss 0.253 (0.253)	Loss-Pred 0.253 (0.253)	Loss1 0.500 (0.500)	Loss2 0.005 (0.005)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 5 finished with loss=0.264 lossPred=0.264 loss1=0.506 loss2=0.021 lossGD=0.000000
[Stage validate]: Epoch 5 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [6][0/138]Time 0.089 (0.089)	Loss 0.336 (0.336)	Loss-Pred 0.336 (0.336)	Loss1 0.658 (0.658)	Loss2 0.014 (0.014)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 6 finished with loss=0.214 lossPred=0.214 loss1=0.405 loss2=0.024 lossGD=0.000000
[Stage validate]: Epoch 6 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [7][0/138]Time 0.089 (0.089)	Loss 0.153 (0.153)	Loss-Pred 0.153 (0.153)	Loss1 0.294 (0.294)	Loss2 0.013 (0.013)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 7 finished with loss=0.400 lossPred=0.400 loss1=0.789 loss2=0.012 lossGD=0.000000
[Stage validate]: Epoch 7 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [8][0/138]Time 0.089 (0.089)	Loss 0.227 (0.227)	Loss-Pred 0.227 (0.227)	Loss1 0.439 (0.439)	Loss2 0.015 (0.015)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 8 finished with loss=0.368 lossPred=0.368 loss1=0.702 loss2=0.035 lossGD=0.000000
[Stage validate]: Epoch 8 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [9][0/138]Time 0.090 (0.090)	Loss 0.387 (0.387)	Loss-Pred 0.387 (0.387)	Loss1 0.673 (0.673)	Loss2 0.100 (0.100)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 9 finished with loss=0.402 lossPred=0.402 loss1=0.718 loss2=0.087 lossGD=0.000000
[Stage validate]: Epoch 9 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [10][0/138]Time 0.089 (0.089)	Loss 0.301 (0.301)	Loss-Pred 0.301 (0.301)	Loss1 0.562 (0.562)	Loss2 0.039 (0.039)	Loss-GD 0.000000 (0.000000)	
[Stage train]: Epoch 10 finished with loss=0.365 lossPred=0.365 loss1=0.629 loss2=0.102 lossGD=0.000000
[Stage validate]: Epoch 10 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Total training time: 146.45 seconds
Training completed successfully!

---

GDI-NN

Epoch 1: 0.572

Epoch 2: 0.444

Epoch 3: 0.338

Epoch 4: 0.267

Epoch 5: 0.264

Epoch 6: 0.214

Epoch 7: 0.400

Epoch 8: 0.368

Epoch 9: 0.402

Epoch 10: 0.365


---

PaddleMaterials

Epoch 1: 0.553056

Epoch 2: 0.467343

Epoch 3: 0.422078

Epoch 4: 0.449988

Epoch 5: 0.423352

Epoch 6: 0.393276

Epoch 7: 0.380240

Epoch 8: 0.394283

Epoch 9: 0.329073

Epoch 10: 0.336712

模型 MCM_multiMLP:

(venv310)  shun@shun-B660M-Pro-RS  ~/workspace/Projects/github/GDI-NN   main ±  python train.py \
    --model_type MCM_multiMLP \
    --batch_size 256 \
    --epochs 10 \
    --hidden_dim 256 \
    --lr 0.001 \
    --pinn_lambda 1.0 \
    --mlp_activation relu \
    --enc_activation relu \
    --seed 42
Namespace(model_type='MCM_multiMLP', batch_size=256, mlp_dropout_rate=0.0, mlp_activation='relu', enc_activation='relu', mlp_num_hid_layers=2, pinn_lambda=1.0, pinn_start_epoch=0, hidden_dim=256, batch_adding=True, lr=0.001, use_lr_scheduler=False, epochs=10, seed=42, data='binaryGamma', data_split_mode='comp_inter', num_splits=5, comp_range=[0.0, 1.0], wandb_logs=False)
/home/shun/venv310/lib/python3.10/site-packages/dgl/heterograph.py:92: DGLWarning: Recommend creating graphs by `dgl.graph(data)` instead of `dgl.DGLGraph(data)`.
  dgl_warning(
dataset size: 35374
MCM_multiMLP(
  (solvent_emb): ModuleList(
    (0): Sequential(
      (0): Embedding(845, 256)
      (1): ReLU()
      (2): Dropout(p=0.05, inplace=False)
      (3): Linear(in_features=256, out_features=256, bias=True)
      (4): ReLU()
      (5): Dropout(p=0.05, inplace=False)
      (6): Linear(in_features=256, out_features=256, bias=True)
      (7): ReLU()
      (8): Dropout(p=0.05, inplace=False)
      (9): Linear(in_features=256, out_features=256, bias=True)
      (10): ReLU()
    )
  )
  (layers_end): ModuleList(
    (0-1): 2 x Sequential(
      (0): Linear(in_features=514, out_features=512, bias=True)
      (1): ReLU()
      (2): Linear(in_features=512, out_features=512, bias=True)
      (3): ReLU()
      (4): Linear(in_features=512, out_features=1, bias=True)
    )
  )
)
Epoch [1][0/138]Time 0.299 (0.299)	Loss 0.976 (0.976)	Loss-Pred 0.976 (0.976)	Loss1 1.948 (1.948)	Loss2 0.005 (0.005)	Loss-GD 0.000021 (0.000021)	
[Stage train]: Epoch 1 finished with loss=0.487 lossPred=0.487 loss1=0.973 loss2=0.001 lossGD=0.000041
[Stage validate]: Epoch 1 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [2][0/138]Time 0.009 (0.009)	Loss 0.254 (0.254)	Loss-Pred 0.254 (0.254)	Loss1 0.508 (0.508)	Loss2 0.000 (0.000)	Loss-GD 0.000009 (0.000009)	
[Stage train]: Epoch 2 finished with loss=0.385 lossPred=0.385 loss1=0.770 loss2=0.001 lossGD=0.000013
[Stage validate]: Epoch 2 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [3][0/138]Time 0.010 (0.010)	Loss 0.337 (0.337)	Loss-Pred 0.337 (0.337)	Loss1 0.674 (0.674)	Loss2 0.000 (0.000)	Loss-GD 0.000013 (0.000013)	
[Stage train]: Epoch 3 finished with loss=0.371 lossPred=0.371 loss1=0.742 loss2=0.001 lossGD=0.000006
[Stage validate]: Epoch 3 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [4][0/138]Time 0.009 (0.009)	Loss 0.287 (0.287)	Loss-Pred 0.287 (0.287)	Loss1 0.574 (0.574)	Loss2 0.000 (0.000)	Loss-GD 0.000004 (0.000004)	
[Stage train]: Epoch 4 finished with loss=0.365 lossPred=0.365 loss1=0.729 loss2=0.001 lossGD=0.000004
[Stage validate]: Epoch 4 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [5][0/138]Time 0.009 (0.009)	Loss 0.424 (0.424)	Loss-Pred 0.424 (0.424)	Loss1 0.848 (0.848)	Loss2 0.000 (0.000)	Loss-GD 0.000003 (0.000003)	
[Stage train]: Epoch 5 finished with loss=0.342 lossPred=0.342 loss1=0.684 loss2=0.000 lossGD=0.000003
[Stage validate]: Epoch 5 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [6][0/138]Time 0.010 (0.010)	Loss 0.230 (0.230)	Loss-Pred 0.230 (0.230)	Loss1 0.460 (0.460)	Loss2 0.000 (0.000)	Loss-GD 0.000002 (0.000002)	
[Stage train]: Epoch 6 finished with loss=0.281 lossPred=0.281 loss1=0.561 loss2=0.000 lossGD=0.000003
[Stage validate]: Epoch 6 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [7][0/138]Time 0.009 (0.009)	Loss 0.189 (0.189)	Loss-Pred 0.189 (0.189)	Loss1 0.378 (0.378)	Loss2 0.000 (0.000)	Loss-GD 0.000003 (0.000003)	
[Stage train]: Epoch 7 finished with loss=0.213 lossPred=0.213 loss1=0.425 loss2=0.000 lossGD=0.000004
[Stage validate]: Epoch 7 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [8][0/138]Time 0.009 (0.009)	Loss 0.206 (0.206)	Loss-Pred 0.206 (0.206)	Loss1 0.412 (0.412)	Loss2 0.000 (0.000)	Loss-GD 0.000003 (0.000003)	
[Stage train]: Epoch 8 finished with loss=0.183 lossPred=0.183 loss1=0.365 loss2=0.000 lossGD=0.000004
[Stage validate]: Epoch 8 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [9][0/138]Time 0.009 (0.009)	Loss 0.178 (0.178)	Loss-Pred 0.178 (0.178)	Loss1 0.355 (0.355)	Loss2 0.000 (0.000)	Loss-GD 0.000005 (0.000005)	
[Stage train]: Epoch 9 finished with loss=0.162 lossPred=0.162 loss1=0.324 loss2=0.000 lossGD=0.000003
[Stage validate]: Epoch 9 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Epoch [10][0/138]Time 0.009 (0.009)	Loss 0.159 (0.159)	Loss-Pred 0.159 (0.159)	Loss1 0.318 (0.318)	Loss2 0.000 (0.000)	Loss-GD 0.000003 (0.000003)	
[Stage train]: Epoch 10 finished with loss=0.145 lossPred=0.145 loss1=0.290 loss2=0.000 lossGD=0.000003
[Stage validate]: Epoch 10 finished with loss=0.000 lossPred=0.000 loss1=0.000 loss2=0.000 lossGD=0.000000
Total training time: 30.69 seconds
Training completed successfully!

---

GDI-NN

Epoch 1: 0.487

Epoch 2: 0.385

Epoch 3: 0.371

Epoch 4: 0.365

Epoch 5: 0.342

Epoch 6: 0.281

Epoch 7: 0.213

Epoch 8: 0.183

Epoch 9: 0.162

Epoch 10: 0.145

---

PaddleMaterials

Epoch 1: 0.492260

Epoch 2: 0.383629

Epoch 3: 0.367106

Epoch 4: 0.362780

Epoch 5: 0.357738

Epoch 6: 0.351863

Epoch 7: 0.343914

Epoch 8: 0.323481

Epoch 9: 0.245997

Epoch 10: 0.175376

@leeleolay
Copy link
Copy Markdown
Collaborator

ppmat遵循【单模型文件】策略,model的部分建议多个文件尽可能的合并,如果涉及到大的model的差别,可以拆分为不同的文件

@megemini
Copy link
Copy Markdown
Author

ppmat遵循【单模型文件】策略,model的部分建议多个文件尽可能的合并,如果涉及到大的model的差别,可以拆分为不同的文件

意思是说,把 ppmat/models/gdinn/gnn.py 和 ppmat/models/gdinn/mcm.py 写到同一个文件里面?

@megemini
Copy link
Copy Markdown
Author

megemini commented Apr 24, 2026

Update 20260424

已经把 mcm 的模型放到了 gdinn 中,统一使用一个模型文件进行管理。重新测试了一下,本地测试通过,不影响现有的命令执行情况 ~

@leeleolay 看看这样行不行?

另外,等您那边上传了数据集之后,我再修改一下 dataset 的部分 ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants