Hi! This benchmark is very meaningful! However, I am a bit confused by the secondary structure datasets. The datasets provided here are different from that on HuggingFace. Further, the number of train/test samples (the train/test files of '.csv' format) are different from the description in the paper (Table A10). Could you please give me some detailed clarification about the SSP datasets. It will be better if you can give some guidance on how to reproduce the results presented in the paper, because I do not know which file should be used when reproduced the prediction score of RNA-FM. Thanks so much!!
Hi! This benchmark is very meaningful! However, I am a bit confused by the secondary structure datasets. The datasets provided here are different from that on HuggingFace. Further, the number of train/test samples (the train/test files of '.csv' format) are different from the description in the paper (Table A10). Could you please give me some detailed clarification about the SSP datasets. It will be better if you can give some guidance on how to reproduce the results presented in the paper, because I do not know which file should be used when reproduced the prediction score of RNA-FM. Thanks so much!!