Great job
What do you think about rebuilding the finetuning script and adding additional fields like programming language etc. What would an ideal data structure look like? I have similar idea to build code model and I think some extra fields can help us