I downloaded the original imagenet dataset ``` 1d675b47d978889d74fa0da5fadfb00e ILSVRC2012_img_train.tar ccaf1013018ac1037801578038d370da ILSVRC2012_img_train_t3.tar 29b22e2961454d5413ddabcf34fc5622 ILSVRC2012_img_val.tar e1b8681fff3d63731c599df9b4b6fc02 ILSVRC2012_img_test_v10102019.tar ``` After unpacking, I ran `./run makeshards`. Number of shards is only 147 for train and 6 for val. I wonder why `nshards` is set so high in this line: https://github.com/webdataset/webdataset-lightning/blob/7b98a6a4e9e8735973f9de29151e6215380e5c9d/run#L3 and this line: https://github.com/webdataset/webdataset-lightning/blob/7b98a6a4e9e8735973f9de29151e6215380e5c9d/train.py#L116 What is the dataset it is expecting, or what is the correct size of the shards?
I downloaded the original imagenet dataset
After unpacking, I ran
./run makeshards.Number of shards is only 147 for train and 6 for val. I wonder why
nshardsis set so high in this line:https://github.com/webdataset/webdataset-lightning/blob/7b98a6a4e9e8735973f9de29151e6215380e5c9d/run#L3
and this line:
https://github.com/webdataset/webdataset-lightning/blob/7b98a6a4e9e8735973f9de29151e6215380e5c9d/train.py#L116
What is the dataset it is expecting, or what is the correct size of the shards?