Number of shards for imagenet dataset? 

I downloaded the original imagenet dataset 
```
1d675b47d978889d74fa0da5fadfb00e ILSVRC2012_img_train.tar
ccaf1013018ac1037801578038d370da ILSVRC2012_img_train_t3.tar
29b22e2961454d5413ddabcf34fc5622 ILSVRC2012_img_val.tar
e1b8681fff3d63731c599df9b4b6fc02 ILSVRC2012_img_test_v10102019.tar
```
After unpacking, I ran `./run makeshards`. 
Number of shards is only 147 for train and 6 for val. I wonder why `nshards` is set so high in this line:
https://github.com/webdataset/webdataset-lightning/blob/7b98a6a4e9e8735973f9de29151e6215380e5c9d/run#L3

and this line:
https://github.com/webdataset/webdataset-lightning/blob/7b98a6a4e9e8735973f9de29151e6215380e5c9d/train.py#L116

What is the dataset it is expecting, or what is the correct size of the shards? 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Number of shards for imagenet dataset? #4

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Number of shards for imagenet dataset? #4

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions