Skip to content

Hyperparameters for MDEQ-XL on ImageNet  #32

@cmohri

Description

@cmohri

Hi,

I've been trying to reproduce the results reported in the paper, and noticed that Table 4 in Appendix A does not incorporate the hyperparameters used for training MDEQ-XL on ImageNet. In particular, I'm curious about the following:

  • In general, is the stop mode "rel" or "abs"?
  • What epsilon is used as the threshold in the Broyden solver? Should I assume it was 1e-3 as is the default value?
  • What were the forward and backward quasi-Newton thresholds $T_f, T_b$?

Thanks so much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions