Hyperparameters in DrQA - Performance not as described

Thanks for sharing your work. Tried to run the DrQA notebook, which has excellent descriptions by the way. Just tried to spin up an Azure ML instance Standard_NC6 (6 cores, 56 GB RAM, 380 GB disk) and GPU - 1 x NVIDIA Tesla K80, to see if I could replicate the results you list after 5 epochs, but get terrible performance. I suspect that for your training you might have used a different set of hyperparameters.

The notebook contains the following:
```
HIDDEN_DIM = 128
EMB_DIM = 300
NUM_LAYERS = 3
NUM_DIRECTIONS = 2
DROPOUT = 0.3

optimizer = torch.optim.Adamax(model.parameters())
```

I suspect that it might be different LR from the default learning rate of Adamax? Hope that you still remember something about the configuration :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperparameters in DrQA - Performance not as described #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Hyperparameters in DrQA - Performance not as described #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions