Skip to content

Include the Sophia Optimizer #501

@bratao

Description

@bratao

This new article(Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training) proposes a new optimizer that says that can improve LLM training up to 2x.

https://arxiv.org/abs/2305.14342

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions