Skip to content

Implementation of Shampoo inconsistent with the paper #502

@siddharth9820

Description

@siddharth9820

inv_precond.copy_(_matrix_power(precond, -1 / order))

Shouldn't the second argument be -0.5/order? For example, with order 2, the authors raise the precondition matrices to the -1/4th power.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions