https://github.com/jettify/pytorch-optimizer/blob/910b414565427f0a66e20040475e7e4385e066a5/torch_optimizer/shampoo.py#L130 Shouldn't the second argument be `-0.5/order`? For example, with order 2, the authors raise the precondition matrices to the -1/4th power.
pytorch-optimizer/torch_optimizer/shampoo.py
Line 130 in 910b414
Shouldn't the second argument be
-0.5/order? For example, with order 2, the authors raise the precondition matrices to the -1/4th power.