josephzhong

Joseph Zhong josephzhong

Highlights

open-r1 open-r1 Public

A RL framework that supports latest LLMs and VLMs (added new rl features and new verifiers, enhanced profiling)

Python
VLM-reward-hacking-detection VLM-reward-hacking-detection Public

A framework to train soft tokens and a backbone VLM for detecting reward hacking in target VLMs.

Python
flash-attn-economical-gpu flash-attn-economical-gpu Public

A set of efficient flash attention implementations based on Triton, beating or reaching comparable performance with Pytorch's cuDNN-backed SDPA on Ampere and Turing GPUs.

Python
recsys-retailrocket recsys-retailrocket Public

RecSys model training framework on a challenging dataset (finished in 2 days with AI coding)

Python
cs762_project cs762_project Public

LLM reward hacking detection

Python
FM FM Public

An efficient implementation of Factorization Machines that supports order=3 features in linear time.

Python