Skip to content
View josephzhong's full-sized avatar

Highlights

  • Pro

Block or report josephzhong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. open-r1 open-r1 Public

    A RL framework that supports latest LLMs and VLMs (added new rl features and new verifiers, enhanced profiling)

    Python

  2. VLM-reward-hacking-detection VLM-reward-hacking-detection Public

    A framework to train soft tokens and a backbone VLM for detecting reward hacking in target VLMs.

    Python

  3. flash-attn-economical-gpu flash-attn-economical-gpu Public

    A set of efficient flash attention implementations based on Triton, beating or reaching comparable performance with Pytorch's cuDNN-backed SDPA on Ampere and Turing GPUs.

    Python

  4. recsys-retailrocket recsys-retailrocket Public

    RecSys model training framework on a challenging dataset (finished in 2 days with AI coding)

    Python

  5. cs762_project cs762_project Public

    LLM reward hacking detection

    Python

  6. FM FM Public

    An efficient implementation of Factorization Machines that supports order=3 features in linear time.

    Python