- 🔭 I’m currently working on speech synthesis (TTS), focusing on controllable emotional TTS, audio-visual joint generation, and MIDI-based singing voice synthesis.
- 🌱 I’m currently learning multi-modal large language models, reinforcement learning for generation tasks (DPO/GRPO), and advanced speech tokenization techniques, as well as exploring AI agents for daily life integration.
- 👯 I’m looking to collaborate on projects related to speech/audio generation, multi-modal AI, and creative applications of AIGC.
- 🤔 I’m looking for help with optimizing efficient data cleaning pipelines, scaling up model training, and exploring novel evaluation metrics for generative models.
- 💬 Ask me about speech synthesis, controllable TTS, audio-visual generation, AI agents, or anything related to AI and technology.
- 📫 How to reach me: jayzen33@outlook.com
- 😄 Pronouns: He/Him
Popular repositories Loading
-
minimind
minimind PublicForked from jingyaogong/minimind
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Python
-
llama3-from-scratch
llama3-from-scratch PublicForked from naklecha/llama3-from-scratch
llama3 implementation one matrix multiplication at a time
Jupyter Notebook
-
minbpe
minbpe PublicForked from karpathy/minbpe
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
Python
-
InternVL
InternVL PublicForked from OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Python
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
syncnet_python
syncnet_python PublicForked from joonson/syncnet_python
Out of time: automated lip sync in the wild
Python
If the problem persists, check the GitHub status page or contact support.

