jayzen33

Hi there 👋 I am jayzen33

🔭 I’m currently working on speech synthesis (TTS), focusing on controllable emotional TTS, audio-visual joint generation, and MIDI-based singing voice synthesis.
🌱 I’m currently learning multi-modal large language models, reinforcement learning for generation tasks (DPO/GRPO), and advanced speech tokenization techniques, as well as exploring AI agents for daily life integration.
👯 I’m looking to collaborate on projects related to speech/audio generation, multi-modal AI, and creative applications of AIGC.
🤔 I’m looking for help with optimizing efficient data cleaning pipelines, scaling up model training, and exploring novel evaluation metrics for generative models.
💬 Ask me about speech synthesis, controllable TTS, audio-visual generation, AI agents, or anything related to AI and technology.
📫 How to reach me: jayzen33@outlook.com
😄 Pronouns: He/Him