I am a 3rd-year Data Science student at the University of Science (HCMUS), passionate about bridging the gap between academic AI research and industry-grade software. I specialize in building complete data pipelines—from web scraping and high-performance ETL to semantic search architectures and business intelligence dashboards.
- 🔭 Currently building: Scalable AI systems integrating LLMs, RAG, and Vector Databases (FAISS/pgvector).
- 🌱 Deep diving into: Advanced ML architectures (ViT, SLMs, XAI) and high-speed data processing (Polars).
- 💼 Looking for: Internship opportunities in Data Science, AI Engineering, or Data Analytics.
| Project | Description | Tech Stack |
|---|---|---|
| Tiki E-Commerce Analytics | End-to-end data pipeline to analyze price structures and customer satisfaction. Crawled 10K+ reviews, built high-performance ETL, and applied Random Forest & LDA Topic Modeling to extract business insights. | Polars, PostgreSQL, Selenium, Power BI, Scikit-learn |
| Tourism Together Platform | AI-powered travel planner utilizing multilingual semantic search. Optimized retrieval latency to ~87ms by shifting vector computations directly to the database via Supabase RPCs. | FastAPI, pgvector, Supabase, Sentence-Transformers |
| Music Recommendation System | Content-based engine processing 57K+ songs. Generated embeddings via SBERT/Word2Vec and integrated vector search for ultra-fast NLP similarity matching. | FAISS, NLP, Streamlit, Hugging Face |


