My name is Min-Hung (Steve) Chen (陳敏弘 in Chinese). I am a Staff Research Scientist at NVIDIA Research Taiwan, working on Vision+X Multimodal AI. I received my Ph.D. degree from Georgia Tech, advised by Prof. Ghassan AlRegib and in collaboration with Prof. Zsolt Kira. Before joining NVIDIA, I was working on Biometric Research for Cognitive Services as a Research Engineer II at Microsoft Azure AI, and was working on Edge-AI Research as a Senior AI Engineer at MediaTek, respectively.
My research interest is mainly Multimodal AI, including Vision-Language, 4D/Spatial Understanding, Efficient Deep Learning, VLA, and Transformer. I am also interested in Learning without Fully Supervision, including domain adaptation, transfer learning, continual learning, X-supervised learning, etc.
[Update] I released a comprehensive paper list for Vision Transformer & Attention to facilitate related research. Feel free to check it (I would be appreciative if you can ★STAR it)!
[Personal Website][LinkedIn][Twitter][Google Scholar][Resume]



