About
I am currently a postdoc scholar at the University of Pennsylvania, advised by Prof. Konrad Kording. Perviously I obtained my Ph.D. degree from Multimedia Lab (MMLab) at the Chinese University of Hong Kong, supervised by Prof. Xiaogang Wang and Prof. Hongsheng Li. I have also collaborated with researchers in Nvidia, SenseTime, the Chinese Academy of Sciences (CAS), and Shanghai AI Lab. Prior to CUHK, I received my B.Eng degree from College of Intelligence and Computing, Tianjin University (TJU) in 2017. At that time, I also minored in Finance.
Before I became a postdoc in UPenn, I co-founded a startup specializing in AI-driven fitness health and exercise safety solutions. In this venture, I established and spearheaded an AI Research and Development team. Our key project involved developing a scalable, quasi-real-time commercial system for motion capture and analyzing human-object interactions. This system was based on a multi-view sparse 3D reconstruction approach. I introduced several effective optimizations to tackle practical challenges and enhance system performance. These included the development of an Asynchronous Tolerant Multi-View Reconstruction, Multiview-guided Non-Maximum Suppression (NMS) method, a Quasi-real-time Scalable AI Heterogeneous Computing System, and advancements in Data and AI Infrastructures.
My research focuses on advancing the development of scalable AI perception systems that can effectively address diverse machine learning issues in naturalistic settings. I am keen on exploring robust representation learning pipelines that are both flexible and powerful. This entails a comprehensive study of general representation employing methods such as optimization, statistics, and HPC systems to overcome challenges presented by defective data (large-scale, unlabeled, and ill-posed). My current interest lies in several representation learning areas including generative data augmentation, self-supervision pretraining, modality alignment of foundation model, and vision learning with parameter-efficient adaptation (PETL). Additionally, I am also interested in multi-view fused perception, and high-performance real-time AI system.