Siming Fan(樊思明)

Personal Statement
Anime and single-player games are my biggest hobbies. My favorite anime is さくら荘のペットな彼女, for which I created a MAD. This anime inspired me to be a programmer. Currently (since 2024.9), I am most interested in AI-accelerated anime/game creation. Feel free contact me through WeChat or Gmail!
Working Experience
AVAR Aiuni 10.2024~now
SenseTime Mobile Intelligence Group 05.2023~10.2024 supervised by Chen Qian and Wentao Liu.
SenseTime Research/3DAR 08.2020~04.2023 supervised by Kwan-Yee Lin and mentor Jingtan Piao.
Education
Bachelor’s in Information and Computational Science (Computer Science Track)
School of Mathematical Sciences
University of Electronic Science and Technology of China (UESTC) Undergraduate 08.2017~07.2021

Link

主页 (Chinese)

Google Scholar

Github

Social
Email
WeChat
QQ
Bilibili
CNBlog
Zhihu
Luogu OJ

Hosted on GitHub Pages — Theme by orderedlist

Open-source Projects in SenseTime Research

2024 MultiModal Frame Retrieval in Video and Editing

  • VideoLLM Retrieves Video Frames
    Frame Localization Image

    BestMoment Annotation Example using our pipeline and GPT4o API, which is the best in action retrieval.

  • Large-scale (18.6M instances) Synthetic Pose Text Annotation

    This dataset aims to address the issues of high cost (¥0.03/character) in manual annotation and low accuracy (accuracy manual:GPT:ours=95%:70%:95%) in GPT4o annotation for pose description. It is divided into two versions: (a) single-frame pose description and (b) dual-frame pose change description, used for training text-to-frame models and image editing models.

    (a) Single-frame pose + Tracking visualization, image version of PoseScript, the text on the image describes the pose of the current bbox person. Double-click to zoom in for detailed annotations, including text, MPJPE (mean per-joint position error), and Y-axis orientation (±180 degrees for front view).

    (b) Dual-frame pose description change + second-frame pose description visualization (non-tracking version), image version of PoseFix, hover to pause.

  • Fine-grained (Action/Pose) Text Description to Locate Video Frames

    (Hover to zoom in)



    2021-2023 Rendering & Animation

  • 3D Animation with Secondary Motion

    (Hover to zoom in)

  • DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering. Wei Cheng, Ruixiang Chen, Wanqi Yin, Siming Fan, Keyu Chen, Honglin He, Huiwen Luo, Zhongang Cai, Jingbo Wang, Yang Gao, Zhengming Yu, Zhengyu Lin, Daxuan Ren, Lei Yang, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Bo Dai, Kwan-Yee Lin.
    ICCV2023 [arxiv] [project page] [code].stars
  • RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars. Pan, Dongwei and Zhuo, Long and Piao, Jingtan and Luo, Huiwen and Cheng, Wei and Wang, Yuxin and Fan, Siming and Liu, Shengqi and Yang, Lei and Dai, Bo and Liu, Ziwei and Loy, Chen Change and Qian, Chen and Wu, Wayne and Lin, Dahua and Lin, Kwan-Yee.
    NeurIPS 2023 [arxiv] [project page] [code].stars

    (Hover to zoom in)

  • Simulating Fluids in Real-World Still Images. Siming Fan, Jingtan Piao, Chen Qian, Kwan-Yee Lin, Hongsheng Li.
    ICCV2023 Oral [arxiv] [project page] [github].stars

    (Hover to zoom in)

    2020 RGB-Lidar 3D Detection & Unsupervised Domain Adaptation

  • Pytorch version of frustum-pointnets. [code]stars