樊思明(Fan Siming)

个人陈述
动漫和单机游戏是我最大的爱好。最喜爱的番剧是樱花庄的宠物女孩,为此制作了个MAD,这部番激励了我成为程序员。我目前(2024.9-now)对AIGC/LLM加速动漫、游戏创作的方向最感兴趣,欢迎加我 微信一起交流学习或者通过 Gmail联系我!
工作经历
AVAR Aiuni 10.2024~now
商汤 MIG(移动智能) 05.2023~10.2024 汇报对象是
钱晨刘文韬
商汤 研究院/3DAR部门 08.2020~04.2023 汇报对象是
林君仪 和 mentor 朴镜潭
教育信息
信息与计算科学学士(计算机科学方向)
数学科学学院
电子科技大学(UESTC) 本科08.2017~07.2021

Link

Homepage(English)

Google Scholar

Github

Social
Email
Wechat
QQ
Bilibili
CNBlog
Zhihu
Luogu OJ

Hosted on GitHub Pages — Theme by orderedlist

Open-source Projects in SenseTime Research

2024 MultiModal Frame Retrieval in video and Editting

  • VideoLLM检索视频帧
    Frame Localization Image

    BestMoment Annotation Example using our pipeline and GPT4o API, which is the best in action retrieval.

  • 大规模(18.6M instances)合成姿态文本标注

    该数据集旨在解决姿态描述标注中人工标注过于昂贵(¥0.03/字)和GPT4o标注准确率过低(准确率人工:GPT:ours=95%:70%:95%)的问题, 分为(a)单帧姿态描述/(b)双帧姿态变化量描述两个版本, 用于训练文本定位视频帧模型和图像编辑模型。

    (a)单帧姿态+Tracking可视化, PoseScript的图像版本,图上的文本为当前bbox人物的姿态描述,双击放大以查看详细标注,包括文本,MPJPE(关键点平均误差)和Y轴朝向(±180度为正面)。

    (b)双帧姿态描述变化量+第二帧姿态描述可视化(未Tracking版), PoseFix的图像版本, 鼠标悬停以暂停。

  • 细粒度(动作/姿态)文本描述定位视频帧大模型

    (鼠标悬停以放大)



    2021-2023 Rendering & Animation

  • 3D Animation with Secondary Motion

    (鼠标悬停以放大)

  • DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering. Wei Cheng, Ruixiang Chen, Wanqi Yin, Siming Fan, Keyu Chen, Honglin He, Huiwen Luo, Zhongang Cai, Jingbo Wang, Yang Gao, Zhengming Yu, Zhengyu Lin, Daxuan Ren, Lei Yang, Ziwei Liu, Chen Change Loy, Chen Qian, Wayne Wu, Dahua Lin, Bo Dai, Kwan-Yee Lin.
    ICCV2023 [arxiv] [project page] [code].stars
  • RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars. Pan, Dongwei and Zhuo, Long and Piao, Jingtan and Luo, Huiwen and Cheng, Wei and Wang, Yuxin and Fan, Siming and Liu, Shengqi and Yang, Lei and Dai, Bo and Liu, Ziwei and Loy, Chen Change and Qian, Chen and Wu, Wayne and Lin, Dahua and Lin, Kwan-Yee.
    NeurIPS 2023 [arxiv] [project page] [code].stars

    (鼠标悬停以放大)

  • Simulating Fluids in Real-World Still Images. Siming Fan, Jingtan Piao, Chen Qian, Kwan-Yee Lin, Hongsheng Li.
    ICCV2023 Oral [arxiv] [project page] [github].stars

    (鼠标悬停以放大)

    2020 RGB-Lidar 3D Detection & Unsupervised Domain Adaptation

  • Pytorch version of frustum-pointnets. [code]stars