Wei Yang (杨卫)

Associate Professor
School of Computer Science & Technology, HUST
Room 306, Southern Building #6

I am an Associate Professor at the School of Computer Science, Huazhong University of Science and Technology, where I co-lead the HUST Media Lab. My research interests primarily lie in the areas of imaging, graphics, computer vision, and artificial intelligence.

Before joining HUST, I worked in the Advanced Technology and Projects (ATAP) division at Google in Mountain View, USA. As a proud member of the Te'veren team, I collaborated with Rick Marks on advanced sensing and on-device intelligence using computer vision. Prior to Google, I served as a Principal Scientist at DGene, US, where I conducted research on real-time volumetric human capture systems.

I graduated from the University of Delaware in 2017, where I majored in Computer Science. At UDel, I worked with Professor Jingyi Yu on research problems in computational photography and scene understanding. During my PhD, I interned at Adobe, hosted by the ACR team, in 2015.

I am actively seeking creative and highly motivated MS and PhD students who are passionate about research.

Email  /  Google Scholar  

Services
  • Area Chair: NeurIPS 23, CVPR 23, CVPR 24
  • Program Committee: AAAI 20, AAAI 21, WACV 21, BMVC 18
  • Reviewer: CVPR, ICCV, NeurIPS, ICML, ICLR, ECCV, TPAMI, TIP, TVCG...
Research and Publications

* denotes equal contribution or co-corresponding author

AMD: Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion
Beibei Jing, Youjia Zhang, Zikai Song, Junqing Yu, Wei Yang
AAAI, 2024  
Paper / Code Comming Soon

We propose the Adaptable Motion Diffusion (AMD) model, which leverages a Large Language Model (LLM) to parse the input text into a sequence of concise and interpretable anatomical scripts that correspond to the target motion.

Progressive Text-to-Image Diffusion with Soft Latent Direction
Yuteng Ye, Jiale Cai, Hang Zhou, Guanwen Li, Youjia Zhang, Zikai Song, Chenxing Gao, Junqing Yu, Wei Yang
AAAI, 2024  
Paper / Code

We propose to harness the capabilities of a Large Language Model (LLM) to decompose text descriptions into coherent directives adhering to stringent formats and progressively generate the target image.

Attacking Transformers with Feature Diversity Adversarial Perturbation
Chenxing Gao, Hang Zhou, Junqing Yu, YuTeng Ye, Jiale Cai, Junle Wang, Wei Yang
AAAI, 2024  
Paper Cooming Soon

We present a label-free white-box attack approach for ViT-based models that exhibits strong transferability to various black-box models by accelerating the feature collapse.

Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification
Yuteng Ye, Hang Zhou, Junqing Yu, Qiang Hu, Wei Yang
AAAI, 2024  
Paper / Code

We propose a Feature Pruning and Consolidation (FPC) framework to circumvent explicit human structure parse, which consists of a sparse encoder, a global and local feature ranking module, and a feature consolidation decoder.

DiffusionTrack: Diffusion Model For Multi-Object Tracking
Run Luo, Zikai Song, Lintao Ma, Jinlin Wei, Wei Yang, Min Yang
AAAI, 2024  
Paper / Code

We formulates object detection and association jointly as a consistent denoising diffusion process from paired noise boxes to paired ground-truth boxes.

NeMF: Inverse Volume Rendering with Neural Microflake Field
Youjia Zhang, Teng Xu, Junqing Yu, Yuteng Ye, Yanqing Jing, Junle Wang, Jingyi Yu Wei Yang
ICCV, 2023 
arXiv / Project Page

We propose to conduct inverse volume rendering by representing a scene using microflake volume, which assumes the space is filled with infinite small flakes and light reflects or scattersat each spatial location according to microflake distributions.

C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction
Luoyuan Xu, Tao Guan, Yuesong Wang, Wenkai Liu, Zhaojie Zeng, Junle Wang, Wei Yang
ICCV, 2023  
Paper

We propose to construct per-view cost frustum and fuse cross-view frustums for finer geometry estimation.

DreamFace: Progressive Generation of Animatable 3D Faces under Text Guidance
Longwen Zhang, Qiwei Qiu, Hongyang Lin, Qixuan Zhang, Cheng Shi, Wei Yang, Ye Shi, Sibei Yang, Lan Xu, Jingyi Yu
SIGGRAPH, 2023 
arXiv / Project Page / Video / Web Demo / Huggingface Space

DreamFace is a progressive scheme to generate personalized 3D faces under text guidance.

HACK: Learning a Parametric Head and Neck Model for High-fidelity Animation
Longwen Zhang, Zijun Zhao, Xinzhou Cong, Qixuan Zhang, Shuqi Gu, Yuchong Gao, Rui Zheng, Wei Yang, Lan Xu, Jingyi Yu
SIGGRAPH, 2023 
arXiv / Project Page / Video

We introduce HACK (Head-And-neCK), a novel parametric model for constructing the head and cervical region of digital humans.

Adaptive Patch Deformation for Textureless-Resilient Multi-View Stereo
Yuesong Wang, Zhaojie Zeng, Tao Guan, Wei Yang, Zhuo Chen, Wenkai Liu, Luoyuan Xu, Yawei Luo
CVPR, 2023  
Paper / Code

We transplant the spirit of deformable convolution into the PatchMatch-based method for both memory-friendly and textureless-resilient MVS.

Compact Transformer Tracker with Correlative Masked Modeling
Zikai Song, Run Luo, Junqing Yu, Yi-Ping Phoebe Chen, Wei Yang
AAAI, 2023  (Oral Presentation)
arXiv / Code

We demonstrate the basic vision transformer (ViT) architecture is sufficient for visual tracking with correlative masked modeling for information aggregation enhancement.

Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection
Hang Zhou, Junqing Yu, Wei Yang
AAAI, 2023  (Oral Presentation)
arXiv / Code

We propose an Uncertainty Regulated Dual Memory Units (UR-DMU) model to learn both the representations of normal data and discriminative features of abnormal data.

HumanNeRF: Efficiently Generated Human Radiance Field from Sparse Inputs
Fuqiang Zhao, Wei Yang, Jiakai Zhang, Pei Lin, Yingliang Zhang, Jingyi Yu, Lan Xu
CVPR, 2022  
Project Page / arXiv / Code / Video

We present a neural representation with efficient generalization ability for high-fidelity free-view synthesis of dynamic humans.

Transformer Tracking With Cyclic Shifting Window Attention
Zikai Song, Junqing Yu, Yi-Ping Phoebe Chen, Wei Yang
CVPR, 2022  
arXiv / Code

CSWinTT is a new transformer architecture with multi-scale cyclic shifting window attention for visual object tracking, elevating the attention from pixel to window level.

Artemis: Articulated Neural Pets with Appearance and Motion Synthesis
Haimin Luo, Teng Xu, Yuheng Jiang, Chenglin Zhou, Qiwei Qiu, Yingliang Zhang, Wei Yang, Lan Xu, Jingyi Yu
SIGGRAPH, 2022  
Project Page / arXiv / Code / Video

ARTEMIS, the core of which is a neural-generated (NGI) animal engine, enables interactive motion control, real-time animation and photo-realistic rendering of furry animals.

NeuVV: Neural Volumetric Videos with Immersive Rendering and Editing
Jiakai Zhang, Liao Wang, Xinhang Liu, Fuqiang Zhao, Minzhang Li, Haizhao Dai, Boyuan Zhang, Wei Yang, Lan Xu, Jingyi Yu
arXiv preprint, 2202.06088  
arXiv

NeuVV introduces a hyper-spherical harmonics (HH) decomposition for modeling smooth color variations over space and time.

Self-Supervised Multi-view Stereo via Adjacent Geometry Guided Volume Completion
Luoyuan Xu, Tao Guan, Yuesong Wang, Yawei Luo Zhuo Chen, Wenkai Liu, Wei Yang
ACM Multimedia, 2022  
Paper

We propose the AGG-CVCNet to learn complete geometry inference from partial observations with high confidence.

Video-driven Neural Physically-based Facial Asset for Production
Longwen Zhang, Chuxiao Zeng, Qixuan Zhang, Hongyang Lin, Ruixiang Cao, Wei Yang, Lan Xu, Jingyi Yu
SIGGRAPH Asia, 2022  
Project Page / arXiv / Video

We present a learning-based, video-driven approach for generating dynamic facial geometries with high-quality physically-based assets

Tightcap: 3D human shape capture with clothing tightness field
Xin Chen, Anqi Pang, Wei Yang, Peihao Wang, Lan Xu, Jingyi Yu
SIGGRAPH, 2022 (TOG) 
Project Page / arXiv / Code / Video

TightCap is a data-driven scheme to capture both the human shape and dressed garments accurately with only a single 3D human scan.

SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos
Xin Chen, Anqi Pang, Wei Yang, Yuexin Ma, Lan Xu, Jingyi Yu
IJCV, 2022 (TOG) 
Project Page / arXiv / Code / Video

SportsCap -- the first approach for simultaneously capturing 3D human motions and understanding fine-grained actions from monocular challenging sports video input

Structure From Motion on XSlit Cameras
Wei Yang, Yingliang Zhang, Jinwei Ye, Yu Ji, Zhong Li, Mingyuan Zhou, Jingyi Yu
TPAMI, 2019  
Paper

We present a structure-from-motion (SfM) framework based on a special type of multi-perspective camera called the cross-slit or XSlit camera.

Robust 3D Human Motion Reconstruction via Dynamic Template Construction
Zhong Li, Yu Ji, Wei Yang, Jinwei Ye, Jingyi Yu
3DV, 2017  (Spotlight Oral Presentation)
arXiv / Video / Data

We generate a global full-body template by registering all poses in the acquired motion sequence, and then construct a deformable graph by utilizing the rigid components in the global template.

The Light Field 3D Scanner
Yingliang Zhang, Zhong Li, Wei Yang, Peihong Yu, Haiting Lin, Jingyi Yu
ICCP, 2017  
Paper

We use the light field (LF) camera such as Lytro and Raytrix as a virtual 3D scanner.

Ray Space Features for Plenoptic Structure-from-Motion
Yingliang Zhang, Peihong Yu, Wei Yang, Yuanxi Ma, Jingyi Yu
ICCV, 2017  
Paper

We present a comprehensive theory on ray geometry transforms under light field pose variations, and derive the transforms of three typical ray manifolds.

Resolving Scale Ambiguity via XSlit Aspect Ratio Analysis
Wei Yang, Haiting Lin, Sing Bing Kang, Jingyi Yu
ICCV, 2015  
Paper

We present the depth dependent aspecratio (DDAR) property that can be used to 3D recovery.

Ambient Occlusion via Compressive Visibility Estimation
Wei Yang, Yu Ji, Haiting Lin, Yang Yang, Sing Bing Kang, Jingyi Yu
CVPR, 2015  
Paper

We present a novel computational imaging solution for recovering AO by adopting a compressive sensing framework.

Depth-of-field and Coded Aperture Imaging on Xslit Lens
Jinwei Ye, Yu Ji, Wei Yang, Jingyi Yu
ECCV, 2014  (Oral Presentation)
Paper / Video

We explore coded aperture solutions on a special non-centric lens called the crossedslit (XSlit) lens.

Coplanar Common Points in Non-centric Cameras
Wei Yang, Yu Ji, Jinwei Ye, S. Susan Young, Jingyi Yu
ECCV, 2014  
Paper

We address the problem of determining CCP existence in general non-centric cameras.


I borrowed this website template from Jon Barron, thanks!