Dan Xu's Research Page!
  • Home
  • People
  • Publications
  • Research
  • Activities
  • Positions

Selected Publications

[Pre-Prints]-[Refereed Conferences]-[Journals]-[Full Records]


Pre-Prints:
Picture
PyGS: Large-scale Scene Representation with Pyramidal 3D Gaussian Splatting
Zipeng Wang, Dan Xu

arXiv preprint arXiv:2404.13679
-
}

Picture
X-VILA: Cross-Modality Alignment for Large Language Model
Hanrong Ye, De-An Huang, Yao Lu, Zhiding Yu, Wei Ping, Andrew Tao, Jan Kautz, Song Han, Dan Xu, Pavlo Molchanov, Hongxu Yin

Arxiv preprint arXiv:2405.19335
-
}

Picture
You Only Train Once: Multi-Identity Free-Viewpoint Neural Human Rendering from Monocular Videos
Jaehyeok Kim, Dongyoon Wee, Dan Xu

Arxiv preprint arXiv:2303.05835
-
}

Picture
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Fa-Ting Hong, Zunnan Xu, Zixiang Zhou, Jun Zhou, Xiu Li, Qin Lin, Qinglin Lu, Dan Xu

arXiv preprint arXiv:2504.02542

(A unified talking head framework supporting both single-modal or multi-modal driven signals)
Javascript test1 ----

@InProceedings{thinkdiff25icml, title={Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation}, author={Fa-Ting Hong, Zunnan Xu, Zixiang Zhou, Jun Zhou, Xiu Li, Qin Lin, Qinglin Lu and Dan Xu}, booktitle={arXiv preprint arXiv:2504.02542}, year={2025} }

--
}


Conference:

[2025]-[2024]-[2023]-[2022]-[2021]-[2020]-[2019]-[2018]-[2017]-[2016]

Picture
Taming LLMs by Scaling Learning Rates with Gradient Grouping
Siyuan Li, Juanxi Tian, Zedong Wang, Xin Jin, Zicheng Liu, Wentao Zhang, Dan Xu

The 63rd Annual Meeting of the Association for Computational Linguistics (ACL Main), 2025, Vienna, Austria

Javascript test1 ---

@InProceedings{tamingacl25, title={Taming LLMs by Scaling Learning Rates with Gradient Grouping}, author={Siyuan Li, Juanxi Tian, Zedong Wang, Xin Jin, Zicheng Liu, Wentao Zhang and Dan Xu}, booktitle={ACL}, year={2025} }

Picture
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models
Zhenxing Mi, Kuan-Chieh Wang, Guocheng Qian, Hanrong Ye, Runtao Liu, Sergey Tulyakov, Kfir Aberman, Dan Xu

International Conference on Machine Learning (ICML), 2025, Vancouver, Canada

Javascript test1 ----

@InProceedings{thinkdiff25icml, title={I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models}, author={Zhenxing Mi, Kuan-Chieh Wang, Guocheng Qian, Hanrong Ye, Runtao Liu, Sergey Tulyakov, Kfir Aberman and Dan Xu}, booktitle={ICML}, year={2025} }

--
}

Picture
UniMC: Taming Diffusion Transformer for Unified Keypoint-guided Multi-class Image Generation
Qin Guo, Ailing Zeng, Dongxu Yue, Ceyuan Yang, Yang Cao, Hanzhong Guo, Wei Liu, Xihui Liu, Dan Xu

International Conference on Machine Learning (ICML), 2025, Vancouver, Canada

(Project Details Coming Soon)
Javascript test1 ---

@InProceedings{unimc25icml, title={UniMC: Taming Diffusion Transformer for Unified Keypoint-guided Multi-class Image Generation}, author={Qin Guo, Ailing Zeng, Dongxu Yue, Ceyuan Yang, Yang Cao, Hanzhong Guo, Wei Liu, Xihui Liu and Dan Xu}, booktitle={ICML}, year={2025} }

Picture
Human-centric Foundation Models: Perception, Generation and Agentic Modeling
Shixiang Tang, Yizhou Wang, Lu Chen, Yuan Wang, Sida Peng, Dan Xu, Wanli Ouyang

The 34th International Joint Conference on Artificial Intelligence (IJCAI), 2025, Montreal, Canada

Javascript test1 -

@InProceedings{ag25ijcai, title={Human-centric Foundation Models: Perception, Generation and Agentic Modeling}, author={Shixiang Tang, Yizhou Wang, Lu Chen, Yuan Wang, Sida Peng, Dan Xu and Wanli Ouyang}, booktitle={IJCAI}, year={2025} }

Picture
Free-Viewpoint Human Animation with Pose-Correlated Reference Selection
Fating Hong, Zhan Xu, Haiyang Liu, Qinjie Lin, Luchuan Song, Zhixin Shu, Yang Zhou, Duygu Ceylan, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, Nashville, USA

(Selected as Highlight Paper)
Javascript test1 ---

@InProceedings{freeviewpoint25, title={Free-viewpoint human animation with pose-correlated reference selection}, author={Fating Hong*, Zhan Xu, Haiyang Liu, Qinjie Lin, Luchuan Song, Zhixin Shu, Yang Zhou, Duygu Ceylan, and Dan Xu}, booktitle={CVPR}, year={2025} }

Picture
Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations
Xunzhi Zheng, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, Nashville, USA

Javascript test1 --

@InProceedings{flownerf25, title={Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations}, author={Xunzhi Zheng and Dan Xu}, booktitle={CVPR}, year={2025} }

Picture
Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs
Yingji Zhong, Zhihao Li, Zhenyu Chen, Lanqing Hong, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, Nashville, USA

(Selected as Highlight Paper)
Javascript test1 ---

@InProceedings{flownerf25, title={Taming Video Diffusion Prior with Scene-Grounding Guidance for 3D Gaussian Splatting from Sparse Inputs}, author={Yingji Zhong, Zhihao Li, Zhenyu Chen, Lanqing Hong and Dan Xu}, booktitle={CVPR}, year={2025} }

Picture
Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
Shuling Zhao, Fating Hong, Xiaoshui Huang, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, Nashville, USA

Javascript test1 ---

@InProceedings{flownerf25, title={Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation}, author={Shuling Zhao, Fating Hong, Xiaoshui Huang, and Dan Xu}, booktitle={CVPR}, year={2025} }

Picture
GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping
Jinfeng Liu, Lingtong Kong, Bo Li, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025, Nashville, USA

Javascript test1 ---

@InProceedings{GSHDR25, title={GaussHDR: High Dynamic Range Gaussian Splatting via Learning Unified 3D and 2D Local Tone Mapping}, author={Jinfeng Liu*, Lingtong Kong, Bo Li, and Dan Xu}, booktitle={CVPR}, year={2025} }

Picture
MM-Ego: Towards Building Egocentric Multimodal LLMs
Hanrong Ye, Haotian Zhang, Erik Daxberger, Lin Chen, Zongyu Lin, Yanghao Li, B. Zhang, Haoxuan You, Dan Xu, Zhe Gan, Jiasen Lu, Yinfei Yang

The International Conference on Learning Representations (ICLR), 2025, Singapore.

Javascript test1 ---

@InProceedings{GSHDR25, title={MM-Ego: Towards building egocentric multimodal LLMs}, author={Hanrong Ye, Haotian Zhang, Erik Daxberger, Lin Chen, Zongyu Lin, Yanghao Li, B. Zhang, Haoxuan You, Dan Xu, Zhe Gan, Jiasen Lu, Yinfei Yang}, booktitle={ICLR}, year={2025} }

Picture
GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal
Yuxin Wang, Qianyi Wu, Guofeng Zhang, Dan Xu

European Conference on Computer Vision (ECCV), 2024, Milan, Italy

Javascript test1 --

@InProceedings{GScreameccv24, title={GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal}, author={Wang, Yuxin and Wu, Qianyi and Zhang, Guofeng and Xu, Dan}, booktitle={ECCV}, year={2024} }

Picture
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling
Jaehyeok Kim, Dongyoon Wee, Dan Xu

European Conference on Computer Vision (ECCV), 2024, Milan, Italy

Javascript test1 --

@InProceedings{moconerfeccv24, title={Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling}, author={Kim Jaehyeok and Wee Dongyoon and Xu, Dan}, booktitle={ECCV}, year={2024} }

Picture
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis
Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu

European Conference on Computer Vision (ECCV), 2024, Milan, Italy

Javascript test1 --

@InProceedings{seggeneccv24, title={SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis}, author={Ye, Hanrong and Kuen, Jason and Liu, Qing and Lin, Zhe and Price, Brian and Xu, Dan}, booktitle={ECCV}, year={2024} }

Picture
RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting
Qi Wang, Ruijie Lu, Xudong Xu, Jingbo Wang, Michael Yu Wang, Bo Dai, Gang Zeng, Dan Xu

European Conference on Computer Vision (ECCV), 2024, Milan, Italy

Javascript test1 --

@InProceedings{roomtexeccv24, title={RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting}, author={Wang, Qi and Liu, Ruijie and Xu, Xudong and Wang, Jingbo and Wang, Michael and Dai, Bo Dai and Zeng, Gang and Xu, Dan}, booktitle={ECCV}, year={2024} }

Picture
DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data
Hanrong Ye, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

Javascript test1 --

@InProceedings{diffusionmtl, title={DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data}, author={Ye, Hanrong and Xu, Dan}, booktitle={CVPR}, year={2024} } }

Picture
CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs
Yingji Zhong, Lanqing Hong, Zhengguo Li, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

Javascript test1 --

@inproceedings{zhong2024cvt, title={CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs}, author={Zhong, Yingji and Hong, Lanqing and Li, Zhenguo and Xu, Dan}, booktitle={CVPR}, year={2024} } }

Picture
Interactive3D: Create What You Want by Interactive 3D Generation
Shaocong Dong, Lihe Ding, Zhanpeng Huang, Tianfan Xue, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

Javascript test1 ---

@inproceedings{dong2024interactive3d, title={Interactive3D: Create What You Want by Interactive 3D Generation}, author={Dong, Shaocong and Ding, Lihe and Huang, Zhanpeng and Wang, Zibin and Xue, Tianfan and Xu, Dan}, booktitle={CVPR}, year={2024} } }

Picture
DetCLIPv3: Towards Versatile Generative Open-Vocabulary Object Detection
Lewei Yao, Renjie Pi, Jianhua Han, Xiaodan Liang, Hang Xu, Zhenguo Li, Dan Xu

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

Javascript test1 --

@inproceedings{lewei2024detclipv3, title={DetCLIPv3: Towards Versatile Generative Open-Vocabulary Object Detection}, author={Lewei Yao, Renjie Pi, Jianhua Han, Xiaodan Liang, Hang Xu, Zhenguo Li, Dan Xu}, booktitle={CVPR}, year={2024} } }

Picture
Text-to-3D Generation with Bidirectional Diffusion using Both 2D and 3D Priors
Lihe Ding, Shaocong Dong, Zhanpeng Huang, Zibin Wang, Yiyuan Zhang, Kaixiong Gong, Dan Xu, Tianfan Xue

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

Javascript test1 --

@inproceedings{ding2023text, title={Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors}, author={Ding, Lihe and Dong, Shaocong, and Huang, Zhanpeng, and Wang, Zibin and Gong, Kaixiong and Xu, Dan and Xue, Tianfan}, booktitle={CVPR}, year={2024} } }

Picture
GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting
Chi Yan*, Delin Qu*, Dan Xu, Zhigang Wang, Bin Zhao, Dong Wang, Xuelong Li

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

(Highlight Paper, 2.8% acceptance rate)
Javascript test1 --

@inproceedings{yan2023gs, author = {Yan, Chi and Qu, Delin and Xu, Dan and Zhao, Bin and Wang, Zhigang and Wang, Dong and Li, Xuelong}, title = {GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting}, booktitle = {CVPR}, year ={2024}, } }

Picture
Implicit Event-RGBD Neural SLAM
Delin Qu, Chi Yan, Dong Wang, Jie Yin, Dan Xu, Bin Zhao, Xuelong Li

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

(Highlight Paper, 2.8% acceptance rate)
Javascript test1 --

@inproceedings{quimplicitslamcvpr24, author = {Delin Qu, Chi Yan, Dong Wang, Jie Yin, Dan Xu, Bin Zhao, Xuelong Li}, title = {Implicit Event-RGBD Neural SLAM}, booktitle = {CVPR}, year ={2024}, } }

Picture
Efficient Multitask Dense Predictor via Binarization
Yuzhang Shang, Dan Xu, Gaowen Liu, Ramana Rao Kompella, Yan Yan

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, Seattle, USA

Javascript test1 --

@inproceedings{Shang2024efficient, title={Efficient Multitask Dense Predictor via Binarization}, author={Yuzhang Shang, Dan Xu, Gaowen Liu, Ramana Rao Kompella, Yan Yan}, booktitle={CVPR}, year={2024} } }

Picture
CoDA: Collaborative Novel Box Discovery and Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Yang Cao, Yihan Zeng, Hang Xu, Dan Xu

Advances in Neural Information Processing Systems (NeurIPS), 2023, New Orleans, USA

Javascript test1 --

@InProceedings{CoDaneurips2023, title={CoDA: Collaborative novel box discovery and cross-modal alignment for open-vocabulary 3D object detection}, author={Cao, Yang and Zeng, Yihan and Xu, Hang, and Xu, Dan}, booktitle={NeurIPS}, year={2023} }

Picture
Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation
Fating Hong, Dan Xu

International Conference on Computer Vision (ICCV), 2023, Paris, France

(Project and code are released!)
Javascript test1 --

@InProceedings{impliciticcv2023, title={Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation}, author={Hong, Fating; Xu, Dan}, booktitle={ICCV}, year={2023} }

Picture
Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis
Yuxin Wang, Wayne Wu, Dan Xu

International Conference on Computer Vision (ICCV), 2023, Paris, France

(Project and code are released!)
Javascript test1 --

@InProceedings{editingiccv2023, title={Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis}, author={Wang, Yuxin; Wu, Wayne; Xu, Dan}, booktitle={ICCV}, year={2023} }

Picture
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Hanrong Ye, Dan Xu

International Conference on Computer Vision (ICCV), 2023, Paris, France

(Project and code will be released soon!)
Javascript test1 --

@InProceedings{taskmoeiccv2023, title={TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts}, author={Ye, Hanrong; Xu, Dan}, booktitle={ICCV}, year={2023} }

Picture
Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization
Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Dan Xu

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023, Vancouver, Canada
Javascript test1 --

@InProceedings{multimodalcvpr2022, title={Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization}, author={Xu, Lian; Ouyang, Wanli; Bennamoun, Mohammed; Boussaid, Farid; Xu, Dan}, booktitle={CVPR}, year={2023} }

Picture
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment
Lewei Yao, Jianhua Han, Xiaodan Liang, Dan Xu, Wei Zhang, Zhenguo Li, Hang Xu

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023, Vancouver, Canada

(Use 13M image-text pairs for pretraining, 40.4% zero-shot AP on the LVIS benchmark, outperforming GLIP/GLIPv2 by 14.4/11.4% AP!)
Javascript test1 --

@InProceedings{multimodalcvpr2023, title={DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment}, author={Yao, Lewei; Han, Jianhua; Liang, Xiaodan; Xu, Dan Xu; Zhang, Wei, Li, Zhenguo; Xu, Hang}, booktitle={CVPR}, year={2023} }

Picture
Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields
Zhenxing Mi, Dan Xu

The International Conference on Learning Representations (ICLR), 2023, Kigali, Rwanda
​
(First time to use MoE for unsupervised large-scale scene decomposition, achiving better performance than BlockNeRF and MegaNeRF!)
Javascript test1 --

@InProceedings{invpt2022, title={Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields}, author={Mi, Zhenxing; Xu, Dan}, booktitle={ICLR}, year={2023} }

Picture
TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding
Hanrong Ye, Dan Xu

The International Conference on Learning Representations (ICLR), 2023, Kigali, Rwanda
​
(Fully-supervised joint 2D and 3D multi-task transformer framework, 3D detection, semantic segmentation and depth estimation!)
Javascript test1 --

@InProceedings{invpt2022, title={Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields}, author={Mi, Zhenxing; Xu, Dan}, booktitle={ICLR}, year={2023} }

Picture
Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis
Hao Tang, Xiaojuan Qi, Guolei Sun, Dan Xu, Nicu Sebe, Radu Timofte, Luc Van Gool

The International Conference on Learning Representations (ICLR), 2023, Kigali, Rwanda
​
Javascript test1 --

@InProceedings{invpt2022, title={Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis}, author={Hao Tang, Xiaojuan Qi, Guolei Sun, Dan Xu, Nicu Sebe, Radu Timofte, Luc Van Gool}, booktitle={ICLR}, year={2023} }

Picture
Contrastive Multi-Task Dense Prediction
Siwei Yang, Hanrong Ye, Dan Xu

The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023, Washington DC, USA
​
(The first contrastive deep framework for multi-task dense prediction!)
Javascript test1 --

@InProceedings{contrastiveaaai23, title={Contrastive Multi-Task Dense Prediction}, author={Siwei Yang; Hanrong Ye; Xu, Dan}, booktitle={AAAI}, year={2023} }

Picture
DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection
Lewei Yao, Juanhua Han, Youpeng Wen, Xiaodan Liang, Dan Xu, Zhengguo Li, Chunjing Xu, Hang Xu

Neural Information Processing Systems (NeurIPS), 2022, New Orleans, USA
​
Javascript test1 -

@InProceedings{invpt2022, title={DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection}, author={Yao, Lewei and Han, Jianhua and Wen, Youpeng and Liang, Xiaodan and Xu, Dan and Zhang, Wei and Li, Zhengguo and Xu, Chujing and Xu, Hang}, booktitle={NeurIPS}, year={2022} }

Picture
Inverted Pyramid Multi-task Transformer for Dense Scene Understanding
Hanrong Ye, Dan Xu

European Conference on Computer Vision (ECCV), 2022, Tel Aviv, Israel
​
(LARGELY outperforms previous SOTA (e.g. ATRC) on multiple challenging multi-task dense prediction benchmarks.)
Javascript test1 ---

@InProceedings{invpt2022, title={Inverted Pyramid Multi-task Transformer for Dense Scene Understanding}, author={Ye, Hanrong and Xu, Dan}, booktitle={ECCV}, year={2022} }

Picture
Network Binarization via Contrastive Learning
Yuzhang Shang, Dan Xu, Ziliang Zong, Yan Yan

European Conference on Computer Vision (ECCV), 2022, Tel Aviv, Israel
​
Javascript test1 -

@inproceedings{shang2022network, title={Network Binarization via Contrastive Learning}, author={Shang, Yuzhang and Xu, Dan and Zong, Ziliang and Yan, Yan}, booktitle={ECCV}, year={2022} }

Picture
Lipschitz Continuity Retained Binary Neural Network
Yuzhang Shang, Dan Xu, Ziliang Zong, Bin Duan, Liqiang Nie, Yan Yan

European Conference on Computer Vision (ECCV), 2022, Tel Aviv, Israel
​
Javascript test1 -

@inproceedings{shang2022lipschitz, title={Lipschitz Continuity Retained Binary Neural Network}, author={Shang, Yuzhang and Xu, Dan and Duan, Bin and Zong, Ziliang and Nie, Liqiang and Yan, Yan}, booktitle={ECCV}, year={2022} }

Picture
Generalized Binary Search Network for Highly-Efficient Multi-View Stereo
Zhenxing Mi, Chang Di, Dan Xu

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, New Orleans, USA
​
Javascript test1 ---

@inproceedings{mi2021generalized, title={Generalized Binary Search Network for Highly-Efficient Multi-View Stereo}, author={Zhenxing Mi and Di Chang and Dan Xu}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, year={2022} }

Picture
Multi-class Token Transformer for Weakly Supervised Semantic Segmentation
Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Dan Xu

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, New Orleans, USA
​
Javascript test1 ---

@inproceedings{xu2022multi, title={Multi-class Token Transformer for Weakly Supervised Semantic Segmentation}, author={Xu, Lian and Ouyang, Wanli and Bennamoun, Mohammed and Boussaid, Farid and Xu, Dan}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2022} }

Picture
Depth-Aware Generative Adversarial Network for Talking Head Video Generation
Fa-Ting Hong, Longhao Zhang, Li Shen, Dan Xu

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, New Orleans, USA
​
Javascript test1 ---

@inproceedings{mi2021generalized, title={Depth-Aware Generative Adversarial Network for Talking Head Video Generation}, author={Hong, Fa-Ting and Zhang, Longhao and Shen, Li and Xu, Dan}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2022} }

Picture
Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentation
Lian Xu, Wanli Ouyang, Mohammed Bennamoun, Farid Boussaid, Ferdous Sohel, Dan Xu

IEEE International Conference on Computer Vision (ICCV), 2021, Montréal, Québec
​
(Significantly outperforms previous SOTA on PASCAL VOC 2012 and MS COCO benchmarks.)
Javascript test1 --

@inproceedings{iccvleveraging, title={Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentation}, author={Xu, Lian and Bennamoun, Mohammed and and Boussaid, Farid and Sohel, Ferdous and Xu, Dan}, booktitle={ICCV}, year={2021} }

Picture
Sign-agnostic CONet: Learning Implicit Surface Reconstructions by Sign-agnostic Optimization of Convolutional Occupancy Networks
Jiapeng Tang, Jiabao Lei, Dan Xu*, Feiying Ma, Kui Jia*, Lei Zhang

IEEE International Conference on Computer Vision (ICCV), 2021, Montréal, Québec
​
(Oral Presentation, 3% acceptance rate; *Corresponding author)
Javascript test1 --

@inproceedings{iccvleveraging, title={Sign-agnostic CONet: Learning implicit surface reconstructions by sign-agnostic optimization of convolutional occupancy networks}, author={Tang, Jiapeng, Lei, Jiabao, Xu, Dan, Ma, Feiying, Jia, Kui, Zhang, Lei}, booktitle={ICCV}, year={2021} }

Picture
Moving SLAM: Fully Unsupervised Deep Learning in Non-Rigid Scenes
Dan Xu, Andrea Vedaldi, João F. Henriques

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, Prague, Czech Republic
Javascript test1 --

@inproceedings{Madelving, title={Moving SLAM: Fully Unsupervised Deep Learning in Non-Rigid Scenes}, author={Xu, Dan and Vedaldi, Andrea and F. Henriques, João}, booktitle={IROS}, year={2021} }

Picture
Delving into Localization Errors for Monocular 3D Object Detection
Xinzhu Ma, Yinming Zhang, Dan Xu, Dongzhan Zhou, Shuai Yi, Wanli Ouyang

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021, Tennessee, USA
Javascript test1 --

@inproceedings{Madelving, title={Delving into Localization Errors for Monocular 3D Object Detection}, author={Ma, Xinzhu and Zhang, Yinming, and Xu, Dan, and Zhou, Dongzhan, and Yi, Shuai and Ouyang, Wanli}, booktitle={CVPR}, year={2021} }

Picture
Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction
Jiapeng Tang, Dan Xu, Kui Jia, Lei Zhang

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021, Tennessee, USA
Javascript test1 ---

@inproceedings{TangLearning, title={Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction}, author={Tang, Jiapeng and Xu, Dan, and Jia, Kui and Zhang, Lei}, booktitle={CVPR}, year={2021} }

Picture
Dynamic Graph Message Passing Networks
Li Zhang, Dan Xu, Anurag Arnab, Philip H.S. Torr

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, Seattle, USA
​
(Oral Representation. mIoU 81.8% on semantic segmentation on Cityscapes test set;
Boosts Mask-RCNN 3.0% on APbox for object detection and 2.5% on APmask for instance segmentation on COCO.
)
Javascript test1 ---

@inproceedings{zhang2019Dynamic, title={Dynamic Graph Message Passing Networks}, author={Zhang, Li and Xu, Dan, and Arnab, Anurag and H.S. Torr, Philip}, booktitle={CVPR}, year={2020} }

Picture
Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation
Hao Tang, Dan Xu, Yan Yan, Philip H.S. Torr, Nicu Sebe

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, Seattle, USA
Javascript test1 ---

@inproceedings{tang2020local, title={Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation}, author={Tang, Hao and Xu, Dan and Yan, Yan and Torr, H.S. Philip and Sebe, Nicu}, booktitle={CVPR}, year={2020} }

Picture
Geometry-Aware Video Object Detection for Static Cameras
Dan Xu, Weidi Xie, Andrew Zisserman

British Machine Vision Conference (BMVC), 2019, Cardiff, UK
​
(Oral Presentation, 4.66% acceptance rate)
Javascript test1 --

@inproceedings{xu2019geometry, title={Geometry-Aware Video Object Detection for Static Cameras}, author={Xu, Dan and Xie, Weidi and Zisserman, Andrew}, booktitle={BMVC}, year={2019} }

Picture
Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM
Lu Sheng*, Dan Xu*, Wanli Ouyang, Xiaogang Wang

International Conference on Computer Vision (ICCV) 2019, Seoul, Korea
​
(*denotes equal contribution)
Javascript test1 -

@inproceedings{Sheng2019Unsupervised, title={Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM}, author={Sheng, Lv and Xu, Dan and Ouyang, Wanli and Wang, Xiaogang}, booktitle={ICCV}, year={2019} }

Picture
Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection
Yingyue Xu, Dan Xu, Xiaopeng Hong, Wanli Ouyang, Rongrong Ji, Guoying Zhao

International Conference on Computer Vision (ICCV) 2019, Seoul, Korea
Javascript test1 -

@inproceedings{Xu2019Structured, title={Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection}, author={Xu, Yingyue and Xu, Dan and Hong, Xiaopeng and Ouyang, Wanli and Ji, Rongrong and Zhao, Guoying}, booktitle={ICCV}, year={2019} }

Picture
Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation
Hao Tang, Dan Xu, Yan Yan, Wei Wang, Nicu Sebe

the 27th ACM Conference on Multimedia (ACM MM), 2019, Nice, France
​
(Oral Presentation, 4.96% acceptance rate)
Javascript test1 ---

@inproceedings{hao2019cycleincycle, title={Cylce in Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation}, author={Tang, Hao and Xu, Dan and Yan, Yan and Wang, Wei and Sebe, Nicu}, booktitle={ACM MM}, year={2019} }

Picture
Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation
Mihai Puscas, Dan Xu, Andrea Pilzer, Nicu Sebe

International Conference on 3D Vision (3DV), 2019, Quebec City, Canada
​
(Oral Presentation, 5.76% acceptance rate)
Javascript test1 -

@inproceedings{Pilzer2019structured, title={Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation}, author={Puscas, Mihai and Xu, Dan and Pilzer, Andrea and Sebe, Nicu}, booktitle={3DV}, year={2019} }

Picture
Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation
Hao Tang*, Dan Xu*, Yan Yan, Jason Corso, Nicu Sebe

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, Long Beach, CA, USA
​
(Oral Presentation, 5.5% acceptance rate) (*denotes equal contribution)
Javascript test1 ---

@inproceedings{hao2019multichannel, title={Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation}, author={Tang, Hao and Xu, Dan and Yan, Yan and Corso Jason and Sebe, Nicu}, booktitle={CVPR}, year={2019} }

Picture
GestureGAN for Hand Gesture-to-Gesture Translation in the Wild
Hao Tang, Wei Wang, Dan Xu, Yan Yan, Nicu Sebe

ACM International Conference on Multimedia (ACM MM), 2018, Seoul, Korea
​
(Oral Presentation, 4.85% acceptance rate, Best Paper Candidate (4/757)!!!)
Javascript test1 --

@inproceedings{hao2018Geaturegan, title={GestureGAN for Hand Gesture-to-Gesture Translation in the Wild}, author={Tang, Hao and Wang, Wei and Xu, Dan and Yan, Yan and Sebe, Nicu}, booktitle={ACM MM}, year={2018} }

Picture
Unsupervised Adversarial Depth Estimation using Cycled Generative Networks
Andrea Pilzer*, Dan Xu*, Mihai Puscas*, Elisa Ricci, Nicu Sebe

International Conference on 3D Vision (3DV), 2018, Verona, Italy. (*denotes equal contribution)
​
Javascript test1 --

@inproceedings{xu2018unsupervised, title={Unsupervised Adversarial Depth Estimation using Cycled Generative Networks}, author={Pilzer, Andrea and Xu, Dan and Puscas, Mihai and Ricci, Elisa and Sebe, Nicu}, booktitle={3DV}, year={2018} }

Picture
PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing
Dan Xu, Wanli Ouyang, Xiaogang Wang, Nicu Sebe

IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Salt Lake City, USA
​
Javascript test1 --

@inproceedings{xu2018PAD-Net, title={PAD-Net: Multi-Tasks Guided Prediciton-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing}, author={Xu, Dan and Ouyang, Wanli and Wang, Xiaogang and Sebe, Nicu}, booktitle={CVPR}, year={2018} }

Picture
Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation
Dan Xu
, Wei Wang, Hao Tang, Hong Liu, Nicu Sebe, Elisa Ricci

​IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Salt Lake City, USA
(Spotlight Presentation, 6.7% acceptance rate)
Javascript test1 --

@inproceedings{xu2018structured, title={Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation}, author={Xu, Dan and Wang, Wei and Tang, Hao and Liu, Hong and Sebe, Nicu and Ricci, Elisa}, booktitle={CVPR}, year={2018} }

Picture
Group Consistent Similarity Learning via Deep CRFs for Person Re-Identification
Dapeng Chen, Dan Xu, Hongsheng Li, Nicu Sebe, Xiaogang Wang
​IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Salt Lake City, USA
(Oral Presentation, 2.1% acceptance rate)
Javascript test1 --

@inproceedings{xu2018group, title={Group Consistent Similarity Learning via Deep CRFs for Person Re-Identification}, author={Chen, Dapeng and Xu, Dan and Li, Hongsheng and Sebe, Nicu and Wang, Xiaogang}, booktitle={CVPR}, year={2018} }

Picture
Every Smile is Unique: Landmark-Guided Diverse Smile Generation
​Wei Wang, 
Xavier Alameda-Pineda, Dan Xu, Pascal Fua, Elisa Ricci, Nicu Sebe
​
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, Salt Lake City, USA
Javascript test1 -

@inproceedings{wang2018Every, title={Every Smile is Unique: Landmark-Guided Diverse Smile Generation}, author={Wei, Wang and Alameda-Pineda, Xavier and Xu, Dan and Fua, Pascal and Ricci, Elisa and Sebe, Nicu}, booktitle={CVPR}, year={2018} }

Picture
Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction
Dan Xu, Wanli Ouyang, Xavier Alameda-Pineda, Elisa Ricci, Xiaogang Wang, Nicu Sebe
The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS), 2017, Long Beach, USA
Javascript test1 --

@inproceedings{xu2017learning, title={Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction}, author={Xu, Dan and Ouyang, Wanli and Alameda-Pineda, Xavier and Ricci, Elisa and Wang, Xiaogang and Sebe, Nicu}, booktitle={NIPS}, year={2017} }

Picture
Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation
Dan Xu, Elisa Ricci, Wanli Ouyang, Xiaogang Wang, Nicu Sebe
​IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, Hawaii, USA
(Spotlight Presentation, 8% acceptance rate)
Javascript test1 ---

@inproceedings{xu2017multi, title={Multi-Scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation}, author={Xu, Dan and Ricci, Elisa and Ouyang, Wanli and Wang, Xiaogang and Sebe, Nicu}, booktitle={CVPR}, year={2017} }

Picture
Learning Cross-Modal Deep Representations for Robust Pedestrian Detection
Dan Xu, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, Nicu Sebe
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, Hawaii, USA
Javascript test -

@inproceedings{xu2017learning, title={Learning Cross-Modal Deep Representations for Robust Pedestrian Detection}, author={Xu, Dan and Ouyang, Wanli and Ricci, Elisa and Wang, Xiaogang and Sebe, Nicu}, booktitle={CVPR}, year={2017} }

Picture
Viraliency: Pooling Local Virality 
Xavier Alameda-Pineda, Andrea Pilzer, Dan Xu, Elisa Ricci, Nicu Sebe
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, Hawaii, USA
i​
Javascript test --

@inproceedings{alameda2017viraliency, title={Viraliency: Pooling Local Virality}, author={Alameda-Pineda, Xavier and Pilzer, Andrea and Xu, Dan and Ricci, Elisa and Sebe, Nicu}, booktitle={CVPR}, year={2017} }

Picture
Multi-Paced Dictionary Learning for Cross-Domain Retrieval and Recognition
Dan Xu, Jingkuan Song, Xavier Alameda-Pineda, Elisa Ricci, Nicu Sebe
23rd International Conference on Pattern Recognition (ICPR), 2016, Cancun, Mexico
(Oral Presentation, Best Scientific Paper Award!!!)
Javascript test1 --

@inproceedings{xu2016multipaced, title={Multi-Paced Dictionary Learning for Cross-Domain Retrieval and Recognition}, author={Xu, Dan and Song, Jingkuan and Alameda-Pineda Xavier and Ricci, Elisa and Sebe, Nicu}, booktitle={the 23rd International Conference on Pattern Recognition (ICPR)}, year={2016} }

Picture
Academic Coupled Dictionary Learning for Sketch Based Image Retrieval
Dan Xu, Xavier Alameda-Pineda, Jingkuan Song, Elisa Ricci, Nicu Sebe
ACM International Conference on Multimedia (ACM MM), 2016, Amsterdam, Netherland
​(
Oral Presentation)
Javascript test1 --

@inproceedings{xu2016academic, title={Academic Coupled Dictionary Learning for Sketch-based Image Retrieval}, author={Xu, Dan and Alameda-Pineda, Xavier and Song, Jingkuan and Ricci, Elisa and Sebe, Nicu}, booktitle={Proceedings of the 2016 ACM on Multimedia Conference}, pages={1326--1335}, year={2016}, organization={ACM} }

Picture
Learning Deep Representations of Appearance and Motion for Anomalous Event Detection
Dan Xu, Elisa Ricci, Yan Yan, Jingkuan Song, Nicu Sebe
British Machine Vision Conference (BMVC), 2015, Swansea, UK
​(Oral Presentation, 7% acceptance rate) 
Javascript test1 --

@article{xu2015learning, title={Learning deep representations of appearance and motion for anomalous event detection}, author={Xu, Dan and Ricci, Elisa and Yan, Yan and Song, Jingkuan and Sebe, Nicu}, booktitle={British Machine Vision Conference (BMVC)}, year={2015} }

Picture
A Novel Hand Posture Recognition System Based on Sparse Representation Using Color and Depth Images
Dan Xu, Yen-Lun Chen, Xinyu Wu, Wei Feng, Huihuan Qian and Yangsheng Xu
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, Tokyo, Japan
​(Oral Presentation)
Javascript test1 -

@inproceedings{xu2013novel, title={A novel hand posture recognition system based on sparse representation using color and depth images}, author={Xu, Dan and Chen, Yen-Lun and Wu, Xinyu and Feng, Wei and Qian, Huihuan and Xu, Yangsheng}, booktitle={2013 IEEE/RSJ International Conference on Intelligent Robots and Systems}, pages={3765--3770}, year={2013}, organization={IEEE} }


Journal:

[2024]-[2023]-[2022]-[2021]-[2020]-[2019]-[2018]-[2017]

Picture
MCTformer: Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation
Lian Xu, Mohammed Bennamoun, Farid Boussaid, Hamid Laga, Wanli, Ouyang, Dan Xu
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2024. (in press) [PDF]
Picture
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding
Hanrong Ye, and Dan Xu
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2024. (in press) [PDF]
Picture
DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
Fating Hong, Li Shen, and Dan Xu
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2023. (in press) [PDF]
Picture
Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting
Yongtuo Liu, Sucheng Ren, Liangyu Chai, Hanjie Wu, Jing Qin, Dan Xu, and Shengfeng He
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2023. (in press) [PDF]
Picture
Uncertainty-aware Contrastive Distillation for Incremental Semantic Segmentation
Guanglei Yang, Enrico Fini, Dan Xu, Paolo Rota, Xavier Alameda-Pineda, Elisa Ricci
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2022. (in press) [PDF]
Picture
Probabilistic Graph Attention Networks for Pixel-Wise Dense Prediction
Dan Xu, Xavier Alameda-Pineda, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, and Nicu Sebe
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2021. (in press) [PDF]
Picture
Progressive Fusion for Unsupervised Binocular Depth Estimation using Cycled Networks
Andrea Pilzer, Stephane Lathuiliere, Dan Xu, Mihai Puscas, Elisa Ricci, Nicu Sebe
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2019. (in press) [PDF]
Picture
Monocular Depth Estimation using Multi-Scale Continuous CRFs as Sequential Deep Networks
Dan Xu, Wanli Ouyang, Elisa Ricci, Xiaogang Wang, Nicu Sebe
IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2018. [PDF]
Picture
Learning How to Smile: Expression Video Generation with Conditional Adversarial Recurrent Nets
Wei Wang, Xavier Alameda-Pineda, Dan Xu, Elisa Ricci, Nicu Sebe
IEEE Transactions on Multimedia (T-MM), 2019. [PDF]
Picture
Cross-Paced Representation Learning with Partial Curricula for Sketch-based Image Retrieval
Dan Xu, Xavier Alameda Pineda, Jingkuan Song, Elisa Ricci, Nicu Sebe
IEEE Transactions on Image Processing (T-IP), 2017. [PDF]
Picture
Supervised Local Descriptor Learning for Human Action Recognition
Xiantong Zhen, Feng Zheng, Ling Shao, Xianbin Cao, Dan Xu
IEEE Transactions on Multimedia (T-MM), 2017. [PDF]
Picture
Detecting Anomalous Events in Videos by Learning Deep Representations of Appearance and Motion
Dan Xu, Yan Yan, Elisa Ricci, Nicu Sebe
Computer Vision and Image Understanding (CVIU), 2017. [PDF]

© 2015 by Dan Xu. All Rights Reserved. Last Modified: 08/07/2015
Create a free web site with Weebly