Ligong Han   韩立功

I'm currently a Ph.D. student at Rutgers University, supervised by Prof. Dimitris Metaxas. Prior to this, I obtained my master's at Carnegie Mellon University and my bachelor's from Chien-Shiung Wu College, Southeast University. I was research intern at MIT-IBM Watson AI Lab, Google Research, Snap Research, NEC Labs America, Tencent, and visitor at the Robotics Institute.

I'm generally interested in generative models, large language models, self-supervised learning, machine learning, and medical image analysis.

Email: lastnamefirstname [at] gmail [dot] com or firstname.lastname [at] rutgers [dot] edu

Email  /  CV  /  Google Scholar  /  Github  /  LinkedIn  /  Twitter

profile photo
News
02-2024 One paper accepted to CVPR-2024!
10-2023 Two papers accepted to WACV-2024!
09-2023 One paper accepted to NeurIPS-2023!
07-2023 One paper accepted to ICCV-2023!
06-2023 One paper accepted to MICCAI-2023!
06-2023 One paper accepted to TMLR!
06-2023 Preprint of our new work, ProxEdit, is out on arXiv.
03-2023 Our paper Constructive Assimilation is accepted at GCV-2023.
03-2023 Preprint of our new work, SVDiff, is out on arXiv.
02-2023 Two papers accepted to CVPR-2023!
Research

Selected publications are highlighted.

Score-Guided Diffusion for 3D Human Recovery
Anastasis Stathopoulos, Ligong Han, Dimitris Metaxas.
Accepted to Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[arXiv]  [Project Page]  [Github]  [bibtex]

TLDR: Solving inverse problems for 3D human pose and shape reconstruction with score guidance in the latent space of a diffusion model.

ProxEdit: Improving Tuning-Free Real Image Editing with Proximal Guidance
Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Yuxiao Chen, Di Liu, Qilong Zhangli, Anastasis Stathopoulos, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas.
Accepted at Winter Conference on Applications of Computer Vision (WACV), 2024
[arXiv]  [poster]  [Github]  [bibtex]

TLDR: We introduced proximal guidance to enhance diffusion-based tuning-free real image editing in two frameworks, Negative Prompt Inversion and Mutual Self-Attention Control. Our algorithms, ProxNPI and ProxMasaCtrl, overcome limitations and achieve high-quality editing with computational efficiency.

On the Stability-Plasticity Dilemma in Continual Meta-Learning: Theory and Algorithm
Qi Chen, Changjian Shui, Ligong Han, Mario Marchand.
Accepted at Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023
[arXiv]  [poster]  [Github]  [bibtex]

TLDR: This paper presents a theoretical framework and a novel algorithm for Continual Meta-Learning (CML) that effectively balances stability to prevent forgetting previous tasks and plasticity for learning from new tasks.

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning
Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang.
Accepted at International Conference on Computer Vision (ICCV), 2023
[arXiv [Unofficial Code]  [PEFT-SVD]  [Project Page]  [poster]  [bibtex]

TLDR: This paper presents an approach for customizing (T2I) diffusion models, fine-tuning singular values of weight matrices to reduce overfitting and model storage, while introducing a text-based single-image editing framework and a data-augmentation technique for multi-subject generation.

DMCVR: Morphology-Guided Diffusion Model for 3D Cardiac Volume Reconstruction
Xiaoxiao He, Chaowei Tan, Ligong Han, Bo Liu, Leon Axel, Kang Li, Dimitris Metaxas.
Accepted at International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2023
[arXiv]  [Github]  [bibtex]

TLDR: We propose a morphology-guided diffusion model for 3D cardiac volume reconstruction, by interpolation in its latent space. The model outperforms strong baselines including GAN- and DiffAE-based methods.

Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies
Ligong Han, Seungwook Han, Shivchander Sudalairaj, Charlotte Loh, Rumen Dangovski, Fei Deng, Pulkit Agrawal, Dimitris Metaxas, Leonid Karlinsky, Tsui-Wei Weng, Akash Srivastava.
Accepted to Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023
[arXiv]  [poster]  [Github]  [bibtex]

TLDR: This study proposes a method to assimilate generated views with expert transformations in contrastive learning, improving the state-of-the-art by up to 3.6% on three datasets and providing a comprehensive analysis of various view generation and assimilation methods.

SINE: SINgle Image Editing with Text-to-Image Diffusion Models
Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, Jian Ren.
Accepted to Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[arXiv]  [Github]  [Project Page]  [bibtex]

TLDR: This work proposes a model-based guidance technique for single-image editing using pre-trained diffusion models, addressing overfitting issues and enabling content creation with only one given image, while also introducing a patch-based fine-tuning method for generating images of arbitrary resolution.

Learning Articulated Shape with Keypoint Pseudo-labels from Web Images
Anastasis Stathopoulos, Georgios Pavlakos, Ligong Han, Dimitris Metaxas.
To appear in Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[arXiv]  [code & data]  [Project Page]  [bibtex]

TLDR: The paper introduces a method for monocular 3D reconstruction of articulated objects with minimal labeled data, using category-specific keypoint estimators and data selection to improve performance.

StyleGAN-Fusion: Diffusion Guided Domain Adaptation of Style-based Generators
Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal.
Accepted at Winter Conference on Applications of Computer Vision (WACV), 2024
[arXiv]  [Github]  [Project Page]  [bibtex]

TLDR: This paper demonstrates the use of score distillation sampling as a critic to adapt GAN generators to new domains using text prompts, leveraging large-scale text-to-image diffusion models and achieving high quality and controllability in domain adaptation for both 2D and 3D image generation.

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
Ligong Han, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris Metaxas, Sergey Tulyakov.
Accepted at Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[arXiv]  [poster]  [Github]  [Project Page]  [bibtex]

TLDR: The paper presents a multimodal video generation framework using a bidirectional transformer and improved techniques to generate high-quality, diverse video sequences, achieving state-of-the-art results on four datasets.

AE-StyleGAN: Improved Training of Style-Based Auto-Encoders
Ligong Han*, Sri Harsha Musunuri*, Martin Renqiang Min, Ruijiang Gao, Yu Tian, Dimitris Metaxas (* equal contribution).
Accepted at Winter Conference on Applications of Computer Vision (WACV), 2022
[arXiv]  [poster]  [Github]  [bibtex]

TLDR: Training a style-based autoencoder end-to-end, resulting in a more disentangled latent space and improved image inversion and generation quality.

Enhancing Counterfactual Classification via Self-Training
Ruijiang Gao, Max Biggs, Wei Sun, Ligong Han.
Accepted to AAAI Conference on Artificial Intelligencen (AAAI), 2022
[arXiv]  [Github]  [bibtex]

TLDR: The paper proposes a Counterfactual Self-Training (CST) algorithm that uses pseudolabeling to address the challenge of partial feedback in settings like pricing, online marketing, and precision medicine, treating it as a domain adaptation problem, and demonstrates its effectiveness on both synthetic and real datasets.

Hierarchically Self-supervised Transformer for Human Skeleton Representation Learning
Yuxiao Chen, Long Zhao, Jianbo Yuan, Yu Tian, Zhaoyang Xia, Shijie Geng, Ligong Han, Dimitris Metaxas.
Accepted to European Conference on Computer Vision (ECCV), 2022
[arXiv]  [Github]  [bibtex]

TLDR: The paper introduces a self-supervised hierarchical pre-training scheme with Hi-TRS, capturing multi-level dependencies and achieving state-of-the-art performance in skeleton-based tasks.

Disentangled Recurrent Wasserstein Autoencoder
Jun Han*, Martin Renqiang Min*, Ligong Han*, Li Erran Li, Xuan Zhang (* equal contribution).
Accepted to International Conference on Learning Representations (ICLR   Spotlight, scored among top 4%), 2021
[arXiv]  [Code]  [bibtex]

TLDR: R-WAE is a framework for unsupervised disentangled sequential representation learning, outperforming baselines in disentanglement and video generation by optimizing Wasserstein distance and mutual information.

Dual Projection Generative Adversarial Networks for Conditional Image Generation
Ligong Han, Martin Renqiang Min, Anastasis Stathopoulos, Yu Tian, Ruijiang Gao, Asim Kadav, Dimitris Metaxas.
Accepted to International Conference on Computer Vision (ICCV), 2021
[arXiv]  [poster]  [Github]  [bibtex]

TLDR: Dual Projection GAN (P2GAN) balances data matching and label matching in cGANs, improving class separability and sample quality.

Human-AI Collaboration with Bandit Feedback
Ruijiang Gao, Maytal Saar-Tsechansky, Maria De-Arteaga, Ligong Han, Min Kyung Lee, Matthew Lease.
Accepted to International Joint Conference on Artificial Intelligence (IJCAI), 2021
[arXiv]  [Github]  [bibtex]

TLDR: The paper introduces a novel human-machine collaboration approach in a bandit feedback setting that improves decision-making performance by exploiting human-machine complementarity and personalizing routing for multiple human decision-makers.

Robust Conditional GAN from Uncertainty-Aware Pairwise Comparisons
Ligong Han, Ruijiang Gao, Mun Kim, Xin Tao, Bo Liu, Dimitris Metaxas.
Accepted to AAAI Conference on Artificial Intelligencen (AAAI), 2020
[arXiv]  [poster]  [Github]  [bibtex]

TLDR: PC-GAN is a novel generative adversarial network using weak supervision (via the proposed Elo rating network) from pairwise comparisons for image attribute editing, achieving robust performance and comparable results to fully-supervised methods.

Unbiased Auxiliary Classifier GANs with MINE
Ligong Han, Anastasis Stathopoulos, Tao Xue, Dimitris Metaxas.
Accepted to Conference on Computer Vision and Pattern Recognition Workshops (CVPRW   DeepMind Travel Award), 2020
[arXiv]  [Github]  [bibtex]

TLDR: We propose Unbiased Auxiliary GANs (UAC-GAN) that leverage the Mutual Information Neural Estimator (MINE) and a novel projection-based statistics network architecture to address the biased distribution issue in AC-GANs, resulting in improved performance on three datasets.

Unsupervised Domain Adaptation via Calibrating Uncertainties
Ligong Han, Yang Zou, Ruijiang Gao, Lezi Wang, Dimitris Metaxas.
Accepted to Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019
[arXiv]  [Github]  [bibtex]

TLDR: We propose a Renyi entropy regularization (RER) framework for unsupervised domain adaptation, which adapts from source to target domain by calibrating predictive uncertainties using variational Bayes learning, and demonstrate its effectiveness on three domain-adaptation tasks.

Learning Generative Models of Tissue Organization with Supervised GANs
Ligong Han, Robert F. Murphy, Deva Ramanan.
Accepted to Winter Conference on Applications of Computer Vision (WACV), 2018
[arXiv]  [Github]  [bibtex]

TLDR: We propose a two-stage and an end-to-end supervised GAN approaches for generating realistic electron microscope images with densely annotated sub-cellular structures.

Misc
[2015]    MATLAB Code for Axis Label Alignment in 3D Plots
[File Exchange]  [Github]

Align axis labels nicely in parallel with axes in MATLAB (3-D) plots. This file was selected as MATLAB Central Pick of the Week.

Website source from Jon Barron.