Ligong Han 韩立功
I am a Research Scientist at the MIT-IBM Watson AI Lab, working on Generative AI with a focus on controllable and precise generation, particularly in diffusion models and LLMs. I obtained my PhD in Computer Science from Rutgers University in 2024, advised by Prof. Dimitris Metaxas. During my PhD, I've spent time at Google Research, MIT-IBM Watson AI Lab, Snap Research, NEC Labs America, Tencent, and the Robotics Institute working as a research intern.
Previously, I earned my master's degree from Carnegie Mellon University and my bachelor's from Chien-Shiung Wu College, Southeast University.
Email: lastnamefirstname [at] gmail [dot] com or firstname.lastname [at] rutgers [dot] edu
Email  / 
CV  / 
Google Scholar  / 
Github  / 
LinkedIn  / 
Twitter
|
|
10-2024 |
One paper accepted to WACV-2025! |
10-2024 |
One paper accepted to NeurIPS-2024! |
02-2024 |
One paper accepted to CVPR-2024! |
10-2023 |
Two papers accepted to WACV-2024! |
09-2023 |
One paper accepted to NeurIPS-2023! |
07-2023 |
One paper accepted to ICCV-2023! |
06-2023 |
One paper accepted to MICCAI-2023! |
06-2023 |
One paper accepted to TMLR! |
03-2023 |
Our paper Constructive Assimilation is accepted at GCV-2023. |
02-2023 |
Two papers accepted to CVPR-2023! |
Research
Selected publications are highlighted. (* equal contribution, † corresponding author)
|
|
🎲 DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models
Xiaoxiao He, Ligong Han†, Quan Dao, Song Wen, Minhao Bai, Di Liu, Han Zhang, Martin Renqiang Min, Juefei Xu, Chaowei Tan, Bo Liu, Kang Li, Hongdong Li, Junzhou Huang, Faez Ahmed, Akash Srivastava, Dimitris Metaxas.
arXiv, 2024
[arXiv] 
[Project Page] 
[bibtex]
TLDR: Discrete diffusion inversion for precise and flexible content editing by recording noise sequences and masking patterns during the reverse process.
|
|
APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking
Can Jin*, Hongwu Peng*, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han, Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran, Dimitris Metaxas.
arXiv, 2024
[arXiv] 
TLDR: An automatic prompt engineering algorithm for LLM-based relevance ranking in IR, significantly reducing human effort and outperforming existing manual prompts.
|
|
🥤 Spectrum-Aware Parameter Efficient Fine-Tuning
Xinxi Zhang*, Song Wen*, Ligong Han*†, Juefei Xu, Akash Srivastava, Junzhou Huang, Hao Wang, Molei Tao, Dimitris Metaxas.
Accepted at Winter Conference on Applications of Computer Vision (WACV), 2025
[arXiv] 
[Github] 
[bibtex]
TLDR: A framework for parameter-efficient fine-tuning by adjusting both singular values and their basis vectors, balancing computational efficiency and representation capacity.
|
|
ProxEdit: Improving Tuning-Free Real Image Editing with Proximal Guidance
Ligong Han†, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Yuxiao Chen, Di Liu, Qilong Zhangli, Anastasis Stathopoulos, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas.
Accepted at Winter Conference on Applications of Computer Vision (WACV), 2024
[arXiv] 
[poster] 
[Github] 
[bibtex]
TLDR: We introduced proximal guidance to enhance diffusion-based tuning-free real image editing in two frameworks, Negative Prompt Inversion and Mutual Self-Attention Control. Our algorithms, ProxNPI and ProxMasaCtrl, overcome limitations and achieve high-quality editing with computational efficiency.
|
|
Constructive Assimilation: Boosting Contrastive Learning Performance through View Generation Strategies
Ligong Han†, Seungwook Han, Shivchander Sudalairaj, Charlotte Loh, Rumen Dangovski, Fei Deng, Pulkit Agrawal, Dimitris Metaxas, Leonid Karlinsky, Tsui-Wei Weng, Akash Srivastava.
Accepted to Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2023
[arXiv] 
[poster] 
[Github] 
[bibtex]
TLDR: This study proposes a method to assimilate generated views with expert transformations in contrastive learning, improving the state-of-the-art by up to 3.6% on three datasets and providing a comprehensive analysis of various view generation and assimilation methods.
|
|
Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning
Ligong Han†, Jian Ren, Hsin-Ying Lee, Francesco Barbieri, Kyle Olszewski, Shervin Minaee, Dimitris Metaxas, Sergey Tulyakov.
Accepted at Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[arXiv] 
[poster] 
[Github] 
[Project Page] 
[bibtex]
TLDR: The paper presents a multimodal video generation framework using a bidirectional transformer and improved techniques to generate high-quality, diverse video sequences, achieving state-of-the-art results on four datasets.
|
|
Enhancing Counterfactual Classification via Self-Training
Ruijiang Gao, Max Biggs, Wei Sun, Ligong Han.
Accepted to AAAI Conference on Artificial Intelligencen (AAAI), 2022
[arXiv] 
[Github] 
[bibtex]
TLDR: The paper proposes a Counterfactual Self-Training (CST) algorithm that uses pseudolabeling to address the challenge of partial feedback in settings like pricing, online marketing, and precision medicine, treating it as a domain adaptation problem, and demonstrates its effectiveness on both synthetic and real datasets.
|
|
Unbiased Auxiliary Classifier GANs with MINE
Ligong Han†, Anastasis Stathopoulos, Tao Xue, Dimitris Metaxas.
Accepted to Conference on Computer Vision and Pattern Recognition Workshops (CVPRW   DeepMind Travel Award), 2020
[arXiv] 
[Github] 
[bibtex]
TLDR: We propose Unbiased Auxiliary GANs (UAC-GAN) that leverage the Mutual Information Neural Estimator (MINE) and a novel projection-based statistics network architecture to address the biased distribution issue in AC-GANs, resulting in improved performance on three datasets.
|
|
[2015]
MATLAB Code for Axis Label Alignment in 3D Plots
[File Exchange] 
[Github]
Align axis labels nicely in parallel with axes in MATLAB (3-D) plots. This file was selected as MATLAB Central Pick of the Week.
|
|