Publications

DECILE research team members indicated in boldface

2026

UniPROT: Uniform Prototype Selection via Partial Optimal Transport with Submodular Guarantees

Prateek Chanda, Prayas Agrawal, Karthik S. Gurumoorthy, Ganesh Ramakrishnan, Bamdev Mishra, Pratik Jawanpuria (AISTATS 2026)

Project page

Paper

Code

Sculpting Subspaces: Constrained Full Fine-Tuning in LLMs for Continual Learning (https://openreview.net/forum?id=vQcyqsGJDw)

Nikhil Shivakumar Nayak, Krishnateja Killamsetty, Ligong Han, Abhishek Bhandwaldar, Prateek Chanda, Kai Xu, Oleg Silkin, Mustafa Eyceoz, Hao Wang, Aldo Pareja, Akash Srivastava (ICLR 2026)

2025

FairPO: Fair Preference Optimization for Multi-Label Learning

Soumen Kumar Mondal, Akshit Varmora, Prateek Chanda, Ganesh Ramakrishnan

In NeurIPS OPT 2025 Workshop in the Proceedings of The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

Unified Wisdom: Harnessing Collaborative Learning to Improve Efficacy of Knowledge Distillation

Durga S, Atharva Abhijit Tambat, Ganesh Ramakrishnan, Pradeep Shenoy

In Proceedings of The Transactions of Machine Learning Research (TMLR 2025)

Bandit Guided Submodular Curriculum for Adaptive Subset Selection

Prateek Chanda, Prayas Agrawal, Saral Sureka, Lokesh Reddy Polu, Atharv Kshirsagar, Ganesh Ramakrishnan

In Proceedings of The The Thirty-Ninth Annual Conference on Neural Information Processing Systems 2025.

2024

Bayesian Coreset Optimization for Personalized Federated Learning

Prateek Chanda, Shrey Modi, Ganesh Ramakrishnan

International Conference on Learning Representations (ICLR) 2024

SMART: Submodular Data Mixture Strategy for Instruction Tuning

H S V N S Kowndinya Renduchintala, Sumit Bhatia, Ganesh Ramakrishnan

In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 - Findings)

DictDis: Dictionary Constrained Disambiguation for Improved NMT

Ayush Maheshwari, Preethi Jyothi, Ganesh Ramakrishnan

In Proceedings of The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024 - Findings)

Speeding up NAS with Adaptive Subset Selection

Vishak Prasad C, Colin White, Paarth Jain, Sibasis Nayak, Ganesh Ramakrishnan

In Proceedings of The 2024 International Conference on Automated Machine Learning (AutoML 2024)

2023

AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning

KrishnaTeja Killamsetty, Guttu Sai Abhishek, Aakriti, Ganesh Ramakrishnan, Alexandre V. Evfimievski, Lucian Popa, Rishabh K. Iyer

In Proceedings of the Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022)

INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Large Language Models

HSVNS Kowndinya Renduchintala, Krishnateja Killamsetty, Sumit Bhatia, Milan Aggarwal, Ganesh Ramakrishnan, Rishabh Iyer, Balaji Krishnamurthy

Accepted paper at the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, (Long paper, Findings Track)

2021

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

35th AAAI Conference on Artificial Intelligence, AAAI 2021

2020

Data Programming using Continuous and Quality-Guided Labeling Function

Oishik Chatterjee, Ganesh Ramakrishnan, Sunita Sarawagi

In Proceedings of The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020), New York, USA.

Coresets for Robust Training of Deep Neural Networks against Noisy Labels

Baharan Mirzasoleiman, Kaidi Cao, Jure Leskovec

In Proc. Advances in Neural Information Processing Systems (NeurIPS), 2020

Coresets for Data-efficient Training of Machine Learning Models

Baharan Mirzasoleiman, Jeff Bilmes, Jure Leskovec

International Conference on Machine Learning (ICML), July 2020

2019

Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision

Vishal Kaushal, Rishabh Iyer, Suraj Kothiwade, Rohan Mahadev, Khoshrav Doctor, Ganesh Ramakrishnan

7th IEEE Winter Conference on Applications of Computer Vision (WACV), 2019 Hawaii, USA

2015

Submodularity in data subset selection and active learning

Kai Wei, Rishabh Iyer, Jeff Bilmes

International Conference on Machine Learning (ICML) 2015

2014

Fast multi-stage submodular maximization

Kai Wei, Rishabh K. Iyer, Jeff A. Bilmes

International Conference on Machine Learning (ICML 2014)

Submodular subset selection for large-scale speech training data

Wei, Kai, et al

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2014

For a complete list of DECILE publications, please visit: