Distributional Preference Alignment of Large Language Models via Optimal Transport

Abstract

Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically dominant in the first order on the distribution of negative samples. We introduce a convex relaxation of this first-order stochastic dominance and cast it as an optimal transport problem with a smooth and convex cost. Thanks to the one-dimensional nature of the resulting optimal transport problem and the convexity of the cost, it has a closed-form solution via sorting on empirical measures. We fine-tune LLMs with this AOT objective, which enables alignment by penalizing the violation of the stochastic dominance of the reward distribution of the positive samples on the reward distribution of the negative samples. We analyze the sample complexity of AOT by considering the dual of the OT problem and show that it converges at the parametric rate. Empirically, we show on a diverse set of alignment datasets and LLMs that AOT leads to state-of-the-art models in the 7B family of models when evaluated with Open LLM Benchmarks and AlpacaEval. We will cover how these ideas extend to multivariate stochastic dominance, that is crucial for covering the multi-reward setting in the context of LLMs.

Date
2024, Dec 13 9:00 AM PST
Event
KI Seminar
Location
ESB4133 and Online (zoom)
Registration
Sign up for the mailing list to receive the connection details

Speaker Biogrpahy

Youssef Mroueh is a Principal Research Scientist in IBM Research with the Human Centered Trustworthy AI department. He received his PhD in computer science in February 2015 from MIT, CSAIL, where he was advised by Professor Tomaso Poggio. In 2011, he obtained his engineering diploma from Ecole Polytechnique Paris France, and a Master of Science in Applied Mathematics from Ecole des Mines de Paris. He is interested in Optimal transport, trustworthy ML, LLMs, Statistical Learning Theory, scientific ML , and AI for social good.