An important problem in machine learning and computational statistics is to sample from an intractable target distribution, e.g. to sample or compute functionals (expectations, normalizing constants) of the target distribution. This sampling problem can be cast as the optimization of a dissimilarity functional, seen as a loss, over the space of probability measures. In particular, one can leverage the geometry of Optimal transport and consider Wasserstein gradient flows for the loss functional, that find continuous path of probability distributions decreasing this loss. Different algorithms to approximate the target distribution result from the choice of the loss, a time and space discretization; and results in practice to the simulation of interacting particle systems. Motivated in particular by two machine learning applications, namely bayesian inference and optimization of shallow neural networks, we will present recent convergence results obtained for algorithms derived from Wasserstein gradient flows.