J. Baker, P. Fearnhead, E. B. Fox, and C. Nemeth, sgmcmc: An R package for stochastic gradient Markov chain Monte Carlo, 2017.

T. Birdal, Probabilistic permutation synchronization using the Riemannian structure of the Birkhoff polytope, CVPR, 2019.

T. Birdal, U. Im?ekli, M. O. Eken, and S. Ilic, Bayesian pose graph optimization via bingham distributions and tempered geodesic MCMC, NeurIPS, pp.308-319, 2018.

C. Ç-elik and M. Duman, Crank-Nicolson method for the fractional diffusion equation with the Riesz fractional derivative, Journal of Computational Physics, vol.231, issue.4, pp.1743-1750, 2012.

J. M. Chambers, C. L. Mallows, and B. W. Stuck, A method for simulating stable random variables, Journal of the american statistical association, vol.71, issue.354, pp.340-344, 1976.

C. Chen, N. Ding, C. , and L. , On the convergence of stochastic gradient MCMC algorithms with high-order integrators, Advances in Neural Information Processing Systems, pp.2269-2277, 2015.

U. ?-im?ekli, R. Badeau, A. T. Cemgil, R. , and G. , Stochastic quasi-Newton Langevin Monte Carlo, ICML, 2016.

U. ?-im?ekli, C. Yildiz, T. H. Nguyen, A. T. Cemgil, R. et al., Asynchronous stochastic quasi-Newton MCMC for non-convex optimization, ICML, pp.4674-4683, 2018.

U. ?-im?ekli, L. Sagun, and M. Gurbuzbalaban, A tailindex analysis of stochastic gradient noise in deep neural networks, ICML, 2019.

A. S. Dalalyan, Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent, Proceedings of the 2017 Conference on Learning Theory, 2017.

A. S. Dalalyan, Theoretical guarantees for approximate sampling from smooth and log-concave densities, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol.79, issue.3, pp.651-676, 2017.

A. Durmus and E. Moulines, Non-asymptotic convergence analysis for the unadjusted Langevin algorithm, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01176132

A. Durmus and E. Moulines, High-dimensional Bayesian inference via the unadjusted Langevin algorithm, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01304430

A. Durmus, U. Im?ekli, E. Moulines, R. Badeau, R. et al., Stochastic gradient Richardson-Romberg Markov Chain Monte Carlo, NIPS, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01354064

M. A. Erdogdu, L. Mackey, and O. Shamir, Global nonconvex optimization with discretized diffusions, Advances in Neural Information Processing Systems, pp.9693-9702, 2018.

J. Gairing, M. Högele, and T. Kosenkova, Transportation distances and noise sensitivity of multiplicative Lévy sde with applications, Stochastic Processes and their Applications, vol.128, pp.2153-2178, 2018.

X. Gao, M. Gurbuzbalaban, and L. Zhu, Breaking reversibility accelerates Langevin dynamics for global nonconvex optimization, 2018.

X. Gao, M. Gürbüzbalaban, and L. Zhu, Global convergence of stochastic gradient Hamiltonian Monte Carlo for non-convex stochastic optimization: Non-asymptotic performance bounds and momentum-based acceleration, 2018.

S. B. Gelfand and S. K. Mitter, Recursive stochastic algorithms for global optimization in R?d, SIAM Journal on Control and Optimization, vol.29, issue.5, pp.999-1018, 1991.

C. Hwang, Laplace's method revisited: weak convergence of probability measures. The Annals of Probability, pp.1177-1182, 1980.

S. Jastrzebski, Z. Kenton, D. Arpit, N. Ballas, A. Fischer et al., Three factors influencing minima in sgd, 2017.

D. Lamberton and G. Pages, Recursive computation of the invariant distribution of a diffusion: the case of a weakly mean reverting drift, Stochastics and dynamics, vol.3, issue.04, pp.435-451, 2003.
URL : https://hal.archives-ouvertes.fr/hal-00104799

P. Lévy, Théorie de l'addition des variables aléatoires. Gauthiers-Villars, 1937.

Y. A. Ma, T. Chen, and E. Fox, A complete recipe for stochastic gradient MCMC, Advances in Neural Information Processing Systems, pp.2899-2907, 2015.

. Non-asymptotic, Analysis of FLMC for Non-Convex Optimization Masuda, H. Ergodicity and exponential ?-mixing bounds for multidimensional diffusions with jumps. Stochastic processes and their applications, vol.117, pp.35-56, 2007.

R. Mikulevi?ius and C. Zhang, On the rate of convergence of weak Euler approximation for nondegenerate SDEs driven by Lévy processes, Stochastic Processes and their Applications, vol.121, pp.1720-1748, 2011.

M. D. Ortigueira, Riesz potential operators and inverses via fractional centred derivatives, International Journal of Mathematics and Mathematical Sciences, 2006.

M. D. Ortigueira, T. M. Laleg-kirati, and J. A. Machado, Riesz potential versus fractional Laplacian, Journal of Statistical Mechanics, issue.09, 2014.

F. Panloup, Recursive computation of the invariant measure of a stochastic differential equation driven by a Lévy process, The Annals of Applied Probability, vol.18, issue.2, pp.379-426, 2008.

I. Pavlyukevich, Cooling down lévy flights, Journal of Physics A: Mathematical and Theoretical, vol.40, issue.41, p.12299, 2007.

Y. Polyanskiy and Y. Wu, Wasserstein continuity of entropy and outer bounds for interference channels, IEEE Transactions on Information Theory, vol.62, issue.7, pp.3992-4002, 2016.

M. Raginsky, A. Rakhlin, and M. Telgarsky, Non-convex learning via stochastic gradient Langevin dynamics: a nonasymptotic analysis, Proceedings of the 2017 Conference on Learning Theory, vol.65, pp.1674-1703, 2017.

G. O. Roberts and O. Stramer, Langevin Diffusions and Metropolis-Hastings Algorithms, Methodology and Computing in Applied Probability, vol.4, issue.4, pp.337-357, 2002.

G. Samorodnitsky and M. S. Taqqu, Stable non-Gaussian random processes: stochastic models with infinite variance, vol.1, 1994.

U. ?-im?ekli, Fractional Langevin Monte carlo: Exploring Levy driven stochastic differential equations for Markov chain Monte Carlo, ICML, pp.3200-3209, 2017.

B. Tzen, T. Liang, and M. Raginsky, Local optimality and generalization guarantees for the langevin algorithm via empirical metastability, Proceedings of the 2018 Conference on Learning Theory, 2018.

M. Welling and Y. W. Teh, Bayesian learning via stochastic gradient Langevin dynamics, International Conference on Machine Learning, pp.681-688, 2011.

L. Xie and X. Zhang, Ergodicity of stochastic differential equations with jumps and singular coefficients, 2017.

P. Xu, J. Chen, D. Zou, and Q. Gu, Global convergence of langevin dynamics based algorithms for nonconvex optimization, Advances in Neural Information Processing Systems, pp.3125-3136, 2018.

V. V. Yanovsky, A. V. Chechkin, D. Schertzer, and A. V. Tur, Lévy anomalous diffusion and fractional Fokker-Planck equation, Physica A: Statistical Mechanics and its Applications, vol.282, issue.1, pp.13-34, 2000.

N. Ye and Z. Zhu, Stochastic fractional Hamiltonian Monte Carlo, Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, vol.7, pp.3019-3025, 2018.

Y. Zhang, P. Liang, and M. Charikar, A hitting time analysis of stochastic gradient langevin dynamics, Proceedings of the 2017 Conference on Learning Theory, vol.65, 1980.