Academic Profile

Luis A. Ortega

Postdoctoral Researcher · Aalborg University

Probabilistic Machine Learning for uncertainty, inference, and generalization.

Function-space variational inference, linearized Laplace approximations, deep ensembles, and Chernoff-style generalization bounds.

Bayesian deep learning Variational inference Generalization bounds Deep ensembles
10 Publications
5 Ongoing projects
4 Research lines

Current Position

Postdoctoral research at Aalborg University.

I am a Postdoctoral Researcher in the Department of Computer Science at Aalborg University, working in the Section for Distributed, Embedded and Intelligent Systems. My work focuses on making modern neural networks more reliable by combining probabilistic inference, uncertainty quantification, and generalization theory.

Research Scope

Post-hoc Bayesian inference

Bayesian uncertainty estimation for pre-trained neural networks through linearized Laplace, neural kernels, and scalable approximations.

Function-space inference

Gaussian and implicit-process models that place uncertainty directly over functions, including variational and flow-transformed constructions.

Generalization theory

PAC-Chernoff and large-deviation analyses of interpolation, regularization, and the implicit bias of stochastic optimization.

Ensembles and calibration

Theoretical and empirical study of diversity, model combination, and calibrated prediction in neural network ensembles.

Selected Work

AISTATS 2022

Diversity and Generalization in Neural Network Ensembles

Ensembles are widely used in machine learning and, usually, provide state-of-the-art performance in many prediction tasks. From the very beginning, the diversity of an ensemble has been identified as a key factor for the superior performance of these models. But the exact role that diversity plays in ensemble models is poorly understood, specially in the context of neural networks. In this work, we combine and expand previously published results in a theoretically sound framework that describes the relationship between diversity and ensemble performance for a wide range of ensemble methods. More precisely, we provide sound answers to the following questions: how to measure diversity, how diversity relates to the generalization error of an ensemble, and how diversity is promoted by neural network ensemble algorithms. This analysis covers three widely used loss functions, namely, the squared loss, the cross-entropy loss, and the 0-1 loss; and two widely used model combination strategies, namely, model averaging and weighted majority vote. We empirically validate this theoretical analysis with neural network ensembles.

ICML 2024

Variational Linearized Laplace Approximation for Bayesian Deep Learning

The Linearized Laplace Approximation (LLA) has been recently used to perform uncertainty estimation on the predictions of pre-trained deep neural networks (DNNs). However, its widespread application is hindered by significant computational costs, particularly in scenarios with a large number of training points or DNN parameters. Consequently, additional approximations of LLA, such as Kronecker-factored or diagonal approximate GGN matrices, are utilized, potentially compromising the model's performance. To address these challenges, we propose a new method for approximating LLA using a variational sparse Gaussian Process (GP). Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN. Furthermore, it allows for efficient stochastic optimization, which results in sub-linear training time in the size of the training dataset. Specifically, its training cost is independent of the number of training points. We compare our proposed method against accelerated LLA (ELLA), which relies on the Nyström approximation, as well as other LLA variants employing the sample-then-optimize principle. Experimental results, both on regression and classification datasets, show that our method outperforms these already existing efficient variants of LLA, both in terms of the quality of the predictive distribution and in terms of total computational time.

NeurIPS 2024

PAC-Bayes-Chernoff Bounds for Unbounded Losses

We introduce a new PAC-Bayes oracle bound for unbounded losses that extends Cramér-Chernoff bounds to the PAC-Bayesian setting. The proof technique relies on controlling the tails of certain random variables involving the Cramér transform of the loss. Our approach naturally leverages properties of Cramér-Chernoff bounds, such as exact optimization of the free parameter in many PAC-Bayes bounds. We highlight several applications of the main theorem. Firstly, we show that our bound recovers and generalizes previous results. Additionally, our approach allows working with richer assumptions that result in more informative and potentially tighter bounds. In this direction, we provide a general bound under a new model-dependent assumption from which we obtain bounds based on parameter norms and log-Sobolev inequalities. Notably, many of these bounds can be minimized to obtain distributions beyond the Gibbs posterior and provide novel theoretical coverage to existing regularization techniques.