Papers

Transformers Can Do Bayesian Inference

S Müller, N Hollmann, SP Arango, J Grabocka… - arXiv preprint arXiv …, 2021 - arxiv.org
Computer Science paper cs.LG Suggest

… Previous work in simulation-based inference is focused on simulations for which a specific … Our method is in so far similar to simulation-based inference as both solely use samples from …

Cited by Link to paper

BibTeX

@article{2112.10510v7,
Author = {Samuel Müller and Noah Hollmann and Sebastian Pineda Arango and Josif Grabocka and Frank Hutter},
Title = {Transformers Can Do Bayesian Inference},
Eprint = {2112.10510v7},
ArchivePrefix = {arXiv},
PrimaryClass = {cs.LG},
Abstract = {Currently, it is hard to reap the benefits of deep learning for Bayesian
methods, which allow the explicit specification of prior knowledge and
accurately capture model uncertainty. We present Prior-Data Fitted Networks
(PFNs). PFNs leverage in-context learning in large-scale machine learning
techniques to approximate a large set of posteriors. The only requirement for
PFNs to work is the ability to sample from a prior distribution over supervised
learning tasks (or functions). Our method restates the objective of posterior
approximation as a supervised classification problem with a set-valued input:
it repeatedly draws a task (or function) from the prior, draws a set of data
points and their labels from it, masks one of the labels and learns to make
probabilistic predictions for it based on the set-valued input of the rest of
the data points. Presented with a set of samples from a new supervised learning
task as input, PFNs make probabilistic predictions for arbitrary other data
points in a single forward propagation, having learned to approximate Bayesian
inference. We demonstrate that PFNs can near-perfectly mimic Gaussian processes
and also enable efficient Bayesian inference for intractable problems, with
over 200-fold speedups in multiple setups compared to current methods. We
obtain strong results in very diverse areas such as Gaussian process
regression, Bayesian neural networks, classification for small tabular data
sets, and few-shot image classification, demonstrating the generality of PFNs.
Code and trained PFNs are released at
https://github.com/automl/TransformersCanDoBayesianInference.},
Year = {2021},
Month = {Dec},
Url = {http://arxiv.org/abs/2112.10510v7},
File = {2112.10510v7.pdf}
}

Share