Papers

Dimensionality Reduction Techniques for Statistical Inference in Cosmology

M Park, M Gatti, B Jain - arXiv preprint arXiv:2409.02102, 2024 - arxiv.org
Physics paper astro-ph.CO Suggest

… we obtain are significant compared to compression methods used in the literature: up to 30% in the Figure of Merit for Ωm and S8 in a realistic Simulation Based Inference …

Link to paper

BibTeX

@article{2409.02102v4,
Author = {Minsu Park and Marco Gatti and Bhuvnesh Jain},
Title = {Dimensionality Reduction Techniques for Statistical Inference in
Cosmology},
Eprint = {2409.02102v4},
ArchivePrefix = {arXiv},
PrimaryClass = {astro-ph.CO},
Abstract = {We explore linear and non-linear dimensionality reduction techniques for
statistical inference of parameters in cosmology. Given the importance of
compressing the increasingly complex data vectors used in cosmology, we address
questions that impact the constraining power achieved, such as: Are currently
used methods effectively lossless? Under what conditions do nonlinear methods,
typically based on neural nets, outperform linear methods? Through theoretical
analysis and experiments with simulated weak lensing data vectors we compare
three standard linear methods and neural network based methods. We propose two
linear methods that outperform all others while using less computational
resources: a variation of the MOPED algorithm we call e-MOPED and an adaptation
of Canonical Correlation Analysis (CCA), which is a method new to cosmology but
well known in statistics. Both e-MOPED and CCA utilize simulations spanning the
full parameter space, and rely on the sensitivity of the data vector to the
parameters of interest. The gains we obtain are significant compared to
compression methods used in the literature: up to 30% in the Figure of Merit
for $\Omega_m$ and $S_8$ in a realistic Simulation Based Inference analysis
that includes statistical and systematic errors. We also recommend two
modifications that improve the performance of all methods: First, include
components in the compressed data vector that may not target the key parameters
but still enhance the constraints on due to their correlations. The gain is
significant, above 20% in the Figure of Merit. Second, compress Gaussian and
non-Gaussian statistics separately -- we include two summary statistics of each
type in our analysis.},
Year = {2024},
Month = {Sep},
Url = {http://arxiv.org/abs/2409.02102v4},
File = {2409.02102v4.pdf}
}

Share