Input data#

Module that handles the import of nbody simulation data from various formats.

class beorn.load_input_data.ArtificialHaloLoader(parameters: Parameters, halo_count: int, seed: int = 12345, final_mass: float = 1000000000000.0, alpha: float = 0.79)[source]#

Bases: BaseLoader

Simple synthetic loader for quick testing.

The loader places a fixed number of halos at random positions in the box and returns masses that follow a simple exponential mass evolution. Intended for tests and examples.

__init__(parameters: Parameters, halo_count: int, seed: int = 12345, final_mass: float = 1000000000000.0, alpha: float = 0.79)[source]#

Create an artificial loader.

Parameters:
  • parameters (Parameters) – Simulation parameters.

  • halo_count (int) – Number of synthetic halos to create.

  • seed (int, optional) – RNG seed for reproducible positions.

  • final_mass (float, optional) – Final mass at the reference redshift used as M0 in the accretion model.

  • alpha (float, optional) – Exponential accretion parameter.

property redshifts#

Return the redshift grid used by this loader.

The grid is returned in ascending order (current -> past).

load_density_field(redshift_index)[source]#

Return a zero density field (placeholder for tests).

Returns an array of zeros with shape (Ncell, Ncell, Ncell).

load_rsd_fields(redshift_index)[source]#

Not implemented for the artificial loader.

Raises:

NotImplementedError – RSD fields are not provided.

load_halo_catalog(redshift_index)[source]#

Return a HaloCatalog with synthetic positions and masses.

Parameters:

redshift_index (int) – Index of the snapshot to return.

Returns:

Catalog containing positions and masses.

Return type:

HaloCatalog

class beorn.load_input_data.BaseLoader(parameters: Parameters)[source]#

Bases: ABC

Abstract base class for data loaders.

Subclasses must implement methods to provide the following data: - halo catalogs containing all relevant halo properties, - baryonic density fields, - redshift-space-distortion (RSD) (optional)

Implementations are expected to expose a redshifts property describing available snapshots.

__init__(parameters: Parameters)[source]#

Initialize the loader with simulation parameters.

Parameters:

parameters (Parameters) – Parameters object containing simulation settings.

abstractmethod load_halo_catalog(redshift_index: int) HaloCatalog[source]#

Load the halo catalog for a given snapshot index.

Parameters:

redshift_index (int) – Snapshot index to load.

Returns:

Loaded halo catalog for the snapshot.

Return type:

HaloCatalog

abstractmethod load_density_field(redshift_index: int) numpy.ndarray[source]#

Load the baryonic density field for a given snapshot.

Parameters:

redshift_index (int) – Snapshot index to load.

Returns:

3D density field array (shape Ncell^3).

Return type:

numpy.ndarray

abstractmethod load_rsd_fields(redshift_index: int) tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray][source]#

Load the three RSD (velocity-weighted) fields for the snapshot.

Parameters:

redshift_index (int) – Snapshot index to load.

Returns:

The three velocity-component meshes (vx, vy, vz) mapped to the grid.

Return type:

tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]

abstract property redshifts: numpy.ndarray#

Array of available redshifts for this loader.

Returns:

1D array of redshift values (ascending order: current->past).

Return type:

numpy.ndarray

redshift_index(redshift: float) int[source]#

Return the index of redshift in the loader’s grid.

Parameters:

redshift (float) – Redshift value to look up.

Returns:

Index into :pyattr:`redshifts` corresponding to redshift.

Return type:

int

Raises:

ValueError – If the requested redshift is not available.

class beorn.load_input_data.PKDGravLoader(parameters: Parameters, density_subpath: str = 'nc256', halo_subpath: str = 'haloes', halo_glob: str = '*.fof.txt', density_glob: str = '*.den.256.0', log_filename: str = 'CDM_200Mpc_2048.log', catalog_yml: Path | None = None, halo_finder: str | None = None)[source]#

Bases: NBodyLoader

Loader for PKDGrav3 N-body simulation data.

Reads the PKDGrav3 log file to map step numbers to redshifts, then discovers density and halo files by globbing the configured directories. The halo-file glob pattern can be changed to support different halo finders (PKDGrav3 built-in FOF, AHF, Rockstar, …).

All snapshot-catalog caching, partial-dataset handling, and informative logging is provided by the parent NBodyLoader.

Parameters:
  • parameters (Parameters) – BEoRN parameter container. Must have cosmo_sim.file_root pointing at the root of the N-body data.

  • density_subpath (str) – Subdirectory (relative to file_root) containing the binary density-grid files. Default: "nc256".

  • halo_subpath (str) – Subdirectory (relative to file_root) containing halo catalog files. Default: "haloes".

  • halo_glob (str) – Glob pattern used to discover halo catalog files inside halo_subpath. Change this to match the output format of AHF ("*.AHF_halos"), Rockstar ("halos_*.ascii"), etc. Default: "*.fof.txt" (PKDGrav3 built-in FOF finder).

  • density_glob (str) – Glob pattern for density files. Default: "*.den.256.0".

  • log_filename (str) – Name of the PKDGrav3 log file. Default: "CDM_200Mpc_2048.log".

  • catalog_yml (Path | str | None) – Path to an existing snapshot catalog YML file. If supplied, BEoRN reads that file directly and skips the directory scan. See NBodyLoader for details. Default: None (auto-find or auto-create).

  • halo_finder (str | None) – Override the halo-finder name stored in the YML. If None (default), the name is inferred from halo_glob.

__init__(parameters: Parameters, density_subpath: str = 'nc256', halo_subpath: str = 'haloes', halo_glob: str = '*.fof.txt', density_glob: str = '*.den.256.0', log_filename: str = 'CDM_200Mpc_2048.log', catalog_yml: Path | None = None, halo_finder: str | None = None)[source]#

Initialize the loader with simulation parameters.

Parameters:

parameters (Parameters) – Parameters object containing simulation settings.

discover_snapshots() list[dict][source]#

Scan the PKDGrav3 data directories and return the snapshot inventory.

Reads the PKDGrav3 log file (column 1 = redshift at each step, with row_index = step_number 1) and matches density and halo files by their embedded step number.

Returns:

One entry per density file found, with keys "redshift", "density_file", and "halo_file" (the latter is None when no matching catalog exists for that step). File paths are relative to self.file_root.

Return type:

list[dict]

load_rsd_fields(redshift_index: int)[source]#

RSD fields are not available for PKDGrav3 in this implementation.

Raises:

NotImplementedError – Always.

class beorn.load_input_data.ThesanLoader(parameters, cache_file: Path | str, *, is_high_res: bool = False)[source]#

Bases: MergerTreeLoader

Loader for THESAN-DARK simulation data with LHaloTree merger trees.

This is the production-ready loader shipped with BEoRN. If you are writing a custom loader for a different simulation see MergerTreeLoader for the minimal interface you need to implement.

Parameters:
  • parameters – BEoRN parameter container. cosmo_sim.file_root must point to the THESAN data root (the directory that contains output/, postprocessing/, etc.).

  • cache_file (Path | str) – Path to the flat HDF5 tree cache produced by the cache extraction cell in the notebook. Snapshot redshifts and other provenance metadata are read from its root attributes and the snapshot_redshifts dataset.

  • is_high_res (bool) – True for THESAN-DARK 1 (2100³ particles), False (default) for THESAN-DARK 2 (1050³ particles).

__init__(parameters, cache_file: Path | str, *, is_high_res: bool = False)[source]#

Initialize the loader with simulation parameters.

Parameters:

parameters (Parameters) – Parameters object containing simulation settings.

property redshifts: numpy.ndarray#

Array of available redshifts for this loader.

Returns:

1D array of redshift values (ascending order: current->past).

Return type:

numpy.ndarray

load_tree_cache() tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray][source]#

Load the simplified merger tree cache.

The cache HDF5 file is produced by extract_simplified_tree.ipynb. It must contain four datasets: tree_halo_ids, tree_snap_num, tree_mass, tree_main_progenitor.

get_halo_information_from_catalog(redshift_index: int) tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray][source]#

Read group positions, masses, and subhalo→group mapping from THESAN group catalogs.

load_density_field(redshift_index: int) numpy.ndarray[source]#

Paint DM particles onto a mesh and return the overdensity field δ = ρ/⟨ρ⟩ − 1.

load_rsd_fields(redshift_index: int)[source]#

Return per-axis velocity meshes for redshift-space distortions.