Sampling Validation using MC

We can also validate our SBI model by drawing samples from the posterior using MCMC methods. This allows us to assess how well our model captures the underlying distribution of parameters given the observed data.

We can do this using the SBI_Fitter.fit_observation_using_sampler method, which lets us choose from a few nested sampling or MCMC samplers. Here, we’ll use the dynesty sampler to draw samples from the posterior.

First we’ll load a trained model and choose a mock observation to fit.

[1]:
from synference import SBI_Fitter, load_unc_model_from_hdf5, test_data_dir

library_path = (
    f"{test_data_dir}/grid_BPASS_Chab_DenseBasis_SFH_0.01_z_14_logN_2.7_Calzetti_v3_multinode.hdf5"  # noqa: E501
)

fitter = SBI_Fitter.load_saved_model(
    model_file=f"{test_data_dir}", library_path=library_path, device="cpu"
)

nm_path = f"{test_data_dir}/BPASS_DenseBasis_v4_final_nsf_0_params_empirical_noise_models.h5"
noise_models = load_unc_model_from_hdf5(nm_path)

fitter.feature_array_flags["empirical_noise_models"] = noise_models
/opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
2025-11-13 20:06:52,442 | synference | INFO     | Loaded model from /home/runner/.local/share/Synthesizer/data/synference/BPASS_DenseBasis_v4_final_nsf_0_posterior.pkl.
2025-11-13 20:06:52,443 | synference | INFO     | Device: cpu
2025-11-13 20:06:52,486 | synference | WARNING  | IndexError when trying to set train/test arrays.

First we can recreate the simulator from the information stored in the library. We will use the simulator in the log-likelihood calculation during sampling.

[2]:
fitter.recreate_simulator_from_library(
    override_library_path=library_path, override_grid_path="test_grid.hdf5"
);
2025-11-13 20:06:52,523 | synference | INFO     | Overriding internal library name from provided file path.
2025-11-13 20:06:52,804 | synference | WARNING  | Failed to load cosmology from HDF5. Using Planck18 instead.
2025-11-13 20:07:02,426 | synference | INFO     | Updated filters: ['HST/ACS_WFC.F435W', 'HST/ACS_WFC.F606W', 'HST/ACS_WFC.F775W', 'HST/ACS_WFC.F814W', 'HST/ACS_WFC.F850LP', 'JWST/NIRCam.F090W', 'JWST/NIRCam.F115W', 'JWST/NIRCam.F150W', 'JWST/NIRCam.F200W', 'JWST/NIRCam.F277W', 'JWST/NIRCam.F335M', 'JWST/NIRCam.F356W', 'JWST/NIRCam.F410M', 'JWST/NIRCam.F444W']
2025-11-13 20:07:02,430 | synference | INFO     | Simulator recreated from library at /home/runner/.local/share/Synthesizer/data/synference/grid_BPASS_Chab_DenseBasis_SFH_0.01_z_14_logN_2.7_Calzetti_v3_multinode.hdf5.
2025-11-13 20:07:02,431 | synference | INFO     | Auto applying inverse log10 transform for log10_Av.
2025-11-13 20:07:02,432 | synference | INFO     | Auto applying inverse log10 transform for log10_mass_weighted_age.
2025-11-13 20:07:02,432 | synference | INFO     | Auto applying inverse log10 transform for log10_floor_sfr_10.
2025-11-13 20:07:02,433 | synference | INFO     | Adding Av to tau_v transform.

Then we can proceed to fit the observation using the sampler.

The code will attempt to remove parameters which don’t affect the fit (e.g. supplemental parameters) before fitting, but you can also specify to remove specific parameters using the remove_params argument.

Let’s choose an observation from the validation set and fit it using the dynesty sampler.

[3]:
index = 2

data = f"{test_data_dir}/sbi_test_data_BPASS_DenseBasis_v4_final.npz"
import numpy as np

loaded = np.load(data)
X_test = loaded["X"][:, index]
y_test = loaded["y"][:, index]

print(X_test)
[24.33915  21.76936  28.048773 28.16009  28.237339 28.1432   28.001326
 28.250029]

We’ll just recreate our prior for the model.

[4]:
fitter.create_priors(set_self=True);
2025-11-13 20:07:02,451 | synference | INFO     | ---------------------------------------------
2025-11-13 20:07:02,453 | synference | INFO     | Prior ranges:
2025-11-13 20:07:02,454 | synference | INFO     | ---------------------------------------------
2025-11-13 20:07:02,456 | synference | INFO     | log_mass: 4.43 - 11.83 [log10_Msun]
2025-11-13 20:07:02,456 | synference | INFO     | log10metallicity: -3.96 - -1.84 [log10(Zmet)]
2025-11-13 20:07:02,458 | synference | INFO     | log10_Av: -2.88 - 0.59 [log10(mag)]
2025-11-13 20:07:02,458 | synference | INFO     | log_sfr: -6.10 - 2.98 [log10(Msun/yr)]
2025-11-13 20:07:02,459 | synference | INFO     | sfh_quantile_25: 0.16 - 0.77 [dimensionless]
2025-11-13 20:07:02,460 | synference | INFO     | sfh_quantile_50: 0.23 - 0.88 [dimensionless]
2025-11-13 20:07:02,460 | synference | INFO     | sfh_quantile_75: 0.39 - 0.98 [dimensionless]
2025-11-13 20:07:02,461 | synference | INFO     | log10_mass_weighted_age: 2.16 - 2.91 [log10(Myr)]
2025-11-13 20:07:02,461 | synference | INFO     | log10_floor_sfr_10: -6.00 - 2.95 [log10_floor(Msun/yr)]
2025-11-13 20:07:02,462 | synference | INFO     | log_surviving_mass: 4.23 - 11.67 [log10_Msun]
2025-11-13 20:07:02,462 | synference | INFO     | beta: -2.38 - 2.34 [dimensionless]
2025-11-13 20:07:02,463 | synference | INFO     | ---------------------------------------------
2025-11-13 20:07:02,464 | synference | INFO     | Processing prior...

No we can run the sampler to fit the observation. This is too computationally expensive to run here, so we’ll just show the code you would use.

result = fitter.fit_observation_using_sampler(
    observation=X_test,
    sampler="dynesty",
    sampler_kwargs={"bound": "multi", "sample": "rwalk", "run_kwargs": {"n_effective": 2000}},
    plot_name=f"example_{index}",
    min_flux_pc_error=0.05,
)
samples = result["samples"]
log_l = result["logl"]
log_w = result["logwt"]

We can sample using our trained SBI model as well for comparison.

[5]:
predicted_params = fitter.sample_posterior(X_test, num_samples=1000)
Sampling from posterior:   0%|          | 0/1 [00:00<?, ?it/s]
2025-11-13 20:07:02,478 | synference | ERROR    | Error occurred while sampling for sample 0: The trailing dimensions of `theta_or_x` do not match the `event_shape`.
Sampling from posterior: 100%|██████████| 1/1 [00:00<00:00, 802.43it/s]