rail.estimation.algos.train_z module

Implementation of the ‘pathological photo-z PDF estimator, as used in arXiv:2001.03621 (see section 3.3). It assigns each test set galaxy a photo-z PDF equal to the normalized redshift distribution N (z) of the training set.

class rail.estimation.algos.train_z.TrainZEstimator

Bases: CatEstimator

CatEstimator which returns a global PDF for all galaxies

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • zmin ([float] default=0.0) – The minimum redshift of the z grid or sample

  • zmax ([float] default=3.0) – The maximum redshift of the z grid or sample

  • nzbins ([int] default=301) – The number of gridpoints in the z grid

  • id_col ([str] default=object_id) – name of the object ID column

  • redshift_col ([str] default=redshift) – name of redshift column

  • calc_summary_stats ([bool] default=False) – Compute summary statistics

  • calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.

  • recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates

  • model (ModelHandle (INPUT))

  • input (TableHandle (INPUT))

  • output (QPHandle (OUTPUT))

__init__(args, **kwargs)

Initialize Estimator

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

None

entrypoint_function: str | None = 'estimate'
interactive_function: str | None = 'train_z_estimator'
name = 'TrainZEstimator'
open_model(**kwargs)

Load the mode and/or attach it to this Stage

Parameters:
  • tag – Input tag associated to the model

  • **kwargs (Any) – Should include ‘model’, see notes

Return type:

None

Notes

The keyword arguement ‘model’ should be either

  1. an object with a trained model,

  2. a path pointing to a file that can be read to obtain the trained model,

  3. or a ModelHandle providing access to the trained model.

Returns:

The object encapsulating the trained model.

Return type:

Any

Parameters:

kwargs (Any)

class rail.estimation.algos.train_z.TrainZInformer

Bases: CatInformer

Train an Estimator which returns a global PDF for all galaxies

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • zmin ([float] default=0.0) – The minimum redshift of the z grid or sample

  • zmax ([float] default=3.0) – The maximum redshift of the z grid or sample

  • nzbins ([int] default=301) – The number of gridpoints in the z grid

  • redshift_col ([str] default=redshift) – name of redshift column

  • input (TableHandle (INPUT))

  • model (ModelHandle (OUTPUT))

entrypoint_function: str | None = 'inform'
interactive_function: str | None = 'train_z_informer'
name = 'TrainZInformer'
run()

Run the stage and return the execution status.

Subclasses must implemented this method.

Return type:

None

validate()

Validation which checks if the required column names by the stage exist in the data

Return type:

None

class rail.estimation.algos.train_z.trainZmodel

Bases: object

Temporary class to store the single trainZ pdf for trained model. Given how simple this is to compute, this seems like overkill.

__init__(zgrid, pdf, zmode)
Parameters:
  • zgrid (ndarray)

  • pdf (ndarray)

  • zmode (ndarray)

Return type:

None