rail.estimation.algos.train_z module

Implementation of the ‘pathological photo-z PDF estimator, as used in arXiv:2001.03621 (see section 3.3). It assigns each test set galaxy a photo-z PDF equal to the normalized redshift distribution N (z) of the training set.

class rail.estimation.algos.train_z.TrainZEstimator

Bases: CatEstimator

CatEstimator which returns a global PDF for all galaxies

Parameters:

output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing
hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’
zmin ([float] default=0.0) – The minimum redshift of the z grid or sample
zmax ([float] default=3.0) – The maximum redshift of the z grid or sample
nzbins ([int] default=301) – The number of gridpoints in the z grid
id_col ([str] default=object_id) – name of the object ID column
redshift_col ([str] default=redshift) – name of redshift column
calc_summary_stats ([bool] default=False) – Compute summary statistics
calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.
recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates
model (ModelHandle (INPUT))
input (TableHandle (INPUT))
output (QPHandle (OUTPUT))

__init__(args, **kwargs)

Initialize Estimator

Parameters:

args (Any)
kwargs (Any)

Return type:

None

entrypoint_function: str | None = 'estimate'

interactive_function: str | None = 'train_z_estimator'

name = 'TrainZEstimator'

open_model(**kwargs)

Load the mode and/or attach it to this Stage

Parameters:

tag – Input tag associated to the model
**kwargs (Any) – Should include ‘model’, see notes

Return type:

None

Notes

The keyword arguement ‘model’ should be either

an object with a trained model,
a path pointing to a file that can be read to obtain the trained model,
or a ModelHandle providing access to the trained model.

Returns:: The object encapsulating the trained model.
Return type:: Any
Parameters:: kwargs (Any)

class rail.estimation.algos.train_z.TrainZInformer

Bases: CatInformer

Train an Estimator which returns a global PDF for all galaxies

Parameters:

output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’
zmin ([float] default=0.0) – The minimum redshift of the z grid or sample
zmax ([float] default=3.0) – The maximum redshift of the z grid or sample
nzbins ([int] default=301) – The number of gridpoints in the z grid
redshift_col ([str] default=redshift) – name of redshift column
input (TableHandle (INPUT))
model (ModelHandle (OUTPUT))

entrypoint_function: str | None = 'inform'

interactive_function: str | None = 'train_z_informer'

name = 'TrainZInformer'

run()

Run the stage and return the execution status.

Subclasses must implemented this method.

Return type:: None

validate()

Validation which checks if the required column names by the stage exist in the data

Return type:: None

class rail.estimation.algos.train_z.trainZmodel

Bases: object

Temporary class to store the single trainZ pdf for trained model. Given how simple this is to compute, this seems like overkill.

__init__(zgrid, pdf, zmode)

Parameters:

zgrid (ndarray)
pdf (ndarray)
zmode (ndarray)

Return type:

None