rail.estimation.estimator module

Abstract base classes defining Estimators of individual galaxy redshift uncertainties.

class rail.estimation.estimator.CatEstimator

Bases: RailStage, PointEstimationMixin

The base class for making photo-z posterior estimates from catalog-like inputs (i.e., tables with fluxes in photometric bands among the set of columns)

Estimators use a generic “model”, the details of which depends on the sub-class.

Estimators take as “input” tabular data, apply the photo-z estimation and provide as “output” a QPEnsemble, with per-object p(z).

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • zmin ([float] default=0.0) – The minimum redshift of the z grid or sample

  • zmax ([float] default=3.0) – The maximum redshift of the z grid or sample

  • nzbins ([int] default=301) – The number of gridpoints in the z grid

  • id_col ([str] default=object_id) – name of the object ID column

  • redshift_col ([str] default=redshift) – name of redshift column

  • calc_summary_stats ([bool] default=False) – Compute summary statistics

  • calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.

  • recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates

  • model (ModelHandle (INPUT))

  • input (TableHandle (INPUT))

  • output (QPHandle (OUTPUT))

__init__(args, **kwargs)

Initialize Estimator

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

None

classmethod default_distribution_type()

Return the type of distribution that this estimator creates

By default this is DistributionType.ad_hoc But this can be overridden by sub-classes to return DistributionType.posterior or DistributionType.likelihood if appropriate

Return type:

DistributionType

entrypoint_function: str | None = 'estimate'
estimate(input_data, **kwargs)

The main interface method for the photo-z estimation

This will attach the input data (defined in inputs as “input”) to this Estimator (for introspection and provenance tracking). Then call the run(), validate(), and finalize() methods.

The run method will call _process_chunk(), which needs to be implemented in the subclass, to process input data in batches. See RandomGaussEstimator for a simple example.

Finally, this will return a QPHandle for access to that output data.

Parameters:

input_data (TableLike) – A dictionary of all input data

Returns:

Handle providing access to QP ensemble with output data

Return type:

QPHandle

inputs = [('model', <class 'rail.core.data.ModelHandle'>), ('input', <class 'rail.core.data.TableHandle'>)]
name = 'CatEstimator'
outputs = [('output', <class 'rail.core.data.QPHandle'>)]
run()

Run the stage and return the execution status.

Subclasses must implemented this method.

Return type:

None

class rail.estimation.estimator.PzEstimator

Bases: RailStage, PointEstimationMixin

The base class for making photo-z posterior estimates from other pz inputs

Estimators use a generic “model”, the details of which depends on the sub-class.

Estimators take as “input” a QPEnsemble, with other estimates and provide as “output” a QPEnsemble, with per-object p(z).

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.

  • recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates

  • model (ModelHandle (INPUT))

  • input (QPHandle (INPUT))

  • output (QPHandle (OUTPUT))

__init__(args, **kwargs)

Initialize Estimator

Parameters:
  • args (Any)

  • kwargs (Any)

Return type:

None

entrypoint_function: str | None = 'estimate'
estimate(input_data, **kwargs)

The main interface method for the photo-z estimation

This will attach the input data (defined in inputs as “input”) to this Estimator (for introspection and provenance tracking). Then call the run(), validate(), and finalize() methods.

The run method will call _process_chunk(), which needs to be implemented in the subclass, to process input data in batches. See RandomGaussEstimator for a simple example.

Finally, this will return a QPHandle for access to that output data.

Parameters:

input_data (QPHandle) – A dictionary of all input data

Returns:

Handle providing access to QP ensemble with output data

Return type:

QPHandle

inputs = [('model', <class 'rail.core.data.ModelHandle'>), ('input', <class 'rail.core.data.QPHandle'>)]
name = 'PzEstimator'
outputs = [('output', <class 'rail.core.data.QPHandle'>)]
run()

Run the stage and return the execution status.

Subclasses must implemented this method.

Return type:

None