rail.estimation.algos.gpz module

RAIL wrapping of Peter Hatfield’s version of GPz, which can be found at: https://github.com/pwhatfield/GPz_py3

class rail.estimation.algos.gpz.GPzEstimator

Bases: CatEstimator

Estimate stage for GPz_v1

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • zmin (float] (default=0.0))

  • zmax (float] (default=3.0))

  • nzbins (int] (default=301))

  • id_col ([str] default=object_id) – name of the object ID column

  • redshift_col ([str] default=redshift) – name of redshift column

  • calc_summary_stats ([bool] default=False) – Compute summary statistics

  • calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.

  • recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates

  • nondetect_val (float] (default=99.0))

  • mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))

  • bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))

  • err_bands (list] (default=['mag_err_u_lsst', 'mag_err_g_lsst', 'mag_err_r_lsst', 'mag_err_i_lsst', 'mag_err_z_lsst', 'mag_err_y_lsst']))

  • ref_band (str] (default=mag_i_lsst))

  • log_errors ([bool] default=True) – if true, take log of magnitude errors

  • replace_error_vals (list] (default=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1]))

  • model (ModelHandle (INPUT))

  • input (TableHandle (INPUT))

  • output (QPHandle (OUTPUT))

__init__(args, **kwargs)

Constructor: Do CatEstimator specific initialization

entrypoint_function: str | None = 'estimate'
interactive_function: str | None = 'gpz_estimator'
name = 'GPzEstimator'
class rail.estimation.algos.gpz.GPzInformer

Bases: CatInformer

Inform stage for GPz_v1

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • nondetect_val (float] (default=99.0))

  • mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))

  • train_frac ([float] default=0.75) – fraction of training data used to make tree, rest used to set best sigma

  • seed ([int] default=87) – random seed

  • bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))

  • err_bands (list] (default=['mag_err_u_lsst', 'mag_err_g_lsst', 'mag_err_r_lsst', 'mag_err_i_lsst', 'mag_err_z_lsst', 'mag_err_y_lsst']))

  • redshift_col (str] (default=redshift))

  • gpz_method ([str] default=VC) – method to be used in GPz, options are ‘GL’, ‘VL’, ‘GD’, ‘VD’, ‘GC’, and ‘VC’

  • n_basis ([int] default=50) – number of basis functions used

  • learn_jointly ([bool] default=True) – if True, jointly learns prior linear mean function

  • hetero_noise ([bool] default=True) – if True, learns heteroscedastic noise process, set False for point est.

  • csl_method ([str] default=normal) – cost sensitive learning type, ‘balanced’, ‘normalized’, or ‘normal’

  • csl_binwidth ([float] default=0.1) – width of bin for ‘balanced’ cost sensitive learning

  • pca_decorrelate ([bool] default=True) – if True, decorrelate data using PCA as preprocessing stage

  • max_iter ([int] default=200) – max number of iterations

  • max_attempt ([int] default=100) – max iterations if no progress on validation

  • log_errors ([bool] default=True) – if true, take log of magnitude errors

  • replace_error_vals (list] (default=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1]))

  • input (TableHandle (INPUT))

  • model (ModelHandle (OUTPUT))

__init__(args, **kwargs)

Constructor Do CatInformer specific initialization

entrypoint_function: str | None = 'inform'
interactive_function: str | None = 'gpz_informer'
name = 'GPzInformer'
run()

train the GPz model after splitting train data into train/validation