rail.core.data module

Rail-specific data management

rail.core.data.DATA_STORE()[source]

Return the factory instance

class rail.core.data.DataHandle(tag, data=None, path=None, creator=None)[source]

Bases: object

Class to act as a handle for a bit of data. Associating it with a file and providing tools to read & write it to that file

Parameters:
  • tag (str) – The tag under which this data handle can be found in the store

  • data (any or None) – The associated data

  • path (str or None) – The path to the associated file

  • creator (str or None) – The name of the stage that created this data handle

close(**kwargs)[source]

Close

data_handle_type_dict = {'FitsHandle': <class 'rail.core.data.FitsHandle'>, 'Hdf5Handle': <class 'rail.core.data.Hdf5Handle'>, 'ModelHandle': <class 'rail.core.data.ModelHandle'>, 'PqHandle': <class 'rail.core.data.PqHandle'>, 'QPDictHandle': <class 'rail.core.data.QPDictHandle'>, 'QPHandle': <class 'rail.core.data.QPHandle'>, 'QPOrTableHandle': <class 'rail.core.data.QPOrTableHandle'>, 'TableHandle': <class 'rail.core.data.TableHandle'>}
data_size(**kwargs)[source]

Return the size of the in memorry data

finalize_write(**kwargs)[source]

Finalize and close file written by chunks

classmethod get_sub_class(class_name)[source]

Get a particular subclass by name

property has_data

Return true if the data for this handle are loaded

property has_path

Return true if the path for the associated file is defined

initialize_write(data_length, **kwargs)[source]

Initialize file to be written by chunks

property is_written

Return true if the associated file has been written

iterator(**kwargs)[source]

Iterator over the data

classmethod make_name(tag)[source]

Construct and return file name for a particular data tag

open(**kwargs)[source]

Open and return the associated file

Notes

This will simply open the file and return a file-like object to the caller. It will not read or cache the data

classmethod print_sub_classes()[source]

Print the list of all the subclasses

read(force=False, **kwargs)[source]

Read and return the data from the associated file

set_data(data, partial=False)[source]

Set the data for a chunk, and set the partial flag to true

size(**kwargs)[source]

Return the size of the data associated to this handle

suffix = ''
write(**kwargs)[source]

Write the data to the associated file

write_chunk(start, end, **kwargs)[source]

Write the data to the associated file

class rail.core.data.DataStore(**kwargs)[source]

Bases: dict

Class to provide a transient data store

This class: 1) associates data products with keys 2) provides functions to read and write the various data produces to associated files

add_data(key, data, handle_class, path=None, creator='DataStore')[source]

Create a handle for some data, and insert it into the DataStore

add_handle(key, handle_class, path, creator='DataStore')[source]

Create a handle for some data, and insert it into the DataStore

allow_overwrite = False
open(key, mode='r', **kwargs)[source]

Open and return the file associated to a particular key

read(key, force=False, **kwargs)[source]

Read the data associated to a particular key

read_file(key, handle_class, path, creator='DataStore', **kwargs)[source]

Create a handle, use it to read a file, and insert it into the DataStore

write(key, **kwargs)[source]

Write the data associated to a particular key

write_all(force=False, **kwargs)[source]

Write all the data in this DataStore

class rail.core.data.FitsHandle(tag, data=None, path=None, creator=None)[source]

Bases: TableHandle

DataHandle for a table written to fits

suffix = 'fits'
class rail.core.data.Hdf5Handle(tag, data=None, path=None, creator=None)[source]

Bases: TableHandle

DataHandle for a table written to HDF5

suffix = 'hdf5'
class rail.core.data.ModelDict[source]

Bases: dict

A specialized dict to keep track of individual estimation models objects: this is just a dict these additional features

  1. Keys are paths

2. There is a read(path, force=False) method that reads a model object and inserts it into the dictionary 3. There is a single static instance of this class

open(path, mode, **kwargs)[source]

Open the file and return the file handle

read(path, force=False, reader=None, **kwargs)[source]

Read a model into this dict

write(model, path, force=False, writer=None, **kwargs)[source]

Write the model, this default implementation uses pickle

class rail.core.data.ModelHandle(tag, data=None, path=None, creator=None)[source]

Bases: DataHandle

DataHandle for machine learning models

model_factory = {}
suffix = 'pkl'
class rail.core.data.PqHandle(tag, data=None, path=None, creator=None)[source]

Bases: TableHandle

DataHandle for a parquet table

suffix = 'pq'
class rail.core.data.QPDictHandle(tag, data=None, path=None, creator=None)[source]

Bases: DataHandle

DataHandle for dictionaries of qp ensembles

suffix = 'hdf5'
class rail.core.data.QPHandle(tag, data=None, path=None, creator=None)[source]

Bases: DataHandle

DataHandle for qp ensembles

suffix = 'hdf5'
class rail.core.data.QPOrTableHandle(tag, data=None, path=None, creator=None)[source]

Bases: QPHandle, Hdf5Handle

DataHandle that should work with either qp.ensembles or tables

class PdfOrValue(value)[source]

Bases: Enum

An enumeration.

both = 2
distribution = 0
has_dist()[source]
has_point()[source]
point_estimate = 1
unknown = -1
check_pdf_or_point()[source]

Check the associated file to see if it is a QP pdf, point estimate or both

is_qp()[source]

Check if the associated data or file is a QP ensemble

suffix = 'hdf5'
class rail.core.data.TableHandle(tag, data=None, path=None, creator=None)[source]

Bases: DataHandle

DataHandle for single tables of data

set_data(data, partial=False)[source]

Set the data for a chunk, and set the partial flag to true

suffix = None
rail.core.data.default_model_read(modelfile)[source]

Default function to read model files, simply used pickle.load

rail.core.data.default_model_write(model, path)[source]

Write the model, this default implementation uses pickle