tkp.db.orm – Object Relational Model interface

This module contains lightweight container objects that corresponds to a dataset, image or extracted source in the database; it is actually a mini Object Relation Mapper (ORM). The correspondence between the object and table row is matched through the private _id attributes.

Each dataset contains several database Images; each Image contains a number of ExtractedSources. The database Images correspond to the images table in the database, not to sourcefinder images or actual image data files on disk (this distinction is important; while there are certainly parts in common, several are not).

The current setup is done in large part to keep the database and sourcefinder (and other parts of the TKP package) separate; tightly integrated database tables/sourcefinder images/disk files make it more difficult to improve the code or distribute parts separately.

Usage

In practice, a DataSet object is created, and separate Images are created referencing that DataSet() instance; ids are automatically assigned where necessary (i.e., on creation of a new entry (row) in the database).

Objects can also be created using an existing id; data is then taken from the corresponding table row in the database.

Creating new objects

The following code is an usage example, but should not be used as a doc test (since the database value can differ, and thus the test would fail):

# database sets up and holds the connection to the actual database
>>> database = tkp.db.database.Database()

# Each object type takes a data dictionary on creation, which for newly objects
# has some required keys (& values). For a DataSet, this is only 'description';
# for an Image, the keys are 'freq_eff', 'freq_bw_', 'taustart_ts',
# 'tau_time' & 'url'
# The required values are stored in the the REQUIRED attribute
>>> dataset = DataSet(data={'description': 'a dataset'}, database=database)

# Here, dataset indirectly holds the database connection:
>>> dataset.database
DataBase(host=heastro1, name=trap, user=trap, ...)
>>> image1 = Image(data={'freq_eff': '80e6', 'freq_bw': 1e6,         'taustart_ts': datetime(2011, 5, 1, 0, 0, 0), 'tau_time': 1800.,  'url': '/'}, dataset=dataset)  # initialize with defaults
    # note the dataset kwarg, which holds the database connection
>>> image1.tau_time
1800.
>>> image1.taustart_ts
datetime.datetime(2011, 5, 1, 0, 0, 0)
>>> image2 = Image(data={'freq_eff': '80e6', 'freq_bw': 1e6,         'taustart_ts': datetime(2011, 5, 1, 0, 1, 0), 'tau_time': 1500.,'url': '/'}, dataset=dataset)
>>> image2.tau_time
1500
>>> image2.taustart_ts
datetime.datetime(2011, 5, 1, 0, 1, 0)
# Images created with a dataset object, are automatically added to that dataset:
>>> dataset.images
set([<tkp.database.dataset.Image object at 0x26fb6d0>, <tkp.database.dataset.Image object at 0x26fb790>])

Updating objects

To update objects, use the update() method.

This method does two things, in the following order:

1. it updates from the database to the object: if there have been changes in the database, the object will reflect that after executing update()

2. then, it updates the object (and the database) with values supplied by the user. The latter values are optional; no supplied values simply means there aren’t any updates.

>>> image2.update(tau_time=2500)    # updates the database as well
>>> image2.tau_time
2500
>>> database.cursor.execute("SELECT tau_time FROM images WHERE imageid=%s" %                                  (image2.id,))
>>> database.cursors.fetchone()[0]
2500
# Manually update the database
>>> database.cursor.execute("UPDATE images SET tau_time=2000.0 imageid=%s" %                                  (image2.id,))
>>> image2.tau_time   # not updated yet!
2500
>>> image2.update()
>>> image2.tau_time
2000

Assigning objects to a table row on creation

It is also possible to create a DataSet, Image or ExtractedSource instance from the database, using the id in the initializer:

>>> dataset2 = DataSet(id=dataset.id, database=database)
>>> image3 = Image(imageid=image2.id, database=database)
>>> image3.tau_time
2000

If an id is supplied, data is ignored.

class tkp.db.orm.DBObject(data=None, database=None, id=None)[source]

Generic mini-ORM object

Derived objects will need to implement __init__, which for practical reasons is split up in __init__ and _init_data: the latter is called at the end __init__, so a derived __init__ would have super(Derived, self).__init__() at the start and super(Derived, self)._init_data() at the end.

__init__ takes care of setting the id, the supplied data dictionary and the connection to the database.

_init_data sets the actual data either from the database (in case of a supplied id) or from the data dictionary.

Basic initialization.

Inherited classes need to implement any actual database action, by calling self._init_data() at the end of their __init__ method.

id

Add or obtain an id to/from the table

The id is generated if self._id does not exist, effectively creating a new row in the database.

Several containers have their specific SQL function to create a new object, so this property will need to overridden.

update(**kwargs)[source]

Update attributes from database, and set database values to kwargs when provided

This method performs two functions, the first always and the second optionally after the first:

  • it updates the attributes from the database. That is, it makes sure the Python instance is synchronized with the database.
  • (optional): it sets the column values in the database to the values provided through kwargs, for the associated database row. Attributes for the instance are of course also set to these values. Any kwargs that do not correspond to a column name are simply ignored.

This function therefore first updates the instance from the database, and then optionally the database from the instance (with the provided keyword arguments).

class tkp.db.orm.DataSet(data=None, database=None, id=None)[source]

Class corresponding to the dataset table in the database

If id is supplied, the data and image arguments are ignored.

frequency_bands()[source]

Return a list of distinct bands present in the dataset.

id

Add or obtain an id to/from the table

This uses the SQL function insertDataset().

runcat_entries()[source]
Returns:a list of dictionarys representing rows in runningcatalog, for all sources belonging to this dataset

Column ‘id’ is returned with the key ‘runcat’

Currently only returns 3 columns: [{‘runcat,’xtrsrc’,’datapoints’}]

Return type:list
update_images()[source]

Renew the set of images by getting the images for this dataset from the database. Implemented separately from update(), since normally this would be too much overhead

class tkp.db.orm.ExtractedSource(data=None, image=None, database=None, id=None)[source]

Class corresponding to the extractedsource table in the database

If id is supplied, the data and image arguments are ignored.

lightcurve()[source]

Obtain the complete light curve (within the current dataset) for this source.

Returns:list of 5-tuples, each tuple being: - observation start time as a datetime.datetime object - integration time (float) - integrated flux (float) - integrated flux error (float) - database ID of this particular source
Return type:list
class tkp.db.orm.Image(data=None, dataset=None, database=None, id=None)[source]

Class corresponding to the images table in the database

If id is supplied, the data and image arguments are ignored.

associate_extracted_sources(deRuiter_r, new_source_sigma_margin)[source]

Associate sources from the last images with previously extracted sources within the same dataset

Parameters:deRuiter_r (float) – The De Ruiter radius for source association. The default value is set through the tkp.config module
id

Add or obtain an id to/from the table

This uses the SQL function insertImage()

insert_extracted_sources(results, extract='blind')[source]

Insert a list of sources

Parameters:
  • results (list) – list of utility.containers.ExtractionResult objects (as returned from sourcefinder.image.ImageData().extract()), or a list of data tuples with the source information as follows: (ra, dec, ra_fit_err, dec_fit_err, peak, peak_err, flux, flux_err, significance level, beam major width (as), beam minor width(as), beam parallactic angle ew_sys_err, ns_sys_err, error_radius).
  • extract (str) – ‘blind’, ‘ff_nd’ or ‘ff_ms’ (see db.general.insert_extracted_sources)
update_rejected()[source]

Update self.rejected with the rejected status. Will be false if not rejected, will be a list of reject descriptions if rejected

update_sources()[source]

Renew the set of sources by getting the sources for this image from the database

This method is separately implemented, because it’s not always necessary and potentially (for an image with dozens or more sources) time & memory consuming.