tkp.main – Top-level pipeline logic flow

tkp.main

The main pipeline logic, from where all other components are called.

tkp.main.assocate_and_get_force_fits(db_image, job_config)[source]
tkp.main.close_database(dataset_id)[source]
tkp.main.consistency_check()[source]
tkp.main.do_forced_fits(runner, all_forced_fits)[source]
tkp.main.extract_fits_from_files(runner, paths)[source]
tkp.main.extract_metadata(job_config, accessors, runner)[source]
Parameters:
  • job_config – a TKP config object
  • accessors (tuple) – list of tkp.Accessor objects
  • runner (tkp.distribute.Runner) – the runner to use
Returns:

a list of metadata dicts

Return type:

tuple

tkp.main.get_accessors(runner, all_images)[source]
tkp.main.get_metadata_for_sorting(runner, image_paths)[source]

Group images per timestamp. Will open all images in parallel using runner.

Parameters:
Returns:

list of tuples, (timestamp, [list_of_images])

Return type:

tuple

tkp.main.get_pipe_config(job_name)[source]
tkp.main.get_runner(pipe_config)[source]

get parallelise props. Defaults to multiproc with autodetect num cores. Wil initialise the distributor.

One should not mix threads and multiprocessing, but for example AstroPy uses threads internally. Best practice then is to first do multiprocessing, and then threading per process. This is the reason why this function should be called as one of the first functions the in the TraP pipeline lifespan.

tkp.main.initialise_dataset(job_config, supplied_mon_coords)[source]

sets up a dataset in the database.

if the dataset already exists it will return the job_config from the previous dataset run.

Parameters:
  • job_config – a job configuration object
  • supplied_mon_coords (tuple) – a list of monitoring positions
Returns:

job_config and dataset ID

Return type:

tuple

tkp.main.load_images(job_name, job_dir)[source]

Load all the images for a specific TraP job.

Returns:a list of paths
Return type:tuple
tkp.main.quality_check(db_images, accessors, job_config, runner)[source]
Returns:a list of db_image and accessor tuples
Return type:tuple
tkp.main.run(job_name, supplied_mon_coords=None)[source]

TKP pipeline main loop entry point.

Parameters:
  • job_name (str) – name of the jbo to run
  • supplied_mon_coords (tuple) – list of coordinates to monitor
tkp.main.run_batch(image_paths, job_config, runner, dataset_id, copy_images)[source]

Run the pipeline in batch mode.

Parameters:
  • job_name (str) – job name, used for locating images script
  • pipe_config – the pipeline configuration object
  • job_config – a job configuration object
  • runner (tkp.distribute.Runner) – Runner to use for distribution
  • dataset_id (int) – The dataset ID to use
tkp.main.run_stream(runner, job_config, dataset_id, copy_images)[source]

Run the pipeline in stream mode.

Daemon function, doesn’t return.

Parameters:
  • runner (tkp.distribute.Runner) – Runner to use for distribution
  • job_config – a job configuration object
  • dataset_id (int) – The dataset ID to use
tkp.main.setup(pipe_config, supplied_mon_coords=None)[source]

Initialises the pipeline run.

tkp.main.source_extraction(accessors, job_config, runner)[source]
tkp.main.store_extractions(images, extraction_results, job_config)[source]
tkp.main.store_image_data(db_images, fits_datas, fits_headers)[source]
tkp.main.store_image_metadata(metadatas, job_config, dataset_id)[source]
tkp.main.timestamp_step(runner, images, job_config, dataset_id, copy_images)[source]

Called from the main loop with all images in a certain timestep

Parameters:
  • runner (tkp.distribute.Runner) – Runner to use for distribution
  • images (tuple) – list of things tkp.accessors can handle, like image paths or fits objects
  • job_config – a tkp job config object
  • dataset_id (int) – The tkp.db.model.Dataset id
Returns:

of tuples (rms_qc, band)

Return type:

tuple

tkp.main.varmetric(dataset_id)[source]