staging¶
staging
Package¶
Image processing preparation.
The staging package defines the functions used to prepare the study image files for import into XNAT, submission to the TCIA QIN collections and pipeline processing.
ctp_config
¶
-
qipipe.staging.ctp_config.
ctp_collection_for_name
(name)¶ Parameters: name – the QIN collection name Returns: the CTP collection name
fix_dicom
¶
-
qipipe.staging.fix_dicom.
COMMENT_PREFIX
= <_sre.SRE_Pattern object>¶ OHSU - the
Image Comments
tag value prefix.
-
qipipe.staging.fix_dicom.
DATE_FMT
= '%Y%m%d'¶ The DICOM date format is YYYYMMDD.
-
qipipe.staging.fix_dicom.
fix_dicom_headers
(collection, subject, *in_files, **opts)¶ Fix the given input DICOM files as follows:
- Replace the
Patient ID
value with the subject number, e.g. Sarcoma001
- Replace the
- Add the
Body Part Examined
tag - Anonymize the
Patient's Birth Date
tag - Standardize the file name
OHSU - The
Body Part Examined
tag is set as follows:- If the collection is
Sarcoma
, then the body part is the qipipe.staging.sarcoma_config.sarcoma_location()
.
- If the collection is
- Otherwise, the body part is the capitalized collection name, e.g.
BREAST
.
OHSU - Remove extraneous
Image Comments
tag value content which might contain PHI.The output file name is standardized as follows:
- The file name is lower-case
- The file extension is
.dcm
- Each non-word character is replaced by an underscore
Parameters: - collection – the collection name
- subject – the input subject name
- opts – the following keyword arguments:
- dest – the location in which to write the modified files (default is the current directory)
Returns: the files which were created
Raises: StagingError – if the collection is not supported
image_collection
¶
-
class
qipipe.staging.image_collection.
Collection
(name, **opts)¶ Bases:
object
The image collection.
Parameters: - name – the
name
- opts – the following keyword options:
Option subject: the subject directory name match regular expression
Option session: the session directory name match regular expression
Option scan_types: the
scan_types
Option scan: the {scan number: {dicom, roi}} dictionary
Option volume: the DICOM tag which identifies a scan volume
Option crop_posterior: the
crop_posterior
flag-
__init__
(name, **opts)¶ Parameters: - name – the
name
- opts – the following keyword options:
Option subject: the subject directory name match regular expression
Option session: the session directory name match regular expression
Option scan_types: the
scan_types
Option scan: the {scan number: {dicom, roi}} dictionary
Option volume: the DICOM tag which identifies a scan volume
Option crop_posterior: the
crop_posterior
flag- name – the
-
crop_posterior
= None¶ A flag indicating whether to crop the image posterior in the mask, e.g. for a breast tumor (default False).
-
instances
= {'sarcoma': <qipipe.staging.image_collection.Collection object>, 'breast': <qipipe.staging.image_collection.Collection object>}¶ The collection {name: object} dictionary.
-
name
= None¶ The capitalized collection name.
-
patterns
= None¶ The DICOM and ROI meta-data patterns. This
patterns
attribute consists of the entriesdicom
androi
, Each of these fields has a mandatoryglob
entry and an optionalregex
entry. Theglob
entry matches the scan subdirectory containing the DICOM or ROI files. Theregex
entry matches the DICOM or ROI files in the subdirectory. The default in the absence of aregex
entry is to include all files in the subdirectory.
-
scan_types
= None¶ The scan {number: type} dictionary.
- name – the
-
qipipe.staging.image_collection.
with_name
(name)¶ Returns: the Collection
whose name is a case-insensitive match for the given name, or None if no match is found
iterator
¶
-
class
qipipe.staging.iterator.
VisitIterator
(project, collection, *session_dirs, **opts)¶ Bases:
object
Scan DICOM generator class .
Parameters: - project – the XNAT project name
- collection – the image collection name
- session_dirs – the session directories over which to iterate
- opts – the
iter_stage()
options
-
__init__
(project, collection, *session_dirs, **opts)¶ Parameters: - project – the XNAT project name
- collection – the image collection name
- session_dirs – the session directories over which to iterate
- opts – the
iter_stage()
options
-
collection
= None¶ The
iter_stage()
collection name parameter.
-
project
= None¶ The
iter_stage()
project name parameter.
-
scan
= None¶ The
iter_stage()
scan number option.
-
session_dirs
= None¶ The input directories.
-
skip_existing
= None¶ The
iter_stage()
skip_existing flag option.
-
qipipe.staging.iterator.
iter_stage
(project, collection, *inputs, **opts)¶ Iterates over the the scans in the given input directories. This method is a staging generator which yields a tuple consisting of the {subject, session, scan, dicom, roi} object.
The input directories conform to the
qipipe.staging.image_collection.Collection.patterns
subject
regular expression.Each iteration {subject, session, scan, dicom, roi} object is formed as follows:
- The subject is the XNAT subject name formatted by
SUBJECT_FMT
. - The session is the XNAT experiment name formatted by
SESSION_FMT
. - The scan is the XNAT scan number.
- dicom is the DICOM directory.
- roi is the ROI directory.
Parameters: - project – the XNAT project name
- collection – the
qipipe.staging.image_collection.Collection.name
- inputs – the source subject directories to stage
- opts – the following keyword option:
- scan – the scan number to stage (default stage all detected scans)
- skip_existing – flag indicating whether to ignore each existing session, or scan if the scan option is set (default True)
Yield: the {subject, session, scan, dicom, roi} objects
- The subject is the XNAT subject name formatted by
map_ctp
¶
TCIA CTP preparation utilities.
-
class
qipipe.staging.map_ctp.
CTPPatientIdMap
¶ Bases:
dict
CTPPatientIdMap is a dictionary augmented with a
map_subjects()
input method to build the map and awrite()
output method to print the CTP map properties.-
CTP_FMT
= '%s-%04d'¶ The CTP Patient ID format with arguments (CTP collection name, input Patient ID number).
-
MAP_FMT
= 'ptid/%s=%s'¶ The ID lookup entry format with arguments (input Paitent ID, CTP patient id).
-
MSG_FMT
= 'Mapped the QIN patient id %s to the CTP subject id %s.'¶ The log message format with arguments (input Paitent ID, CTP patient id).
-
SOURCE_PAT
= <_sre.SRE_Pattern object>¶ The input Patient ID pattern is the study name followed by a number, e.g.
Breast10
.
-
add_subjects
(collection, *patient_ids)¶ Adds the input => CTP Patient ID association for the given input DICOM patient ids.
Parameters: - collection – the image collection name
- patient_ids – the DICOM Patient IDs to map
Raises: StagingError – if an input patient id format is not the study followed by the patient number
-
write
(dest=<open file '<stdout>', mode 'w'>)¶ Writes this id map in the standard CTP format.
Parameters: dest – the IO stream on which to write this map (default stdout)
-
-
qipipe.staging.map_ctp.
PROP_FMT
= 'QIN-%s-OHSU.ID-LOOKUP.properties'¶ The format for the Patient ID map file name specified by CTP.
-
qipipe.staging.map_ctp.
map_ctp
(collection, *subjects, **opts)¶ Creates the TCIA patient id map. The map is written to a property file in the destination directory. The property file name is given by
property_filename()
.Parameters: - collection – the image collection
- subjects – the subject names
- opts – the following keyword option:
- dest – the destination directory
Returns: the subject map file path
-
qipipe.staging.map_ctp.
property_filename
(collection)¶ Returns the CTP id map property file name for the given collection. The Sarcoma collection is capitalized in the file name, Breast is not.
ohsu
¶
This module contains the OHSU-specific image collections.
- The following OHSU QIN scan numbers are captured:
- 1: T1
- 2: T2
- 4: DW
- 6: PD
These scans have DICOM files specified by the
qipipe.staging.image_collection.Collection.patterns
dicom
attribute. The T1 scan has ROI files as well, specified
by the patterns roi.glob
and roi.regex
attributes.
-
qipipe.staging.ohsu.
BREAST_DW_PAT
= '*sorted/*Diffusion'¶ The Breast DW DICOM directory match pattern.
-
qipipe.staging.ohsu.
BREAST_PD_PAT
= '*sorted/*PD*'¶ The Breast pseudo-proton density DICOM directory match pattern.
-
qipipe.staging.ohsu.
BREAST_ROI_PAT
= 'processing/R10_0.[456]*/slice*'¶ The Breast ROI glob filter. The
.bqf
ROI files are in the following session subdirectory:processing/<R10 directory>/slice<slice index>/
-
qipipe.staging.ohsu.
BREAST_ROI_REGEX
= <_sre.SRE_Pattern object at 0x48ccd60>¶ The Breast ROI .bqf ROI file match pattern.
-
qipipe.staging.ohsu.
BREAST_SESSION_REGEX
= <_sre.SRE_Pattern object>¶ The Sarcoma session directory match pattern. The variations
Visit_3
,Visit3
,visit3
,BC4V3
,BC4_V3
andB4V3
all match Breast Session03.
-
qipipe.staging.ohsu.
BREAST_SUBJECT_REGEX
= <_sre.SRE_Pattern object>¶ The Breast subject directory match pattern.
-
qipipe.staging.ohsu.
BREAST_T2_PAT
= '*sorted/2_tirm_tra_bilat'¶ The Breast T2 DICOM directory match pattern.
-
qipipe.staging.ohsu.
MULTI_VOLUME_SCAN_NUMBERS
= [1]¶ Only T1 scans can have more than one volume.
-
qipipe.staging.ohsu.
SARCOMA_DW_PAT
= '*Diffusion'¶ The Sarcoma DW DICOM directory match pattern.
-
qipipe.staging.ohsu.
SARCOMA_ROI_PAT
= 'Breast processing results/multi_slice/slice*'¶ The Sarcoma ROI glob filter. The
.bqf
ROI files are in the session subdirectory:Breast processing results/<ROI directory>/slice<slice index>/(Yes, the Sarcoma processing results is in the “Breast processing results” subdirectory)!
-
qipipe.staging.ohsu.
SARCOMA_ROI_REGEX
= <_sre.SRE_Pattern object>¶ The Sarcoma ROI .bqf ROI file match pattern.
Note
The Sarcoma ROI directories are inconsistently named, with several alternatives and duplicates.
TODO - clarify which of the Sarcoma ROI naming variations should be used.
Note
There are no apparent lesion number indicators in the Sarcoma ROI input.
TODO - confirm that there is no Sarcoma lesion indicator.
-
qipipe.staging.ohsu.
SARCOMA_SESSION_REGEX
= <_sre.SRE_Pattern object>¶ The Sarcoma session directory match pattern. The variations
Visit_3
,Visit3
,visit3
S4V3
, andS4_V3
all match Sarcoma Session03.
-
qipipe.staging.ohsu.
SARCOMA_SUBJECT_REGEX
= <_sre.SRE_Pattern object>¶ The Sarcoma subject directory match pattern.
-
qipipe.staging.ohsu.
SARCOMA_T2_PAT
= '*T2*'¶ The Sarcoma T2 DICOM directory match pattern.
-
qipipe.staging.ohsu.
SESSION_REGEX_PAT
= "\n (?: # Don't capture the prefix\n [vV]isit # The Visit or visit prefix form\n _? # with an optional underscore delimiter\n | # ...or...\n %s\\d+_?V # The alternate prefix form, beginning with\n # a leading collection abbreviation\n # substituted into the pattern below\n ) # End of the prefix\n (\\d+)$ # The visit number\n"¶ The session directory match pattern. This pattern must be specialized for each collection by replacing the %s place-holder with a string.
-
qipipe.staging.ohsu.
T1_PAT
= '*concat*'¶ The T1 DICOM directory match pattern.
-
qipipe.staging.ohsu.
VOLUME_TAG
= 'AcquisitionNumber'¶ The DICOM tag which identifies the volume. The OHSU QIN collections are unusual in that the DICOM images which comprise a 3D volume have the same DICOM Series Number and Acquisition Number tag. The series numbers are consecutive, non-sequential integers, e.g. 9, 11, 13, ..., whereas the acquisition numbers are consecutive, sequential integers starting at 1. The Acquisition Number tag is selected as the volume number identifier.
roi
¶
OHSU - ROI utility functions.
TODO - move this to ohsu-qipipe.
-
class
qipipe.staging.roi.
LesionROI
(lesion, volume_number, slice_sequence_number, location)¶ Bases:
object
Aggregate with attributes
lesion
volume
,slice
andlocation
.Parameters: -
__init__
(lesion, volume_number, slice_sequence_number, location)¶ Parameters:
-
lesion
= None¶ The lesion number.
-
location
= None¶ The absolute BOLERO ROI .bqf file path.
-
slice
= None¶ The one-based slice sequence number.
-
volume
= None¶ The one-based volume number.
-
-
qipipe.staging.roi.
PARAM_REGEX
= <_sre.SRE_Pattern object>¶ The regex to parse a parameter file.
-
qipipe.staging.roi.
iter_roi
(regex, *in_dirs)¶ Iterates over the the OHSU ROI
.bqf
mask files in the given input directories. This method is aLesionROI
generator, e.g.:>>> # Find .bqf files anywhere under /path/to/session/processing. >>> next(iter_roi('.*/\.bqf', '/path/to/session')) {lesion: 1, slice: 12, path: '/path/to/session/processing/rois/roi.bqf'}
;param regex: the file name match regular expression Parameters: in_dirs – the ROI directories to search Yield: the LesionROI
objects
sort
¶
-
qipipe.staging.sort.
sort
(collection, scan, *in_dirs)¶ Groups the DICOM files in the given location by volume.
Parameters: - collection – the collection name
- scan – the scan number
- in_dirs – the input DICOM directories
Returns: the {volume: files} dictionary
sarcoma_config
¶
-
qipipe.staging.sarcoma_config.
CFG_FILE
= '/home/docs/checkouts/readthedocs.org/user_builds/qipipe/checkouts/latest/qipipe/conf/sarcoma.cfg'¶ The Sarcoma Tumor Location configuration file. This file contains properties that associate the subject name to the location, e.g.:
Sarcoma004 = SHOULDER
The value is the SNOMED anatomy term.
-
qipipe.staging.sarcoma_config.
sarcoma_config
()¶ Returns: the sarcoma configuration Return type: ConfigParser
-
qipipe.staging.sarcoma_config.
sarcoma_location
(subject)¶ Parameters: subject – the XNAT Subject ID Returns: the subject tumor location