Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 2:9:37.
doi: 10.4103/jpi.jpi_42_18. eCollection 2018.

Implementing the DICOM Standard for Digital Pathology

Affiliations

Implementing the DICOM Standard for Digital Pathology

Markus D Herrmann et al. J Pathol Inform. .

Abstract

Background: Digital Imaging and Communications in Medicine (DICOM®) is the standard for the representation, storage, and communication of medical images and related information. A DICOM file format and communication protocol for pathology have been defined; however, adoption by vendors and in the field is pending. Here, we implemented the essential aspects of the standard and assessed its capabilities and limitations in a multisite, multivendor healthcare network.

Methods: We selected relevant DICOM attributes, developed a program that extracts pixel data and pixel-related metadata, integrated patient and specimen-related metadata, populated and encoded DICOM attributes, and stored DICOM files. We generated the files using image data from four vendor-specific image file formats and clinical metadata from two departments with different laboratory information systems. We validated the generated DICOM files using recognized DICOM validation tools and measured encoding, storage, and access efficiency for three image compression methods. Finally, we evaluated storing, querying, and retrieving data over the web using existing DICOM archive software.

Results: Whole slide image data can be encoded together with relevant patient and specimen-related metadata as DICOM objects. These objects can be accessed efficiently from files or through RESTful web services using existing software implementations. Performance measurements show that the choice of image compression method has a major impact on data access efficiency. For lossy compression, JPEG achieves the fastest compression/decompression rates. For lossless compression, JPEG-LS significantly outperforms JPEG 2000 with respect to data encoding and decoding speed.

Conclusion: Implementation of DICOM allows efficient access to image data as well as associated metadata. By leveraging a wealth of existing infrastructure solutions, the use of DICOM facilitates enterprise integration and data exchange for digital pathology.

Keywords: Computational pathology; DICOMweb; image compression; slide scanning; whole slide imaging.

PubMed Disclaimer

Conflict of interest statement

There are no conflicts of interest.

Figures

Figure 1
Figure 1
Representation of digital pathology information in DICOM. (a) Extension of the DICOM model of the real world for specimens. Diagram showing different DICOM information entities and their hierarchical relationships. Entities are color-coded based on information source. Information about Patient entities (shown in blue) is obtained from the electronic medical record, information about Study, Container, and Specimen entities (shown in orange) is obtained from the laboratory information system, and information about Series and Image entities is provided by the microscope/slide scanning system (shown in green). (b) DICOM information model. The VL Whole Slide Microscopy Image composite information object definition class provides attributes to describe real-world information entities. Related attributes are grouped into modules (visualized as boxes). The Pixel Data attribute, which represents the actual image pixel values, is highlighted in bold. (c) DICOM data set. Attributes of an instance of the VL Whole Slide Microscopy Image class are encoded as data elements. The pixel data element is generally the last element in the dataset (shown in grey). The elements preceding the pixel data element encode metadata that are required for interpretation of the image. These meta data elements are often collectively (but inaccurately) referred to as the “header” (blue, orange, green)
Figure 2
Figure 2
Encoding of whole slide imaging information in DICOM data set. (a) Schematic of multiresolution whole slide image pyramid. The pyramid base level represents the original image that was acquired by the microscope/slide scanning system. Higher pyramid level images (=lower power) are derived through successive downsampling. The frame of reference for images is the slide coordinate system in millimeter units, where the origin is defined as the lower left corner of the upright standing slide. (b) Total pixel matrix. An image is defined as a continuous, rectangular area of pixels. Shown is the image at ×10 resolution. Note that it is rotated 90° (counterclockwise) relative to the slide coordinate system. (c) Tiled image grid. The pixel matrix may be tiled into smaller rectangular, equally sized regions. Tiles are organized such that the first dimension spans the matrix rows and the second dimension the matrix columns, respectively. (d) Image pyramid encoded as series of VL Whole Slide Microscopy Image instances. Each image instance (=pyramid layer =downsampled magnification) is encoded as a separate DICOM data set. (e) DICOM data set of a multiframe image instance with encapsulated Pixel Data element. Each tile is compressed and encoded as a separate Frame item. Frames are implicitly numbered, based on the order of encoding. The dimensional organization of frames in the real world is described by the dimension index sequence attribute (green), which contains a dimension index pointer and a functional group pointer attribute for each dimension. The values of these attributes point to other attributes in the per-frame functional groups sequence attribute (yellow), which encode the actual values for each frame. The frame content sequence attribute hereby describes the relative position of each frame in the tiled image grid whereas the plane position (slide) sequence attribute describes the absolute position of each frame in the slide coordinate system as well as the total pixel matrix. The byte offset to individual frame items in the pixel data element (grey box labeled 1-6) is specified by the basic offset table item (purple), which is itself part of (the first item of) the encapsulated pixel data element
Figure 3
Figure 3
Encoding of specimen information in DICOM data set. (a) DICOM model of the real world for pathology laboratory workflow. Top: Hierarchical relationships between the patient (blue), study =case (purple), specimen (yellow), and container (green) information entities. Bottom: Sequence of specimen preparation steps (1-6) for a typical surgical pathology encounter. (b) DICOM data set of image instance with encoded laboratory information. Patient and Study entities are described by attributes of the Patient module (blue) and general study and patient study modules (purple), respectively. Specimen and container entities are described by attributes of the Specimen module (yellow). The container used for imaging (glass slide) is described at the root dataset level whereas imaged specimens (tissue sections) are described by the nested specimen description sequence attribute. Laboratory procedures that were performed to prepare a specimen for image acquisition are described by the specimen preparation sequence attribute, which includes a specimen preparation step content item sequence attribute for each performed procedure. The description of a content item is based on the Specimen Preparation template, which provides concepts to encode a procedure as name-value pairs using a specified coding scheme, such as SNOMED CT. Mandatory concepts are the specimen identifier, which describes the identifier of the processed specimen, and processing type, which describes the kind of procedure that was performed. Additional concepts may be included depending on the value of the Processing Type concept. In case of a sampling preparation step (highlighted in red , compare sequence step 3 in a), Sampling method, Parent Specimen Identifier and Parent Specimen Type concepts are included. (c) Schematic of a current pathology report. Top: Identifiable information about patient (blue) and accession (purple), which map to attributes of the patient and general study module. Bottom: Final diagnosis and gross description for each specimen of type “part.” Identifiable information about specimen of type “section” is often not included in the report (i.e., final diagnoses generally lack complete slide-level annotations). SNOMED CT: Systematized Nomenclature of Medicine Clinical Terms
Figure 4
Figure 4
Measurements for encoding and storing DICOM data sets in PS3.10 files. Relative file sizes and encoding times for different vendor-specific formats and image compression methods. A set of 16 vendor-specific whole slide image files (4 files per vendor) were converted to DICOM files using lossy (JPEG with quality set to 95) and lossless (JPEG-LS and JPEG 2000) image compression methods. Left: Size of generated DICOM files with respect to the size of the corresponding original image files. The lossy JPEG size is shown to exemplify the difference between lossy and lossless schemes. Right: Program execution times for reading original files, encoding the contained information as DICOM data sets and writing generated data sets to DICOM files on disk. Execution times were normalized with respect to original file sizes for comparison between digital slides. Shown are averages (bars) and standard deviation (error bars) of measurements
Figure 5
Figure 5
Retrieving pixel data from DICOM data sets stored in PS3.10 files on disk. (a) Sequence of steps required for loading an individual frame of a multiframe DICOM image from a file on disk into a pixel matrix in memory. First, the file that contains the frame of interest is identified by reading and interpreting metadata (“DICOM header”) of each file. Second, the relative position of the frame within the image is identified using the available metadata. Third, the absolute position of the frame is determined within the encapsulated pixel data element. Fourth, the binary content of the frame item is read into memory. Lastly, the pixel data are decompressed. (b) Software library support for reading images from DICOM files. Selected software libraries available for the Python programming language and their level of support for reading multiframe DICOM images. Only a subset of libraries provides the functionality (check icon) for selectively loading an individual frame into memory. (c) Frame access efficiency for different image compression methods. Shown is the average time (±standard deviation) it takes to identify, read, and decompress an individual frame of a multi-frame DICOM image using an implementation of the algorithm (shown in a) in the pydicom software library
Figure 6
Figure 6
Querying and retrieving DICOM data sets from an archive through DICOMweb RESTful services. (a) Component diagram of DICOMweb client and server components as well as their respective interfaces. Client and server communicate over network using the HTTP. The client may be an interactive viewer application running in a web browser or a machine learning (ML) program running in a Unix shell. The server provides RESTful services for query and retrieval of DICOM objects: Query based on ID for DICOM Objects (QIDO-RS) and Web Access to DICOM Objects (WADO-RS). (b) Sequence diagram of DICOMweb client-server interactions for query and retrieval of DICOM objects. The client searches for studies (cases) or series (digital slides) resources via a SearchForStudies or SearchForSeries request, respectively. The client may provide query parameters to filter DICOM objects based on given attribute values. The server responds with resource representations for each matched object in JSON format according to the DICOM JSON model. The client retrieves image metadata resources from the server for each image (resolution level) belonging to the matched study or series through a RetrieveMetadata request. The server responds with metadata resource representations in DICOM JSON format for each image instance, which provide the client the necessary information to interpret the images and identify relevant frames (e.g., based on their position in the total pixel matrix or the slide coordinate system). The client requests a subset of frames from the server through a RetrieveFrames request. The server responds with a message containing the requested frames in the requested image format (e.g., JPEG, JPEG-LS, JPEG2000). JSON: JavaScript Object Notation, HTTP: Hypertext Transfer Protocol, library
Figure 7
Figure 7
Interactive DICOMweb viewer for whole slide images and related metadata. Screenshot of browser-based graphical user interface for display of DICOM VL Whole Slide Microscopy Image instances. The viewer represents a single-page application that uses the DICOMweb JavaScript client implementation to query and retrieve DICOM objects from an archive using RESTful web services. The viewer searches for studies using QIDO-RS and displays the list of matched study resources (1). Each list item shows relevant metadata (e.g., patient name, medical record number, accession number). Users can select a study by clicking on a list item (2), which triggers a search for all series that belong to the selected study (3). When the user selects a series, i.e., digital slide (4), the viewer retrieves the metadata for all images that belong to the series using WADO-RS and interprets the metadata to determine the image pyramid structure (5). The viewer then automatically retrieves the frames for a navigated resolution image using WADO-RS and displays them to the user (6). When the user navigates through the image pyramid, the viewer automatically retrieves the frames for the given slide coordinate position using WADO-RS. In addition, the viewer displays the image-level metadata (7). Standardized metadata enable interactive query and retrieval of images for visualization, human interpretation and data validation

References

    1. Schultz M. Rudolf Virchow. Emerg Infect Dis. 2008;14:1480–1.
    1. Weinstein RS, Graham AR, Richter LC, Barker GP, Krupinski EA, Lopez AM, et al. Overview of telepathology, virtual microscopy, and whole slide imaging: Prospects for the future. Hum Pathol. 2009;40:1057–69. - PubMed
    1. Pantanowitz L, Dickinson K, Evans AJ, Hassell LA, Henricks WH, Lennerz JK, et al. American telemedicine association clinical guidelines for telepathology. J Pathol Inform. 2014;5:39. - PMC - PubMed
    1. Louis DN, Feldman M, Carter AB, Dighe AS, Pfeifer JD, Bry L, et al. Computational pathology: A path ahead. Arch Pathol Lab Med. 2016;140:41–50. - PMC - PubMed
    1. Louis DN, Gerber GK, Baron JM, Bry L, Dighe AS, Getz G, et al. Computational pathology: An emerging definition. Arch Pathol Lab Med. 2014;138:1133–8. - PubMed