Iris: A Next Generation Digital Pathology Rendering Engine

Ryan Erik Landvater¹, Ulysses Balis¹

Affiliations

PMID: 39830734
PMCID: PMC11742306
DOI: 10.1016/j.jpi.2024.100414

Iris: A Next Generation Digital Pathology Rendering Engine

Ryan Erik Landvater et al. J Pathol Inform. 2024.

. 2024 Dec 5:16:100414.

doi: 10.1016/j.jpi.2024.100414. eCollection 2025 Jan.

Authors

Ryan Erik Landvater¹, Ulysses Balis¹

Affiliation

¹ University of Michigan Medical School, Department of Pathology, 2800 Plymouth Road, Ann Arbor, MI 48109-2800, USA.

PMID: 39830734
PMCID: PMC11742306
DOI: 10.1016/j.jpi.2024.100414

Abstract

Digital pathology is a tool of rapidly evolving importance within the discipline of pathology. Whole slide imaging promises numerous advantages; however, adoption is limited by challenges in ease of use and speed of high-quality image rendering relative to the simplicity and visual quality of glass slides. Herein, we introduce Iris, a new high-performance digital pathology rendering system. Specifically, we outline and detail the performance metrics of Iris Core, the core rendering engine technology. Iris Core comprises machine code modules written from the ground up in C++ and using Vulkan, a low-level and low-overhead cross-platform graphical processing unit application program interface, and our novel rapid tile buffering algorithms. We provide a detailed explanation of Iris Core's system architecture, including the stateless isolation of core processes, interprocess communication paradigms, and explicit synchronization paradigms that provide powerful control over the graphical processing unit. Iris Core achieves slide rendering at the sustained maximum frame rate on all tested platforms (120 FPS) and buffers an entire new slide field of view, without overlapping pixels, in 10 ms with enhanced detail in 30 ms. Further, it is able to buffer and compute high-fidelity reduction-enhancements for viewing low-power cytology with increased visual quality at a rate of 100-160 μs per slide tile, and with a cumulative median buffering rate of 1.36 GB of decompressed image data per second. This buffering rate allows for an entirely new field of view to be fully buffered and rendered in less than a single monitor refresh on a standard display, and high detail features within 2-3 monitor refresh frames. These metrics far exceed previously published specifications, beyond an order of magnitude in some contexts. The system shows no slowing with high use loads, but rather increases performance due to graphical processing unit cache control mechanisms and is "future-proof" due to near unlimited parallel scalability.

Keywords: Digital pathology; Digital scope render engine; Performance digital pathology; Technologies for improved whole slide imaging; Time field of view; Time per tile; Vulkan.

PubMed Disclaimer

Conflict of interest statement

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Ryan Landvater has patent #US20230334621A1 pending to REGENTS OF THE UNIVERSITY OF MICHIGAN. If there are other authors, they declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this article.

Figures

**Fig. 1**
Iris implementation and application programming interface overview. The Iris system is composed of multiple modules, which can optionally be configured to interact with Iris Core, the WSI-specific and optimized rendering engine. Iris Core is compiled for Windows, macOS, iOS, and Linux from portable C++ source code and can bind the draw-surface of a generic WSV application's graphical window to begin drawing slide data. Iris's API (blue arrows) is accessed through the Iris Viewer instance, which coordinates the active modules and renders slide data to the bound window. The system automatically configures runtime parameters (red arrows) based upon identified hardware capabilities and the calling application's operating system window visual surface, a feature known as a “plug and play” configuration. The Iris Core API is lightweight comprising only a few header files. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

**Fig. 2**
Whole slide image (WSI) to visible draw-space tile mapping scheme. The WSI layer pyramid (left), comprising multiple sub-sampled image layers at decreasing resolution (zoom-level), are mapped to visible tiles drawn to the screen as part of Iris' rendering pipelines (right). These visible tiles represent 256 × 256-pixel regions of the WSI pyramid layers, and the number of tiles rendered are equivalent to the screen dimension, divided by 256 pixels, and multiplied by the current zoom factor for that layer. The high-resolution layer represents ≤1.0 WSI pixel to monitor pixel (i.e., these tiles are shrunken to less than their native size), whereas the LR layer represents >1.0 WSI pixel per monitor pixel (i.e., these tiles are enlarged to greater than their native size). Any unbuffered tile space is transparent, such that aspects of the LR layer are visible during the high-resolution buffering.

**Fig. 3**
Iris system architecture detailing CPU and GPU-pipelines. CPU engine architecture (top) involves coordination between multiple concurrent thread executions with simultaneous GPU submissions (arrows that cross to bottom). The user interface (UI) threads signal the rendering, buffering, and loading threads of region updates. The rendering thread pulls GPU resources (blue arrow) to render via an atomically updated VRAM-to-slide-location map^† (Fig. 4), updated in real-time by the buffer thread (red arrow). The buffer thread executes lock-less high and low-priority queues and stacks by pulling (blue arrow) raw image data from a short-term RAM cache and initiating GPU transfer and compute pipelines. Additionally, the buffering thread queues and prioritizes concurrent loader threads that read slide data from a slide file or locally cached server slide-file (blue arrow). Multiple concurrent GPU queue families (bottom) execute render and buffer commands, all of which may be submitted simultaneously. Transfer commands and enhance/downsample pipeline^‡ (Fig. 5) are executed in series with synchronization of the L1/L2 GPU cache between queues. Actively buffering tiles (*) may be rendered during a render-pass if the transfer completes before fragment shader execution and control of that location within the L1/L2 cache is transferred over to the render queue. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

**Fig. 4**
Simultaneous buffer-render coordination schema. Physical tile locations with index values (left) are hash-mapped to respective tile image data residing within associated GPU VRAM allocations. VRAM image allocations are also indexed within an Iris VRAM Allocation Wrapper array. Allocations are recycled and their image data is overwritten to avoid GPU allocation overhead. Atomic reference counting and atomic status flags are associated with each Iris Wrapper Instance to track use across concurrent threads. Read commands, illustrated in blue, show how the tile location is mapped to the proper index within Iris Wrapper Instance Array for drawing the image allocation associated with active tiles (1485 for example). An indeterminate status for outstanding tiles may exist as well based upon GPU buffering progression (1490). Image data are read from a dynamic short-term cache and write commands, denoted in red, transfer image data into available (status TILE_FREE) allocations during microtransactions. These allocations are identified when the thread queries the Iris Wrapper Instance array for available allocation wrapper instances or purges the state of mapped tiles away from the rendering view. Allocation purging simply requires flipping the atomic flag from TILE_ACTIVE to TILE_FREE as buffering overwrites the prior image data. This simple schema makes image deallocation or clearing unnecessary, however it does require proper GPU L2 cache control mechanisms to avoid rendering stale data. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

**Fig. 5**
Single pass mipmap generation compute pipeline schema. During a buffering microtransaction, all tiles (left) within the buffering microtransaction are enqueued for mipmap generation. Each mipmap for a given tile is generated simultaneously from the reference tile (1×) that was buffered to the GPU using a single pipeline pass, a process known as single pass downsampling. Variably sized and normalized 2D Laplacian sharpening kernels (using larger kernels for greater reduction), comprising floating-point numerals, are convolved on the reference image for each mipmap level (2×, 4×, 8×, etc.) to generate images of the highest possible visual quality. The resultant tiles demonstrate enhanced cytological detail when sampled for rendering at lower magnification power.

**Fig. 6**
Rapid tile buffering sequence (RTBS) atomic microtransactions. The RTBS breaks pending buffering transactions into numerous microtransactions (which utilize atomic state signals for concurrency and GPU device synchronization) and coordinates GPU resource recycling to avoid allocation overhead. Using the described “pull” design principles, the transactions only buffer data immediately available in regions pending buffering. This results in multiple microtransactions per frame and the GPU scheduler staggering transaction to saturate GPU cores. Unavailable data are prioritized by the loader threads and unbuffered tiles are made transparent so that only the LR regions are rendered while the buffering completes the high-detail sections.

**Fig. 7**
Iris buffering rate for time for field of view (TeFOV) and time per tile (TPT). Scatter plots of individual buffering events and boxplots showing aggregate TeFOV (top) and TPT (bottom), plotted base-10 logarithmically to show order of magnitude performance differences. Median TFOV times decoding using Iris Codec of LR-FOV 10 ms and HR-FOV 25 ms; when decoding with OpenSlide LR-FOV 56 ms and HR-FOV 124 ms. These TFOV times are shown in the context of previously published high-performance JavaScript by Schuffler *et al.* denoted by asterisk (*). TPT values decoding using Iris Codec were LR-TPT 160 μs and HR-TPT 100 μs; when decoding with OpenSlide LR-TPT 1320 μs and HR-TPT 600 μs.

**Fig. 8**
Iris system performance trace for Example iPadOS implementation. Use-traces (left) over a 12 s period (approximately 7–9 ms smoothed over 48 ms) and corresponding performance distributions for the trace (right) showed sustained 120 FPS (119.4–121.9 FPS) rendering rate and ≤2 ms shader execution time with rapid slide navigation during buffering. Rapid tile buffer-rate (left-bottom) during the median 25 ms TeFOVs (Fig. 5) occurred at a median rate of 1.39 GB/s (1.01–1.69 GB/s). Shortened shader execution times were noted coincidentally with large data boluses (red arrows) and represent the high-speed access to the slide image data residing within the L1/L2 cache as a direct result of the preceding buffering transaction, without the overhead of pulling the image data from VRAM. This corresponds with increased FPS. Differences between the trace (left) and performance distributions (right) are due to averaged smoothing in the FPS/buffering trace. A drop in FPS was noted during Iris' interaction with the iOS pencil kit (external drawing calls) during an annotation event (short reduction to 80 FPS on right-side of trace at approximately 10.5 s). A video recording of use during the capture of this trace is provided as Video 1 in the online supplements. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

See this image and copyright information in PMC

References

1. Jesus R., Silva L.B., Costa C. Intra-query parallelism for a scalable and responsive web-based digital pathology viewer. Stud Health Technol Inform. 2022;294:279–280. doi: 10.3233/SHTI220456. - DOI - PubMed
1. Al-Janabi S., Huisman A., Van Diest P.J. Digital pathology: current status and future perspectives. Histopathology. 2012;61(1):1–9. doi: 10.1111/j.1365-2559.2011.03814.x. - DOI - PubMed
1. Zarella M.D., Feldscher A. Laboratory computer performance in a digital pathology environment: outcomes from a single institution. J Pathol Inform. 2018;9:44. doi: 10.4103/jpi.jpi_47_18. - DOI - PMC - PubMed
1. Gilman I., Kishore A., Thatcher C., Salsbery M., Vandecreme A., Pearce T. OpenSeadragon. 2024. https://openseadragon.github.io/ER URL.
1. Razian S.A., Jadidi M. Histology image viewer and converter (HIVC): a high-speed freeware software to view and convert whole slide histology images. Comput Methods Biomech Biomed Eng Imag Visual. 2023;11(5):1652–1660. doi: 10.1080/21681163.2023.2174776. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Iris: A Next Generation Digital Pathology Rendering Engine

Affiliation

Iris: A Next Generation Digital Pathology Rendering Engine

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Miscellaneous