BRAIN TUMOR INFORMATICS
IN SILICO SCIENCE
CLINICAL AND TRANSLATIONAL INFORMATICS
HIGH END COMPUTING
IMAGING INFORMATICS

Computational Infrastructure

Share Last Updated: January 24, 2012


High Performance Computation Middleware

Software to support the extraction and interpretation of information from large imaging datasets has to deal with expensive data processing requirements, thousands of multi-gigabyte images and trillions of microscopic objects and their features. Detailed characterization of morphology in a large image dataset requires coordinated use of many interrelated analysis pipelines and comparison of analysis results from multiple analysis pipelines and analysis runs. A single analysis run involves pipelines of cascaded methods including: 1) data transformation tasks such as thresholding, tessellation, color and illumination normalization, 2) segmentation of structures such as cells and nuclei, 3) characterization of shape and texture features of segmented structures, and 4) machine-learning methods that integrate information from features to accomplish classification tasks.

Emerging trends in high performance computing (HPC) architectures offer viable platforms that can be exploited to address computational challenges associated with these pipelines.

We leverage ongoing research in large scale data processing middleware and high performance computing techniques to speed the execution of analysis computations. We develop runtime methods and tools to manage complex processing and data interactions under heterogeneous computing environments consisting of multi-core CPUs and multiple GPUs.

Our work is focused on data centric and performance aware dynamic scheduling for mapping of data and operations across heterogeneous computing devices. Our approach attempts to fully exploit both types of computing devices (GPUs and CPUs) on a computation node to execute multiple tasks assigned to that node.

References

1.    Tahsin Kurc, Patrick Widener, Wenjin Chen, Fusheng Wang, Lin Yang, Jun Hu, Vijay Kumar, Vicky Chu, Lee Cooper, Jun Kong, Ashish Sharma, Tony Pan, Joel Saltz, and David Foran, “Grid-Enabled, High performance Microscopy Image Analysis”, The 2nd International Workshop on High-Performance Medical Image Computing for Image-Assisted Clinical Intervention and Decision-Making, Beijing, China, HPMICCAI 2010 International Workshop. pp. 70-79, 2010.

2.    David J Foran, Lin Yang, Wenjin Chen, Jun Hu, Lauri A Goodell,
Michael Reiss, Fusheng Wang, Tahsin Kurc, Tony Pan, Ashish Sharma, Joel H Saltz. "ImageMiner: a software system for comparative analysis of tissue microarrays using content-based image retrieval, high-performance computing, and grid technology." Journal of the American Medical Informatics Association 18(4): 403-415. 2011.

3.    George Teodoro, Tahsin M. Kurc, Tony Pan, Lee A.D. Cooper, Jun Kong, Patrick Widener, Joel H. Saltz, "Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems", 26th IEEE International Paralllel & Distributed Processing Symposium, IPDPS 2012.
 

Information Models and Data Management Systems

Scientific in silico studies of brain tumor undertaken by our group and collaborators generate large heterogeneous collections of quantitative features from pathology slides. A typical whole slide pathology image contains 20 billion pixels (with digitization at 40X objective magnification). An 8-bit color uncompressed representation of this image is about 56 GB in size. Image analysis algorithms segment and classify 105 to 107 cells in each virtual slide of size 105 by 105 pixels. When multiple interrelated analysis pipelines are executed, a systematic analysis of large-scale image data involves classification of roughly 109 to 1012 micro-anatomic structures. The process of classifying a given cell is done using roughly 10-100 shape, texture and (when appropriate) stain quantification features. As a result, a thorough data analysis limited to classifying cells could encompass 1010 to 1013 features.

These large numbers of objects, features and classifications obtained from analysis runs need to be indexed and efficiently managed for data sharing purposes and to support various types of data selection, comparison and statistical operations.

In order to support data modeling and management requirements, we have developed an object-oriented, extensible information model, referred to as Pathology Analytical Imaging Standards (PAIS), and its database implementation to support computerized image analysis results and human observations. The PAIS model is designed to address the requirements of representing results from whole slide and tissue microarray image data analyses. This whole slide image analyses approach resulted from the In Silico Brain Tumor Research Center project (ISBTRC; one of the six In Silico Research Centers of Excellence funded by the NCI caBIG program) and the requirements of comparative analysis of tissue microarrays (TMAs), a project funded by the NIH. We have also implemented a database with spatial database extension running on a parallel database architecture. PAIS database enables efficient support of expressive queries for integrative correlation studies.

References

1.   Fusheng Wang, Tae W. Oh, Cristobal Vergara-Niedermayr, Tahsin Kurc, Joel Saltz, "Managing and Querying Whole Slide Images." In Proc. of SPIE Medical Imaging, San Diego, CA. Feb 4-9, 2012.

2.    Fusheng Wang, Jun Kong, Lee Cooper, Tony Pan, Tahsin Kurc, Wenjin Chen, Ashish Sharma, Cristobal Niedermayr, Tae W. Oh, Daniel Brat, Alton B. Farris, David Foran, Joel Saltz, “A Data Model and Database for High-resolution Pathology Analytical Image Informatics,” Journal of Pathology Informatics, Vol. 2, Issue 1, pp. 32-40, 2011.

3.    Fusheng Wang, Jun Kong, Jingjing Gao, Cristobal Vergara-Niedermayr, David Adler, Lee Cooper, Weian Chen, Tahsin Kurc, Joel Saltz, “High Performance Analytical Pathology Imaging Database for Algorithm Evaluation,” Workshop on High Performance and Distributed Computing for Medical Imaging HP-MICCAI/MICCAI-DCI, September 2011.

4.    F. Wang, R. Lee, X. Zhang and J. Saltz: "Towards Building High Performance Medical Image Management System for Clinical Trials." In Proc. of SPIE Medical Imaging," Feb 12-17, 2011.

5.    Fusheng Wang, Tahsin Kurc, Patrick Widener, Tony Pan, Jun Kong, Lee Cooper, David Gutman, Ashish Sharma, Sharath Cholleti, Vijay Kumar and Joel Saltz,  “High-performance Systems for In Silico Microscopy Imaging Studies," The 7th International Conference on Data Integration in the Life Sciences, August 2010.

6.    David J Foran, Lin Yang, Wenjin Chen, Jun Hu, Lauri A Goodell,
Michael Reiss, Fusheng Wang, Tahsin Kurc, Tony Pan, Ashish Sharma, Joel H Saltz. "ImageMiner: a software system for comparative analysis of tissue microarrays using content-based image retrieval, high-performance computing, and grid technology." Journal of the American Medical Informatics Association 18(4): 403-415. 2011.




 

HOME | CONTACTS | EMERGENCY | EMPLOYMENT | MAKE A GIFT | EMORY'S WEB | SITE MAP | A-Z INDEX

Copyright © 2017 Emory University - All Rights Reserved | 201 Dowman Drive, Atlanta, Georgia 30322 USA 404.727.6123