Big Data Approaches to Cancer Immunotherapy

Daphne E. Schlesinger1,3,4, Tricia Cottrell5, Peter Nguyen6,7, Sneha Berry6,7,
Benjamin Green6,7, Nicolas Giraldo-Castillo5, Janice M. Taube5,6,7, and Alex
, 1Johns Hopkins Department of Physics & Astronomy; 2Johns Hopkins Department of Computer Science; 3The Institute for Data Intensive Engineering and Science; 4Johns Hopkins Department of Biomedical Engineering; 5Johns Hopkins Department of Pathology; 6Johns Hopkins Department of Dermatology; 7Bloomberg-Kimmel Institute for Cancer Immunotherapy


Targeted immunotherapy is an increasingly promising area of innovation for cancer treatment. In particular, researchers have identified a correlation between patient survival and the spatial relations between T-cells and the tumor margin [1]. Furthermore, the expression of transmembrane proteins on tumor cells, most notable PD-L1, reflects the immunogenicity of the cells of a given cancer, advising personalized treatment options for patients [2]. These tissue parameters can be studied via various staining and microscopy modalities, including multispectral imaging technologies and multiplexed immunofluorescence. Some microscope software provides limited methods for segmenting tissue and phenotyping individual cells. Currently, the process for any type of analysis of cells and tissue is highly labor intensive, time-consuming, and not particularly precise.

Given that there are both immense treatment opportunities and an evident need for “big data” analysis for cancer immunotherapy, this work seeks to develop an efficient workflow for acquiring, processing, and providing open access to histological scans. This involves developing computational methods for spatially mapping different cell types, with respect to tumor margins, using image data provided by multispectral microscopy. In particular, the high-dimensional data is decomposed into 8 principle components, and viewed in a uniquely designed user interface alongside more conventional Haemotoxylin and Eosin (H&E) stains, in order to allow pathologists to identify associations between each principle component and tissue features. Images have also been assembled in a database for efficient access. In the longer term, the intention is to apply advanced deep learning algorithms to the images for segmentation and precise cell phenotype classification.

[1] Carstens, J.L. Spatial Computation of intratumoral T cells correlates with survival of patients with pancreatic cancer. Nature Communications 8, 15095 (2017).
[2] Nghiem, P.T. PD-1 Blockade with Pembrolizumab in Advanced Merkel-Cell Carcinoma. The New England Journal of Medicine 374, 2542-2552 (2016).