Enabling a More Rapid and Accurate Diagnosis
Every patient’s journey with cancer begins with a diagnosis and pathologists are expertly trained to make that diagnosis through the examination of tissues, body fluids, and cells. Since the dawn of pathology as a recognized medical field, the primary tool used to examine and detect disease has been the microscope, a technology that was first developed anywhere from 300 to 800 years ago. Although modern microscopes have adapted through numerous levels of sophistication and have long-since proven to be quite reliable in the routine diagnosis of cancer and other disease states, the fact remains that pathologists and laboratorians have been using essentially the same tools and diagnostic criteria for decades. These criteria include the use of microscopes for magnification, stains to add contrast to thin tissue sections (made popular by the textile industry in the mid-1800s), and qualitative or semi-quantitative grading systems, which provide diagnostic and therapy decision-making value, yet likewise have not been revised for several decades.
Digital technologies have seen significant growth in the last three decades and in that time, the concept of ‘digital pathology’ has grown in its scope and clinical ability. As a tool for pathology, digital imaging is enabling new waves of innovation that signal a number of advancements to the practice, including these key capabilities, the last of which is perhaps most compelling:
Following current diagnostic procedures, a pathologic report of a cancer diagnosis usually comprises key information including cancer type and grade. This information assists the clinician in determining the prognosis and probable outcome of a malignancy, and provides guidance on the course of treatment that will give the patient the best chance of recovery. However, ensuring accurate and precise cancer typing and grading is an ongoing challenge. There can be a wide range of objectivity in cancer interpretation among multiple pathologists, or even with a single pathologist working in a different time or place. The interpretation of breast cancer is a perfect example; over 20 histological subtypes exist and the stage of the cancer is assigned through a combination of multiple, visually determined criteria essentially boiled down into three broad categories—Grades I, II, and III—through the calculation of Nottingham scores. Furthermore, breast cancer is a morphologically and molecularly heterogeneous disease, with grading generally assessed by observing specific, characteristic traits. However, those traits tend to vary across the tumor. Regardless, this grading system currently is our best indicator for overall patient survival. The higher the Nottingham score and resulting tumor grade, the worse the prognosis will be for the patient.
To extrapolate this further, the Nottingham Prognostic Index is based on the morphological features of:
These are all vital aspects, but, for example, it is important to consider the scope of nuclear pleomorphism. Nuclear pleomorphism assesses the shape of cell nuclei relative to one another. Abnormal shape changes due to activation of oncogenes and spatial reorganization of chromatin in cancer nuclei leads to abnormal cell production and tumor malignancy. These shape changes in a tumor’s nucleus indicate that important changes are occurring and we currently know that odd changes in shape indicate a poor prognosis, yet regular round nuclei indicate a relatively positive outcome for most patients. Many scientists believe a great deal of information is being revealed by these changes in each individual nucleus. So the question is: If pathologists have information about the exact shape of all of the cells in the tumor, can they offer a diagnosis with additional accuracy, thus allowing for a more patient-specific treatment? Furthermore, if a computer can recognize multiple features including shape, size, color, and texture, can a computer help pathologists establish a superior grading system than what is possible with the human eye alone?
Computationally Enhanced Visualization
Given that the Nottingham grading system assessment is based, in part, on irregular nuclear pleomorphic features like shape, size, color, and intensity of individual cells, computationally differentiated and classified nuclei may enhance the assessment of diagnosis and grading. Computational analysis is fast emerging as a reliable, second opinion companion to assist the pathologist. With this in mind, we are exploring an Artificial Neural Network (ANN) classifier’s constant learning ability to determine how it could impact the accuracy of cancer detection and predictive prognoses. An ANN learns from its past experiences and errors in a non-linear parallel processing manner like human brain neurons; hence the name (see SIDEBAR 1). Neural network classifiers may be tested for optimal combinations of training, validation, and out-of-sample testing for breast cancer patient data. This is a computational approach proposed to support rapid and error-free diagnostic pathology. For next generation health care solutions, highly automated digital pathological tools could not only deliver accurate and timely decisions, but also could assist pathologists as a diagnostic companion with a second opinion in an environment that requires rapid decision-making.
Pathologists may visually inspect hundreds of slides per day under a microscope, which clearly is a challenging task. Exacerbating this issue is the lack of agreement among pathologists, as well as reproducibility in the most challenging grade II malignancies. To address this, we propose that a series of clustering algorithms that work like artificial neural networks, K-means, Gaussian mixture models, and fuzzy C-means may someday eliminate the risk of diagnostic errors by ultimately giving the pathologist more information than is currently available.
The most popular algorithm used in ANNs today is feed-forward back-propagation (FFBP, see SIDEBAR 2). The foundational idea is that breast cancer prognostic tools can be designed based on the powerful learning and processing features of ANN in a probabilistic and noisy environment. The neuron is the basic calculating entity that computes from a number of inputs and delivers one output in comparison with the established threshold value. The computational processing, which is performed by internal structural arrangement, consists of hidden layers which utilize the FFBP mechanism to deliver outputs with high accuracy rates. The learning is based on supervised (reinforcement) and unsupervised (no target) type wherein the latter mimics the biological neuron pattern of learning. Supervised reinforcement (back-propagation) allows the user to influence the desired output because the correct result is known. When the desired output does not meet the actual output, the network is adjusted. The concept is akin to training your brain to learn by example, rather than coming to a novel conclusion without knowing the right answer.
To test this, we designed and performed experiments on two commercially available neural network development platforms to test pattern recognition and classification tools for breast cancer data analysis in order to distinguish benign cells from malignant cells. The data set included 699 patients presenting nine attributes based on uniformity of cell size, clump thickness, etc. The database is categorical in nature with a dependent variable as the predictor for benign or malignant tumor class.
Multiple experiments were performed for data classification and to test various combinations of hidden neuron numbers by varying training data, validation, and out-of-sample testing data percentage. Our results indicated 97.25% of accurately classified mean (%) with standard deviation (SD) of 0.478486 at fixed 70% training, 15% validation, and 15% out-of-sample independent testing using 15 hidden neurons combinations. The total run was 51 (each) for every set of testing.
Under the classifier run for 1000 epochs at an elapsed time of 16 seconds, the results were very encouraging: They correctly classified 99.1% as benign and 100% as malignant on a training/active-confusion matrix agreeing very closely with gold standard pathologist scoring. On cross-validation, 98.1% data was correctly classified as a benign tumor. The ANN analysis reported here is based on the pathologist as the gold standard; in any case, if the pathologist is wrong (or two pathologists disagree), this adds a significant amount of complexity to the problem.
Benefits of Innovative Tools for Pathology
Together, these experiments indicate that digital network classifiers may be useful for enhancing breast cancer grading. In the course of conventional breast cancer diagnostics and treatment, the current approach is highly serial in nature, as well as time consuming. Augmenting or replacing currently used glass slides with a fully automated digital solution with electronic file sharing could be advantageous because it not only allows sharing to be more rapid and efficient, but it facilitates rapid consults with pathology sub-specialty experts, which in turn leads to more accurate diagnoses. An additional benefit of this digitization is the simple fact that a digital image can be quantified with consistent reproducibility. With advances in diagnostic testing and technology such as the application of ANN algorithms, pathologists are empowered to deliver an even more rapid and accurate grading and diagnosis of cancer.
Throughout health care practice, the goal is to ensure every patient receives the most rapid and accurate diagnosis possible. Ideally, computers would be used orthogonally to pathologists with each contributing what they do best. The ANN will look for clues in the tissue that are difficult for the pathologist to access, including accurate and reliable quantification and multivariate decision making. As pathologists are currently readers and interpreters of patient tissues, they can be empowered by novel technologies to become curators of complex data (both human and machine generated), compiling information in order to offer a more comprehensive final diagnosis. Armed with this knowledge, the patient can then take the first step of their treatment journey on the correct path.
Munish Puri, PhD, MSc, is an intern researcher in analytic microscopy core at the Moffitt Cancer Center in Tampa, Florida. He received his PhD in electrical engineering from the University of South Florida (USF). His research includes biomedical imaging, bioengineering, digital pathology, flexible sensors for brain/machine interface, and novel neuroprosthetics.
Srinivas Tipparaju, MPharm, PhD, is an assistant professor of molecular medicine, pharmacology, and physiology at the USF College of Pharmacy. He received his PhD from Jamia Hamdard University in Delhi, India, and performed his postdoctoral research at Emory University.
Wilfrido Moreno, MSc Eng, PhD, received his degrees from USF. He is a professor in the department of electrical engineering at USF and is the director of the Ibero-American Science and Technology Educational Consortium (ISTEC). Dr. Moreno also is a founding member of the Nanotechnology Research & Education Center (NREC).
Marilyn M. Bui, MD, PhD, is an associate member of the anatomic pathology department at the Moffitt Cancer Center, as well as the scientific director of analytic microscopy core, and associate professor at the USF Morsani College of Medicine. She received her MD from the Capital Institute of Medicine in Beijing, China, and her PhD in immunology and molecular pathology from the University of Florida. Her pathology residency and cytopathology fellowship training were from the University of Florida Health Science Center, Jacksonville.
Mark C. Lloyd, MS, is a core staff scientist in the analytic microscopy lab at the Moffitt Cancer Center. He also is a PhD candidate in the ecology and evolution group, biological sciences department at the University of Illinois at Chicago.
Artificial Neural Network—ANN1
An Artificial Neural Network is a computational model inspired by networks of living neurons, wherein the neurons compute values from inputs; the process is actually quite straightforward. All signals can be assigned as either 1 or -1 (the binary case). The neuron calculates a weighted sum of inputs and compares it to a threshold of 0. If the sum is higher than the threshold, the output is set to 1, otherwise to -1.
The power of the neuron results from its collective behavior in a network where all neurons are interconnected. The network starts evolving: neurons continuously evaluate their output by looking at their inputs, calculating the weighted sum, and then comparing to a threshold to decide if they should fire. This is a highly complex parallel process whose features cannot be reduced to phenomena taking place with individual neurons.
One observation is that the evolving of an ANN causes it to eventually reach a state where all neurons continue working but no further changes in their state occur. A network may have more than one stable state and it is determined by the choice of synaptic weights and thresholds for the neurons.
The first term—feedforward—describes how the neural network processes and recalls patterns. In a feedforward neural network, neurons are only connected forward. Each layer of the neural network contains connections to the next layer (for example, from the input to the hidden layer), but there are no connections back.
The second term—backpropagation—describes how this type of neural network is trained. Backpropagation is a form of supervised training. When using a supervised training method, the network must be provided with both sample inputs and anticipated outputs. The anticipated outputs are compared against the actual outputs for given input. Using the anticipated outputs, the backpropagation training algorithm then takes a calculated error and adjusts the weights of the various layers backwards from the output layer to the input layer to reduce the value of error.