Reading time: 3 minutes
Can robots accomplish the work of doctors? Some aspects of medicine may be better left to technology: one example includes the reading of mammograms to diagnose breast cancer. Thus far in the history of Oncobites, we have examined various aspects of diagnostics such as molecular vibrations, gold nanoparticles, biomarkers and at-home cancer tests. Zooming into a small sliver of radiographic cancer diagnostics, we will uncover how artificial intelligence compares with trained physicians in diagnosing breast cancer in a study by McKinney et al from scientists at Google Health.
In most cancers, including breast cancer, early detection is vital to improving outcomes for patients. Women in the US and UK start getting mammograms starting around the ages of 40-50, in hopes of early detection of any signs of cancer for better treatment outcomes and prognosis. Mammograms are one of several radiology tools designed to collect information from within our bodies. They work by sending small bursts of x-rays that pass through breast tissue to record a signal of the consistency of the soft breast tissue. However, collecting the image is not sufficient. We need reliable reads and interpretation of these images in order to catch cancer early. An entire field of medicine called radiology is devoted to reading these images.
The scientists at Google Health created and presented an artificial intelligence (AI) system and compared it to trained radiologists in reading mammograms. Their goal was to reduce the high rates of false positives (when it looks like you have cancer, but you really don’t) and false negatives (when it looks like you don’t have any masses, but you really do). They took a large dataset from both the UK and the US and demonstrated that their AI technology showed reductions in both the US and UK in both false positives and negatives. They compared this technology to six radiologists who also read the mammograms and saw that the technology did better than all of the human readers. They evaluate this using a metric called the “Area under the receiver operating characteristic curve” or AUC-ROC. The AUC-ROC measures how useful a test is. For example, a test that is perfect would have a score of 1, where a test that generates completely random results, would have a score of 0.5. The AUC when the AI system read the mammograms was greater than the AUC-ROC for radiologists reading the scans by almost 12%. Usually radiologists work in pairs, meaning two people read the scans to make sure the diagnosis is done correctly. When they used the AI system in the double reading process in the UK, they found that the performance was maintained and the second reader’s work was reduced by 88%.
Increasing dependency on technology has always been a sensitive and scary move to take, especially in a medical field like oncology where patients want to make sure no mistakes are made. Studies are emerging to show, however, that increasing dependency on technology to do these reads are not a way to take a shortcut or to make life easier for our health care providers; instead, it may be the smartest direction for the field to provide the safest and most thorough care possible.
Edited by Daniel Zhong
McKinney, S.M., Sieniek, M., Godbole, V. et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89–94 (2020). https://doi.org/10.1038/s41586-019-1799-6