In the analysis of images for detecting potential pathology, artificial intelligence (AI) is showing enormous promise across multiple fields of medicine. But the technology in dermatology is bound to fail in skin of color if training does not specifically address these skin types, according to Adewole S. Adamson, MD, who outlined this issue at the American Academy of Dermatology Virtual Meeting Experience.
Dr Adewole Adamson
“Machine learning algorithms are only as good as the inputs through which they learn. Without representation from individuals with skin of color, we are at risk of creating a new source of racial disparity in patient care,” Adamson, assistant professor in the division of dermatology, department of internal medicine, University of Texas at Austin, said at the meeting.
Diagnostic algorithms using AI are typically based on deep learning, a subset of machine learning that depends on artificial neural networks. In the case of image processing, neural networks can “learn” to recognize objects, faces, or, in the realm of health care, disease, from exposure to multiple images.
There are many other variables that affect the accuracy of deep learning for diagnostic algorithms, including the depth of the layering through which the process distills multiple inputs of information, but the number of inputs is critical. In the case of skin lesions, machines cannot learn to recognize features of different skin types without exposure.
“There are studies demonstrating that dermatologists can be outperformed for detection of skin cancers by AI, so this is going to be an increasingly powerful tool,” Adamson said. The problem is that “there has been very little representation in darker skin types” in the algorithms developed so far.
The risk is that AI will exacerbate an existing problem. Skin cancer in darker skin is less common but already underdiagnosed, independent of AI. Per 100,000 males in the United States, the rate of melanoma is about 30-fold greater in White men than in Black men (33.0 vs. 1.0). Among females, the racial difference is smaller but still enormous (20.2 vs. 1.2 per 100,000 females), according to U.S. data.
For the low representation of darker skin in studies so far with AI, “one of the arguments is that skin cancer is not a big deal in darker skin types,” Adamson said.
It might be the other way around. The relative infrequency with which skin cancer occurs in the Black population in the United States might explain a low level of suspicion and ultimately delays in diagnosis, which, in turn, leads to worse outcomes. According to one analysis drawn from the Surveillance, Epidemiology and End-Result (SEER) database (1998-2011), the proportion of patients with regionally advanced or distant disease was nearly twice as great (11.6% vs. 6.0%; P < .05) in Black patients, relative to White patients.
Not surprisingly, given the importance of early diagnosis of cancers overall and skin cancer specifically, the mean survival for malignant melanoma in Black patients was almost 4 years lower than in White patients (10.8 vs. 14.6 years; P < .001) for nodular melanoma, the same study found.
In humans, bias is reasonably attributed in many cases to judgments made on a small sample size. The problem in AI is analogous. Adamson, who has published research on the potential for machine learning to contribute to health care disparities in dermatology, cited work done by Joy Buolamwini, a graduate researcher in the media lab at the Massachusetts Institute of Technology. In one study she conducted, the rate of AI facial recognition failure was 1% in White males, 7% in White females, 12% in skin-of-color males, and 35% in skin-of-color females. Fewer inputs of skin of color is the likely explanation, Adamson said.
The potential for racial bias from AI in the diagnosis of disease increases and becomes more complex when inputs beyond imaging, such as past medical history, are included. Adamson warned of the potential for “bias to creep in” when there is failure to account for societal, cultural, or other differences that distinguish one patient group from another. However, for skin cancer or other diseases based on images alone, he said there are solutions.
“We are in the early days, and there is time to change this,” Adamson said, referring to the low representation of skin of color in AI training sets. In addition to including more skin types to train recognition, creating AI algorithms specifically for dark skin is another potential approach.
However, his key point was the importance of recognizing the need for solutions.
“AI is the future, but we must apply the same rigor to AI as to other medical interventions to ensure that the technology is not applied in a biased fashion,” he said.
Dr Susan Swetter
Susan M. Swetter, MD, professor of dermatology and director of the pigmented lesion and melanoma program at Stanford (Calif.) University Medical Center and Cancer Institute, agreed. As someone who has been following the progress of AI in the diagnosis of skin cancer, Swetter recognizes the potential for this technology to increase diagnostic efficiency and accuracy, but she also called for studies specific to skin of color.
The algorithms “have not yet been adequately evaluated in people of color, particularly Black patients in whom dermoscopic criteria for benign versus malignant melanocytic neoplasms differ from those with lighter skin types,” Swetter said in an interview.
She sees the same fix as that proposed by Adamson.
“Efforts to include skin of color in AI algorithms for validation and further training are needed to prevent potential harms of over- or underdiagnosis in darker skin patients,” she pointed out.
Adamson reports no potential conflicts of interest relevant to this topic. Swetter had no relevant disclosures.
This article originally appeared on MDedge.com, part of the Medscape Professional Network.