We are currently in the midst of what some have called the "wild west" of AI. Though healthcare is one of the most heavily regulated sectors, the regulation of AI in this space is still in its infancy. The rules are being written as we speak. We are playing catch-up by learning how to reap the benefits these technologies offer while minimizing any potential harms once they've already been deployed.
AI systems in healthcare exacerbate existing inequities. We've seen this play out into real-world consequences from racial bias in the American justice system and credit scoring, to gender bias in resume screening applications. Programs that are designed to bring machine "objectivity" and ease to our systems end up reproducing and upholding biases with no means of accountability.
The algorithm itself is seldom the problem. It is often the data used to program the technology that merits concern. But this is about far more than ethics and fairness. Building AI tools that take account of the whole picture of healthcare is fundamental to creating solutions that work.
The Algorithm is Only as Good as the Data
By nature of our own human systems, datasets are almost always partial and rarely ever fair. As Linda Nordling comments in a Nature article, A fairer way forward for AI in healthcare, "this revolution hinges on the data that are available for these tools to learn from, and those data mirror the unequal health system we see today."
Take, for example, the finding that Black people in US emergency rooms are 40 percent less likely to receive pain medication than are white people, and Hispanic patients are 25 percent less likely. Now, imagine the dataset these findings are based on is used to train an algorithm for an AI tool that would be used to help nurses determine if they should administer pain relief medication. These racial disparities would be reproduced and the implicit biases that uphold them would remain unquestioned, and worse, become automated.We can attempt to improve these biases by removing the data we believe causes the bias in training, but there will still be hidden patterns that correlate with demographic data. An algorithm cannot take in the nuances of the full picture, it can only learn from patterns in the data it is presented with.
Data bias creeps into healthcare in unexpected ways. Consider the fact that animal models used in laboratories across the world to discover and test new pain medications are almost entirely male. As a result, many medications, including pain medication, are not optimized for females. So, it makes sense that even common pain medications like ibuprofen and naproxen have been proven to be more effective in men than women and that women tend to experience worse side effects from pain medication than men do.
In reality, male rodents aren't perfect test subjects either. Studies have also shown that both female and male rodents' responses to pain levels differ depending on the sex of the human researcher present. The stress response elicited in rodents to the olfactory presence of a sole male researcher is enough to alter their responses to pain.
While this example may seem to be a departure from AI, it is in fact deeply connected — the current treatment choices we have access to were implicitly biased before the treatments ever made it to clinical trials. The challenge of AI equity is not a purely technical problem, but a very human one that begins with the choices that we make as scientists.
Unequal Data Leads to Unequal Benefits
In order for all of society to enjoy the many benefits that AI systems can bring to healthcare, all of society must be equally represented in the data used to train these systems. While this may sound straightforward, it's a tall order to fill.
Data from some populations don't always make it into training datasets. This can happen for a number of reasons. Some data may not be as accessible or it may not even be collected at all due to existing systemic challenges, such as a lack of access to digital technology or simply being deemed unimportant. Predictive models are created by categorizing data in a meaningful way. But because there's generally less of it, "minority" data tends to be an outlier in datasets and is often wiped out as spurious in order to create a cleaner model.
Data source matters because this detail unquestionably affects the outcome and interpretation of healthcare models. In sub-Saharan Africa, young women are diagnosed with breast cancer at a significantly higher rate. This reveals the need for AI tools and healthcare models tailored to this demographic group, as opposed to AI tools used to detect breast cancer that are only trained on mammograms from the Global North. Likewise, a growing body of work suggests that algorithms used to detect skin cancer tend to be less accurate for Black patients because they are trained mostly on images of light-skinned patients. The list goes on.
We are creating tools and systems that have the potential to revolutionize the healthcare sector, but the benefits of these developments will only reach those represented in the data.
So, what can be done?
Part of the challenge in getting bias out of data is that high volume, diverse and representative datasets are not easy to access. Training datasets that are publicly available tend to be extremely narrow, low-volume, and homogenous—they only capture a partial picture of society. At the same time, a wealth of diverse health data is captured every day in many healthcare settings, but data privacy laws make accessing these more voluminous and diverse datasets difficult.
Data protection is of course vital. Big Tech and governments do not have the best track record when it comes to the responsible use of data. However, if transparency, education, and consent for the sharing of medical data was more purposefully regulated, far more diverse and high-volume data sets could contribute to fairer representation across AI systems and result in better, more accurate results for AI-driven healthcare tools.
But data sharing and access is not a complete fix to healthcare's AI problem. Better and personalized healthcare through AI is still a hugely challenging problem that will take an army of scientists and engineers. At the end of the day, we want to teach our algorithms to make good choices but we are still figuring out what good choices should look like for ourselves.
AI presents the opportunity to bring greater personalization to healthcare, but it equally presents the risk of entrenching existing inequalities. We have the opportunity in front of us to take a considered approach to data collection, regulation, and use that will provide a fuller and fairer picture and enable the next steps for AI in healthcare.
Angela Wilkins is the executive director of the Ken Kennedy Institute at Rice University.