Healthcare is complex. We like simple. So we try to make healthcare simple, which has worked for now, but not for much longer. The data driving the complexity is reaching an inflection point.
Soon the value of data will diminish, creating the beginning of a paradox that will define healthcare for years – the more we increase the number of data inputs, the lower the value of data outcomes. This is sometimes called the law of diminishing returns.
This may be hard to grasp right now. We naturally believe the more inputs we add into a dataset, the stronger it becomes. Through this belief, we have created massive datasets that predict and correlate anything in healthcare.
We correlate credit scores to patient compliance and the cost of care to the likelihood of long term follow-up. But eventually the inputs will be of limited value, and possibly even counterproductive. This is because health care is now complex.
Complex systems are defined by differences between the component parts and the whole – or more simply, the sum of the parts does not equal the whole and what happens in part of the system does not equate to what happens across the entire system.
We sense this at a certain level already. We know healthcare is different in New York City and in rural Montana. But we use the same datasets and predictive tools to measure healthcare behavior and cost of care.
If we only measure diabetes compliance and cost of care, then it would make sense to use the same metrics in the two regions. But as we add inputs, we inevitably incorporate socioeconomic conditions into the dataset that create different interpretations – and lead to errors.
For example, if we measure the distance traveled for clinical care in New York City compared to rural Montana, we would likely find that people travel farther for care in Montana. But travel distance also depends upon residential density, which varies between urban and rural developments. And to interpret travel distance into the same dataset as a predictor of patient outcomes, without correlating development densities, will lead to misinterpretations.
The data may show that shorter travel distances in New York City increase the need for telemedicine services – because of higher development density. But it may also show that greater travel distances require more telemedicine services in rural Montana – because of lower development density.
Travel distance, therefore, is not an input that should be integrated into clinical datasets without context. It requires additional inputs. But eventually, the inputs overwhelm the datasets with complexity.
At that point the datasets produce misleading outcomes and counterintuitive interpretations, including some that are overtly biased against certain ethnicities or demographics. And in the process of applying conclusions from such data, we compound the error.
This is called ecological fallacy, a common logical error in data that arises when interpretation are made about individuals based on data.
It is particularly problematic in healthcare because variations in individual patients vary more widely that what broad datasets suggest. These variations are often the result of individual patient decisions made over the course of patient care and not reflected in the patient data, which is more dependent on the outcome of those decisions.
Simple datasets can get away with these errors because they are often small and easy to point out. Complex datasets use an overabundance of inputs that produce unintended interpretations – and are then compounded by errors made from applying the data onto individual patients.
This is the problem with complexity in healthcare. We try to make it simple when it is anything but that. In our reliance on healthcare data, we have become too reliant on data – expanding it until it has become complex. And complexity changes data in ways we have yet to fully understand.
But we will begin to see the effects sooner than we think.