Conversations about quantifying the robustness of inferences.


Welcome to our discussion about sensitivity analysis. All of the assumptions of statistical analysis rarely hold. So the challenge for the pragmatist is to understand when evidence is strong enough to support action. That’s where sensitivity analysis comes in – so we can understand how robust our inferences are to challenges to our assumptions. One example is statements such as “XX% of the estimated effect would have to be due to bias to change your inference about the effect.

Why is robustness important for emerging COVID-19 studies?

As those who work in public policy and health, we seek to help a broad range of people, with a broad range of statistical backgrounds, interpret uncertainty about public health findings regarding COVID-19. We observe that currently, there is little common language for expressing uncertainty. Consider Anthony Fauci’s quote (as in Healio on April 29): “The trial, which began Feb. 21 this year, compared remdesivir with placebo in more than \(1,000\) patients.

First example: Pneumonia finding from initial hydroxychloroquine RCT only requires 1 ‘switch’ to invalidate

The first report of a randomized trial regarding hydroxychloroquine (HCQ) came from a study conducted at the Renmin Hospital of Wuhan University. The table below shows the association between treatment (HCQ vs conventional) and condition (improved vs exacerbated/ucnhanged). For Table 1, \(χ^2= 4.7\), \(p = 0.03\), and the authors concluded that HCQ was efficacious. Table 1. Association between hydroxychloroquine (HCQ) vs Conventional Treatments and Pneumonia on Chest CT To quantify the robustness of the inference, we calculate the number of treatment cases that would need to be switched from “improved” to “exacerbated or unchanged” to change the inference – a quantity we refer to as the Robustness of the Inference to Switches (RIS).

From early RCT: effect of Remdesivir on mortality could go either way

Consider the recent randomized double-blind, placebo-controlled trial of Remdesivir for patients with severe COVID-19. The study found no discernable difference in mortality: Twenty-two of \(158\) (\(14\%\)) Remdesivir patients died within \(28\) days while \(10\) of \(78\) (\(13\%\)) in the placebo group died (Table 1). How different would the results have to be in the current study to change statistical inference about Remdesivir? Table 1. Comparison of Remdesivir vs Placebo Control on Mortality.

From early RCT: robust inference of Remdesivir effect on time to clinical improvement

Now consider the previous example of Remdesivir but instead we are interested in the continuous outcome variable of time to clinical improvement. Figure 1. Time to clinical improvement in the Remdesivir example. Data from Table 3, Wang et al (2020). We can interpret the robustness of inference regarding Remdesivir’s effect on time to clinical improvement in following ways. Percent bias necessary to invalidate the inference To invalidate an inference, \(70.

What do we do as more studies come out?

Online dashboards, such as the COVID-19 clinical trials registry, provide near real-time tracking and categorization of findings accumulating across emerging research. As such one can update the robustness of the cumulative findings for a given COVID-19 treatment. We illustrate using a historical non-COVID example: the study-by-study accumulation of \(16\) estimated effects presented in a meta-analysis of randomized trials examining the impact of hypertension treatments on the probability of suffering a stroke (data from Collins et al.

A 15-min talk

I summarized our current thinking on how robustness analysis can be applied to emerging COVID research in a brief presentation at an online conference entitled “COVID-19 and Public Policy and Management.” The conference was hosted by the Center on Technology, Data, and Society at Arizona State University, and a recording of the 15-minute talk, entitled “Communicating the Robustness of Inferences as COVID-19 Evidence Accumulates”, is available here.