Hydroxychloroquine and COVID-19: Meta-analysis of 197 Studies (February 2021)
- HCQ (Hydroxychloroquine) is not effective when used very late with high dosages over a long period (RECOVERY/SOLIDARITY), effectiveness improves with earlier usage and improved dosing.
- Early treatment consistently shows positive effects.
- Negative evaluations typically ignore treatment time, often focusing on a subset of late stage studies.
We analyze all significant studies concerning the use of HCQ (or CQ) for COVID-19. Search methods, inclusion criteria, effect extraction criteria (more serious outcomes have priority), all individual study data, PRISMA answers, and statistical methods are detailed in Appendix 1. We present random-effects meta-analysis results for all studies, for studies within each treatment stage, for mortality results only, after exclusion of studies with critical bias, and for Randomized Controlled Trials (RCTs) only. Typical meta analyses involve subjective selection criteria and bias evaluation, requiring an understanding of the criteria and the accuracy of the evaluations. However, the volume of studies presents an opportunity for an additional simple and transparent analysis aimed at detecting efficacy.
If treatment was not effective, the observed effects would be randomly distributed (or more likely to be negative if treatment is harmful). We can compute the probability that the observed percentage of positive results (or higher) could occur due to chance with an ineffective treatment (the probability of >= k heads in n coin tosses, or the one-sided sign test / binomial test). Analysis of publication bias is important and adjustments may be needed if there is a bias toward publishing positive results. For HCQ, we find evidence of a bias toward publishing negative results.
Figure 2 shows stages of possible treatment for COVID-19. Pre-Exposure Prophylaxis (PrEP) refers to regularly taking medication before being infected, in order to prevent or minimize infection. In Post-Exposure Prophylaxis (PEP), medication is taken after exposure but before symptoms appear. Early Treatment refers to treatment immediately or soon after symptoms appear, while Late Treatment refers to more delayed treatment.
Early treatment.100% of early treatment studies report a positive effect, with an estimated reduction of 66% in the effect measured (death, hospitalization, etc.) from the random effects meta-analysis, RR 0.34 [0.27-0.44].
Late treatment.Late treatment studies are mixed, with 73% showing positive effects, and an estimated reduction of 25% in the random effects meta-analysis. Negative studies mostly fall into the following categories: they show evidence of significant unadjusted confounding, including confounding by indication; usage is extremely late; or they use an excessively high dosage.
Pre-Exposure Prophylaxis.77% of PrEP studies show positive effects, with an estimated reduction of 36% in the random effects meta-analysis. Negative studies are all studies of systemic autoimmune disease patients which either do not adjust for the different baseline risk of these patients at all, or do not adjust for the highly variable risk within these patients.
Post-Exposure Prophylaxis.83% of PEP studies report positive effects, with an estimated reduction of 33% in the random effects meta-analysis.
|Results by treatment stage|
Randomized Controlled Trials (RCTs)
Randomized Controlled Trials (RCTs) minimize one source of bias and can provide a higher level of evidence. Results restricted to RCTs are shown in Figure 7, Figure 8, and Table 2. Even with the small number of RCTs to date, they confirm efficacy for early treatment. Prophylaxis and early treatment studies show 29% improvement in random effects meta-analysis, RR 0.71 [0.54‑0.94], p = 0.015. Early treatment RCTs show 49% improvement, RR 0.51 [0.30‑0.88], p = 0.015.
Evidence supports incorporating non-RCT studies. [Concato] find that well-designed observational studies do not systematically overestimate the magnitude of the effects of treatment compared to RCTs. [Anglemyer] summarized reviews comparing RCTs to observational studies and found little evidence for significant differences in effect estimates. [Lee] shows that only 14% of the guidelines of the Infectious Diseases Society of America were based on RCTs. Limitations in an RCT can easily outweigh the benefits, for example excessive dosages, excessive treatment delays, or Internet survey bias could easily have a greater effect on results. Ethical issues may prevent running RCTs for known effective treatments. For more on the problems with RCTs see [Deaton, Nichol].
Publication bias.Publishing is often biased towards positive results, which we would need to adjust for when analyzing the percentage of positive results. Studies that require less effort are considered to be more susceptible to publication bias. Prospective trials that involve significant effort are likely to be published regardless of the result, while retrospective studies are more likely to exhibit bias. For example, researchers may perform preliminary analysis with minimal effort and the results may influence their decision to continue. Retrospective studies also provide more opportunities for the specifics of data extraction and adjustments to influence results.
For HCQ, 87.8% of prospective studies report positive effects, compared to 75.0% of retrospective studies, indicating a bias toward publishing negative results. Figure 10 shows a scatter plot of results for prospective and retrospective studies.
Figure 11 shows the results by region of the world, for all regions that have > 5 studies. Studies from North America are 3.8 times more likely to report negative results than studies from the rest of the world combined, 52.3% vs. 13.7%, two-tailed z test -5.36, p = 0.00000008. [Berry] performed an independent analysis which also showed bias toward negative results for US-based research.
The lack of bias towards positive results is not very surprising. Both negative and positive results are very important given the current use of HCQ for COVID-19 around the world, evidence of which can be found in the studies analyzed here, government protocols, and news reports, for example [AFP, AfricaFeeds, Africanews, Afrik.com, Al Arabia, Al-bab, Anadolu Agency, Anadolu Agency (B), Archyde, Barron's, Barron's (B), BBC, Belayneh, A., Bianet, CBS News, Challenge, Dr. Goldin, Efecto Cocuyo, Expats.cz, Face 2 Face Africa, Filipova, France 24, France 24 (B), Franceinfo, Global Times, Government of China, Government of India, Government of Venezuela, GulfInsider, Le Nouvel Afrik, LifeSiteNews, Medical World Nigeria, Medical Xpress, Medical Xpress (B), Middle East Eye, Ministerstva Zdravotnictví, Ministry of Health of Ukraine, Ministry of Health of Ukraine (B), Morocco World News, Mosaique Guinee, Nigeria News World, NPR News, Oneindia, Pan African Medical Journal, Parola, Pilot News, PledgeTimes, Pleno.News, Q Costa Rica, Rathi, Russian Government, Russian Government (B), Teller Report, The Africa Report, The Australian, The BL, The East African, The Guardian, The Indian Express, The Moscow Times, The North Africa Post, The Tico Times, Ukrinform, Vanguard, Voice of America].
We also note a bias towards publishing negative results by certain journals and press organizations, with scientists reporting difficulty publishing positive results [Boulware, Meneguesso]. Although 153 studies show positive results, The New York Times, for example, has only written articles for studies that claim HCQ is not effective [The New York Times, The New York Times (B), The New York Times (C)]. As of September 10, 2020, The New York Times still claims that there is clear evidence that HCQ is not effective for COVID-19 [The New York Times (D)]. As of October 9, 2020, the United States National Institutes of Health recommends against HCQ for both hospitalized and non-hospitalized patients [United States National Institutes of Health].
Treatment details.We focus here on the question of whether HCQ is effective or not for COVID-19. Studies vary significantly in terms of treatment delay, treatment regimen, patients characteristics, and (for the pooled effects analysis) outcomes, as reflected in the high degree of heterogeneity. However, early treatment consistently shows benefits. 100% of early treatment studies report a positive effect, with an estimated reduction of 66% in the effect measured (death, hospitalization, etc.) in the random effects meta-analysis, RR 0.34 [0.27-0.44].
HCQ is an effective treatment for COVID-19. The probability that an ineffective treatment generated results as positive as the 197 studies to date is estimated to be 1 in 768 trillion (p = 0.0000000000000013).
100% of early treatment studies report a positive effect, with an estimated reduction of 66% in the effect measured (death, hospitalization, etc.) using a random effects meta-analysis, RR 0.34 [0.27-0.44].
We performed ongoing searches of PubMed, medRxiv, ClinicalTrials.gov, The Cochrane Library, Google Scholar, Collabovid, Research Square, ScienceDirect, Oxford University Press, the reference lists of other studies and meta-analyses, and submissions to the site c19study.com, which regularly receives submissions of both positive and negative studies upon publication. Search terms were hydroxychloroquine or chloroquine and COVID-19 or SARS-CoV-2, or simply hydroxychloroquine or chloroquine. Automated searches are performed every hour with notifications of new matches. All studies regarding the use of HCQ or CQ for COVID-19 that report an effect compared to a control group are included in the main analysis. This is a living analysis and is updated regularly.
We extracted effect sizes and associated data from all studies. If studies report multiple kinds of effects then the most serious outcome is used in calculations for that study. For example, if effects for mortality and cases are both reported, the effect for mortality is used, this may be different to the effect that a study focused on. If symptomatic results are reported at multiple times, we used the latest time, for example if mortality results are provided at 14 days and 28 days, the results at 28 days are used. Mortality alone is preferred over combined outcomes. Outcomes with zero events in both arms were not used. Clinical outcome is considered more important than PCR testing status. For PCR results reported at multiple times, where a majority of patients recover in both groups, preference is given to results mid-recovery (after most or all patients have recovered there is no room for an effective treatment to do better). When results provide an odds ratio, we computed the relative risk when possible, or converted to a relative risk according to [Zhang]. Reported confidence intervals and p-values were used when available, using adjusted values when provided. If multiple types of adjustments are reported including propensity score matching (PSM), the PSM results are used. When needed, conversion between reported p-values and confidence intervals followed [Altman, Altman (B)], and Fisher's exact test was used to calculate p-values for event data. If continuity correction for zero values is required, we use the reciprocal of the opposite arm with the sum of the correction factors equal to 1 [Sweeting]. If a study separates HCQ and HCQ+AZ, we use the combined results were possible, or the results for the larger group. Results are all expressed with RR < 1.0 suggesting effectiveness. Most results are the relative risk of something negative. If a study reports relative times, the results are expressed as the ratio of the time for the HCQ group versus the time for the control group. If a study reports the rate of reduction of viral load, the results are based on the percentage change in the rate. Calculations are done in Python (3.9.1) with scipy (1.5.4), pythonmeta (1.11), numpy (1.19.4), statsmodels (0.12.1), and plotly (4.14.1).
The forest plots are computed using PythonMeta [Deng] with the DerSimonian and Laird random effects model (the fixed effect assumption is not plausible in this case).
We received no funding, this research is done in our spare time. We have no affiliations with any pharmaceutical companies or political parties.
We have classified studies as early treatment if most patients are not already at a severe stage at the time of treatment, and treatment started within 5 days after the onset of symptoms, although a shorter time may be preferable. Antivirals are typically only considered effective when used within a shorter timeframe, for example 0-36 or 0-48 hours for oseltamivir, with longer delays not being effective [McLean, Treanor].