Particulate matter (PM2.5) is a mixture of small particles and droplets less than 2.5 micrometers in diameter. It is regulated in outdoor air in the United States by the U.S. Environmental Protection Agency (EPA) for the protection of public health and the environment.REF The regulatory value for controlling air quality, including PM2.5, is based upon a claim that poor or diminished air quality is fatal and harmful to people.
This is no trivial matter as the EPA uses costs of poor air quality as a basis to regulate emissions of PM2.5 and other substances, including greenhouse gases. The greater the claimed harm from poor air quality, the stricter the emission regulations, and the more costs imposed on Americans.
As for a climate change connection, a poor air quality adverse health claim is asserted in the Fourth National Climate Assessment for United States, which acknowledges that large “uncertainties exist with respect to the climate impacts on PM2.5.”REF Yet based on prediction modeling of future climate change, it is alleged that “more frequent and severe wildfires due to climate change would further diminish air quality,”REF including PM2.5, and that “exposure to high concentrations [of PM2.5] can result in…premature death, nonfatal heart attacks, and adverse birth outcomes.”REF
The science and observation data behind this claim need to be understood to establish if there is even this link between future climate change and adverse health from PM2.5. As we will show, this claim is not supported. Thus, any effect from man-released greenhouse gases, measurable climate change, and health effects from PM2.5 is entirely unproven.
Firstly, a recent expert review of climate change prediction models compared to observations noted that the observed rate of surface−air temperature increases over the past 50 years has been unremarkable and much weaker than that predicted by almost all the climate models.REF Thus, the link between climate change and air quality is tenuous at best. Secondly, most wildfires are caused by humans—but not through their greenhouse gas emissions.REF Fire management experts attribute increases in forest fires to forest management practices.REF
Thirdly, the PM2.5−adverse health link is far less certain than what the EPA would have us believe. To support this position, we show three persistent, hidden problems of PM2.5 health effects research that people, in general, and air quality researchers and EPA policymakers, in particular, are not aware of—or, if they are, they (intentionally?) ignore.
These include the following: (1) use of questionable practices in academic research; (2) multiple testing (statistical) bias; and (3) irreproducibility (falseness) of PM2.5−health effect research claims. Where possible, we focus our discussion on the two key health endpoints that the EPA claims result from PM2.5 exposureREF—nonfatal heart attacks and premature deaths.
Questionable Research Practices
A search of the terms “health,” “air quality,” or “air pollution” anywhere in journal articles listed in the publicly available National Institutes of Health PubMed database returns over 58,800 results for the period 2000 to July 2024.REF Why are there so many published articles about air quality and health? The answer is simple: Academic researchers must continually publish to remain in their university positions and acquire funding. A recent National Association of Scholars (NAS) Shifting Sands Project report highlighted this problem:
University researchers earn tenure, promotion, lateral moves to more prestigious universities, salary increases, grants, professional reputation, and public esteem—above all, from publishing exciting, new, positive results…. [T]he same incentives affect journal editors, who receive acclaim for their journal, and personal reputational awards, by publishing exciting new research—even if the research has not been vetted thoroughly…. [A]ll these incentives reward published research with new positive claims—but not reproducible research.REF
The academic community’s incentives for publication quantity rather than quality encourage poor or questionable research practices to get their research published.REF These poor research practices can lead to false-positive findings in literature, and the persistence of these poor methods can lead to the natural selection of bad science in literature.REF
Questionable research practices are deceptive practices that do not constitute “research misconduct” but fail to align with the principles of scientific integrity.REF Some examples of deceptive practices that academic researchers use include:
- Inaccurate referencing of ideas and concepts;
- Failing to keep accurate records of the research process;
- P-hacking (repeatedly running statistical tests on a set of data until some statistically significant results arise);
- Incomplete reporting of relevant aspects of the study design;
- Selectively reporting studies that “worked” and ignoring those that did not;
- Claiming to have predicted an unexpected finding;
- Failing to report or discuss relevant contrary (i.e., nonsignificant) evidence; and
- Failing to share data or relevant information on the research with peers who would like to verify research results.REF
Research on air quality, including PM2.5, is by no means free of these deceptive practices. For example, research claiming that PM2.5 causes asthma attacks often involves academics selecting and reporting only some of their findings when they perform many statistical tests with a data set. Specifically, these are findings with positive associations—that is, results showing that higher levels of PM2.5 are associated with more disease or death.REF These researchers then only need to selectively describe the research designs and methods they used that are consistent with their reported positive associations (and point of view). Other findings that support different conclusions are ignored and not reported in their study, nor are negative (null) studies cited.
All of this can be written up in a professional manner and submitted to a scientific journal. Journal editors can overlook this deceptive practice given the professional, tight presentation of a scientific manuscript and send it off for peer review. Likewise for peer reviewers. In the end, what gets published is based on only a portion of the statistical comparisons they performed (i.e., their selectively reported findings). The problem is that it is unknown whether these selectively reported findings are true or false-positive findings. Others have noted that studies such as these are more likely to present false-positive findings.REF
Another example of deceptive practices in research relates to false reporting of research findings in scientific reviews. A scientific review is intended to be a comprehensive and focused review of the scientific literature on a particular topic, but it can be misleading.
For example, a review of studies on gas stoves, indoor air quality, and respiratory healthREF referred to a Paulin et al. randomized control trial on cooking behaviors, nitrogen dioxide (NO2), and asthma in 30 children ages 5−12 years in East Baltimore.REF The review article stated, “Paulin and colleagues demonstrated that daily changes in household NO2 exposure were associated with gas stove/oven use and led to worsened asthma symptoms and nighttime inhaler use among children with asthma.” Yet Paulin et al. clearly stated in their abstract: “There were no associations between NO2 and lung function or asthma symptoms.”
Richard Smith, former editor of the British Medical Journal, best summarized how we should treat research today given the deceptive practices being used: “It may be time to move from assuming that research has been honestly conducted and reported to assuming it to be untrustworthy until there is some evidence to the contrary.”REF
Smith was a cofounder of the Committee on Medical Ethics and for many years the chair of the Cochrane Library Oversight Committee and is a member of the board of the UK Research Integrity Office.
Multiple Testing Bias
Environmental epidemiology studies of populations (also called observational studies) examine associations between air quality factors and diseases or deaths. In the case of PM2.5, these studies are not founded on proven biological plausibility of PM2.5 causing diseases or deaths.REF They are founded on an assumption of what may be a cause of disease or death—for example, PM2.5.
These studies tend to analyze large, complex data sets that are far from homogeneous. Bias occurs when an air quality and disease (or death) data set is used to test multiple predictors, multiple outcomes, different population subgroups, multiple statistical cause−effect models, or multiple confounders to cause−effect associations.
Multiple testing without a statistical correction tends to produce more false-positive findings.REF Environmental epidemiologists typically do not correct for multiple testing, so any published study that does not make corrections cannot be reliable. Two papers published in 1988 alerted epidemiologists to the multiple testing problem.REF The epidemiologists did not take heed, and they have ignored the problem ever since.
Observational studies are known to have a bias toward highlighting statistically significant findings (i.e., those with a p-value less than 0.05) and avoiding highlighting nonsignificant findings. This bias is known as selective reporting.REF
Furthermore, a recent National Association of Scholars report estimated numbers of statistical comparisons performed in 70 randomly selected observational studies of PM2.5 and heart attacks, asthma attacks, and development of asthma. The estimated median number of statistical comparisons performed in these 70 studies was 13,056.REF
The null hypothesis is a concept in statistics that exists when there is no effect or association in a test between two variables. A key assumption in hypothesis testing according to theory is that one in 20 results (5 percent) can be statistically significant—that is, a p-value less than 0.05—under the null hypothesis when 0.05 is used to indicate statistical significance.
Given a typical study with 13,000 statistical comparisons, one can expect as many as 650 “statistically significant” results due to chance alone. That is to say, 650 false-positive results may be expected given 13,000 statistical comparisons performed. Among these 70 studies, large numbers of statistically significant test results may have gone unreported—presumably, results with p-values greater than 0.05.REF
These high numbers of statistical comparisons also encourage “p-hacking”—a search for significance during statistical analysis of data.REF There is a high risk of p-hacking with numerous hypothesis tests performed on a data set,REF particularly if there is no statistical correction for multiple testing and multiple modeling. Researchers have showed that employing a few common forms of p-hacking may cause the false-positive error rate for a single study to increase from the expected 5 percent to over 60 percent.REF
Multiple testing and p-hacking in scholarly research are no trivial matter. Large amounts of false-positive findings can be mistaken as true and be published in scholarly journals, which, in turn, can give rise to a false claim being taken as fact.REF EPA policymakers can then unknowingly use false-positive findings in PM2.5 research to develop policy goals and emission regulations for PM2.5. To sustain a narrative that poor air quality is a killer, any study where no effect is found can simply not be reported.
Irreproducibility (Falseness) of Claims
Far too many published claims in scholarly research are irreproducible or false.REF Academic research, both observational and experimental, possesses astonishingly high error rates, and peer and editorial review of university research no longer effectively provides quality control.REF This irreproducibility of PM2.5 health research claims lies at the center of the climate change, air quality–nonfatal heart attack/premature death narrative.
The NAS Shifting Sands Project explores irreproducible research and how it affects public policy. In an introduction to the Shifting Sands Project, NAS President Peter Wood, stated:
Science has always had a layer of untrustworthy results published in respectable places and “experts” who are eventually shown to have been sloppy, mistaken, or untruthful in their reported findings. Irreproducibility [of research] itself is nothing new. Science advances, in part, by learning how to discard false hypotheses, which sometimes means dismissing reported data that does not stand the test of independent reproduction.REF
Findings of statistical hypothesis tests in PM2.5 health research are normally presented as relative risks, odds ratios, effect sizes, or percent increases with 95 percent confidence intervals. Researchers conduct statistical tests on a data set to determine whether a significant correlation exists between two variables—for example, assumed PM2.5 exposure and disease (or death). This allows a researcher to make a claim if a significant association is found. But are these claims reproducible? One way to answer this is to use p-value plots.
The p-value is a number that describes how likely it is to have found a particular result if a nonsignificant association between the two variables were true. Relative risks, odds ratios, effect sizes, percent increases with 95 percent confidence intervals, and p-values are calculated from the same data set. They are interchangeable, and one can be calculated from another.REF
The p-values for a set of hypotheses tests can be displayed in a p-value plot.REF The p-values are rank-ordered from smallest to largest and plotted against the integers 1, 2, 3, and so on. The plot is used to visually check the heterogeneity (dissimilarity) of test statistics addressing the same research question or claim—for example, that assumed PM2.5 exposure causes disease or death. The p-value plot can be used to test the reproducibility of a research claim. The plot is well-regarded, being cited more than 500 times in scientific literature.REF
There are several ways to interpret p-value plots depending on their appearance:REF
- The p-values falling approximately on a 45 degree line in the plot suggests a good fit with the theoretical (uniform) distribution. Such a trend represents a distinct sample distribution for a null association between the tested variables.
- If p-values are mostly less than 0.05 and fall on a line with a shallow slope in the plot, there could be a real, non-random association between tested variables. Such a trend represents a distinct sample distribution for a true association between two variables.
- In the absence of biases, deviations from a near-45 degree line for the p-values may also indicate departures from the uniform distribution and a real, non-random association between two variables. In the presence of biases, the p-values can resemble the shape of a hockey stick—with some small (on the blade) and some large (on the handle). Such a p-value plot is ambiguous, and it represents an unproven research claim.
Three Examples. The reproducibility of three PM2.5 health research claims was tested with data sets from meta-analysis studies using p-value plots:REF
- PM2.5 exposure leads to more adult hospital admissions and emergency room visits due to nonfatal heart attacks.
- PM2.5 exposure leads to more all-cause and cause-specific mortality (premature deaths).
- PM2.5 exposure leads to more cases of lung cancer incidence and mortality.
The first claim was tested on a meta-analysis of 13 observational studies that examined the association between assumed PM2.5 exposure and adult hospital admissions and emergency room visits due to nonfatal heart attacks. The meta-analysis calculated p-values from relative risks and confidence intervals. The p-values are presented in Chart 1.
The second claim was tested on a meta-analysis of 29 observational studies that examined the association between assumed PM2.5 exposure and all-cause and cause-specific mortality. The p-values were calculated from relative risks and confidence intervals used in the meta-analysis and are presented in a p-value plot. (See Chart 2.)
The third claim was tested on a meta-analysis of 17 observational studies that examined the association between assumed PM2.5 exposure and lung cancer incidence and mortality. The p-values were calculated from relative risks and confidence intervals used in the meta-analysis and are presented in a p-value plot. (See Chart 3.)
The p-value trends in all the charts clearly depart from the uniform distribution—p-values falling approximately on a 45 degree line. All p-value trends present as two-component mixtures. These trends do not support real exposure–disease (or death) associations. Specifically, they do not show evidence of distinct sample distributions for true effects between two variables—p-values mostly less than 0.05 and falling on a line with a shallow slope in the plot.
These trends show that the test statistics are dissimilar. Keep in mind that each of the data sets in the three charts is supposed to be addressing whether PM2.5 causes nonfatal heart attacks or premature deaths. What these charts do show is that the PM2.5 research claims cannot be reproduced.
How can such dissimilar test statistics combined in meta-analysis represent true effects? The answer is that questionable research practices and multiple testing without statistical corrections cannot be ruled out as explanations for small p-values in studies combined in a meta-analysis. It is possible and likely that false-positive results are being mistakenly claimed as true results in these studies and are being carried forward into meta-analysis.
Conflicting PM2.5 Health Research
Another important question is whether scholarly research exists showing that PM2.5 is not associated with nonfatal heart attacks or premature deaths (i.e., so-called null association studies).
Nonfatal Heart Attacks. Large, well-conducted scholarly research has reported no association of PM2.5 in outdoor air with nonfatal heart attacks. A 2009 study was conducted on nearly 400,000 emergency room visits for heart attack at 14 hospitals in seven Canadian cities during the 1990s and early 2000s.REF Statistical comparisons with daily (24-hour) average levels and three-hour averages (for example, 12 a.m.–3 a.m., 3 a.m.–6 a.m., and 6 a.m.–9 a.m., and so on) for numerous air quality parameters, including PM2.5, were assessed.
The researchers found that none of the statistical comparisons for PM2.5 and heart attack emergency room visits were significant using combined data for the seven cities. This was also the case for comparisons made at the individual city level as observed by the fact that they did not present any city-level results for PM2.5−heart attack emergency room visits. Also, these researchers made no mention of the lack of an association between PM2.5 and heart attacks in their study—an example of selective reporting.
A large 2014 study in England and Wales examined air quality and 452,343 emergency hospitalization events for nonfatal and fatal heart attacks and 16 other cardiovascular endpoints (eight nonfatal and eight fatal).REF The study’s data were from three databases from 2003 to 2008. Statistical comparisons with daily average levels of numerous air quality parameters, including PM2.5, were assessed for all cardiovascular endpoints.
All the nonfatal heart attack statistics they presented in their study were nonsignificant with no corrections for multiple testing. The researchers concluded, “This study found no clear evidence for pollution effects on STEMIs [a particular nonfatal heart attack diagnosis].”REF
A 2015 study combined two independent analyses of five air quality parameters, including PM2.5, over the period 1999−2010 in Calgary and Edmonton, two geographically close and demographically similar cities with over 10,000 first-time heart attack hospitalization events in each city.REF
The study emphasized reproducibility of results by comparing statistical claims made in one city to the other city as a way of exploring the possible role of air quality parameters on heart attack hospitalization events. Researchers performed the same 600 statistical comparisons of potential air quality parameter−heart attack hospitalization events for each urban population, including 120 comparisons of PM2.5−heart attack hospitalization events.
None of the findings observed in one city was reproduced in the other city for PM2.5 and other air quality parameters. In short, claims made in one city were not replicated in the other. The researchers concluded that “none of the air pollutants investigated showed consistent positive associations with increased risk of [heart attack] hospitalisation.”REF
Premature Deaths. Similarly, well-conducted scholarly research has reported no association of PM2.5 in outdoor air with premature deaths. In the mid-1970s, the EPA forced 276 U.S. counties that did not meet Total Suspended Particulate (TSP) standards to improve their air quality.REF These are called nonattainment counties. Researchers compared changes in adult mortality rates and TSP levels (which includes PM2.5) for these nonattainment counties with 257 attainment counties serving as controls.
The data set they analyzed was for six consecutive years (1969–1974), and it included over 31 million people. Based on their findings, the researchers concluded, “We find that regulatory status is associated with large reductions in TSP pollution but has little association with reductions in either adult or elderly mortality.”REF
Researchers later conducted a reanalysis of the same data set using a different, independent statistical method.REF These researchers concluded, “Our reanalysis reveals subgroup heterogeneity in the effects of air quality regulation on elderly longevity (one size does not fit all), and we show that this heterogeneity is largely explained by socioeconomic and environmental confounders other than air quality.” In essence, these researchers reproduced the previous results that changes in TSP levels in outdoor air did not explain changes in adult mortalities.
A 2016 study examined forest fires in the Canadian province of Quebec in the summer of 2002 that generated smoke plumes that migrated as far as New York.REF Researchers analyzed PM2.5 levels and mortality data for a four-week period in July 2002 for Greater Boston (over 1.7 million people) and New York City (over 8 million people).
Daily average PM2.5 levels increased in both cities for three days during this period. These researchers concluded that “substantial short-term elevation in PM2.5 concentrations from forest fire smoke were not followed by increased daily mortality in Greater Boston or New York City.”REF
In a 2017 study, researchers examined the relationship between air quality and acute deaths in California.REF They looked at daily deaths, daily average air quality levels for PM2.5 and ozone, daily temperature levels (minimum and maximum), and daily maximum relative humidity levels for the eight most populous California air basins. The data set covered the years 2000−2012. There were more than 1 million deaths used in the analysis representing over 37,000 exposure days.
The study found little evidence for an association between air quality and deaths. Within the data set, there were several forest fire smoke/PM2.5 events, and even these events did not exhibit associations with daily deaths. These researchers concluded, “Our analysis finds little evidence for an association between air quality [PM2.5 and ozone] and acute deaths.”REF
More recently, in 2021, another researcher examined PM2.5 and acute deaths in the United States using Medicare data from 1999−2013. Of particular interest, this study also involved a separate analysis of the California study mentioned above.REF
This researcher found “no significant association” between PM2.5 levels and adult mortality. In essence, this researcher reproduced the previous California results that changes in PM2.5 levels did not explain changes in adult mortalities.
Implications
The EPA regulations for PM2.5 rely on environmental epidemiological literature of nonfatal heart attacks, premature deaths, and other harms. The EPA does not take proper accounting of false-positive findings from multiple testing and other biases in studies they rely on, nor does the agency apply rigorous tests for reproducibility of these studies.REF A good case can be made that many studies with nonsignificant findings were simply not published.REF
Given the evidence presented here, PM2.5 health research claims of nonfatal heart attacks and premature deaths are untrustworthy and cannot be relied upon to make public policy. Questionable research practices and multiple testing biases cannot be ruled out as explanations for these research claims. Further, independent testing of the PM2.5 research claims using p-value plots shows that they are irreproducible. Also, numerous null association or effect studies in scholarly literature show that PM2.5 does not cause nonfatal heart attacks and premature death.
As a matter of science logic, if an alleged risk factor such as PM2.5 causes nonfatal heart attacks or premature deaths, then any adequate study should be able to show this. The research examples described above are adequate, and they show that particulates, including PM2.5 in outdoor air, do not cause nonfatal heart attacks and premature deaths, and these null studies are reproducible results.
The book Scare PollutionREF and the NAS reportREF provide extensive background on these topics, and they find that PM2.5 at current levels in outdoor air is not harming anyone. Both acknowledge that rare combinations of events can turn air deadly. We know of three events—Meuse River Valley, Belgium, in 1930; Danora, Pennsylvania, in 1948; and the London fog in 1952 (and again in 1956 and 1962)REF—where a combination of a temperature inversion, acid in the air, and particulate matter resulted in deaths. But these air quality conditions do not exist today.
In the end, it is up to the interpretation of the scientific method to answer the PM2.5−nonfatal heart attack/premature death claims: If the method is flawed, so is the evidence. Studies that engage in questionable research practices should be treated as untrustworthy until proven otherwise.REF Studies that perform many statistical comparisons tend to produce more errors of false-positive associations in the absence of statistical corrections.REF
Also, PM2.5 health research ought to survive a battery of independent, passable tests, such as p-value plots. Researchers of PM2.5 health always have the burden of proof to defend statistically significant associations. Evidence presented here shows that p-value plots of PM2.5 research claims for nonfatal heart attacks and premature deaths are irreproducible. This is consistent with the broader claim that false-positive results are common features of the scholarly literature today.REF
Independent studies that use sound research practices—good study technique, randomization, blocking, blinding, and unbiased peer review, among othersREF—should be able to provide unbiased risk statistics that differ one from another only by chance. These practices are not common features of PM2.5 health research claiming health effects.
Non-randomized designs are typical of the PM2.5 health research referred to here. These types of designs are unable to address biases and confounding that lurk, often unmeasured and unobserved.REF Spurious risk statistics that resemble genuine effects can easily occur in PM2.5 health research studies when, in fact, they are nothing more than artifacts of these hidden biases and confounding.
Conclusion
This report attempts to show that, although assumed associations between PM2.5 and health effects are common, causation is not proven. With no detected causal effects and given persistent, hidden problems in PM2.5 health research, any purported material link between PM2.5 and public health is entirely unsupported and should not be taken seriously.
About the Authors
S. Stanley Young, PhD, was a former Visiting Fellow for the Science Advisory Committee in the Center for Energy, Climate, and Environment at The Heritage Foundation. He is currently CEO of CGStat in Raleigh, NC, and director of the National Association of Scholars’ Shifting Sands Project.
Warren B. Kindzierski, PhD, now retired, was a professor of public health at the University of Alberta, Canada, and a contributor to the Shifting Sands Project.