Air temperature model and areal interpolation
Air temperature predictions were drawn from a geostatistical model detailed previously (9). Briefly, we trained a machine learning model (XGBoost) on ground station data aggregated by the National Oceanic and Atmospheric Administration (NOAA) Meteorological Assimilation Data Ingest System (MADIS). The specific MADIS dataset we used was the National Mesonet, with >4,000 weather stations across the region. Predictors in the model included land surface temperature from the MODIS instrument on the Aqua and Terra satellites, an inverse-distance weighting interpolation of air temperature, enhanced vegetation index, landcover and landform characteristics, elevation, a topological position index, and temporal terms for seasonality. Careful attention was paid to avoid overfitting with spatial cross-validation methods, and the model was assessed and validated against an independent external dataset. The resulting model was approximately 1-km grid cells and an hourly timestep. For comparison, air temperature predictions from this model have a root mean square error of 1.6 Kelvin when compared to ground observations, whereas the North American Land Data Assimilation System-2 (NLDAS-2) model, used in the Centers for Disease Control and Prevention Heat and Health Tracker, has a root mean square error of 2.5 Kelvin.
An areal interpolation procedure was designed to align temperature data with census tracts. First, 1-km prediction cells were reprojected to align with NASA’s Gridded Population of the World version 4.11 data product for 2000 and 2010 (25). We then used the exactextractrpackage in R to calculate area-weighted mean population density values in each cell (26). Finally, we used the population density and coverage area of each prediction cell to weight temperature predictions in each cell and compute weighted average by each tract. The result was a population-weighted areal interpolation of temperature within each census tract.
Air temperature data were transformed to create CDDs, restricted to predictions in May–September of each year. We used Fahrenheit instead of Celsius to align with energy policies in the U.S. CDD calculations were adapted based on NOAA methods:
\(Cooling\ degree\ days=\ \sum_{i}^{n}\left\{\par \begin{matrix}\left(\frac{\text{Tmax}_{i}-\ \text{Tmin}_{i}}{2}\right)-65,if>0,\ \ otherwise\\ 0\\ \end{matrix}\right.\ \) (1)
where Tmax was the maximum hourly temperature and Tmin was the minimum on day i, summed over n days in May–September.