Data and code from: Multivariate bayesian regression model for predicting disposed ash composition at U.S. coal fired power stations

Public

  • This dataset contains the code and data files needed for implementation of a Multivariate Bayesian Regression model, described in Jin et al. (2025), for the historical prediction of the chemical composition of disposed coal ash at U.S. coal fired power plants as a function of annualized coal purchase data.

    The integrated coal supply data file (CoalSupplyDataset.csv) represents a compilation of monthly fuel purchase records for the period 1973-2022 at major U.S. power stations. These records were obtained from the U.S. Energy Information Administration. The CSV file also contains, for each coal purchase record, the coal region of the mine as defined by the U.S. Geological Survey. Data entry errors and data gaps in the EIA records were corrected as described in Jin et al. This CSV file represents the integrated coal supply data after corrections were made.

    The model structure and fitting parameters are encoded in pickle file format (Bayesian.pkl). The model was developed with the coal supply data and coal ash composition data, apportioned according to the Stratified Shuffle Split for training and testing subsets. The model was built using Python and the PyMC library.

    Reference Publication: Jin, Z.; Huang, J.; Hower, J.C.; Hsu-Kim, H.(2025). Predictive Assessment of the Chemical Composition of Coal Ash in Reserve at U.S. Disposal Sites. Environmental Science & Technology.
    ... [Read More]

Total Size
4 files (329 MB)
Data Citation
  • Jin, Z., Huang, J., Hower, J. C., & Hsu-Kim, H. (2025). Data and code from: Multivariate bayesian regression model for predicting disposed ash composition at U.S. coal fired power stations. Duke Research Data Repository. https://doi.org/10.7924/r4vm4h42h
DOI
  • 10.7924/r4vm4h42h
Publication Date
ARK
  • ark:/87924/r4vm4h42h
Type
Format
Related Materials
Funding Agency
  • Alfred P. Sloan Foundation
  • Office of Fossil Energy
Grant Number
  • DE-FE0031748
  • G-2020-13922
Title
  • Data and code from: Multivariate bayesian regression model for predicting disposed ash composition at U.S. coal fired power stations
This Dataset
Usage Stats