This dataset contains the code and data files needed for implementation of a Multivariate Bayesian Regression model described in Jin et al. (2025) for the historical prediction of the chemical composition of disposed coal ash at U.S. coal fired power plants as a function of annualized coal purchase data.
The integrated coal supply data file (CoalSupplyDataset.csv) represents a compilation of monthly fuel purchase records for the period 1973-2022 at major U.S. power stations. These records were obtained from the U.S. Energy Information Administration. The CSV file also contains for each coal purchase record the coal region of the mine as defined by the U.S. Geological Survey. Data entry errors and data gaps in the EIA records were corrected as described in Jin et al. This CSV file represents the integrated coal supply data after corrections were made.
The model structure and fitting parameters are encoded in pickle file format (Bayesian.pkl). The model was developed with the coal supply data and coal ash composition data apportioned according to the Stratified Shuffle Split for training and testing subsets. The model was built using Python and the PyMC library.
Reference Publication: Jin Z.; Huang J.; Hower J.C.; Hsu-Kim H.(2025). Predictive Assessment of the Chemical Composition of Coal Ash in Reserve at U.S. Disposal Sites. Environmental Science & Technology.
The integrated coal supply data file (CoalSupplyDataset.csv) represents a compilation of monthly fuel purchase records for the period 1973-2022 at major U.S. power stations. These records were obtained from the U.S. Energy Information Administration. The CSV file also contains for each coal purchase record the coal region of the mine as defined by the U.S. Geological Survey. Data entry errors and data gaps in the EIA records were corrected as described in Jin et al. This CSV file represents the integrated coal supply data after corrections were made.
The model structure and fitting parameters are encoded in pickle file format (Bayesian.pkl). The model was developed with the coal supply data and coal ash composition data apportioned according to the Stratified Shuffle Split for training and testing subsets. The model was built using Python and the PyMC library.
Reference Publication: Jin Z.; Huang J.; Hower J.C.; Hsu-Kim H.(2025). Predictive Assessment of the Chemical Composition of Coal Ash in Reserve at U.S. Disposal Sites. Environmental Science & Technology.