高级检索

    基于机器学习及太赫兹时域光谱的煤质挥发分预测

    Prediction of coal volatile matter based on machine learning and terahertz time-domain spectroscopy

    • 摘要: 基于机器学习算法,通过采集煤样的太赫兹时域光谱数据,构建了一个随机森林回归模型,能够高效准确预测煤的挥发分含量。利用主成分分析(PCA)及其变体算法,核主成分分析(KPCA)、序列主成分分析(SPCA)和增量主成分分析(IPCA),对光谱数据进行了降维和特征筛选优化。接着,采用随机森林算法构建了4种回归模型:RF-PCA、RF-KPCA、RF-SPCA和RF-IPCA。通过十折交叉验证和超参数优化确保了模型的准确性和精度。其中,RF-SPCA模型表现出色,预测精度最佳,R2达到了0.985,RMSE为1.949,MAE为0.913。进一步分析学习曲线显示模型随训练样本增加而稳定,残差图则展示预测误差均匀分布在零点两侧,进一步验证了模型优异的泛化能力。这一研究为智能煤矿分析提供了有效的分析路径。

       

      Abstract: Based on machine learning algorithms, a random forest regression model was constructed using terahertz time-domain spectroscopy data from coal samples to efficiently and accurately predict volatile matter content. Principal component analysis (PCA) and its variants, including kernel PCA (KPCA), sequential PCA (SPCA), and incremental PCA (IPCA), were employed to optimize dimensionality reduction and feature selection of the spectral data. Subsequently, four regression models were developed using random forest algorithm: RF-PCA, RF-KPCA, RF-SPCA, and RF-IPCA. The models' accuracy and precision were ensured through ten-fold cross-validation and hyperparameter optimization. Among them, the RF-SPCA model demonstrated superior predictive accuracy with an R2 of 0.985, RMSE of 1.949, and MAE of 0.913. Further analysis of learning curves indicated model stability with increasing training samples, while residual plots showed uniform distribution of prediction errors around zero, further validating the model's excellent generalization performance. The research provides an effective analytical approach for intelligent coal mine analysis.

       

    /

    返回文章
    返回