You are viewing the site in preview mode

Skip to main content
Fig. 3 | Respiratory Research

Fig. 3

From: An integrated machine learning model of transcriptomic genes in multi-center chronic obstructive pulmonary disease reveals the causal role of TIMP4 in airway epithelial cell

Fig. 3

A model constructed with 13 genes that accurately identify COPD. (A-B) ROC results demonstrate the performance of 20 different machine learning methods, based on the selected 13-gene model, in identifying COPD across GSE47460 (A) and GSE76925 (B). (C) ROC results indicated that the random forest and extra tree models constructed using GSE47460 and GSE76925 datasets were cross-validated against each other. (D) ROC results display AUC outcomes for validating the random forest and extra tree models using two external patients with COPD lung tissue sequencing data (GSE103174 and GSE239897). (E) The expression changes of the 13 genes (ANGPTL1, DUSP26, FGG, GAS2, VEGFD, BHLHE22, SYNGR1, TIMP4, CXCL12, GEMIN5, SV2B, HTR2B, and TMEM117) used to construct the model are indicated between control and COPD groups in GSE47460 and GSE76925 datasets. (F) Scatter plots revealing predicted versus observed FEV1% predicted values for each of the 13 regression models (AdaBoost, Decision Tree, ElasticNet, GLM, LASSO, Least.Angle, Linear, NeuralNet, RandomForest, Ridge, SGD, SVR, and voting) in GSE47460. Each point represents a sample in the test set. R-squared values and p-values are displayed for each model. Data indicated mean ± SD. P-values are indicated in charts determined by a two-tailed student test (E)

Back to article page