Skip to main content

Random Forest Regression Model Application for Prediction of China’s Railway Freight Volume

  • Chapter
  • First Online:
Collaborative Logistics and Intermodality
  • 525 Accesses

Abstract

Purpose: The China Railway has an important impact on the transport of domestic energy products. The Chinese Prime Minister sees railway freight as a barometer of the Chinese economy; therefore, the study of China’s Railway freight is meaningful. During the past 5 years, from 2012 to 2016, China Railway freight volume continually declined, leading to a very serious situation. It is important to predict the volume of rail freight because it indicates the development of the Chinese economy. The prediction of China’s railway freight by a traditional regression model is not very effective because it is too sensitive to changes in statistical data. In particular, economic changes in China are now too large, resulting in significant changes in railway freight volume. In this chapter, we aim to use an machine learning model to predict China’s railway freight volume and attempt to determine whether the random forest regression model is more effective than the conventional forecasting method.

Design/methodology/approach: In this chapter, random forest regression is applied to quantitatively predict railway freight volume. Six independent variables were collected from Jan 2001 to Dec 2016 in relation to China’s railway freight. After data analysis, a random forest regression model of China’s railway freight volume was built using the R language. To obtain the most suitable regression model, the random forest regression error is contrasted with the multiple linear regression model. The result shows that random forest regression model performed better than linear regression.

Findings: The results in this study indicate the following: (1) the random forest regression model is able to predict railway freight volume using the selected variables. (2) By comparison of the variance and the normalized mean square error (NMSE) of different models, the best random forest regression model is obtained, and this model performs accurate prediction. (3) Compared with the multiple linear regression model, the random forest regression model exhibits superiority in prediction accuracy, robustness and fitness. (4) Although coal makes up the largest proportion of railway freight, refined oil production also has a large impact.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  • Chengde, L., & Guolan, P. (2007). Application of random forest on selecting evaluation index system for enterprise credit assessment. Journal of Xiamen University (Natural Science), 2, 199–203.

    Google Scholar 

  • Chi, Z., Lei, H., & Zhichao, Z. (2013). Research on combination forecast of port cargo throughput based on time series and causality analysis. Journal of Industrial Engineering and Management, 6(1), 124–134.

    Google Scholar 

  • Dongwen, C. (2014). Application of random forest regression model for wastewater discharge forecasting. Water Technology, 8, 31–36.

    Google Scholar 

  • Huawen, W., & Fuzhang, W. (2014). Research on railway freight traffic prediction based on maximum Lyapunov exponent. Journal of the China Railway Society, 4, 7–13.

    Google Scholar 

  • Kuangnan, F., Jianping, Z., & Bangchang, X. (2010). A research into the forecasting if fund return rate direction and trading strategies based on the random forest method. Economic Survey, 2, 61–65.

    Google Scholar 

  • Liang, Z., Qiong, H., Ao, L., & Minghui, W. (2012). A genome-wide association study of Alzheimer’s disease using random forests and enrichment analysis. Scientia Sinica (Vitae), 8, 639–647.

    Google Scholar 

  • Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., & Chica-Rivas, M. (2015). Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geology Reviews, 71, 804–818.

    Article  Google Scholar 

  • Rui, S., & Yan, S. (2014). Mixed radial basis function neural network in freight volume forecasting. Journal of Wuhan University of Technology (Transportation Science & Engineering), 6, 1247–1250.

    Google Scholar 

  • Shengnan, Y., Yuanfang, C., Qin, H., You, K., Ranran, H. & Shenghua, G. (2015). Using the random forest method for classification and regression in hydrology. In Proceedings of the 2nd annual congress on advanced engineering and technology, CAET 2015 (pp. 213–218).

    Google Scholar 

  • Silke, J., Gerhard, T., & Anne-Laure, B. (2015). Random forest for ordinal responses: Prediction and variable selection. Computational Statistics and Data Analysis, 96, 57–73.

    Google Scholar 

  • Svetnik, V., Liaw, A., Tong, C., Culberson, J. C., Sheridan, R. P., & Feuston, B. P. (2003). Random forest: A classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Sciences, 43, 1947–1958.

    Article  Google Scholar 

  • Tao, Z., & Jinlong, Z. (2014). Study on the forecast of freight volume based on GRNN. Logistics Sci-Tech, 10, 138–141.

    Google Scholar 

  • Vitorino, D., Coelho, S. T., Santos, P., Sheets, S., Jurkovac, B., & Amado, C. (2014). A random forest algorithm applied to condition-based wastewater deterioration modeling and forecasting. Procedia Engineering, 89, 401–410.

    Article  Google Scholar 

  • Yan, S., Maoxiang, L., Danzhu, W., & Linyun, L. (2014). A PSO-GRNN model for railway freight volume prediction: Empirical study from China. Journal of Industrial Engineering and Management, 7(2), 413–433.

    Google Scholar 

  • Yandong, Y., & Congzhou, Y. (2015). Prediction models based on multivariate statistical methods and their applications for predicting railway freight volume. Neurocomputing, 158, 210–215.

    Article  Google Scholar 

  • Yang, Y., Wen, D., & Lin, W. (2010). Freight volume prediction of coal transportation from Xinjiang based on Grey-Markov chain method. In Proceedings of the 2010 international conference of logistics engineering and management, Vol. 387, pp. 502–508.

    Google Scholar 

  • Yue, Z., & Bin, S. (2012). Railway freight volume forecast in northeast region based on the improved gray model. Journal of Railway Science and Engineering, 5, 125–128.

    Google Scholar 

  • Zhenzi, L., Tao, Z., Xiaoyan, W., & Kang, L. (2014). Methodology of regression by random forest and its application on metabolomics. China Journal of Health Statistics, 2, 158–160.

    Google Scholar 

Download references

Acknowledgements

This chapter is funded by the National Natural Science Foundation (Grant No. 71390334), the EC-China Research Network on Integrated Container Supply Chain Project (No. 612546), and the Beijing Logistics Informatics Research Base.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaochun Lu .

Editor information

Editors and Affiliations

Appendix Table. Chinese Railway Freight and Other Industry Data

Appendix Table. Chinese Railway Freight and Other Industry Data

Month

Railway freight (Mt)

Steel output (Mt)

Coal output (Mt)

Refined oil output (Mt)

Thermal power output (billion KW/h)

Fixed asset investment

(billion yuan)

Growth rate of industrial added value (%)

Jan-2001

137.00

10.91

63.45

16.27

88.022

3.5930

2.3

Feb-2001

131.00

11.31

67.97

15.76

100.431

1.7965

19

Mar-2001

152.00

13.23

82.49

18.12

88.722

4.4913

12.1

Apr-2001

147.00

12.85

79.42

18.65

95.111

5.3895

11.5

May-2001

155.00

13.28

80.63

19.03

93.663

7.1860

10.2

Jun-2001

149.00

13.61

80.94

18.33

94.658

7.1860

10.1

Jul-2001

152.00

12.96

74.14

16.6

106.032

6.2878

8.1

Aug-2001

154.00

13.41

78.1

16.91

100.992

6.2878

8.1

Sep-2001

151.00

13.17

81.18

17.79

97.313

8.0843

9.5

Oct-2001

158.00

13.54

83.89

18.51

96.623

8.9825

8.8

Nov-2001

149.00

15.34

88.06

18.76

138.632

9.8808

7.9

Dec-2001

151.00

13.86

100.19

16.71

76.552

20.6598

8.7

Jan-2002

161.00

14.01

86.33

17.17

116.264

3.8509

18.6

Feb-2002

141.00

13.27

68.49

16.12

80.532

1.9255

2.7

Mar-2002

157.00

16.02

93.78

17.79

109.261

4.8137

10.9

Apr-2002

153.00

16.11

92.81

18.99

106.414

5.7764

12.1

May-2002

161.00

15.89

91.61

19.33

104.102

7.7018

12.9

Jun-2002

154.00

16.63

93.63

17.95

106.328

7.7018

12.4

Jul-2002

158.00

15.67

87.16

17.16

118.668

6.7391

12.8

Aug-2002

158.00

16.52

90.9

17.73

112.877

6.7391

12.7

Sep-2002

155.00

16.60

93.51

18.99

110.027

8.6646

13.8

Oct-2002

161.00

16.55

94.88

19.6

113.483

9.6273

14.2

Nov-2002

155.00

16.27

100.35

18.99

116.555

10.5900

14.5

Dec-2002

155.00

18.64

114.63

19.83

134.253

22.1428

14.9

Jan-2003

156.00

16.48

96.31

19.31

119.035

3.4397

14.8

Feb-2003

144.00

15.82

85.39

18.45

109.467

1.7198

19.8

Mar-2003

168.00

18.26

106.87

19.66

126.905

4.2996

16.9

Apr-2003

161.00

18.71

107.33

19.81

122.338

5.1595

14.9

May-2003

173.00

19.98

109.11

18.32

118.852

6.8794

13.7

Jun-2003

170.00

19.71

114.44

18.81

124.23

6.8794

16.9

Jul-2003

171.00

19.80

109.93

20.3

137.808

6.0194

16.5

Aug-2003

172.00

21.10

107.19

20.82

139.595

6.0194

17.1

Sep-2003

165.00

20.43

105.61

21.19

127.588

7.7393

16.3

Oct-2003

171.00

21.59

114.71

21.75

130.042

8.5992

17.2

Nov-2003

165.00

21.88

122.38

21

139.692

9.4591

17.9

Dec-2003

175.00

22.06

135.97

22.32

146.574

19.7782

18.1

Jan-2004

176.00

21.18

96.95

22.52

128.997

3.6055

7.2

Feb-2004

164.00

22.25

110.02

22.03

140.763

1.8028

23.2

Mar-2004

183.00

23.83

128.13

22.31

150.094

4.5069

19.4

Apr-2004

173.00

22.18

131.29

22.07

141.276

5.4083

19.1

May-2004

184.00

24.01

128.72

23.09

138.066

7.2110

17.5

Jun-2004

179.00

23.76

135.41

22.51

136.882

7.2110

16.2

Jul-2004

186.00

24.57

131.5

22.92

158.449

6.3097

15.5

Aug-2004

190.00

25.48

132.55

23.48

153.315

6.3097

15.9

Sep-2004

184.00

26.65

134.06

22.51

144.397

8.1124

16.1

Oct-2004

191.00

27.59

140.18

23.17

150.663

9.0138

15.7

Nov-2004

184.00

28.12

143.7

23.54

153.755

9.9152

14.8

Dec-2004

184.00

27.76

152.7

24.47

173.514

20.7317

14.4

Jan-2005

221.00

26.11

128.79

24.26

165.049

5.4572

20.9

Feb-2005

198.00

25.78

111.38

22.28

137.827

2.7286

7.6

Mar-2005

223.00

30.65

140.02

24.68

170.209

6.8216

15.1

Apr-2005

222.00

29.50

143.87

23.94

154.817

8.1859

16

May-2005

225.00

30.79

149.89

24.56

155.01

10.9145

16.6

Jun-2005

219.00

30.30

158.6

23.8

157.722

10.9145

16.8

Jul-2005

225.00

30.55

151.76

24.62

174.186

9.5502

16.1

Aug-2005

225.00

32.54

153.96

24.23

172.639

9.5502

16

Sep-2005

222.00

32.61

157.64

24.62

163.134

12.2788

16.5

Oct-2005

234.00

33.10

163.55

24.25

160.383

13.6431

16.1

Nov-2005

232.00

33.33

170.43

24.21

170.241

15.0074

16.6

Dec-2005

249.00

35.92

180.02

24.68

204.506

31.3791

16.5

Jan-2006

222.00

32.47

129.23

24.82

175.483

6.5712

12.6

Feb-2006

211.00

31.82

136.97

23.66

166.949

5.9510

20.1

Mar-2006

240.00

37.74

166.22

25.2

184.531

14.1421

17.8

Apr-2006

236.00

38.28

174.12

25.02

177.99

12.3059

16.6

May-2006

246.00

40.29

173.28

25.21

176.842

19.4986

17.9

Jun-2006

241.00

41.35

179.67

25.22

182.893

17.8245

19.5

Jul-2006

243.00

37.94

168.42

25

202.797

13.7011

16.7

Aug-2006

250.00

39.06

174.37

25.14

220.153

17.1325

15.7

Sep-2006

246.00

40.08

181.28

25.05

191.91

17.0605

16.1

Oct-2006

256.00

41.58

186.55

25.65

195.287

18.0716

14.7

Nov-2006

250.00

44.32

189.67

26.1

210.564

22.6269

14.9

Dec-2006

239.00

41.92

203.89

26.95

233.479

40.4718

14.7

Jan-2007

261.00

40.44

174.79

26.44

223.904

7.8542

12.6

Feb-2007

232.00

38.57

142.03

24.94

171.976

6.5746

12.6

Mar-2007

257.00

47.34

177.76

27.11

223.65

12.2157

17.6

Apr-2007

259.00

45.86

188.47

26.66

213.56

14.6766

17.4

May-2007

266.00

46.89

191.15

27.96

221.12

17.7775

18.1

Jun-2007

262.00

51.15

202.4

27.97

223.52

18.3120

19.4

Jul-2007

268.00

48.44

195.61

27.51

245.32

19.8683

18

Aug-2007

265.00

48.28

193.97

27.51

240.81

15.4582

17.5

Sep-2007

257.00

50.57

203.55

26.95

223.51

24.7323

18.9

Oct-2007

272.00

49.03

198.77

27.62

222.28

21.1956

17.9

Nov-2007

261.00

48.23

211.92

27.67

237.49

25.1356

17.3

Dec-2007

271.00

49.81

218.5

29.06

254.12

67.7690

17.4

Jan-2008

224.01

45.92

187.36

28.44

249.743

6.6502

15.4

Feb-2008

213.11

43.13

170.01

27.37

202.987

8.8449

15.4

Mar-2008

239.10

51.89

211.29

28.89

254.61

15.9165

17.8

Apr-2008

232.13

51.41

213.34

27.38

242.47

18.4833

15.7

May-2008

237.91

53.69

227

27.78

242.25

26.8180

16

Jun-2008

232.28

53.93

239.36

29.61

222.923

27.1270

16

Jul-2008

237.18

51.83

220.29

30.31

265.051

29.9385

14.7

Aug-2008

244.14

48.03

222.14

29.19

240.126

36.0258

12.8

Sep-2008

234.35

45.33

228.7

28.25

222.43

41.9183

11.4

Oct-2008

233.41

42.76

219.28

29.79

212.16

43.9156

8.2

Nov-2008

206.24

42.84

226.97

27.27

203.506

49.1718

5.4

Dec-2008

205.46

51.02

219.94

27.17

227.484

109.6393

5.7

Jan-2009

212.01

44.22

172.34

25.77

204.6

42.1759

11

Feb-2009

199.93

46.13

196.56

25.8

203.661

33.2695

11

Mar-2009

225.16

54.38

233.43

29.37

246.185

33.2695

8.3

Apr-2009

220.21

52.25

229.8

29.43

218.44

42.5924

7.3

May-2009

230.72

57.36

248.41

31.19

220.014

50.9313

8.9

Jun-2009

228.23

62.14

279.09

31.92

246.316

62.9944

10.7

Jul-2009

238.74

61.36

257.82

33.11

263.955

49.7445

10.8

Aug-2009

242.01

62.24

260.75

32.56

271.687

50.7265

12.3

Sep-2009

236.24

61.72

263.16

32.83

255.387

55.4840

13.9

Oct-2009

247.24

63.60

273.07

33.29

255.14

54.9122

16.1

Nov-2009

236.25

62.93

288.94

33.36

282.943

64.4570

19.2

Dec-2009

246.01

64.11

280.62

34.6

313.094

194.0332

18.5

Jan-2010

257.68

61.77

256.1

33.7

312.8

32.1840

12.8

Feb-2010

234.50

55.59

212.98

31.91

210.29

21.7840

12.8

Mar-2010

264.87

68.39

279.81

34.56

294.77

35.3916

18.1

Apr-2010

252.61

68.68

269.29

34.41

280.98

53.0815

17.8

May-2010

263.56

71.86

283.86

35.79

275.78

57.2350

16.5

Jun-2010

254.61

72.40

296.92

35.35

257.97

71.6981

13.7

Jul-2010

261.96

67.19

291.91

35.28

280.51

56.7347

13.4

Aug-2010

260.80

69.71

301.38

34.73

308.03

77.2319

13.9

Sep-2010

252.87

64.26

276.02

34.91

261.24

85.9048

13.3

Oct-2010

263.80

64.44

300.66

37.04

255.3

87.7507

13.1

Nov-2010

255.24

66.01

337.58

36.66

282.08

108.7950

13.3

Dec-2010

259.60

65.98

327.57

38.72

305.59

145.1310

13.5

Jan-2011

276.30

67.33

268.26

37.19

311.33

40.4357

14.9

Feb-2011

250.88

63.53

247.49

35.21

262.39

30.3833

14.9

Mar-2011

281.10

77.17

307.36

37.66

326.93

50.7976

14.8

Apr-2011

266.76

72.63

318.08

37.19

309.3

51.6567

13.4

May-2011

281.07

77.99

327.3

38.47

317.83

46.7197

13.3

Jun-2011

273.68

78.75

340.2

35.56

315.47

58.1232

15.1

Jul-2011

278.91

76.60

327.83

37.49

342.07

44.3029

14

Aug-2011

277.60

77.19

327.4

36.79

354.82

35.3043

13.5

Sep-2011

269.90

76.09

318

36.1

313.08

37.7091

13.8

Oct-2011

278.88

72.88

330

37.11

299.31

33.5598

13.2

Nov-2011

269.71

69.97

321

37.87

308.67

62.6549

12.4

Dec-2011

276.56

71.17

311

39.23

352.53

139.3327

12.8

Jan-2012

278.31

68.02

256

39.68

294.31

12.2822

21.3

Feb-2012

259.08

71.27

292

36.86

314.57

17.6494

21.3

Mar-2012

286.28

83.17

295

38.37

351.91

29.7075

11.9

Apr-2012

276.83

79.98

310

36.95

305.15

29.9584

9.3

May-2012

287.59

81.56

315

38.33

311.69

40.0564

9.6

Jun-2012

263.37

83.44

315

35.98

293.61

48.0970

9.5

Jul-2012

253.33

81.52

310

37.6

329.11

47.9632

9.2

Aug-2012

251.95

78.74

307

37.74

327.4

45.7842

8.9

Sep-2012

253.77

80.58

315

38.76

287.65

72.6580

9.2

Oct-2012

267.68

81.17

315

39.91

293.9

81.0127

9.6

Nov-2012

268.56

80.96

329.8

41.61

320.44

81.8002

10.1

Dec-2012

276.72

81.45

319.8

43.12

356.96

145.0460

10.3

Jan-2013

281.18

81.60

310

42.89

371.46

20.9936

8.9

Feb-2013

250.98

76.67

240

37.76

271.52

16.6321

8.9

Mar-2013

276.51

89.61

290

40.83

352.7

29.3815

8.9

Apr-2013

254.96

87.61

305

38.32

330.76

43.8956

9.3

May-2013

262.20

91.19

307

39.06

325.64

46.7074

9.2

Jun-2013

252.65

90.83

304

39.6

324.009

58.3211

8.9

Jul-2013

259.72

90.75

300

40.3

375.095

45.8139

9.7

Aug-2013

265.93

91.94

300

39.74

395.108

51.5394

10.4

Sep-2013

269.87

93.55

315

38.65

332.389

56.4273

10.2

Oct-2013

284.04

92.81

320

41.08

339.731

59.5572

10.3

Nov-2013

273.06

90.32

320a

40.17

358.533

82.1137

10

Dec-2013

282.81

90.41

320a

42.02

400.846

154.3621

9.7

Jan-2014

276.61

87.08

290

38.61

361.393

20.0205

9

Feb-2014

233.74

78.65

245

40.17

322.516

20.0205

8.6

Mar-2014

262.20

95.07

305

41.92

375.48

24.9900

8.7

Apr-2014

245.63

92.50

301

39.58

341.19

42.8600

8.7

May-2014

258.23

96.82

300

40.33

340.27

37.9710

8.7

Jun-2014

247.86

98.04

298

41.83

346.06

58.6730

9.2

Jul-2014

252.46

94.76

301

41.08

366.4

62.0220

9

Aug-2014

261.05

94.98

302

41.39

352.8

68.9660

6.9

Sep-2014

253.87

95.75

292

42.02

314.646

89.6480

8

Oct-2014

264.37

95.25

291

43.51

320.6

86.7260

7.7

Nov-2014

253.85

92.05

330

42.25

345.5

80.1580

7.2

Dec-2014

254.71

98.22

330b

44.58

398

188.0780

7.9

Jan-2015

254.17

88.31

290

42.94

403.05

30.1200

6.8

Feb-2015

212.13

79.76

240

39.7

291.43

16.4760

6.8

Mar-2015

236.34

97.56

305

44.69

349.72

42.6590

5.6

Apr-2015

222.74

96.41

298.02

43.13

340.9

50.6390

5.9

May-2015

229.46

98.48

309.39

43.92

344.38

54.6410

6.1

Jun-2015

220.03

98.43

326.72

43.35

336.35

63.9650

6.8

Jul-2015

225.23

92.30

306.66

43.54

365.77

78.8580

6

Aug-2015

224.22

94.50

308.63

44.34

377.79

69.4280

6.1

Sep-2015

216.40

94.69

312.68

44.43

314.59

73.7710

5.7

Oct-2015

225.12

94.27

316.95

44.25

310.72

74.8830

5.6

Nov-2015

215.36

93.96

320.23

43.92

353.18

90.6180

6.2

Dec-2015

227.04

95.38

316.59

45.83

385.59

189.4740

5.9

Jan-2016

281.26

81.14

256.73

43.54

157.18

20.67

5.9

Feb-2016

235.47

81.14

256.73

43.54

157.18

19.23

6.8

Mar-2016

274.33

99.23

293.80

44.91

364.19

42.70

6.8

Apr-2016

261.00

96.68

268.03

44.75

328.94

61.90

6

May-2016

265.75

99.46

263.75

44.23

330.00

73.00

6

Jun-2016

257.00

100.72

277.54

45.08

345.67

86.50

6.2

Jul-2016

263.49

95.94

270.01

45.32

388.85

65.73

6

Aug-2016

279.31

97.91

278.09

44.28

413.77

63.60

6.3

Sep-2016

285.60

98.09

276.96

43.80

361.23

109.01

6.1

Oct-2016

307.21

97.68

281.85

47.05

355.53

81.06

6.1

Nov-2016

304.93

95.40

308.01

45.77

379.7

76.51

6.2

Dec-2016

315.79

95.711

31097.7

47.822

423.58

101.591

6

  1. aCoal outputs from Nov and Dec 2013 were not published, and the missing data are replaced with data from Oct 2013
  2. bCoal output from Dec 2014 was not published, and the missing data are replaced with data from Nov 2014

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Wang, Y., Lu, X. (2021). Random Forest Regression Model Application for Prediction of China’s Railway Freight Volume. In: Hernández, J.E., Li, D., Jimenez-Sanchez, J.E., Cedillo-Campos, M.G., Wenping, L. (eds) Collaborative Logistics and Intermodality. Springer, Cham. https://doi.org/10.1007/978-3-030-50958-3_6

Download citation

Publish with us

Policies and ethics