Skip to main content

Advertisement

Log in

Data mining for unemployment rate prediction using search engine query data

  • Special Issue Paper
  • Published:
Service Oriented Computing and Applications Aims and scope Submit manuscript

Abstract

Unemployment rate prediction has become critically significant, because it can help government to make decision and design policies. In previous studies, traditional univariate time series models and econometric methods for unemployment rate prediction have attracted much attention from governments, organizations, research institutes, and scholars. Recently, novel methods using search engine query data were proposed to forecast unemployment rate. In this paper, a data mining framework using search engine query data for unemployment rate prediction is presented. Under the framework, a set of data mining tools including neural networks (NNs) and support vector regressions (SVRs) is developed to forecast unemployment trend. In the proposed method, search engine query data related to employment activities is firstly extracted. Secondly, feature selection model is suggested to reduce the dimension of the query data. Thirdly, various NNs and SVRs are employed to model the relationship between unemployment rate data and query data, and genetic algorithm is used to optimize the parameters and refine the features simultaneously. Fourthly, an appropriate data mining method is selected as the selective predictor by using the cross-validation method. Finally, the selective predictor with the best feature subset and proper parameters is used to forecast unemployment trend. The empirical results show that the proposed framework clearly outperforms the traditional forecasting approaches, and support vector regression with radical basis function (RBF) kernel is dominant for the unemployment rate prediction. These findings imply that the data mining framework is efficient for unemployment rate prediction, and it can strengthen government’s quick responses and service capability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Askitas N, Zimmermann KF (2009) Google econometrics and unemployment forecasting. Appl Econom Q 55(2):107–120

    Article  Google Scholar 

  2. Blasco N, Corredor P, Del Rio C, Santamaria R (2005) Bad news and Dow Jones make the Spanish stocks go round. Eur J Oper Res 163(1):253–275

    Article  MATH  Google Scholar 

  3. Chen CI (2008) Application of the novel nonlinear grey Bernoulli model for forecasting unemployment rate. Chao Solitons Fractals 37(1):278–287

    Article  MATH  Google Scholar 

  4. Choi H, Varian H (2009) Predicting initial claims for unemployment benefits. Google technical report

  5. Choi H, Varian H (2009) Predicting the present with Google trends. Google technical report

  6. D’Amuri F (2009) Predicting unemployment in short samples with internet job search query data. MPRA paper no. 18403:1–17

  7. D’Amuri F, Marcucci J (2009) Google it! forecasting the US unemployment rate with a Google job search index. MPRA Paper No. 18248:1–52

  8. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS (2009) Detecting influenza epidemics using search engine query data. Nature 457(19):1012–1014

    Article  Google Scholar 

  9. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  10. Harvill JL, Ray BK (2005) A note on multi-step forecasting with functional coefficient autoregressive models. Int J Forecast 21(4):717–727

    Article  Google Scholar 

  11. Keilis-Borok VI, Soloviev AA, Allegre CB, Sobolevskii AN (2005) Patterns of macroeconomic indicators preceding the unemployment rise in Western Europe and the USA. Pattern Recogn 38(3):423–435

    Article  MATH  Google Scholar 

  12. Krolzig HM, Marcellino M (2002) A Markov-switching vector equilibrium correction model of the UK labour market. Empir Econ 27:233–254

    Article  Google Scholar 

  13. Lahiani A, Scaillet O (2009) Testing for threshold effect in ARFIMA models: application to US unemployment rate data. Int J Forecast 25(2):418–428

    Google Scholar 

  14. Lan KC, Ho KS, Luk RWP, Yeung DS (2005) FNDS: a dialogue-based system for accessing digested financial news. J Syst Softw 78(2):180–193

    Google Scholar 

  15. Milas C, Rothman P (2008) Out-of-sample forecasting of unemployment rates with pooled STVECM forecasts. Int J Forecast 24(1):101–121

    Google Scholar 

  16. Proietti T (2003) Forecasting the US unemployment rate. Comput Stat Data Anal 42(3):451–476

    Article  MathSciNet  MATH  Google Scholar 

  17. Schanne N, Wapler R (2010) Regional unemployment forecasts with spatial interdependencies. Int J Forecast 26(4):908–926

    Article  Google Scholar 

  18. Schumaker RP, Chen H (2009) A quantitative stock prediction system based financial news. Inform Process Manag 45(5):571–583

    Article  Google Scholar 

  19. Suhoy T (2009) Query indices and a 2008 downturn: Israeli data. Bank of Israel discussion paper

  20. Tashman LJ (2000) Out-of-sample tests of forecast accuracy: an analysis review. Int J Forecast 16(4):437–450

    Article  Google Scholar 

  21. Terui N, van Dijk HK (2002) Combined forecasts from linear and nonlinear time series models. Int J Forecast 18(3):421–438

    Article  Google Scholar 

  22. Vijverberg CPC (2009) A time deformation model and its time-varying autocorrelation: an application to US unemployment data. Int J Forecast 25(1):128–145

    Google Scholar 

  23. Xu W, Han ZW, Ma J (2010) A neural network based approach to detect influenza epidemics using search engine query data. In: Proceeding of the ninth international conference on machine learning and cybernetics, Qingdao, China, pp 1408–1412

  24. Xu W, Zheng T, Li Z (2011) A neural network based forecasting method for the unemployment rate prediction using the search engine query data. In: Proceeding of the eighth IEEE international conference on e-business engineering, Beijing, China, pp 9–15

  25. Xu W, Li Z, Chen Q (2012) Forecasting the unemployment rate by neural networks using search engine query data. In: Proceeding of the 45th Hawaii international conference on system sciences, Hawaii, US, pp 3591–3599

Download references

Acknowledgments

This research work was partly supported by 973 Project (Grant No. 2012CB316205), National Natural Science Foundation of China (Grant No. 71001103) and Beijing Natural Science Foundation (No. 9122013).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Xu.

Appendix: The top 100 search engine query data

Appendix: The top 100 search engine query data

No.

Key words

No.

Key words

1

filing unemployment

51

ohio unemployment rate

2

unemployment filing for

52

unemployment ny

3

unemployment office

53

unemployment compensation

4

file for unemployment

54

unemployment in az

5

unemployment file for

55

to apply for unemployment

6

unemployment state

56

unemployment insurance claim

7

state of unemployment

57

unemployment department of labor

8

insurance unemployment

58

department of labor unemployment

9

washington unemployment

59

labor department unemployment

10

unemployment file

60

unemployment check

11

unemployment insurance

61

unemployment for mn

12

unemployment apply

62

unemployment in indiana

13

department of unemployment

63

unemployment in california

14

unemployment website

64

snag a job

15

unemployment application

65

unemployment grants

16

unemployment new york

66

unemployment in pennsylvania

17

washington state unemployment

67

unemployment benefit insurance

18

Wisconsinunemployment benefits

68

claim unemployment benefit

19

insurance for unemployment

69

part time unemployment

20

apply for unemployment

70

security jobs

21

unemployment claims

71

new york unemployment benefit

22

unemployment apply for

72

unemployment insurance benefit

23

apply for unemployment

73

unemployment dol

24

unemployment ca

74

unemployment info

25

unemployment services

75

unemployment commission

26

unemployment security

76

michigan unemployment benefits

27

unemployment

77

weekly unemployment insurance

28

to file unemployment

78

weekly unemployment benefits

29

unemployment benefits

79

nyc unemployment benefits

30

file for unemployment online

80

green jobs

31

ohio unemployment benefits

81

how to claim unemployment

32

unemployment file claims

82

unemployment rate

33

to file for unemployment

83

unemployment insurance benefits

34

unemployment benefits pa

84

unemployment weekly benefits

35

unemployment benefit

85

online unemployment application

36

nys dept labor

86

unemployment rate ny

37

state unemployment benefit

87

jobs in usa

38

connecticut unemployment benefits

88

new york unemployment benefits

39

dept of unemployment

89

benefits for unemployment

40

nys dept of labor

90

police jobs

41

for unemployment benefits

91

dc unemployment

42

uimn.org

92

unemployment in kansas

43

unemployment in michigan

93

mass unemployment benefits

44

unemployment benefit claim

94

unemployment online

45

unemployment payment

95

unemployment in florida

46

unemployment in colorado

96

eligible for unemployment

47

apply for unemployment online

97

benefits of unemployment insurance

48

unemployment benefits insurance

98

unemployment eligibility

49

application for unemployment

99

construction jobs

50

benefits unemployment insurance

100

unemployment rate recession

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, W., Li, Z., Cheng, C. et al. Data mining for unemployment rate prediction using search engine query data. SOCA 7, 33–42 (2013). https://doi.org/10.1007/s11761-012-0122-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11761-012-0122-2

Keywords

Navigation