Skip to main content
Log in

An integrated space–time framework for linkage discovery of big survey data

  • Published:
Spatial Information Research Aims and scope Submit manuscript

Abstract

In the realm of survey research, establishing connections within large datasets remains a challenge. This study aims to unveil underlying connections within extensive survey data, emphasizing the need for a more integrated approach to decipher intricate relationships among survey elements. Utilizing computational semantics, machine learning, and advanced spatiotemporal models, we developed an all-encompassing database. This novel database is adept at extracting and characterizing features from a multitude of survey studies, spotlighting relationships among metadata elements such as terms, variables, and topics. The derived relationships are systematically stored as connectivity matrices. These matrices not only quantify the degree of interconnectedness among features but also provide insights into their complex interplay. As a result, our system functions akin to a digital geographical data librarian. Beyond merely serving as a storage tool, this system facilitates interdisciplinary research. It equips researchers with the capability to discern connections between survey elements, enabling them to identify the most influential paths among features based on diverse criteria. Such a tool fosters cross-disciplinary integration and unveils potential ties between seemingly unrelated survey attributes, paving the way for breakthroughs in understanding and application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Data availability

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

  1. Ye, X., & Niyogi, D. (2022). Resilience of human settlements to climate change needs the convergence of urban planning and urban climate science. Computational Urban Science, 2(1), 1–4.

    Article  Google Scholar 

  2. Liverman, D. M., & Cuesta, R. M. R. (2008). Human interactions with the Earth system: People and pixels revisited. Earth Surface Processes and Landforms: The Journal of the British Geomorphological Research Group, 33(9), 1458–1471.

    Article  ADS  Google Scholar 

  3. Tan, J., Duan, Q., Xiao, C., He, C., & Yan, X. (2023). A brief review of the coupled human-Earth system modeling: Current state and challenges. The Anthropocene Review. https://doi.org/10.1177/20530196221149121

    Article  Google Scholar 

  4. Lu, L., Li, P., Kalacska, M., & Robinson, B. E. (2023). Environmental impacts of renting rangelands: Integrating remote sensing and household surveys at the parcel level. Environmental Research Letters, 18(7), 074005.

    Article  ADS  Google Scholar 

  5. Singh, K. K. (2022). Research Methodology in Social Science. KK Publications.

    Google Scholar 

  6. Lazer, D. M., Pentland, A., Watts, D. J., Aral, S., Athey, S., Contractor, N., & Wagner, C. (2020). Computational social science: Obstacles and opportunities. Science, 369(6507), 1060–1062.

    Article  ADS  CAS  PubMed  Google Scholar 

  7. Massey, D. S., & Denton, N. A. (1993). American apartheid: Segregation and the making of the underclass. Harvard University Press.

  8. Sampson, R. J., Raudenbush, S. W., & Earls, F. (1997). Neighborhoods and violent crime: A multilevel study of collective efficacy. Science, 277, 918–924.

    Article  CAS  PubMed  Google Scholar 

  9. Massey, D. S., & Fischer, M. J. (2006). The effect of childhood segregation on minority academic performance at selective colleges. Ethnic and Racial Studies, 29, 1–26.

    Article  Google Scholar 

  10. Charles, C. Z., Dinwiddie, G., & Massey, D. S. (2004). The continuing consequences of segregation: Family stress and college academic performance. Social Science Quarterly, 85, 1353–1373.

    Article  Google Scholar 

  11. Cassidy, K. D., Boutsen, L., Humphreys, G. W., & Quinn, K. A. (2014). Ingroup categorization affects the structural encoding of other-race faces: Evidence from the N170 event-related potential. Social Neuroscience, 9, 235–248.

    Article  PubMed  Google Scholar 

  12. Swanson, D. R. (1986). Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Perspectives in biology and medicine, 30, 7–18.

    Article  CAS  PubMed  Google Scholar 

  13. Swanson, D. R. (1986). Undiscovered public knowledge. The Library Quarterly, 56, 103–118.

    Article  Google Scholar 

  14. Swanson, D.R., & Smalheiser, N.R. (1996). Undiscovered Public Knowledge: A Ten-Year Update. KDD, pp. 295–298.

  15. Sebastian, Y., Siew, E.-G., & Orimaye, S. O. (2017). Emerging approaches in literature-based discovery: Techniques and performance review. The Knowledge Engineering Review, 32, e12.

    Article  Google Scholar 

  16. Hasan, K. S., & Ng, V. (2014). Automatic keyphrase extraction: A survey of the state of the art. ACL, 1, 1262–1273.

    Google Scholar 

  17. Medelyan, O., Frank, E., & Witten, I. H. (2009). Human-competitive tagging using automatic keyphrase extraction. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 3-Vol. 3, pp. 1318–1327.

  18. Su, X., & Khoshgoftaar, T. M. (2009). A survey of collaborative filtering techniques. Advances in artificial intelligence, 2009, 4.

    Article  Google Scholar 

  19. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    Google Scholar 

  20. Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 1, pp. 448–453.

  21. Strube, M., & Ponzetto, S. P. (2006). WikiRelate! Computing semantic relatedness using Wikipedia. AAAI, 6, 1419–1424.

    Google Scholar 

  22. Harispe, S., Ranwez, S., Janaqi, S., & Montmain, J. (2015). Semantic similarity from natural language and ontology analysis. Synthesis Lectures on Human Language Technologies, 8, 1–254.

    Article  Google Scholar 

  23. Zhu, G., & Iglesias, C. A. (2017). Computing semantic similarity of concepts in knowledge graphs. IEEE Transactions on Knowledge and Data Engineering, 29, 72–85.

    Article  Google Scholar 

  24. Pedersen, T., Pakhomov, S. V., Patwardhan, S., & Chute, C. G. (2007). Measures of semantic similarity and relatedness in the biomedical domain. Journal of Biomedical Informatics, 40, 288–299.

    Article  PubMed  Google Scholar 

  25. Aouicha, M. B., Taieb, M. A. H., & Hamadou, A. B. (2016). SISR: System for integrting semntic reltedness nd similrity mesures. Soft Computing., 22, 1855–1879.

    Article  Google Scholar 

  26. Beheshti, A., Yakhchi, S., Mousaeirad, S., Ghafari, S. M., Goluguri, S. R., & Edrisi, M. A. (2020). Towards cognitive recommender systems. Algorithms, 13(8), 176.

    Article  Google Scholar 

  27. Yu, X., Ren, X., Sun, Y., Gu, Q., Sturt, B., Khandelwal, U., Norick, B., & Han, J. (2014). Personalized entity recommendation: A heterogeneous information network approach. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, pp. 283–292.

  28. Wang, H., Wang, N., & Yeung, D.-Y. (2015). Collaborative deep learning for recommender systems. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1235–1244.

  29. Chen, L., Xin, X., Wong, D., & Ding, Y. (2017). HCoM: Item-based similarity model for heterogeneous implicit feedback. In Mobile Data Management (MDM), 2017 18th IEEE International Conference On, pp. 40–49.

  30. Liu, M., Pan, W., Liu, M., Chen, Y., Peng, X., & Ming, Z. (2017). Mixed similarity learning for recommendation with implicit feedback. Knowledge-Based Systems, 119, 178–185.

    Article  Google Scholar 

  31. Zhu, X., Li, F., Chen, H., & Peng, Q. (2017). An efficient path computing model for measuring semantic similarity using edge and density. Knowledge and Information Systems, 55, 1–33.

    Google Scholar 

  32. Ganiz, M. C., Pottenger, W. M., & Janneck, C. D. (2005). Recent advances in literature based discovery. Technical report, LU-CSE-05-027 2005. Lehigh University, CSE Department.

  33. Wilkowski, B., Fiszman, M., Miller, C., Hristovski, D., Arabandi, S., Rosemblat, G., & Rindflesch, T. (2011). Discovery browsing with semantic predications and graph theory. In AMIA Annual Symposium Proceedings.

  34. Song, M., Heo, G. E., & Ding, Y. (2015). SemPathFinder: Semantic path analysis for discovering publicly unknown knowledge. Journal of Informetrics, 9, 686–703.

    Article  Google Scholar 

  35. Hahn-Powell, G., Valenzuela-Escárcega, M., & Surdeanu, M. (2017). Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph. In ACL, 103.

  36. Franzoni, V., & Milani, A. (2014). Heuristic semantic walk for concept chaining in collaborative networks. International Journal of Web Information Systems, 10, 85–103.

    Article  Google Scholar 

  37. Hogan, A. (2020). Resource description framework. In The Web of Data (pp. 59–109). Springer.

  38. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., Hellmann, S., Morsey, M., Kleef, P., & Auer, S. (2015). DBpedia–a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 6, 167–195.

    Article  Google Scholar 

  39. Hoffart, J., Suchanek, F. M., Berberich, K., & Weikum, G. (2013). YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence, 194, 28–61.

    Article  MathSciNet  Google Scholar 

  40. Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38, 39–41.

    Article  Google Scholar 

  41. Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C., Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., & Gruber, R. E. (2008). Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems, 26, 4.

    Article  Google Scholar 

  42. Çatalyürek, Ü. I., Aykanat, C., & Uçar, B. (2010). On two-dimensional sparse matrix partitioning: Models, methods, and a recipe. SIAM Journal on Scientific Computing, 32, 656–683.

    Article  MathSciNet  Google Scholar 

  43. Shang, J., Zhang, X., Liu, L., Li, S., & Han, J. (2020). Nettaxo: Automated topic taxonomy construction from text-rich network. Proceedings of the Web Conference, 2020, 1908–1919.

    Google Scholar 

  44. Abu-Salih, B. (2021). Domain-specific knowledge graphs: A survey. Journal of Network and Computer Applications, 185, 103076.

    Article  Google Scholar 

  45. Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., Melo, G. D., Gutierrez, C., & Zimmermann, A. (2021). Knowledge graphs. ACM Computing Surveys (Csur), 54(4), 1–37.

    Article  Google Scholar 

  46. Intrator, J., Tannen, J., & Massey, D. S. (2016). Segregation by race and income in the United States 1970–2010. Social Science Research, 60, 45–60.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Hall, A. (2014). Projecting regional change. Science, 346(6216), 1461–1462. https://doi.org/10.1126/science.aaa0629

    Article  ADS  CAS  PubMed  Google Scholar 

  48. Schelling, T. C. (1969). Models of segregation. The American Economic Review, 59, 488–493.

    Google Scholar 

  49. Patel, A., Crooks, A., & Koizumi, N. (2012). Slumulation: An agent-based modeling approach to slum formations. Journal of Artificial Societies and Social Simulation, 15(4), 2.

    Article  Google Scholar 

Download references

Funding

National Science Foundation, 2112356, 2122054, and 2232533, Xinyue Ye.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xinyue Ye.

Ethics declarations

Conflict of interest

The authors certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers’ bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or non-financial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ye, X., Lian, X., Xu, H. et al. An integrated space–time framework for linkage discovery of big survey data. Spat. Inf. Res. 32, 195–206 (2024). https://doi.org/10.1007/s41324-023-00553-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41324-023-00553-x

Keywords

Navigation