Skip to main content
Log in

Creating Automatic Connections for Personal Knowledge Management

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

The field of Personal Knowledge Management (PKM) has seen a surge in popularity in recent years. Interestingly, Natural Language Processing (NLP) and Large Language Models are also becoming mainstream, but PKM has not seen much integration with NLP. With this motivation, this article first introduces a methodology to automatically interconnect isolated text collections using NLP techniques combined with Knowledge Graphs. The text connections are generated by exploring the semantic relatedness of the texts and the concepts they share. The article proceeds to describe PKM Assistants that incorporate the methodology to assist users in understanding and exploring the knowledge contained in text collections using a Knowledge Management tool called Tana. The article continues with an assessment of the methodology using a text collection composed of several books and book passages collected for each book. Finally, the article concludes with a discussion of the proposed methodology, with special attention to the potential use cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data Availability Statement

Data is available.

Notes

  1. https://obsidian.md/.

  2. https://roamresearch.com/.

  3. https://tana.inc/.

  4. https://docs.deeppavlov.ai/en/0.9.0/features/models/ner.html.

  5. https://spacy.io/usage/large-language-models.

  6. https://spacy.io/usage/large-language-models.

  7. https://obsidian.md/.

  8. https://tana.inc/.

  9. https://readwise.io/.

  10. https://github.com/fisfraga/Automatic-Connections-PKM.

  11. https://juggl.io/.

  12. https://platform.openai.com/docs/.

  13. https://platform.openai.com/docs/.

References

  1. Ahrens S. How to Take Smart Notes: One Simple Technique to Boost Writing, Learning and Thinking. Sönke Ahrens, - (2017)

  2. Forte T. Building a Second Brain: A Proven Method to Organize Your Digital Life and Unlock Your Creative Potential vol. 1. Atria Books, - (2022)

  3. Fraga F, Poggi M, Casanova M, Leme L. On the Automatic Generation of Knowledge Connections, pp. 43–54 (2023). https://doi.org/10.5220/0011781100003467

  4. Mendes P.N, Jakob M, García-Silva A, Bizer C. DBpedia spotlight: shedding light on the web of documents. In: Proc. 7th International Conference on Semantic Systems, pp. 1–8 (2011). https://doi.org/10.1145/2063518.2063519

  5. Chabchoub M, Gagnon M, Zouaq A. FICLONE: Improving DBpedia Spotlight Using Named Entity Recognition and Collective Disambiguation 5(1), 17 (2018)

  6. Finkel J.R, Grenager T, Manning C.D. Incorporating non-local information into information extraction systems by gibbs sampling. In: Proc. 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pp. 363–370 (2005)

  7. Geiß J, Spitz A, Gertz M. NECKAr: A Named Entity Classifier for Wikidata, pp. 115–129 (2018). https://doi.org/10.1007/978-3-319-73706-5_10

  8. Shanaz A.L.F, Ragel R.G. Named entity extraction of wikidata items. In: 2019 14th Conference on Industrial and Information Systems (ICIIS), pp. 40–45 (2019).https://doi.org/10.1109/ICIIS47346.2019.9063300

  9. Fahl W, Holzheim T, Lange C, Decker S. Semantification of ceur-ws with wikidata as a target knowledge graph (2023)

  10. Becker M, Korfhage K, Frank A. COCO-EX: A tool for linking concepts from texts to ConceptNet. In: Proc. 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pp. 119–126 (2021)

  11. Fang S, Huang Z, He M, Tong S, Huang X, Liu Y, Huang J, Liu Q. Guided Attention Network for Concept Extraction, vol. 2, pp. 1449–1455 (2021). https://doi.org/10.24963/ijcai.2021/200 . https://www.ijcai.org/proceedings/2021/200

  12. Li J, Sun A, Han J, Li C. A Survey on Deep Learning for Named Entity Recognition. IEEE Trans Knowl Data Eng. 2020;34(1):50–70. https://doi.org/10.1109/TKDE.2020.2981314.

    Article  Google Scholar 

  13. Canales RF, Murillo EC. Evaluation of Entity Recognition Algorithms in Short Texts. CLEI Electronic Journal. 2017;20(1):13.

    Google Scholar 

  14. SpazioDati . Dandelion API Semantic Text Analytics as a service (2012). http://www.dandelion.eu

  15. Fetahu B, Kar S, Chen Z, Rokhlenko O, Malmasi S. SemEval-2023 task 2: Fine-grained multilingual named entity recognition (MultiCoNER 2). In: Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pp. 2247–2265. Association for Computational Linguistics, Toronto, Canada (2023). https://doi.org/10.18653/v1/2023.semeval-1.310 . https://aclanthology.org/2023.semeval-1.310

  16. Gomaa WH, Fahmy AA. A Survey of Text Similarity Approaches. International Journal of Computer Applications. 2013;68(13):13–8. https://doi.org/10.5120/11638-7118.

    Article  Google Scholar 

  17. Speer R, Chin J, Havasi C. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. In: Proc. 31st AAAI Conference on Artificial Intelligence - AAAI’17, pp. 4444–4451 (2017)

  18. Yazdani M, Popescu-Belis A. Computing text semantic relatedness using the contents and links of a hypertext encyclopedia. Artif Intell. 2013;194:176–202. https://doi.org/10.1016/j.artint.2012.06.004.

    Article  MathSciNet  Google Scholar 

  19. Ni Y, Xu Q.K, Cao F, Mass Y, Sheinwald D, Zhu H.J, Cao S.S. Semantic Documents Relatedness using Concept Graph Representation. In: Proc. 9th ACM International Conference on Web Search and Data Mining, pp. 635–644 (2016). https://doi.org/10.1145/2835776.2835801

  20. Resnik P. Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: Proc. 14th International Joint Conference on Artificial Intelligence - Volume 1 - IJCAI’95, pp. 448–453 (1995)

  21. Piao G, Breslin J.G. Computing the semantic similarity of resources in dbpedia for recommendation purposes. In: Joint International Semantic Technology Conference, pp. 185–200. Springer, - (2015)

  22. Leal J.P, Rodrigues V, Queirós R. Computing semantic relatedness using dbpedia. In: Proc. 1st Symposium on Languages, Applications and Technologies. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, - (2012)

  23. Mikolov T, Sutskever I, Chen K, Corrado G.S, Dean J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013)

  24. Devlin J, Chang M.-W, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, pp. 4171–4186 (2019). https://doi.org/10.18653/v1/n19-1423

  25. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A.N, Kaiser L, Polosukhin I. Attention is All You Need. In: Proc. 31st International Conference on Neural Information Processing Systems - NIPS’17, pp. 6000–6010 (2017)

  26. Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942 (2019)

  27. Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  28. Sanh V, Debut L, Chaumond J, Wolf T. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv:1910.01108 (2019)

  29. Joshi M, Levy O, Zettlemoyer L, Weld D. BERT for Coreference Resolution: Baselines and Analysis. In: Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5803–5808. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1588 . https://aclanthology.org/D19-1588

  30. Reimers N, Gurevych I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In: Proc. 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pp. 3982–3992 (2019)

  31. Dickson B. How to use LLMs to create custom embedding models. https://bdtechtalks.com/2024/01/08/microsoft-llm-embeddings/

  32. Jiang T, Huang S, Luan Z, Wang D, Zhuang F. Scaling Sentence Embeddings with Large Language Models (2023)

  33. Becker M, Korfhage K, Paul D, Frank A. CO-NNECT: A Framework for Revealing Commonsense Knowledge Paths as Explicitations of Implicit Knowledge in Texts. In: Proc. 14th International Conference on Computational Semantics (IWCS), pp. 21–32 (2021). https://aclanthology.org/2021.iwcs-1.3

  34. Dessì D, Osborne F, Recupero DR, Buscaldi D, Motta E. Generating Knowledge Graphs by Employing Natural Language Processing and Machine Learning Techniques within the Scholarly Domain. Futur Gener Comput Syst. 2021;116:253–64. https://doi.org/10.1016/j.future.2020.10.026.

    Article  Google Scholar 

  35. Auer S, Oelen A, Haris M, Stocker M, D’Souza J, Farfar KE, Vogt L, Prinz M, Wiens V, Jaradeh MY. Improving access to scientific literature with knowledge graphs. Bibliothek Forschung und Praxis. 2020;44(3):516–29. https://doi.org/10.1515/bfp-2020-2042.

    Article  Google Scholar 

  36. Ilkou E. Personal Knowledge Graphs: Use Cases in e-Learning Platforms. In: Proc. WWW ’22: Companion Proceedings of the Web Conference 2022, pp. 344–348 (2022). https://doi.org/10.1145/3487553.3524196

  37. Blanco-Fernández Y, Gil-Solla A, Pazos-Arias JJ, Ramos-Cabrer M, Daif A, López-Nores M. Distracting users as per their knowledge: Combining linked open data and word embeddings to enhance history learning. Expert Syst Appl. 2020;143: 113051. https://doi.org/10.1016/j.eswa.2019.113051.

    Article  Google Scholar 

  38. Li S, Xu E. Obsidian [Computer software] (2020). https://obsidian.md/

  39. Vassbotn T, Kriken O. Tana [Computer Software] (2022). https://tana.inc/

  40. Brown T.B, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh,A, Ziegler D.M, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D. Language Models are Few-Shot Learners. arXiv:2005.14165 [cs] (2020)

  41. Mantel N. The detection of disease clustering and a generalized regression approach. Cancer research 27(2 Part 1), 209–220 (1967)

  42. Swinscow T.D.V, Campbell M.J. Statistics at Square One. Bmj London, - (2002)

  43. Burkhard R.A. Towards a framework and a model for knowledge visualization: Synergies between information and knowledge visualization. In: Knowledge and Information Visualization, pp. 238–255. Springer, - (2005)

  44. Clark A, Chalmers D. The extended mind analysis. 1998;58(1):7–19.

    Google Scholar 

  45. Bush V. As we may think. The atlantic monthly. 1945;176(1):101–8.

    Google Scholar 

  46. Engelbart D. Augmenting society’s collective IQs. In: Proc. 15th ACM Conference on Hypertext and Hypermedia, p. 1 (2004)

  47. Engelbart D.C. Toward high-performance organizations: A strategic role for groupware. In: Proc. GroupWare, vol. 92, pp. 3–5. Citeseer, - (1992)

Download references

Acknowledgements

This work was partly funded by FAPERJ under grant E-26/202.818/2017; by CAPES under grants 88881.310592-2018/01, 88881.134081/2016-01, and 88882.164913/2010-01; and by CNPq under grant 302303/2017-0.

Funding

The information is provided in the Acknowledgments above.

Author information

Authors and Affiliations

Authors

Contributions

The authors confirm contribution to the paper as follows: study conception and design: Felipe Poggi A. Fraga; analysis and interpretation of results: Felipe Poggi A. Fraga, Marcus Poggi; draft manuscript preparation: Marco A. Casanova, Luiz André P. Paes Leme. All authors reviewed the results and approved the final version of the manuscript.

Corresponding author

Correspondence to Felipe Poggi A. Fraga.

Ethics declarations

Conflict of Interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Research Involving Humans and/or Animals

The article does not involve experiments with humans and/or animals.

Informed Consent

The article does not involve experiments with humans.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Recent Trends on Enterprise Information Systems” guest edited by Joaquim Filipe, Michał Śmiałek, Alexander Brodsky and Slimane Hammoudi.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fraga, F.P.A., Poggi, M., Casanova, M.A. et al. Creating Automatic Connections for Personal Knowledge Management. SN COMPUT. SCI. 5, 525 (2024). https://doi.org/10.1007/s42979-024-02876-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-02876-4

Keywords

Navigation