Skip to main content

Inter-Generational Family Reconstitution with Enriched Ontologies

  • Conference paper
  • First Online:
Advances in Conceptual Modeling (ER 2021)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 13012))

Included in the following conference series:

Abstract

Enriching ontologies can measurably enhance research in digital curation. We support this claim by using an enriched ontology to address a well known, challenging problem: record linkage of historical records for intergenerational family reconstitution. An enriched ontology enables extraction of birth, death, and marriage records via linguistic grounding, curation of record-comprising information with pragmatic constraints and cultural normatives, and record linkage by evidential reasoning. The result is an automatic and highly accurate reconstruction of family trees. Empirical evidence shows that conceptual modeling theory can be applied to important real-world problems and yield excellent results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In [8], 880 personas of 9,279 were determined to have matches. From this training set, weights were estimated (e.g. \(4.6_{0908}\) for Birth Year, \(4.8_{9474}\) for Father’s Surname, \(0.0_{0176}\) for Birth Town). Lawson et al. argue that these weights should be universal, depending only on the chosen set of attributes. The technique for computing the weights is described by White [14].

References

  1. Abramitzky, R., Mill, R., Perez, S.: Linking individuals across historical sources: a fully automated approach (2018). Working Paper No. 1031

    Google Scholar 

  2. Bailey, M.J., Cole, C., Henderson, M., Massey, C.: How well do automated linking methods perform? Lessons from us historical data. J. Econ. Lit. 58(4), 997–1044 (2020). https://doi.org/10.1257/jel.20191526. https://www.aeaweb.org/articles?id=10.1257/jel.20191526

  3. Embley, D., Liddle, S., Park, J.: Increasing the quality of extracted information by reading between the lines. In: Comyn-Wattiau, I., du Mouza, C., Prat, N. (eds.) Ingénierie et management des systèmes d’information–Mélanges en l’honneur de Jacky Akoka. Éditions Cépaduès, Toulouse (2016)

    Google Scholar 

  4. Embley, D., Nagy, G.: Green interaction for extracting family information from OCR’d books. In: Proceedings of the 13th IAPR International Workshop on Document Analysis Systems, DAS 2018, pp. 127–132. IEEE Computer Society, Vienna, March 2018

    Google Scholar 

  5. Feigenbaum, J.: A machine learning approach to census record linking (2016). http://scholar.harvard.edu/files/jfeigenbaum/files/feigenbaumcensuslink

  6. Friedrichs, E., Pech, A.: Familienbuch des Kirchspiels Flögeln: bestehend aus den Dörfern Flögeln und Fickmühlen; vom Beginn der Kirchenbücher 1700 bis 1900. Deutsche Ortssippenbücher. Reihe A, E. Friedrichs, Bremerhaven (2000)

    Google Scholar 

  7. Grant, F.: Index to The Register of Marriages and Baptisms in the Parish of Kilbarchan, 1649–1772. J. Skinner & Company LTD., Edinburgh (1912)

    Google Scholar 

  8. Lawson, J., White, D., Price, B., Yamagata, R.: Probabilistic record linkage for genealogical research. Brigham Young Univ. Stud. 41(2), 161–174 (2002)

    Google Scholar 

  9. Miller Funeral Home Records, 1917–1950, Greenville, Ohio. Darke County Ohio Genealogical Society, Greenville, Ohio (1990)

    Google Scholar 

  10. Nagy, G.: Green information extraction from family books. SN Comput. Sci. 1(23), 1–23 (2019). https://doi.org/10.1007/s42979-019-0024-x

    Article  Google Scholar 

  11. Newcombe, H., Kennedy, J., Axford, S., James, A.: Automatic linkage of vital records. Science 130, 954–959 (1959)

    Article  Google Scholar 

  12. Packer, T.L., Embley, D.W.: Cost effective ontology population with data from lists in OCRed historical documents. In: Frinken, V., Barrett, B., Manmatha, R., Märgner, V. (eds.) HIP2013 Proceedings, pp. 44–52. ACM (2013)

    Google Scholar 

  13. Vanderpoel, G.: The Ely Ancestry: Lineage of RICHARD ELY of Plymouth, England. The Calumet Press, New York (1902)

    Google Scholar 

  14. White, D.: A review of the statistics of record linkage for genealogical research. In: Record Linkage Techniques–1997: Proceedings of an International Workshop and Exposition, pp. 362–373. National Academy Press, Washington DC, USA (1999)

    Google Scholar 

  15. Wilkinson, M.D., Dumontier, M., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 1–9 (2016)

    Article  Google Scholar 

  16. Woodfield, S.N., Seeger, S., Litster, S., Liddle, S.W., Grace, B., Embley, D.W.: Ontological deep data cleaning. In: Trujillo, J.C., et al. (eds.) ER 2018. LNCS, vol. 11157, pp. 100–108. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00847-5_9

    Chapter  Google Scholar 

Download references

Acknowledgements

We thank Emeritus Professor George Nagy, Rensselaer Polytechnic Institute, for the development of GreenQQ and gratefully acknowledge the work of Gary James (Jim) Norris, who created a complete extraction ground truth for Flögeln and developed GreenQQ templates to attain near 100% extraction accuracy.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen W. Liddle .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Embley, D.W., Liddle, S.W., Lonsdale, D.W., Woodfield, S.N. (2021). Inter-Generational Family Reconstitution with Enriched Ontologies. In: Reinhartz-Berger, I., Sadiq, S. (eds) Advances in Conceptual Modeling. ER 2021. Lecture Notes in Computer Science(), vol 13012. Springer, Cham. https://doi.org/10.1007/978-3-030-88358-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-88358-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-88357-7

  • Online ISBN: 978-3-030-88358-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics