Secure and Trustworthy Provenance Collection for Digital Forensics

  • Adam BatesEmail author
  • Devin J. Pohly
  • Kevin R. B. Butler


Data provenance refers to the establishment of a chain of custody for information that can describe its generation and all subsequent modifications that have led to its current state. Such information can be invaluable for a forensics investigator. The first step to being able to make use of provenance for forensics purposes is to be able to ensure that it is collected in a secure and trustworthy fashion. However, the collection process along raises several significant challenges. In this chapter, we discuss past approaches to provenance collection from application to operating system level, and promote the notion of a provenance monitor to assure the complete collection of data. We examine two instantiations of the provenance monitor concept through the Hi-Fi and Linux Provenance Module systems, discussing the details of their design and implementation to demonstrate the complexity of collecting full provenance information. We consider the security of these schemes and raise challenges that future provenance systems must address to be maximally useful for practical forensic use.


User Space Trusted Platform Module Data Provenance Provenance System Digital Signature Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work draws in part from [5, 38, 48]. We would like to thank our co-authors of those works, including Patrick McDaniel, Thomas Moyer, Stephen McLaughlin, Erez Zadok, Marianne Winslett, and Radu Sion, as well as reviewers of those original papers who provided us with valuable feedback. This work is supported in part by the U.S. National Science Foundation under grants CNS-1540216, CNS-1540217, and CNS-1540128.


  1. 1.
    Aldeco-Pérez, R., Moreau, L.: Provenance-based auditing of private data use. In: Proceedings of the 2008 International Conference on Visions of Computer Science: BCS International Academic Conference. VoCS’08, pp. 141–152. British Computer Society, Swinton, UK (2008)Google Scholar
  2. 2.
    Bates, A., Mood, B., Valafar, M., Butler, K.: Towards secure provenance-based access control in cloud environments. In: Proceedings of the 3rd ACM Conference on Data and Application Security and Privacy, CODASPY ’13, pp. 277–284. ACM, New York, NY, USA (2013). doi: 10.1145/2435349.2435389
  3. 3.
    Bates, A., Butler, K., Haeberlen, A., Sherr, M., Zhou, W.: Let SDN be your eyes: secure forensics in data center networks. In: NDSS Workshop on Security of Emerging Network Technologies, SENT (2014)Google Scholar
  4. 4.
    Bates, A., Butler, K.R.B., Moyer, T.: Take only what you need: leveraging mandatory access control policy to reduce provenance storage costs. In: Proceedings of the 7th International Workshop on Theory and Practice of Provenance, TaPP’15 (2015)Google Scholar
  5. 5.
    Bates, A., Tian, D., Butler, K.R.B., Moyer, T.: Trustworthy whole-system provenance for the linux kernel. In: Proceedings of the 2015 USENIX Security Symposium (Security’15). Washington, DC, USA (2015)Google Scholar
  6. 6.
    Bellare, M., Canetti, R., Krawczyk, H.: Keyed hash functions and message authentication. In: Proceedings of Crypto’96, LNCS, vol. 1109, pp. 1–15 (1996)Google Scholar
  7. 7.
    Boneh, D., Lynn, B., Shacham, H.: Short signatures from the weil pairing. In: Boyd, C. (ed.) Advances in Cryptology—ASIACRYPT (2001)Google Scholar
  8. 8.
    Carata, L., Akoush, S., Balakrishnan, N., Bytheway, T., Sohan, R., Seltzer, M., Hopper, A.: A primer on provenance. Commun. ACM 57(5), 52–60 (2014). doi: 10.1145/2596628. Google Scholar
  9. 9.
    Catalano, D., Di Raimondo, M., Fiore, D., Gennaro, R.: Off-line/On-line signatures: theoretical aspects and experimental results. In: PKC’08: Proceedings of the Practice and Theory in Public Key Cryptography. 11th International Conference on Public Key Cryptography, pp. 101–120. Springer, Berlin, Heidelberg (2008)Google Scholar
  10. 10.
    Centers for Medicare & Medicaid Services: The health insurance portability and accountability act of 1996 (HIPAA). (1996)
  11. 11.
    Chapman, A., Jagadish, H., Ramanan, P.: Efficient provenance storage. In: Proceedings of the 2008 ACM Special Interest Group on Management of Data Conference, SIGMOD’08 (2008)Google Scholar
  12. 12.
    Chiticariu, L., Tan, W.C., Vijayvargiya, G.: DBNotes: a post-it system for relational databases based on provenance. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, SIGMOD’05 (2005)Google Scholar
  13. 13.
    Clark, D.D., Wilson, D.R.: A comparison of commercial and military computer security policies. In: Proceedings of the IEEE Symposium on Security and Privacy. Oakland, CA, USA (1987)Google Scholar
  14. 14.
    Department of Homeland Security: A Roadmap for Cybersecurity Research (2009)Google Scholar
  15. 15.
    Edwards, A., Jaeger, T., Zhang, X.: Runtime verification of authorization hook placement for the linux security modules framework. In: Proceedings of the 9th ACM Conference on Computer and Communications Security, CCS’02 (2002)Google Scholar
  16. 16.
    Even, S., Goldreich, O., Micali, S.: On-line/off-line digital signatures. In: Proceedings on Advances in Cryptology, CRYPTO ’89, pp. 263–275. Springer, New York, USA (1989).
  17. 17.
    Foster, I.T., Vöckler, J.S., Wilde, M., Zhao, Y.: Chimera: AVirtual data system for representing, querying, and automating data derivation. In: Proceedings of the 14th Conference on Scientific and Statistical Database Management, SSDBM’02 (2002)Google Scholar
  18. 18.
    Frew, J., Bose, R.: Earth system science workbench: a data management infrastructure for earth science products. In: Proceedings of the 13th International Conference on Scientific and Statistical Database Management, pp. 180–189. IEEE Computer Society (2001)Google Scholar
  19. 19.
    Ganapathy, V., Jaeger, T., Jha, S.: Automatic placement of authorization hooks in the linux security modules framework. In: Proceedings of the 12th ACM Conference on Computer and Communications Security, CCS ’05, pp. 330–339. ACM, New York, USA (2005). doi: 10.1145/1102120.1102164
  20. 20.
    Gao, C.Z., Yao, Z.A.: A further improved online/offline signature scheme. Fundam. Inf. 91, 523–532 (2009).
  21. 21.
    Gehani, A., Tariq, D.: SPADE: support for provenance auditing in distributed environments. In: Proceedings of the 13th International Middleware Conference, Middleware ’12 (2012)Google Scholar
  22. 22.
    Glavic, B., Alonso, G.: Perm: processing provenance and data on the same data model through query rewriting. In: Proceedings of the 25th IEEE International Conference on Data Engineering, ICDE ’09 (2009)Google Scholar
  23. 23.
    Hall, E.: The Arnolfini Betrothal: Medieval Marriage and the Enigma of Van Eyck’s Double Portrait. University of California Press, Berekely, CA (1994)Google Scholar
  24. 24.
    Hasan, R., Sion, R., Winslett, M.: The case of the fake picasso: preventing history forgery with secure provenance. In: Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST’09), FAST’09. San Francisco, CA, USA (2009)Google Scholar
  25. 25.
    Hicks, B., Rueda, S., St.Clair, L., Jaeger, T., McDaniel, P.: A logical specification and analysis for SELinux MLS policy. ACM Trans. Inf. Syst. Secur. 13(3), 26:1–26:31 (2010). doi: 10.1145/1805874.1805982
  26. 26.
    Holland, D.A., Bruan, U., Maclean, D., Muniswamy-Reddy, K.K., Seltzer, M.I.: Choosing a data model and query language for provenance. In: Proceedings of the 2nd International Provenance and Annotation Workshop, IPAW’08 (2008)Google Scholar
  27. 27.
    Jaeger, T., Edwards, A., Zhang, X.: Consistency analysis of authorization hook placement in the linux security modules framework. ACM Trans. Inf. Syst. Secur. 7(2), 175–205 (2004). doi: 10.1145/996943.996944 CrossRefGoogle Scholar
  28. 28.
    Jones, S.N., Strong, C.R., Long, D.D.E., Miller, E.L.: Tracking emigrant data via transient provenance. In: 3rd Workshop on the Theory and Practice of Provenance, TAPP’11 (2011)Google Scholar
  29. 29.
    Kent, S., Atkinson, R.: RFC 2406: IP Encapsulating Security Payload (ESP) (1998)Google Scholar
  30. 30.
    Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978). doi: 10.1145/359545.359563 Google Scholar
  31. 31.
    Lampson, B.W.: A note on the confinement problem. Commun. ACM 16(10), 613–615 (1973)CrossRefGoogle Scholar
  32. 32.
    Lee, K.H., Zhang, X., Xu, D.: High accuracy attack provenance via binary-based execution partition. In: Proceedings of the 20th ISOC Network and Distributed System Security Symposium, NDSS (2013)Google Scholar
  33. 33.
    Lee, K.H., Zhang, X., Xu, D.: LogGC: garbage collecting audit log. In: Proceedings of the 2013 ACM Conference on Computer and Communications Security, CCS (2013)Google Scholar
  34. 34.
    Lyle, J., Martin, A.: Trusted computing and provenance: better together. In: 2nd Workshop on the Theory and Practice of Provenance, TaPP’10 (2010)Google Scholar
  35. 35.
    Ma, S., Lee, K.H., Kim, C.H., Rhee, J., Zhang, X., Xu, D.: Accurate, low cost and instrumentation-free security audit logging for windows. In: Proceedings of the 31st Annual Computer Security Applications Conference, ACSAC 2015, pp. 401–410. ACM (2015). 22. doi: 10.1145/2818000.2818039
  36. 36.
    Ma, S., Zhang, X., Xu, D.: ProTracer: towards practical provenance tracing by alternating between logging and tainting. In: Proceedings of the 23rd ISOC Network and Distributed System Security Symposium, NDSS (2016)Google Scholar
  37. 37.
    Macko, P., Seltzer, M.: A general-purpose provenance library. In: 4th Workshop on the Theory and Practice of Provenance, TaPP’12 (2012)Google Scholar
  38. 38.
    McDaniel, P., Butler, K., McLaughlin, S., Sion, R., Zadok, E., Winslett, M.: Towards a secure and efficient system for end-to-end provenance. In: Proceedings of the 2nd conference on Theory and practice of provenance. USENIX Association, San Jose, CA, USA (2010)Google Scholar
  39. 39.
    Metasploit Project.
  40. 40.
    Moreau, L., Groth, P., Miles, S., Vazquez-Salceda, J., Ibbotson, J., Jiang, S., Munroe, S., Rana, O., Schreiber, A., Tan, V., Varga, L.: The provenance of electronic data. Commun. ACM 51(4), 52–58 (2008). Google Scholar
  41. 41.
    Mouallem, P., Barreto, R., Klasky, S., Podhorszki, N., Vouk, M.: Tracking files in the kepler provenance framework. In: SSDBM 2009: Proceedings of the 21st International Conference on Scientific and Statistical Database Management (2009)Google Scholar
  42. 42.
    Muniswamy-Reddy, K.K., Holland, D.A., Braun, U., Seltzer, M.: Provenance-aware storage systems. In: Proceedings of the Annual Conference on USENIX ’06 Annual Technical Conference, Proceedings of the 2006 Conference on USENIX Annual Technical Conference (2006)Google Scholar
  43. 43.
    Muniswamy-Reddy, K.K., Braun, U., Holland, D.A., Macko, P., Maclean, D., Margo, D., Seltzer, M., Smogor, R.: Layering in provenance systems. In: Proceedings of the 2009 Conference on USENIX Annual Technical Conference, ATC’09 (2009)Google Scholar
  44. 44.
    Nguyen, D., Park, J., Sandhu, R.: Dependency path patterns as the foundation of access control in provenance-aware systems. In: Proceedings of the 4th USENIX Conference on Theory and Practice of Provenance. TaPP’12, p. 4. USENIX Association, Berkeley, CA, USA (2012)Google Scholar
  45. 45.
    Ni, Q., Xu, S., Bertino, E., Sandhu, R., Han, W.: An access control language for a general provenance model. In: Secure Data Management (2009)Google Scholar
  46. 46.
    Pancerella, C., Hewson, J., Koegler, W., Leahy, D., Lee, M., Rahn, L., Yang, C., Myers, J.D., Didier, B., McCoy, R., Schuchardt, K., Stephan, E., Windus, T., Amin, K., Bittner, S., Lansing, C., Minkoff, M., Nijsure, S., von Laszewski, G., Pinzon, R., Ruscic, B., Wagner, A., Wang, B., Pitz, W., Ho, Y.L., Montoya, D., Xu, L., Allison, T.C., Green Jr., W.H., Frenklach, M.: Metadata in the collaboratory for multi-scale chemical science. In: Proceedings of the 2003 International Conference on Dublin Core and Metadata Applications: Supporting Communities of Discourse and Practice—Metadata Research & Applications, pp. 13:1–13:9. Dublin Core Metadata Initiative (2003)Google Scholar
  47. 47.
    Park, J., Nguyen, D., Sandhu, R.: A provenance-based access control model. In: Proceedings of the 10th Annual International Conference on Privacy, Security and Trust (PST), pp. 137–144 (2012). doi: 10.1109/PST.2012.6297930
  48. 48.
    Pohly, D.J., McLaughlin, S., McDaniel, P., Butler, K.: Hi-Fi: collecting high-fidelity whole-system provenance. In: Proceedings of the 2012 Annual Computer Security Applications Conference, ACSAC ’12. Orlando, FL, USA (2012)Google Scholar
  49. 49.
    Postel, J.: RFC 791: Internet Protocol (1981)Google Scholar
  50. 50.
    Revkin, A.C.: Hacked E-mail is new fodder for climate dispute. New York Times 20 (2009)Google Scholar
  51. 51.
    Sailer, R., Zhang, X., Jaeger, T., van Doorn, L.: Design and implementation of a TCG-based integrity measurement architecture. In: Proceedings of the 13th USENIX Security Symposium. San Diego, CA, USA (2004)Google Scholar
  52. 52.
    Sar, C., Cao, P.: Lineage file system. (2005)
  53. 53.
    Shamir, A., Tauman, Y.: Improved online/offline signature schemes. In: Advances in Cryptology—CRYPTO 2001 (2001)Google Scholar
  54. 54.
    Silva, C.T., Anderson, E.W., Santos, E., Freire, J.: Using vistrails and provenance for teaching scientific visualization. Comput. Graph. Forum 30(1), 75–84 (2011)Google Scholar
  55. 55.
    Sion, R.: Strong WORM. In: Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems (2008)Google Scholar
  56. 56.
    Spillane, R.P., Sears, R., Yalamanchili, C., Gaikwad, S., Chinni, M., Zadok, E.: Story book: an efficient extensible provenance framework. In: First Workshop on the Theory and Practice of Provenance. USENIX (2009)Google Scholar
  57. 57.
    Sundararaman, S., Sivathanu, G., Zadok, E.: Selective versioning in a secure disk system. In: Proceedings of the 17th USENIX Security Symposium (2008)Google Scholar
  58. 58.
    Symantec: Symantec security response. (2015)
  59. 59.
    The Netfilter Core Team: The netfilter project: packet mangling for linux 2.4., (1999)
  60. 60.
    U.S. Code: 22 U.S. Code §2778—control of arms exports and imports. (1976)
  61. 61.
    Xie, Y., Muniswamy-Reddy, K.K., Long, D.D.E., Amer, A., Feng, D., Tan, Z.: Compressing provenance graphs. In: Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (2011)Google Scholar
  62. 62.
    Xie, Y., Feng, D., Tan, Z., Chen, L., Muniswamy-Reddy, K.K., Li, Y., Long, D.D.: A hybrid approach for efficient provenance storage. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM ’12 (2012)Google Scholar
  63. 63.
    Zanussi, T., Yaghmour, K., Wisniewski, R., Moore, R., Dagenais, M.: Relayfs: an efficient unified approach for transmitting data from kernel to user space. In: Proceedings of the 2003 Linux Symposium, pp. 494–506. Ottawa, ON, Canada (2003)Google Scholar
  64. 64.
    Zhang, X., Edwards, A., Jaeger, T.: Using CQUAL for static analysis of authorization hook placement. In: Proceedings of the 11th USENIX Security Symposium (2002)Google Scholar
  65. 65.
    Zhou, W., Sherr, M., Tao, T., Li, X., Loo, B.T., Mao, Y.: Efficient querying and maintenance of network provenance at internet-scale. In: Proceedings of the 2010 ACM SIGMOD International Conference on Measurement of Data (2010)Google Scholar
  66. 66.
    Zhou, W., Fei, Q., Narayan, A., Haeberlen, A., Loo, B.T., Sherr, M.: Secure network provenance. In: ACM Symposium on Operating Systems Principles (SOSP) (2011)Google Scholar
  67. 67.
    Zhou, W., Mapara, S., Ren, Y., Haeberlen, A., Ives, Z., Loo, B.T., Sherr, M.: Distributed time-aware provenance. In: Proceedings of VLDB (2013)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Adam Bates
    • 1
    Email author
  • Devin J. Pohly
    • 2
  • Kevin R. B. Butler
    • 3
  1. 1.University of Illinois at Urbana-ChampaignUrbanaUSA
  2. 2.Pennsylvania State UniversityUniversity ParkState CollegeUSA
  3. 3.University of FloridaGainesvilleUSA

Personalised recommendations