A Term Distribution Visualization Approach to Digital Forensic String Search

  • Moses Schwartz
  • L. M. Liebrock
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5210)


Digital forensic string search is vital to the forensic discovery process, but there has been little research on improving tools or methods for this task. We propose the use of term distribution visualizations to aid digital forensic string search tasks. Our visualization model enables an analyst to quickly identify relevant sections of a text and provides brushing and drilling-down capabilities to support analysis of large datasets. Initial user study results suggest that the visualizations are useful for information retrieval tasks, but further studies must be performed to obtain statistically significant results and to determine specific utility in digital forensic investigations.


Term distribution visualizations digital forensics text string search 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press / Addison-Wesley (1999)Google Scholar
  2. 2.
    Beebe, N., Dietrich, G.: A New Process Model for Text String Searching. Springer, Norwell (2007)Google Scholar
  3. 3.
    Beebe, N.L., Clark, J.G.: Digital forensic text string searching: Improving information retrieval effectiveness by thematically clustering search results. In: Digital Investigation, September 2007, vol. 4(suppl. 1) (2007)Google Scholar
  4. 4.
    Byrd, D.: A scrollbar-based visualization for document navigation. In: Proceedings of the Fourth ACM International Conference on Digital Libraries (1999)Google Scholar
  5. 5.
    Don, A., Zheleva, E., Gregory, M., Tarkan, S., Auvil, L., Clement, T., Shneiderman, B., Plaisant, C.: Discovering interesting usage patterns in text collections: integrating text mining with visualization. In: CIKM 2007: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pp. 213–222. ACM Press, New York (2007)CrossRefGoogle Scholar
  6. 6.
    Forte, D.: The importance of text searches in digital forensics. In: Network Security, April 2004, pp. 13–15 (2004)Google Scholar
  7. 7.
    Free Software Foundation. Tool: GNU GrepGoogle Scholar
  8. 8.
    Harper, D., Koychev, I., Sun, Y., Pirie, I.: Within-document retrieval: A user-centred evaluation of relevance profiling. In: Information Retrieval, vol. 7, pp. 265–290 (2004)Google Scholar
  9. 9.
    Harper, D.J., Coulthard, S., Yixing, S.: A language modelling approach to relevance profiling for document browsing. In: JCDL 2002: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital Libraries, New York, NY, USA (2002)Google Scholar
  10. 10.
    Havre, S., Hetzler, E., Whitney, P., Nowell, L.: ThemeRiver: Visualizing thematic changes in large document collections. IEEE Transactions on Visualization and Computer Graphics 8(1), 9–20 (2002)CrossRefGoogle Scholar
  11. 11.
    Hearst, M.A.: Tilebars: visualization of term distribution information in full text information access. In: CHI 1995: Proceedings of the SIGCHI conference on Human factors in computing systems, New York, NY, USA, pp. 59–66. ACM Press/Addison-Wesley Publishing Co (1995)Google Scholar
  12. 12.
    Mandia, K., Prosise, C., Pepe, M.: Incident Response & Computer Forensics. McGraw-Hill/Osborne, California (2003)Google Scholar
  13. 13.
    Mann, T., Reiterer, H.: Case study: A combined visualization approach for www-search results. In: Proc. IEEE Information Visualization 1999, pp. 59–62 (1999)Google Scholar
  14. 14.
    Mann, T.M.: Visualization of WWW-search results. In: DEXA Workshop, pp. 264–268 (1999)Google Scholar
  15. 15.
    Mao, Y., Dillon, J.V., Lebanon, G.: Sequential document visualization. In: IEEE Transactions on Visualization and Computer Graphics, November/December 2007, vol. 13(6), pp. 1208–1215 (2007)Google Scholar
  16. 16.
    Schwartz, M., Hash, C., Liebrock, L.: Term distribution visualizations with a focus+context model. Technical report, New Mexico Institute of Mining and Technology (2008),
  17. 17.
    Splunk, Inc. Application: SplunkGoogle Scholar
  18. 18.
    Paley, W.B.: TextArc: Showing word frequency and distribution in text. Poster presented at IEEE Symposium on Information Visualization (2002)Google Scholar
  19. 19.
    Whittaker, S., Hirschberg, J., Choi, J., Hindle, D., Pereira, F.C.N., Singhal, A.: SCAN: Designing and evaluating user interfaces to support retrieval from speech archives. In: Research and Development in Information Retrieval, pp. 26–33 (1999)Google Scholar
  20. 20.
    Wong, P.C., Cowley, W., Foote, H., Jurrus, E., Thomas, J.: Visualizing sequential patterns for text mining. In: INFOVIS 2000: Proceedings of the IEEE Symposium on Information Vizualization 2000, p. 105 (2000)Google Scholar
  21. 21.
    Zhang, J.: Visualization for Information Retrieval, 1st edn. Springer, Heidelberg (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Moses Schwartz
    • 1
  • L. M. Liebrock
    • 1
  1. 1.New Mexico Institute of Mining and TechnologySocorroUSA

Personalised recommendations