Skip to main content

Constructing computer virus phylogenies

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1075))

Included in the following conference series:

Abstract

There has been much recent algorithmic work on the problem of reconstructing the evolutionary history of biological species. Computer virus specialists are interested in finding the evolutionary history of computer viruses — a virus is often written using code fragments from one or more other viruses, which are its immediate ancestors. A phylogeny for a collection of computer viruses is a directed acyclic graph whose nodes are the viruses and whose edges map ancestors to descendants and satisfy the property that each code fragment is “invented” only once. To provide a simple explanation for the data, we consider the problem of constructing such a phylogeny with a minimum number of edges. This optimization problem is NP-hard, and we present positive and negative results for associated approximation problems. When tree solutions exist, they can be constructed and randomly sampled in polynomial time.

Part of this work was performed at Sandia National Laboratories and was supported by the U.S. Department of Energy under contract DE-AC04-76AL85000. Part of this work was supported by the ESPRIT Basic Research Action Programme of the EC under contract 7141 (project ALCOM-IT).

Part of this work was performed at Sandia National Laboratories and was supported by the U.S. Department of Energy under contract DE-AC04-76AL85000.

This work was performed under U.S. Department of Energy contract DE-AC04-76AL85000.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Bellare, S. Goldwasser, C. Lund, and A. Russell. Efficient probabilistically checkable proofs and applications to approximation. In Proceedings of the 25th Annual ACM Symposium on the Theory of Computing, pages 294–304, 1993.

    Google Scholar 

  2. C. Benham, S. Kannan, M. Paterson, and T. Warnow. Hen's teeth and whale's feet: Generalized characters and their compatibility. Journal of Mathematical Biology, 2(4):515–525, 1995.

    Google Scholar 

  3. H. Bodlaender, M. Fellows, and T. Warnow. Two strikes against perfect phylogeny. In Proceedings of the 19th International Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science, pages 273–283. Springer Verlag, 1992.

    Google Scholar 

  4. C. Colbourn and M. Jerrum, 1995. Personal communication.

    Google Scholar 

  5. C. Colbourn, W. Myrvold, and E. Neufeld. Two algorithms for unranking arborescences. Journal of Algorithms. To appear.

    Google Scholar 

  6. D. Coppersmith and S. Winograd. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation, 9:251–280, 1990.

    Google Scholar 

  7. M. Crochemore and W. Rytter. Text Algorithms. Oxford University Press, 1994.

    Google Scholar 

  8. U. Feige. A threshold of ln n for approximating set cover. In Proceedings of the 28th Annual ACM Symposium on the Theory of Computing, pages 286–293, 1996.

    Google Scholar 

  9. A. Gibbons. Algorithmic Graph Theory. Cambridge University Press, 1985.

    Google Scholar 

  10. L. Goldberg, P. Goldberg, C. Phillips, E. Sweedyk, and T. Warnow. Computing the phylogenetic number to find good evolutionary trees. In Proceedings of the 6th Symposium on Combinatorial Pattern Matching, July 1995.

    Google Scholar 

  11. D. Gusfield. Efficient algorithms for inferring evolutionary trees. Networks, 21:12–28, 1991.

    Google Scholar 

  12. W. Joklik, H. Willett, D. Amos, and C. Wilfert, editors. Zinsser Microbiology. Appleton & Lange, Norwalk, Connecticut, 20th edition, 1992.

    Google Scholar 

  13. D. Karger, P. Klein, and R. Tarjan. A randomized linear-time algorithm to find minimum spanning trees. Journal of the Association for Computing Machinery, 42(2), 1995.

    Google Scholar 

  14. J. Kephart and W. Arnold. Automatic extraction of computer virus signatures. In R. Ford, editor, Proceedings of the 4th Virus Bulletin International Conference, pages 179–194. Virus Bulletin Ltd; 1994.

    Google Scholar 

  15. A. Nijenhuis and H. Wilf. Combinatorial Algorithms for Computers and Calculators. Academic Press, 2nd edition, 1978.

    Google Scholar 

  16. R. Prim. Shortest connection networks and some generalizations. Bell System Technical Journal, 36:1389–1401, 1957.

    Google Scholar 

  17. G. B. Sorkin. Grouping related computer viruses into families. In Proceedings of the IBM Security ITS, Oct. 1994.

    Google Scholar 

  18. M. Steel. The complexity of reconstructing trees from qualitative characters and subtrees. Journal of Classification, 9:91–116, 1992.

    Google Scholar 

  19. D. Wilson. Generating random spanning trees more quickly than the cover time. Submitted for publication, 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Dan Hirschberg Gene Myers

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Goldberg, L.A., Goldberg, P.W., Phillips, C.A., Sorkin, G.B. (1996). Constructing computer virus phylogenies. In: Hirschberg, D., Myers, G. (eds) Combinatorial Pattern Matching. CPM 1996. Lecture Notes in Computer Science, vol 1075. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61258-0_19

Download citation

  • DOI: https://doi.org/10.1007/3-540-61258-0_19

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61258-2

  • Online ISBN: 978-3-540-68390-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics