Skip to main content

Advertisement

Log in

Exploring the wild birds’ migration data for the disease spread study of H5N1: a clustering and association approach

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Knowledge about the wetland use of migratory bird species during the annual life circle is very interesting to biologists, as it is critically important in many decision-making processes such as for conservation site construction and avian influenza control. The raw data of the habitat areas and the migration routes are usually in large scale and with high complexity when they are determined by high-tech GPS satellite telemetry. In this paper, we convert these biological problems into computational studies and introduce efficient algorithms for the data analysis. Our key idea is the concept of hierarchical clustering for migration habitat localizations, and the notion of association rules for the discovery of migration routes from the scattered location points in the GIS. One of our clustering results is a tree structure, specially called spatial-tree, which is an illusive map depicting the breeding and wintering home range of bar-headed geese. A related result to this observation is an association pattern that reveals a high possibility that bar-headed geese’s potential autumn migration routes are likely between the breeding sites in the Qinghai Lake, China and the wintering sites in Tibet river valley. Given the susceptibility of geese to spread H5N1, and on the basis of the chronology and the rates of the bar-headed geese migration movements, we can conjecture that bar-headed geese play an important role in the spread of the H5N1 virus at a regional scale in Qinghai-Tibetan Plateau.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on Very Large Data Bases(VLDB’94), Santiago, Chile, pp 487–499

  2. Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on Data Engineering (ICDE’95) Taipei, Taiwan, pp 3–14

  3. Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: ACM SIGMOD Record, vol 28, Nr. 2, New York, NY, USA: ACM 1999, pp 49–60

  4. Ball GH, Hall DJ (1965) ISODATA: a novel method of data analysis and pattern classification. In: Technical report of Stanford Research Institute, Menlo Park, CA, Stanford Research Institute, pp 66

  5. Bekkerman R, Scholz M, Viswanathan K (2009) Improving clustering stability with combinatorial MRFs. In: The 15th ACM conference on knowledge discovery and data mining (SIGKDD09), pp 99–107

  6. Berthold P, Terrill SB (1991) Recent advances in studies of bird migration. Ann Rev Ecol Syst 22: 357–378

    Article  Google Scholar 

  7. Brecheisen S, Kriegel H-P, Pfeifle M (2006) Multi-step density-based clustering. Knowl Inf Syst 9(3): 284–308

    Article  Google Scholar 

  8. Brown JD, Stallknecht DE, Swayne DE (2008) Experimental infection of swans and geese with highly pathogenic avian influenza virus (H5N1) of Asian lineage. Emerg Infect Dis 14: 136–142

    Article  Google Scholar 

  9. Chen K, Liu L (2009) “Best K”: critical clustering structure in categorical datasets. knowl Inf Syst 20(1): 1–33

    Article  Google Scholar 

  10. Chen H, Smith GJ, Zhang SY, Qin K, Wang J, Li KS, Webster RG, Peiris JS, Guan Y (2005) Avian flu: H5N1 virus outbreak in migratory waterfowl. Nature 436: 191–192

    Article  Google Scholar 

  11. Dong G, Li J (2005) Mining border descriptions of emerging patterns from dataset pairs. Knowl Inf Syst 8: 178–202

    Article  Google Scholar 

  12. Ertöz L, Steinbach M, Kumar V (2001) Finding topics in collections of documents: A shared nearest neighbor approach. In: Proceedings of Text Mine’01, First SIAM international conference on Data Mining SDM’01), Chicago, IL, USA

  13. Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining. Portland, OR, pp 226–231

  14. Ester M, Kriegel H-P, Sander J, Wimmer M, Xu X (1998) Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th international conference on Very Large Data Bases conference (VLDB’98), New York, USA, OR, AAAI Press, pp 226–231, 1996

  15. Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of Data (SIGMOD‘98) New York, NY, USA, pp 73–84

  16. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Proc Int J Data Min Knowl Discov 81: 53–87

    Article  MathSciNet  Google Scholar 

  17. Higuchi H (1991) Cooperative work on crane migration from Japan to the U.S.S.R. through Korea and China. In: Salathé T (ed). Conserving migratory birds Cambridge, International Council for Bird Preservation, pp 189–201

  18. Hiroto S, Tamura M, Higuchi H (2004) Migration routes and important stopover sites of endangered oriental white storks Ciconia boyciana as revealed by satellite tracking. Mem Natl Inst Polar Res (special issue) 58: 162–178

    Google Scholar 

  19. Kamal Rp, Tosh C, Pattnaik B, Behera P, Nagarajan S, Gounalan S, Shrivastava N, Shankar Bp, Pradhan Hk (2007) Analysis of the PB2 gene reveals that Indian H5N1 influenza virus belongs to a mixed-migratory bird sub-lineage possessing the amino acid lysine at position 627 of the PB2 protein. Arch Virol 152: 1637–1644

    Article  Google Scholar 

  20. Kandylas V, Upham SP, Ungar LH (2008) Finding cohesive clusters for analyzing knowledge communities. Knowl Inf Syst 17(3): 335–354

    Article  Google Scholar 

  21. Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. John Wiley, New York

    Google Scholar 

  22. Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Proceedings of the 4th international symposium on large spatial databases (SSD95), pp 47–66

  23. Kou Z, Li Y, Yin Z, Guo S, Wang M et al (2009) The survey of H5N1 flu virus in wild birds in 14 provinces of China from 2004 to 2007. PLoS ONE 4(9): e6926. doi:10.1371/journal.pone.0006926

    Article  Google Scholar 

  24. Lei F, Tang S, Zhao D, Zhang X, Kou Z, Li Y, Zhang Z, Yin Z, Chen S, Li S, Zhang D, Yan B, Li T (2007) Characterization of H5N1 influenza viruses isolated from migratory birds in Qinghai province of China in 2006. Avian Dis 51: 568–572

    Article  Google Scholar 

  25. Liu J et al (2005) Highly pathogenic H5N1 influenza virus infection in migratory birds. Science 309(5738): 1206

    Article  Google Scholar 

  26. Maria H, Yannis B, Michalis V (2001) On clustering validation techniques. J Intell Inf Syst 17: 107–145

    Article  Google Scholar 

  27. Mathevet R, Tamisier A (2002) Creation of a nature reserve, its effects on hunting management and waterfowl distribution in the Camargue southern France. Biodiv Conserv 11: 509–519

    Article  Google Scholar 

  28. Miyabayashi Y, Mundkur T (1999) Atlas of key sites for Anatidae in the East Asian Flyway. Tokyo: Japan, and Kuala Lumpur, Malaysia: Wetlands International—AsiaPacific.Available at http://www.jawgp.org/anet/aaa1999/aaaendx.htm. Accessed 11 March 2008

  29. Muzaffar SB, Johny T (2008) Seasonal movements and migration of Pallas’s Gulls Larus ichthyaetus from Qinghai Lake, China. Forktail 24(2008): 100–107

    Google Scholar 

  30. Newman SH, Iverson SA et al (2009) Migration of whooper swans and outbreaks of highly pathogenic avian influenza H5N1 virus in Eastern Asia. PLos ONE 4(5)

  31. Ng RT, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th international conference on Very Large Data Bases (VLDB’94), Santiago, Chile, pp 144–155

  32. Pei J, Han J, Mortazavi-Asl B, Pinto H (2001) Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 2001 International Conference on Data Engineering (ICDE 01), pp 214–224

  33. Pei J, Dong G, Zou W, Han J (2004) Mining condensed frequent-pattern bases. Knowl Inf Syst 6: 570–594

    Article  Google Scholar 

  34. Sabirovic M, Wilesmith J et al (2006) Defra. Situation analysis—outbreaks of HPAI H5N1 virus in Europe during 2005/2006— an overview and commentary. In: International Animal Health Division, 1A Page Street, London, SW1P 4PQ, United Kingdom. Version 1, Released 30 June 2006, p 40

  35. Shan G, Li J, Lin Q (2005) Introduction to ACM international collegiate programming contest, 2nd edn. pp 100–102 (in Chinese)

  36. Sturm-Ramirez KM, Hulse-Post DJ, Govorkova EA, Humberd J, Seiler P et al (2005) Are ducks contributing to the endemicity of highly pathogenic H5N1 influenza virus in Asia?. J Virol 79: 11269–11279

    Article  Google Scholar 

  37. Tang M, Zhou Y, Cui P, Wang W, Li J, Hou Y-S, Yan B (2009) Discovery of migration habitats and routes of wild bird species by clustering and association analysis. In: The 5th international conference on advanced data mining and applications. LNAI 5678, pp 288–301

  38. Tang M, Wang W, Jiang Y, Zhou Y, Li J, Cui P, Liu Y, Yan B (2010) Birds bring flues? Mining frequent and high weighted cliques from birds migration networks. In: The 15th international conference on Database Systems for Advanced Applications (DASFAA2010), LNCS 5982, pp 359–370

  39. Wang G, Zhan D, Li L, Lei F, Liu B, Liu D, Xiao H, Feng Y, Li J, Yang B, Yin Z, Song X, Zhu X, Cong Y, Pu J, Wang J, Liu J, Gao GF, Zhu Q (2008) H5N1 avian influenza re-emergence of Lake Qinghai: phylogenetic and antigenic analyses of the newly isolated viruses and roles of migratory birds in virus circulation. J Gen Virol 89: 697–702

    Article  Google Scholar 

  40. Webster RG, Peiris M, Chen H, Guan Y (2006) H5N1 outbreaks and enzootic influenza. Emerg Infect Dis 12:3–8

    Google Scholar 

  41. Weisstein EW (1994) “Spherical Polygon.” From the Math World-A Wolfram Web Resource. http://mathworld.wolfram.com/SphericalPolygon.html

  42. Worton BJ (1989) Kernel methods for estimating the utilization distribution in home-range studies. Ecology 70: 164–168

    Article  Google Scholar 

  43. Zaki M (1998) Efficient enumeration of frequent sequences. In: The 7th international conference on information and knowledge management, pp 68–75, Washington DC

  44. Zeng X, Pei J, Wang K, Li Jinyan (2009) PADS: a simple yet effective pattern-aware dynamic search method for fast maximal frequent pattern mining. Knowl Inf Syst 20(3): 375–391

    Article  Google Scholar 

  45. Zhang FY, Yang RL (1997) Bird migration research of China. China Forestry Publishing House, Beijing

    Google Scholar 

  46. Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Rec 25(2): 103–114

    Article  Google Scholar 

  47. Zhou Jy, Shen Hg, Chen Hx, Tong Gz, Liao M, Yang Hc, Liu Jx (2006) Characterization of a highly pathogenic H5N1 influenza virus derived from bar-headed geese in China. J Gen Virol 87: 1823–1833

    Article  Google Scholar 

  48. Zhou A, Cao F, Qian W, Jin C (2008) clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2): 181–214

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuanchun Zhou.

Additional information

Parts of this paper appeared in the Proceedings of the 2009 ADMA Conference [37].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tang, M., Zhou, Y., Li, J. et al. Exploring the wild birds’ migration data for the disease spread study of H5N1: a clustering and association approach. Knowl Inf Syst 27, 227–251 (2011). https://doi.org/10.1007/s10115-010-0308-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-010-0308-x

Keywords

Navigation