Abstract
Knowledge about the wetland use of migratory bird species during the annual life circle is very interesting to biologists, as it is critically important in many decision-making processes such as for conservation site construction and avian influenza control. The raw data of the habitat areas and the migration routes are usually in large scale and with high complexity when they are determined by high-tech GPS satellite telemetry. In this paper, we convert these biological problems into computational studies and introduce efficient algorithms for the data analysis. Our key idea is the concept of hierarchical clustering for migration habitat localizations, and the notion of association rules for the discovery of migration routes from the scattered location points in the GIS. One of our clustering results is a tree structure, specially called spatial-tree, which is an illusive map depicting the breeding and wintering home range of bar-headed geese. A related result to this observation is an association pattern that reveals a high possibility that bar-headed geese’s potential autumn migration routes are likely between the breeding sites in the Qinghai Lake, China and the wintering sites in Tibet river valley. Given the susceptibility of geese to spread H5N1, and on the basis of the chronology and the rates of the bar-headed geese migration movements, we can conjecture that bar-headed geese play an important role in the spread of the H5N1 virus at a regional scale in Qinghai-Tibetan Plateau.
Similar content being viewed by others
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on Very Large Data Bases(VLDB’94), Santiago, Chile, pp 487–499
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on Data Engineering (ICDE’95) Taipei, Taiwan, pp 3–14
Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: ACM SIGMOD Record, vol 28, Nr. 2, New York, NY, USA: ACM 1999, pp 49–60
Ball GH, Hall DJ (1965) ISODATA: a novel method of data analysis and pattern classification. In: Technical report of Stanford Research Institute, Menlo Park, CA, Stanford Research Institute, pp 66
Bekkerman R, Scholz M, Viswanathan K (2009) Improving clustering stability with combinatorial MRFs. In: The 15th ACM conference on knowledge discovery and data mining (SIGKDD09), pp 99–107
Berthold P, Terrill SB (1991) Recent advances in studies of bird migration. Ann Rev Ecol Syst 22: 357–378
Brecheisen S, Kriegel H-P, Pfeifle M (2006) Multi-step density-based clustering. Knowl Inf Syst 9(3): 284–308
Brown JD, Stallknecht DE, Swayne DE (2008) Experimental infection of swans and geese with highly pathogenic avian influenza virus (H5N1) of Asian lineage. Emerg Infect Dis 14: 136–142
Chen K, Liu L (2009) “Best K”: critical clustering structure in categorical datasets. knowl Inf Syst 20(1): 1–33
Chen H, Smith GJ, Zhang SY, Qin K, Wang J, Li KS, Webster RG, Peiris JS, Guan Y (2005) Avian flu: H5N1 virus outbreak in migratory waterfowl. Nature 436: 191–192
Dong G, Li J (2005) Mining border descriptions of emerging patterns from dataset pairs. Knowl Inf Syst 8: 178–202
Ertöz L, Steinbach M, Kumar V (2001) Finding topics in collections of documents: A shared nearest neighbor approach. In: Proceedings of Text Mine’01, First SIAM international conference on Data Mining SDM’01), Chicago, IL, USA
Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining. Portland, OR, pp 226–231
Ester M, Kriegel H-P, Sander J, Wimmer M, Xu X (1998) Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th international conference on Very Large Data Bases conference (VLDB’98), New York, USA, OR, AAAI Press, pp 226–231, 1996
Guha S, Rastogi R, Shim K (1998) CURE: an efficient clustering algorithm for large databases. In: Proceedings of the 1998 ACM SIGMOD international conference on Management of Data (SIGMOD‘98) New York, NY, USA, pp 73–84
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Proc Int J Data Min Knowl Discov 81: 53–87
Higuchi H (1991) Cooperative work on crane migration from Japan to the U.S.S.R. through Korea and China. In: Salathé T (ed). Conserving migratory birds Cambridge, International Council for Bird Preservation, pp 189–201
Hiroto S, Tamura M, Higuchi H (2004) Migration routes and important stopover sites of endangered oriental white storks Ciconia boyciana as revealed by satellite tracking. Mem Natl Inst Polar Res (special issue) 58: 162–178
Kamal Rp, Tosh C, Pattnaik B, Behera P, Nagarajan S, Gounalan S, Shrivastava N, Shankar Bp, Pradhan Hk (2007) Analysis of the PB2 gene reveals that Indian H5N1 influenza virus belongs to a mixed-migratory bird sub-lineage possessing the amino acid lysine at position 627 of the PB2 protein. Arch Virol 152: 1637–1644
Kandylas V, Upham SP, Ungar LH (2008) Finding cohesive clusters for analyzing knowledge communities. Knowl Inf Syst 17(3): 335–354
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. John Wiley, New York
Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Proceedings of the 4th international symposium on large spatial databases (SSD95), pp 47–66
Kou Z, Li Y, Yin Z, Guo S, Wang M et al (2009) The survey of H5N1 flu virus in wild birds in 14 provinces of China from 2004 to 2007. PLoS ONE 4(9): e6926. doi:10.1371/journal.pone.0006926
Lei F, Tang S, Zhao D, Zhang X, Kou Z, Li Y, Zhang Z, Yin Z, Chen S, Li S, Zhang D, Yan B, Li T (2007) Characterization of H5N1 influenza viruses isolated from migratory birds in Qinghai province of China in 2006. Avian Dis 51: 568–572
Liu J et al (2005) Highly pathogenic H5N1 influenza virus infection in migratory birds. Science 309(5738): 1206
Maria H, Yannis B, Michalis V (2001) On clustering validation techniques. J Intell Inf Syst 17: 107–145
Mathevet R, Tamisier A (2002) Creation of a nature reserve, its effects on hunting management and waterfowl distribution in the Camargue southern France. Biodiv Conserv 11: 509–519
Miyabayashi Y, Mundkur T (1999) Atlas of key sites for Anatidae in the East Asian Flyway. Tokyo: Japan, and Kuala Lumpur, Malaysia: Wetlands International—AsiaPacific.Available at http://www.jawgp.org/anet/aaa1999/aaaendx.htm. Accessed 11 March 2008
Muzaffar SB, Johny T (2008) Seasonal movements and migration of Pallas’s Gulls Larus ichthyaetus from Qinghai Lake, China. Forktail 24(2008): 100–107
Newman SH, Iverson SA et al (2009) Migration of whooper swans and outbreaks of highly pathogenic avian influenza H5N1 virus in Eastern Asia. PLos ONE 4(5)
Ng RT, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th international conference on Very Large Data Bases (VLDB’94), Santiago, Chile, pp 144–155
Pei J, Han J, Mortazavi-Asl B, Pinto H (2001) Prefixspan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 2001 International Conference on Data Engineering (ICDE 01), pp 214–224
Pei J, Dong G, Zou W, Han J (2004) Mining condensed frequent-pattern bases. Knowl Inf Syst 6: 570–594
Sabirovic M, Wilesmith J et al (2006) Defra. Situation analysis—outbreaks of HPAI H5N1 virus in Europe during 2005/2006— an overview and commentary. In: International Animal Health Division, 1A Page Street, London, SW1P 4PQ, United Kingdom. Version 1, Released 30 June 2006, p 40
Shan G, Li J, Lin Q (2005) Introduction to ACM international collegiate programming contest, 2nd edn. pp 100–102 (in Chinese)
Sturm-Ramirez KM, Hulse-Post DJ, Govorkova EA, Humberd J, Seiler P et al (2005) Are ducks contributing to the endemicity of highly pathogenic H5N1 influenza virus in Asia?. J Virol 79: 11269–11279
Tang M, Zhou Y, Cui P, Wang W, Li J, Hou Y-S, Yan B (2009) Discovery of migration habitats and routes of wild bird species by clustering and association analysis. In: The 5th international conference on advanced data mining and applications. LNAI 5678, pp 288–301
Tang M, Wang W, Jiang Y, Zhou Y, Li J, Cui P, Liu Y, Yan B (2010) Birds bring flues? Mining frequent and high weighted cliques from birds migration networks. In: The 15th international conference on Database Systems for Advanced Applications (DASFAA2010), LNCS 5982, pp 359–370
Wang G, Zhan D, Li L, Lei F, Liu B, Liu D, Xiao H, Feng Y, Li J, Yang B, Yin Z, Song X, Zhu X, Cong Y, Pu J, Wang J, Liu J, Gao GF, Zhu Q (2008) H5N1 avian influenza re-emergence of Lake Qinghai: phylogenetic and antigenic analyses of the newly isolated viruses and roles of migratory birds in virus circulation. J Gen Virol 89: 697–702
Webster RG, Peiris M, Chen H, Guan Y (2006) H5N1 outbreaks and enzootic influenza. Emerg Infect Dis 12:3–8
Weisstein EW (1994) “Spherical Polygon.” From the Math World-A Wolfram Web Resource. http://mathworld.wolfram.com/SphericalPolygon.html
Worton BJ (1989) Kernel methods for estimating the utilization distribution in home-range studies. Ecology 70: 164–168
Zaki M (1998) Efficient enumeration of frequent sequences. In: The 7th international conference on information and knowledge management, pp 68–75, Washington DC
Zeng X, Pei J, Wang K, Li Jinyan (2009) PADS: a simple yet effective pattern-aware dynamic search method for fast maximal frequent pattern mining. Knowl Inf Syst 20(3): 375–391
Zhang FY, Yang RL (1997) Bird migration research of China. China Forestry Publishing House, Beijing
Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. ACM SIGMOD Rec 25(2): 103–114
Zhou Jy, Shen Hg, Chen Hx, Tong Gz, Liao M, Yang Hc, Liu Jx (2006) Characterization of a highly pathogenic H5N1 influenza virus derived from bar-headed geese in China. J Gen Virol 87: 1823–1833
Zhou A, Cao F, Qian W, Jin C (2008) clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2): 181–214
Author information
Authors and Affiliations
Corresponding author
Additional information
Parts of this paper appeared in the Proceedings of the 2009 ADMA Conference [37].
Rights and permissions
About this article
Cite this article
Tang, M., Zhou, Y., Li, J. et al. Exploring the wild birds’ migration data for the disease spread study of H5N1: a clustering and association approach. Knowl Inf Syst 27, 227–251 (2011). https://doi.org/10.1007/s10115-010-0308-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-010-0308-x