Abstract
This paper proposes a Colored Petri Net model capturing the behaviour of vertical search engines. In such systems a query submitted by a user goes through different stages and can be handled by three different kinds of nodes. The proposed model has a modular design that enables accommodation of alternative/additional search engine components. A performance evaluation study is presented to illustrate the use of the model and it shows that the proposed model is suitable for rapid exploration of different scenarios and determination of feasible search engine configurations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Al-Fares, M., Loukissas, A., Vahdat, A.: A scalable, commodity data center network architecture. SIGCOMM 38, 63–74 (2008)
Arlitt, M., Krishnamurthy, D., Rolia, J.: Characterizing the scalability of a large web-based shopping system. J. of ACM Trans. Internet Technol. 1, 44–69 (2001)
Badue, C.S., Almeida, J.M., Almeida, V., Baeza-Yates, R.A., Ribeiro-Neto, B.A., Ziviani, A., Ziviani, N.: Capacity planning for vertical search engines. CoRR, abs/1006.5059 (2010)
Badue, C.S., Baeza-Yates, R.A., Ribeiro-Neto, B.A., Ziviani, A., Ziviani, N.: Modeling performance-driven workload characterization of web search systems. In: CIKM, pp. 842–843 (2006)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley (1999)
Broder, A.Z., Carmel, D., Herscovici, M., Soffer, A., Zien, J.Y.: Efficient query evaluation using a two-level retrieval process. In: CIKM, pp. 426–434 (2003)
Cacheda, F., Carneiro, V., Plachouras, V., Ounis, I.: Network Analysis for Distributed Information Retrieval Architectures. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 527–529. Springer, Heidelberg (2005)
Cacheda, F., Carneiro, V., Plachouras, V., Ounis, I.: Performance analysis of distributed information retrieval architectures using an improved network simulation model. Inf. Process. Manage. 43(1), 204–224 (2007)
Cacheda, F., Plachouras, V., Ounis, I.: Performance Analysis of Distributed Architectures to Index One Terabyte of Text. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 394–408. Springer, Heidelberg (2004)
Cacheda, F., Plachouras, V., Ounis, I.: A case study of distributed information retrieval architectures to index one terabyte of text. Inf. Process. Manage. 41(5) (2005)
Cahoon, B., McKinley, K.S., Lu, Z.: Evaluating the performance of distributed architectures for information retrieval using a variety of workloads. ACM Trans. Inf. Syst. 18, 1–43 (2000)
Chowdhury, A., Pass, G.: Operational requirements for scalable search systems. In: CIKM, pp. 435–442 (2003)
Couvreur, T.R., Benzel, R.N., Miller, S.F., Zeitler, D.N., Lee, D.L., Singhal, M., Shivaratri, N.G., Wong, W.Y.P.: An analysis of performance and cost factors in searching large text databases using parallel search systems. Journal of The American Society for Information Science and Technology 45, 443–464 (1994)
Fitzpatrick, B.: Distributed caching with memcached. J. of Linux, 72–76 (2004)
Gan, Q., Suel, T.: Improved techniques for result caching in web search engines. In: WWW, pp. 431–440 (2009)
Jensen, K., Kristensen, L.: Coloured Petri Nets. Springer, Heidelberg (2009)
Jiang, G., Chen, H., Yoshihira, K.: Profiling services for resource optimization and capacity planning in distributed systems. J. of Cluster Computing, 313–329 (2008)
Lin, W., Liu, Z., Xia, C.H., Zhang, L.: Optimal capacity allocation for web systems with end-to-end delay guarantees. Perform. Eval., 400–416 (2005)
Lu, B., Apon, A.: Capacity Planning of a Commodity Cluster in an Academic Environment: A Case Study (2008)
Marin, M., Gil-Costa, V.: High-performance distributed inverted files. In: Proceedings of CIKM, pp. 935–938 (2007)
Menasce, D.A., Almeida, V.A., Dowdy, L.W.: Performance by Design: Computer Capacity Planning. Prentice Hall (2004)
Moffat, A., Webber, W., Zobel, J.: Load balancing for term-distributed parallel retrieval. In: SIGIR, pp. 348–355 (2006)
Moffat, A., Webber, W., Zobel, J., Baeza-Yates, R.: A pipelined architecture for distributed text query evaluation. Information Retrieval 10(3), 205–231 (2007)
Reiser, M., Lavenberg, S.S.: Mean-value analysis of closed multichain queuing networks. J. ACM 27(2), 313–322 (1980)
van der Aalst, W., Stahl, C.: Modeling Business Processes – A Petri Net-Oriented Approach. MIT Press (2011)
Wang, H., Sevcik, K.C.: Experiments with improved approximate mean value analysis algorithms. Perform. Eval. 39, 189–206 (2000)
Zaitsev, D.A.: An evaluation of network response time using a coloured petri net model of switched lan. In: Proceedings of Fifth Workshop and Tutorial on Practical Use of Coloured Petri Nets and the CPN Tools, pp. 157–167 (2004)
Zhang, C., Chang, R.N., Perng, C.-S., So, E., Tang, C., Tao, T.: An optimal capacity planning algorithm for provisioning cluster-based failure-resilient composite services. In: Proceedings of the 2009 IEEE International Conference on Services Computing, SCC 2009, pp. 112–119 (2009)
Zhang, J., Suel, T.: Optimized inverted list assignment in distributed search engine architectures. In: IPDPS (2007)
Zobel, J., Moffat, A.: Inverted files for text search engines. J. of CSUR 38(2) (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gil-Costa, V., Lobos, J., Inostrosa-Psijas, A., Marin, M. (2012). Capacity Planning for Vertical Search Engines: An Approach Based on Coloured Petri Nets. In: Haddad, S., Pomello, L. (eds) Application and Theory of Petri Nets. PETRI NETS 2012. Lecture Notes in Computer Science, vol 7347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31131-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-31131-4_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31130-7
Online ISBN: 978-3-642-31131-4
eBook Packages: Computer ScienceComputer Science (R0)