Abstract
In this paper, we talk about developing a search engine and information retrieval system for Bangla. Current work done in this area assumes the use of a particular type of encoding or the availability of particular facilities for the user. We wanted to come up with an implementation that did not require any special features or optimizations in the user end, and would perform just as well in all situations. For this purpose, we picked two case studies to work on in our effort to finding a suitable solution to the problem. While working on these cases, we encountered several problems and had to find our way around these problems. We had to pick and choose from a set of software packages for the one that would best serve our needs. We also had to take into consideration user convenience in using our system, for which we had to keep in mind the diverse demographics of people that might have need for such a system. Finally, we came up with the system, with all the desired features. Some possible future developments also came into mind in the course of our work, which are also mentioned in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Google Desktop, available online at http://desktop.google.com/about.html
Yahoo! Search, available online at http://search.yahoo.com
MSN Search, available online at http://search.msn.com
Erik Hatcher and Otis Gospodnetic, ‘Lucene in Action’, April 2006.
The Official Nutch Website – http://www.lucene.apache.org/nutch
Vicaya, available online at http://vicaya.sourceforge.net
Prothom Alo, the largest online daily newspaper in Bangla, available online at www.prothom-alo.net
D.Net – Development Research Network, www.dnet-bangladesh.org
Pallitathya, a research program of D.Net on understanding information needs from a village perceptive, http://www.pallitathya.org/
Abolombon, a program of D.Net designed to improve access to legal information on governance and human rights issues, http://www.abolombon.org/
The Nutch wiki, available online at http://wiki.apache.org/nutch/
The Nutch tutorial for Version 0.7.x, available online at http://www.lucene.apache.org/nutch/tutorial.html
A step by step guideline on how to configure and use Tomcat, available online at http://www.coreservlets.com/Apache-Tomcat-Tutorial/
Weblog on enabling Tomcat to support UTF-8 Encoding - http://rollerweblogger.org/page/roller/20040415
FAQ on the World Summit Information Society at http://www.itu.int/wsis/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this paper
Cite this paper
Haque, N., Ali, M.H., Abdullah, M.S., Khan, M. (2007). Infrastructure for Bangla Information retrieval in the context of ICT for Development. In: Elleithy, K. (eds) Advances and Innovations in Systems, Computing Sciences and Software Engineering. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6264-3_57
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6264-3_57
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6263-6
Online ISBN: 978-1-4020-6264-3
eBook Packages: EngineeringEngineering (R0)