Statistical estimation of access frequencies in data broadcasting environments
In a data publishing environment, the server periodically broadcasts data to users based on a broadcast program. The program is constructed using knowledge of access frequencies, which is assumed to be available and accurate, on the broadcast data. For example, the program may broadcast frequently accessed data more often in a broadcast cycle. However, it remains an open question as to how to obtain such access frequencies. The difficulty of obtaining such access frequencies is that in such an environment, mobile users are only listening to the channel they are interested in and do not request for the data items from the server. A promising approach in the literature is to make use of broadcast misses to understand the access patterns in a data publishing environment. In this case, mobile users may decide whether to wait for the required item to arrive or to make an explicit request for it even though it will be published. However, estimation of access frequencies based on broadcast misses may not be accurate because the number of broadcast misses to the data depends on how frequently the data is broadcast: if a piece of data is more frequently broadcast than the others, then the broadcast misses to that piece of data will be low because the average waiting time is low. In this paper, we propose a statistical estimation model that is based on maximum likelihood estimation to estimate the access frequencies. Our approach is novel in that it exploits knowledge that is available – broadcast misses and broadcast frequencies – to refine the program to better meet the needs of the user population. We report our simulation study that demonstrates the effectiveness of our approach.
Unable to display preview. Download preview PDF.
- S. Acharya, R. Alonso, M. Franklin and S. Zdonik, Broadcast disks: Data management for asymmetric communication environments, in: Proceedings of the 1995 ACM-SIGMOD International Conference on Management of Data (June 1995) pp. 199–210.Google Scholar
- D. Barbara and T. Imielinski, Sleepers and workaholics: Caching in mobile distributed environments, in: Proceedings of the 1994 ACMSIGMOD International Conference on Management of Data (June 1994) pp. 1–12.Google Scholar
- D.A.S. Fraser, Probability and Statistics: Theory and Applications (Duxbury Press, 1976).Google Scholar
- T. Imielinski and S. Viswanathan, Adaptive wireless information systems, in: Proceedings of the SIGDBS Conference, Tokyo, Japan (October 1994) pp. 19–41.Google Scholar
- T. mielinski, S. Viswanathan and B.R. Badrinath, Power efficient filtering of data on air, in: Proceedings of the 4th International Conference on Extending Database Technology (March 1994) pp. 245–258.Google Scholar
- E.L. Lehmann, Theory of Point Estimation (Wiley, New York, 1983).Google Scholar
- T. Robertson, Order Restricted Statistical Inference (Wiley, New York, 1988).Google Scholar
- K. Stathatos, N. Roussopoulos and J.S. Baras, Adaptive data broadcast in hybrid networks, in: Proceedings of 23rd International Conference on Very Large Data Bases (1997) pp. 326–335.Google Scholar
- Kian-Lee Tan and J. Xu Yu, Energy efficient filtering of nonuniform broadcast, in: Proceedings of the 16th IEEE International Conference on Distributed Computing Systems (May 1996).Google Scholar
- Kian-Lee Tan and J. Xu Yu, A dynamic scheduler for the infinite air-cache, Data and Knowledge Engineering 24 (1997).Google Scholar
- J. Xu Yu and Kian-Lee Tan, An analysis of selective tuning schemes for nonuniform broadcast, Data and Knowledge Engineering 22 (1997).Google Scholar