# Estimating Sum by Weighted Sampling

• Rajeev Motwani
• Rina Panigrahy
• Ying Xu
Conference paper

DOI: 10.1007/978-3-540-73420-8_7

Volume 4596 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Motwani R., Panigrahy R., Xu Y. (2007) Estimating Sum by Weighted Sampling. In: Arge L., Cachin C., Jurdziński T., Tarlecki A. (eds) Automata, Languages and Programming. ICALP 2007. Lecture Notes in Computer Science, vol 4596. Springer, Berlin, Heidelberg

## Abstract

We study the classic problem of estimating the sum of n variables. The traditional uniform sampling approach requires a linear number of samples to provide any non-trivial guarantees on the estimated sum. In this paper we consider various sampling methods besides uniform sampling, in particular sampling a variable with probability proportional to its value, referred to as linear weighted sampling. If only linear weighted sampling is allowed, we show an algorithm for estimating sum with $$\tilde{O}(\sqrt n)$$ samples, and it is almost optimal in the sense that $$\Omega(\sqrt n)$$ samples are necessary for any reasonable sum estimator. If both uniform sampling and linear weighted sampling are allowed, we show a sum estimator with $$\tilde{O}(\sqrt[3]n)$$ samples. More generally, we may allow general weighted sampling where the probability of sampling a variable is proportional to any function of its value. We prove a lower bound of $$\Omega(\sqrt[3]n)$$ samples for any reasonable sum estimator using general weighted sampling, which implies that our algorithm combining uniform and linear weighted sampling is an almost optimal sum estimator.