Skip to main content

Acquiring High Quality Customer Data with Low Cost

  • Conference paper
  • First Online:
E-Life: Web-Enabled Convergence of Commerce, Work, and Social Life (WEB 2015)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 258))

Included in the following conference series:

Abstract

This work concerns optimal customer data acquisition and selection problem. We first propose using a divide-and-conquer technique to find empirical distribution of the customer data. We then formulate a data acquisition problem as an optimization problem that maximizes the quality of the acquired data while keeping the cost of acquisition as low as possible. We propose using generalized second-price (GSP) auction to acquire customer data and show that when the number of bidders is large, GSP is a truth-telling mechanism. We derive the analytical solution for the optimization problem, which finds a set of data that best represents the probability distribution of the target population relative to the acquisition cost. An experimental study is conducted to demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare SDC methods for protection of numerical microdata. European Project IST-2000-25069 CASC (2002)

    Google Scholar 

  2. Chaudhuri, A., Stenger, H.: Survey Sampling: Theory and Methods. Marcel Dekker Inc., Florence (1992)

    MATH  Google Scholar 

  3. Edelman, B., Ostrovsky, M., Schwarz, M.: Internet advertising and the generalized second-price auction: selling billions of dollars worth of keywords. Am. Econ. Rev. 97(1), 242–259 (2007)

    Article  Google Scholar 

  4. Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977)

    Article  MATH  Google Scholar 

  5. Gomes, R., Sweeney, S.: Bayes-Nash equilibria of the generalized second price auction. http://ssrn.com/abstract=1429585

  6. Groves, R.M.: Nonresponse rates and nonresponse bias in household surveys. Public Opin. Q. 70(5), 646–675 (2006)

    Article  Google Scholar 

  7. Krishna, V.: Auction Theory. Elsevier, Amsterdam (2010)

    Google Scholar 

  8. Mookerjee, V., Dos Santos, B.: Inductive expert system design: maximizing system value. Inf. Syst. Res. 4(4), 111–131 (1993)

    Article  Google Scholar 

  9. Mookerjee, V., Mannino, M.: Redesigning case retrieval to reduce information acquisition costs. Inf. Syst. Res. 8(1), 51–69 (1997)

    Article  Google Scholar 

  10. Moore, J., Whinston, A.: A model of decision-making with sequential information-acquisition (Part 1). Decis. Support Syst. 2(4), 285–307 (1986)

    Article  Google Scholar 

  11. Moore, J., Whinston, A.: A model of decision-making with sequential information-acquisition (Part 2). Decis. Support Syst. 3(1), 47–73 (1987)

    Article  Google Scholar 

  12. Saar-Tsechansky, M., Melville, P., Provost, F.: Active feature-value acquisition. Manage. Sci. 55(4), 664–684 (2009)

    Article  Google Scholar 

  13. Varian, H.R.: Position auctions. Int. J. Ind. Organ. 25(6), 1163–1178 (2007)

    Article  Google Scholar 

  14. Zheng, Z., Padmanabhan, B.: Selectively acquiring customer information: a new data acquisition problem and an active learning-based solution. Manag. Sci. 52(5), 697–712 (2003)

    Article  Google Scholar 

Download references

Acknowledgments

Xiao-Bai Li’s research was supported in part by the National Library of Medicine of the National Institutes of Health (NIH) under Grant Number R01LM010942. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao-Bai Li .

Editor information

Editors and Affiliations

Appendix

Appendix

Proof of Theorem 1 (GSP is asymptotically truth-telling).

Let v be the true value of a bidder and \( \beta (v) \) be the bidder’s bid. For a GSP with N bidders (large) and S slots (fixed), Gomes and Sweeney [5] show the following relationship:

$$ \beta (v) = v - \sum\nolimits_{s = 2}^{S} {\gamma_{s} (v)\int_{0}^{v} {(v - \beta (x))F^{N - s - 1} (x)f(x)dx} } , $$

where F (with the density f) represents the distribution of bidders’ values and \( \gamma_{s} (v) \) is a function of v with a complicated expression. We have derived the result \( \beta (v) = v \) by showing that

$$ \mathop {\lim }\limits_{N \to \infty } \sum\nolimits_{s = 2}^{S} {\gamma_{s} (v)\int_{0}^{v} {(v - \beta (x))F^{N - s - 1} (x)f(x)dx} } = 0. $$

The details are omitted due to space limitations. Instead, we provide a sketch of the proof here in a Nash equilibrium context. First, suppose that the bidder underbid with \( \beta^{ - } (v) < v \) and wins. He will be paid an amount next to his price \( \beta * > \beta^{ - } (v) \). If \( \beta * > v \), then the bidder would win with bid v anyway. If \( \beta^{ * } \le v \), then the payoff to the bidder would be \( \beta^{ * } - v \le 0 \). So, underbidding does not improve the bidder’s payoff. Now, suppose that the bidder overbid with \( \beta^{ + } (v) > v \). Since N is large and S is fixed, there will be a large number of bids within \( [v,\beta^{ + } (v)] \) and this number will be larger than S. Consequently, the bidder will not win by overbidding.

Proof of Theorem 2 (Solutions to the optimization problem (1)).

We standardize the minimization problem (1) to

$$ { \hbox{min} }\,\sum\limits_{i = 1}^{m} {\frac{{\left( {x_{i} - np_{i} } \right)^{2} }}{{np_{i} }}} + \frac{1}{2}w\sum\limits_{i = 1}^{m} {d_{i} x_{i}^{2} } , $$
(A.1)
$$ {\text{s}}.{\text{t}}. { 1} - x_{i} \le 0, \, i = 1, \ldots ,m, $$
(A.2)
$$ \sum\limits_{i = 1}^{m} {x_{i} } - n = 0. $$
(A.3)

Applying the KKT conditions to this problem, we have for Lagrange multipliers \( \lambda_{i} , \, i = 1, \ldots ,m \),

$$ \begin{aligned} \lambda_{i} \ge 0, \, i = 1, \ldots ,m, \hfill \\ \lambda_{i} ( - x_{i} + 1) = 0, \, i = 1, \ldots ,m, \hfill \\ \sum\limits_{i = 1}^{m} {x_{i} } - n = 0. \hfill \\ \end{aligned} $$
(A.4)

It follows from (A.1), (A.2) and (A.3) that

$$ \frac{d}{{dx_{i} }}\left( {\sum\limits_{i = 1}^{m} {\frac{{\left( {x_{i} - np_{i} } \right)^{2} }}{{np_{i} }}} + \frac{1}{2}w\sum\limits_{i = 1}^{m} {d_{i} x_{i}^{2} } + \sum\limits_{i = 1}^{m} {\lambda_{i} ( - x_{i} + 1)} + \mu (\sum\limits_{i = 1}^{m} {x_{i} } - n)} \right) = 0, \, i = 1, \ldots ,m, $$

or

$$ \frac{{2\left( {x_{i} - np_{i} } \right)}}{{np_{i} }} + wd_{i} x_{i} - \lambda_{i} + \mu = 0, \, i = 1, \ldots ,m. $$

For non-boundary solutions, \( x_{i} > 1 \). It then follows from (A.4) that \( \lambda_{i} = 0, \, i = 1, \ldots ,m \). Thus,

$$ \begin{aligned} 2(x_{i} - np_{i} ) + wnp_{i} d_{i} x_{i} + np_{i} \mu = 0, \, i = 1, \ldots ,m, \hfill \\ x_{i} (2 + wnp_{i} d_{i} ) - 2np_{i} + np_{i} \mu = 0, \, i = 1, \ldots ,m, \hfill \\ \end{aligned} $$

i.e.,

$$ x_{i} - \frac{{2np_{i} }}{{(2 + wnp_{i} d_{i} )}} + \mu \frac{{np_{i} }}{{(2 + wnp_{i} d_{i} )}} = 0, \, i = 1, \ldots ,m, $$
(A.5)

or

$$ \sum\limits_{i = 1}^{m} {x_{i} } - \sum\limits_{i = 1}^{m} {\frac{{2np_{i} }}{{(2 + wnp_{i} d_{i} )}}} + \mu \sum\limits_{i = 1}^{m} {\frac{{np_{i} }}{{(2 + wnp_{i} d_{i} )}}} = 0. $$

Substituting (A.3) into the above equation, we have

$$ n - \sum\limits_{i = 1}^{m} {\frac{{2np_{i} }}{{(2 + wnp_{i} d_{i} )}}} + \mu \sum\limits_{i = 1}^{m} {\frac{{np_{i} }}{{(2 + wnp_{i} d_{i} )}}} = 0. $$

Solving it for μ, we get \( \mu = 2 - \frac{1}{{\sum\limits_{i = 1}^{m} {\frac{{p_{i} }}{{(2 + wnp_{i} d_{i} )}}} }} \). Substituting it into (A.5), we have

$$ x_{i} = {{\frac{{np_{i} }}{{2 + wnp_{i} d_{i} }}} \mathord{\left/ {\vphantom {{\frac{{np_{i} }}{{2 + wnp_{i} d_{i} }}} {\sum\limits_{j = 1}^{m} {\frac{{p_{j} }}{{2 + wnp_{j} d_{j} }}} }}} \right. \kern-0pt} {\sum\limits_{j = 1}^{m} {\frac{{p_{j} }}{{2 + wnp_{j} d_{j} }}} }}, \, i = 1, \ldots ,m. $$

This solution satisfies KKT condition and the objective function is convex, so it is the unique solution.

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Liu, X., Li, XB. (2016). Acquiring High Quality Customer Data with Low Cost. In: Sugumaran, V., Yoon, V., Shaw, M. (eds) E-Life: Web-Enabled Convergence of Commerce, Work, and Social Life. WEB 2015. Lecture Notes in Business Information Processing, vol 258. Springer, Cham. https://doi.org/10.1007/978-3-319-45408-5_5

Download citation

Publish with us

Policies and ethics