Acquiring High Quality Customer Data with Low Cost

Liu, Xiaoping; Li, Xiao-Bai

doi:10.1007/978-3-319-45408-5_5

Xiaoping Liu⁹ &
Xiao-Bai Li⁹

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 258))

Included in the following conference series:

Workshop on E-Business

1180 Accesses
1 Citations

Abstract

This work concerns optimal customer data acquisition and selection problem. We first propose using a divide-and-conquer technique to find empirical distribution of the customer data. We then formulate a data acquisition problem as an optimization problem that maximizes the quality of the acquired data while keeping the cost of acquisition as low as possible. We propose using generalized second-price (GSP) auction to acquire customer data and show that when the number of bidders is large, GSP is a truth-telling mechanism. We derive the analytical solution for the optimization problem, which finds a set of data that best represents the probability distribution of the target population relative to the acquisition cost. An experimental study is conducted to demonstrate the effectiveness of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare SDC methods for protection of numerical microdata. European Project IST-2000-25069 CASC (2002)
Google Scholar
Chaudhuri, A., Stenger, H.: Survey Sampling: Theory and Methods. Marcel Dekker Inc., Florence (1992)
MATH Google Scholar
Edelman, B., Ostrovsky, M., Schwarz, M.: Internet advertising and the generalized second-price auction: selling billions of dollars worth of keywords. Am. Econ. Rev. 97(1), 242–259 (2007)
Article Google Scholar
Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977)
Article MATH Google Scholar
Gomes, R., Sweeney, S.: Bayes-Nash equilibria of the generalized second price auction. http://ssrn.com/abstract=1429585
Groves, R.M.: Nonresponse rates and nonresponse bias in household surveys. Public Opin. Q. 70(5), 646–675 (2006)
Article Google Scholar
Krishna, V.: Auction Theory. Elsevier, Amsterdam (2010)
Google Scholar
Mookerjee, V., Dos Santos, B.: Inductive expert system design: maximizing system value. Inf. Syst. Res. 4(4), 111–131 (1993)
Article Google Scholar
Mookerjee, V., Mannino, M.: Redesigning case retrieval to reduce information acquisition costs. Inf. Syst. Res. 8(1), 51–69 (1997)
Article Google Scholar
Moore, J., Whinston, A.: A model of decision-making with sequential information-acquisition (Part 1). Decis. Support Syst. 2(4), 285–307 (1986)
Article Google Scholar
Moore, J., Whinston, A.: A model of decision-making with sequential information-acquisition (Part 2). Decis. Support Syst. 3(1), 47–73 (1987)
Article Google Scholar
Saar-Tsechansky, M., Melville, P., Provost, F.: Active feature-value acquisition. Manage. Sci. 55(4), 664–684 (2009)
Article Google Scholar
Varian, H.R.: Position auctions. Int. J. Ind. Organ. 25(6), 1163–1178 (2007)
Article Google Scholar
Zheng, Z., Padmanabhan, B.: Selectively acquiring customer information: a new data acquisition problem and an active learning-based solution. Manag. Sci. 52(5), 697–712 (2003)
Article Google Scholar

Download references

Acknowledgments

Xiao-Bai Li’s research was supported in part by the National Library of Medicine of the National Institutes of Health (NIH) under Grant Number R01LM010942. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH.

Author information

Authors and Affiliations

Department of Operations and Information Systems, Manning School of Business, University of Massachusetts Lowell, Lowell, MA, 01854, USA
Xiaoping Liu & Xiao-Bai Li

Authors

Xiaoping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Bai Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Bai Li .

Editor information

Editors and Affiliations

Department of Decision and Information Sciences, Oakland University, Rochester, Michigan, USA
Vijayan Sugumaran
Virginia Commonwealth University, Richmond, Virginia, USA
Victoria Yoon
Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana–Champaign, Urbana, Illinois, USA
Michael J. Shaw

Appendix

Proof of Theorem 1 (GSP is asymptotically truth-telling).

Let v be the true value of a bidder and $ \beta (v) $ be the bidder’s bid. For a GSP with N bidders (large) and S slots (fixed), Gomes and Sweeney [5] show the following relationship:

$$ \beta (v) = v - \sum\nolimits_{s = 2}^{S} {\gamma_{s} (v)\int_{0}^{v} {(v - \beta (x))F^{N - s - 1} (x)f(x)dx} } , $$

where F (with the density f) represents the distribution of bidders’ values and $ \gamma_{s} (v) $ is a function of v with a complicated expression. We have derived the result $ \beta (v) = v $ by showing that

$$ \mathop {\lim }\limits_{N \to \infty } \sum\nolimits_{s = 2}^{S} {\gamma_{s} (v)\int_{0}^{v} {(v - \beta (x))F^{N - s - 1} (x)f(x)dx} } = 0. $$

The details are omitted due to space limitations. Instead, we provide a sketch of the proof here in a Nash equilibrium context. First, suppose that the bidder underbid with $ \beta^{ - } (v) < v $ and wins. He will be paid an amount next to his price $ \beta * > \beta^{ - } (v) $. If $ \beta * > v $, then the bidder would win with bid v anyway. If $ \beta^{ * } \le v $, then the payoff to the bidder would be $ \beta^{ * } - v \le 0 $. So, underbidding does not improve the bidder’s payoff. Now, suppose that the bidder overbid with $ \beta^{ + } (v) > v $. Since N is large and S is fixed, there will be a large number of bids within $ [v,\beta^{ + } (v)] $ and this number will be larger than S. Consequently, the bidder will not win by overbidding.

Proof of Theorem 2 (Solutions to the optimization problem (1)).

We standardize the minimization problem (1) to

$$ { \hbox{min} }\,\sum\limits_{i = 1}^{m} {\frac{{\left( {x_{i} - np_{i} } \right)^{2} }}{{np_{i} }}} + \frac{1}{2}w\sum\limits_{i = 1}^{m} {d_{i} x_{i}^{2} } , $$

(A.1)

$$ {\text{s}}.{\text{t}}. { 1} - x_{i} \le 0, \, i = 1, \ldots ,m, $$

(A.2)

$$ \sum\limits_{i = 1}^{m} {x_{i} } - n = 0. $$

(A.3)

Applying the KKT conditions to this problem, we have for Lagrange multipliers $ \lambda_{i} , \, i = 1, \ldots ,m $,

$$ \begin{aligned} \lambda_{i} \ge 0, \, i = 1, \ldots ,m, \hfill \\ \lambda_{i} ( - x_{i} + 1) = 0, \, i = 1, \ldots ,m, \hfill \\ \sum\limits_{i = 1}^{m} {x_{i} } - n = 0. \hfill \\ \end{aligned} $$

(A.4)

It follows from (A.1), (A.2) and (A.3) that

$$ \frac{d}{{dx_{i} }}\left( {\sum\limits_{i = 1}^{m} {\frac{{\left( {x_{i} - np_{i} } \right)^{2} }}{{np_{i} }}} + \frac{1}{2}w\sum\limits_{i = 1}^{m} {d_{i} x_{i}^{2} } + \sum\limits_{i = 1}^{m} {\lambda_{i} ( - x_{i} + 1)} + \mu (\sum\limits_{i = 1}^{m} {x_{i} } - n)} \right) = 0, \, i = 1, \ldots ,m, $$

or

$$ \frac{{2\left( {x_{i} - np_{i} } \right)}}{{np_{i} }} + wd_{i} x_{i} - \lambda_{i} + \mu = 0, \, i = 1, \ldots ,m. $$

For non-boundary solutions, $ x_{i} > 1 $. It then follows from (A.4) that $ \lambda_{i} = 0, \, i = 1, \ldots ,m $. Thus,

$$ \begin{aligned} 2(x_{i} - np_{i} ) + wnp_{i} d_{i} x_{i} + np_{i} \mu = 0, \, i = 1, \ldots ,m, \hfill \\ x_{i} (2 + wnp_{i} d_{i} ) - 2np_{i} + np_{i} \mu = 0, \, i = 1, \ldots ,m, \hfill \\ \end{aligned} $$

i.e.,

$$ x_{i} - \frac{{2np_{i} }}{{(2 + wnp_{i} d_{i} )}} + \mu \frac{{np_{i} }}{{(2 + wnp_{i} d_{i} )}} = 0, \, i = 1, \ldots ,m, $$

(A.5)

or

$$ \sum\limits_{i = 1}^{m} {x_{i} } - \sum\limits_{i = 1}^{m} {\frac{{2np_{i} }}{{(2 + wnp_{i} d_{i} )}}} + \mu \sum\limits_{i = 1}^{m} {\frac{{np_{i} }}{{(2 + wnp_{i} d_{i} )}}} = 0. $$

Substituting (A.3) into the above equation, we have

$$ n - \sum\limits_{i = 1}^{m} {\frac{{2np_{i} }}{{(2 + wnp_{i} d_{i} )}}} + \mu \sum\limits_{i = 1}^{m} {\frac{{np_{i} }}{{(2 + wnp_{i} d_{i} )}}} = 0. $$

Solving it for μ, we get $ \mu = 2 - \frac{1}{{\sum\limits_{i = 1}^{m} {\frac{{p_{i} }}{{(2 + wnp_{i} d_{i} )}}} }} $. Substituting it into (A.5), we have

$$ x_{i} = {{\frac{{np_{i} }}{{2 + wnp_{i} d_{i} }}} \mathord{\left/ {\vphantom {{\frac{{np_{i} }}{{2 + wnp_{i} d_{i} }}} {\sum\limits_{j = 1}^{m} {\frac{{p_{j} }}{{2 + wnp_{j} d_{j} }}} }}} \right. \kern-0pt} {\sum\limits_{j = 1}^{m} {\frac{{p_{j} }}{{2 + wnp_{j} d_{j} }}} }}, \, i = 1, \ldots ,m. $$

This solution satisfies KKT condition and the objective function is convex, so it is the unique solution.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, X., Li, XB. (2016). Acquiring High Quality Customer Data with Low Cost. In: Sugumaran, V., Yoon, V., Shaw, M. (eds) E-Life: Web-Enabled Convergence of Commerce, Work, and Social Life. WEB 2015. Lecture Notes in Business Information Processing, vol 258. Springer, Cham. https://doi.org/10.1007/978-3-319-45408-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-45408-5_5
Published: 01 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45407-8
Online ISBN: 978-3-319-45408-5
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

Acquiring High Quality Customer Data with Low Cost

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation