Date: 03 Jun 2012
How to reverse-engineer quality rankings
A good or bad product quality rating can make or break an organization. However, the notion of “quality” is often defined by an independent rating company that does not make the formula for determining the rank of a product publicly available. In order to invest wisely in product development, organizations are starting to use intelligent approaches for determining how funding for product development should be allocated. A critical step in this process is to “reverse-engineer” a rating company’s proprietary model as closely as possible. In this work, we provide a machine learning approach for this task, which optimizes a certain rank statistic that encodes preference information specific to quality rating data. We present experiments on data from a major quality rating company, and provide new methods for evaluating the solution. In addition, we provide an approach to use the reverse-engineered model to achieve a top ranked product in a cost-effective way.
Editor: Thorsten Joachims.
Ataman, K., Street, W. N., & Zhang, Y. (2006). Learning to rank by maximizing AUC with linear programming. In International joint conference on neural networks.
Bajek, E. (2008). (Almost) How the Elias Sports Bureau rankings work. Unpublished blog entry at http://tigers-thoughts.blogspot.com/2008/07/almost-how-elias-sports-bureau-rankings.html.
Bertsimas, D., & Weismantel, R. (2005). Optimization over integers. Charlestown: Dynamic Ideas.
Bertsimas, D., Chang, A., & Rudin, C. (2010). A discrete optimization approach to supervised ranking. In Proceedings of the 5th INFORMS workshop on data mining and health informatics (DM-HI 2010).
Bertsimas, D., Chang, A., & Rudin, C. (2011). Integer optimization methods for supervised ranking. MIT DSpace, Operations Research Center. Working paper available at http://dspace.mit.edu/handle/1721.1/67362.
Burges, C. J., Ragno, R., & Le, Q. (2006). Learning to rank with nonsmooth cost functions. In Proceedings of neural information processing systems (NIPS) (pp. 395–402).
Cao, Z., Qin, T., Liu, T.-Y., Tsai, M.-F., & Li, H. (2007). Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on machine learning (ICML) (pp. 129–136). CrossRef
Chandler, S. J. (2006). Analyzing US News and World Report rankings using symbolic regression. Unpublished blog entry at http://taxprof.typepad.com/taxprof_blog/files/analyzing_u.S. News & World Report Rankings Using Symbolic Regression.pdf.
Cossock, D., & Zhang, T. (2006). Subset ranking using regression. In Proceedings of the 19th conference on learning theory (COLT) (pp. 605–619).
Ferri, C., Flach, P., & Hernández-Orallo, J. (2002). Learning decision trees using the area under the ROC curve. In Proceedings of the 19th international conference on machine learning (ICML) (pp. 139–146).
Freund, Y., Iyer, R., Schapire, R. E., & Singer, Y. (2003). An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 4, 933–969. MathSciNet
Green, P. E., Krieger, A. M., & Wind, Y. (J.) (2001). Thirty years of conjoint analysis: Reflections and prospects. Interfaces, 31(3), S56–S73. CrossRef
Hammer, P. L., Kogan, A., & Lejeune, M. A. (2007). Reverse-engineering banks’ financial strength ratings using logical analysis of data. Working paper available at http://www.optimization-online.org/DB_FILE/2007/02/1581.pdf.
Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the ACM conference on knowledge discovery and data mining (KDD). New York: ACM Press.
Lafferty, J., & Zhai, C. (2001). Document language models, query models, and risk minimization for information retrieval. In Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval (pp. 111–119). New York: ACM Press. CrossRef
Le, Q., & Smola, A. (2007). Direct optimization of ranking measures. arXiv:0704.3359v1 [cs.IR].
Li, P., Burges, C. J., & Wu, Q. (2007). McRank: learning to rank using multiple classification and gradient boosting. In Proceedings of neural information processing systems (NIPS).
Matveeva, I., Laucius, A., Burges, C., Wong, L., & Burkard, T. (2006). High accuracy retrieval with multiple nested ranker. In Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (pp. 437–444). New York: ACM Press. CrossRef
Morgenson, G., & Story, L. (2010). Rating agency data aided wall street in deals. New York Times, Business section, April 24, 2010. Article at http://www.nytimes.com/2010/04/24/business/24rating.html.
Su, A.-J., Hu, Y. C., Kuzmanovic, A., & Koh, C.-K. (2010). How to improve your Google ranking: myths and reality. In IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT) (Vol. 1, pp. 50–57). CrossRef
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y., & Singer, Y. (2005). Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research, 6, 1453–1484. MATH
Xu, J., Liu, T. Y., Lu, M., Li, H., & Ma, W. Y. (2008). Directly optimizing evaluation measures in learning to rank. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. New York: ACM Press.
- How to reverse-engineer quality rankings
Volume 88, Issue 3 , pp 369-398
- Cover Date
- Print ISSN
- Online ISSN
- Springer US
- Additional Links
- Supervised ranking
- Quality ratings
- Discrete optimization
- Applications of machine learning
- Industry Sectors
- Author Affiliations
- 1. Operations Research Center, Mass. Institute of Technology, Cambridge, MA, 02139, USA
- 2. MIT Sloan School of Management, Mass. Institute of Technology, Cambridge, MA, 02139, USA
- 3. Ford Motor Company, Dearborn, MI, 48124, USA