Exploiting Randomness for Feature Selection in Multinomial Logit: A CRM Cross-Sell Application

  • Anita Prinzie
  • Dirk Van den Poel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4065)


Data mining applications addressing classification problems must master two key tasks: feature selection and model selection. This paper proposes a random feature selection procedure integrated within the multinomial logit (MNL) classifier to perform both tasks simultaneously. We assess the potential of the random feature selection procedure (exploiting randomness) as compared to an expert feature selection method (exploiting domain-knowledge) on a CRM cross-sell application. The results show great promise as the predictive accuracy of the integrated random feature selection in the MNL algorithm is substantially higher than that of the expert feature selection method.


Feature Selection Predictive Accuracy Product Category Feature Subset Feature Selection Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, D., Schorling, C.: Market Share Forecasting: An Empirical Comparison of Artificial Neural Networks and Multinomial Logit Model. Journal of Retailing 72(4), 383–407 (1996)CrossRefGoogle Scholar
  2. 2.
    Baltas, G., Doyle, P.: Random utility models in marketing: a survey. Journal of Business Research 51(2), 115–125 (2001)CrossRefGoogle Scholar
  3. 3.
    Barandela, R., Sánchez, J.S., Garcia, V., Rangel, E.: Strategies for learning in class imbalance problems. Pattern Recognition 36(3), 849–851 (2003)CrossRefGoogle Scholar
  4. 4.
    Barsalou, L.W.: Deriving Categories to Achieve Goals. In: Bower, G.H. (ed.) The Psychology of Learning and Motivation, pp. 1–64. Academic Press, New York (1991)Google Scholar
  5. 5.
    Ben-Akiva, M., Lerman, S.R.: Discrete Choice Analysis: Theory and Application to Travel Demand. The MIT Press, Cambridge (1985)Google Scholar
  6. 6.
    Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)MATHCrossRefGoogle Scholar
  7. 7.
    Buchtala, O., Klimek, M., Sick, B.: Evolutionary optimization of radial basis function classifiers for data mining classifications. IEEE Transactions on Systems Man and Cybernetics Part B- Cybernetics 35(5), 928–947 (2005)CrossRefGoogle Scholar
  8. 8.
    Corfman, K.P.: Comparability and Comparison Levels Used in Choices Among Consumer Products. Journal of Marketing Research 28(3), 368–374 (1991)CrossRefGoogle Scholar
  9. 9.
    DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L.: Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988)MATHCrossRefGoogle Scholar
  10. 10.
    Dietterich, T.G.: Machine-Learning Research – Four current directions. AI Magazine 18(4), 97–136 (1997)Google Scholar
  11. 11.
    Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. Technical Report HPL-2003-4, HP Laboratories (2003)Google Scholar
  12. 12.
    Green, D., Swets, J.A.: Signal detection theory and psychophysics. John Wiley & Sons, New York (1966)Google Scholar
  13. 13.
    Huang, Y., McCullagh, P., Black, N., Harper, R.: Feature Selection and Classification Model Construction on Type 2 Diabetic Patient’s Data. In: Perner, P. (ed.) ICDM 2004. LNCS (LNAI), vol. 3275, pp. 153–162. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
    Johnson, M.D.: Consumer Choice Strategies for Comparing Noncomparable Alternatives. Journal of Consumer Research 11(3), 741–753 (1984)CrossRefGoogle Scholar
  15. 15.
    Knott, A., Hayes, A., Neslin, S.A.: Next-Product-To-Buy Models for Cross-selling Applications. Journal of Interactive Marketing 16(3), 59–75 (2002)CrossRefGoogle Scholar
  16. 16.
    Kohavi, R., John, G.H.: Wrappers for Feature Subset Selection. Artificial Intelligence 97(1-2), 273–324 (1997)MATHCrossRefGoogle Scholar
  17. 17.
    Leopold, E., Kindermann, J.: Text Categorization with Support Vector Machines. How to Represent Texts in Input Space? Machine Learning 46(1-3), 423–444 (2002)MATHGoogle Scholar
  18. 18.
    Liu, H., Yu, L.: Toward Integrating Feature Selection Algorithms for Classification and Clustering. IEEE Transactions on Knowledge and Data Engineering 17(4), 491–502 (2005)CrossRefGoogle Scholar
  19. 19.
    Melgani, F., Bruzzone, L.: Classification of Hyperspectral Remote Sensing Images with Support Vector Machines. IEEE Transactions on Geoscience and Remote Sensing 42(8), 1778–1790 (2004)CrossRefGoogle Scholar
  20. 20.
    Morrison, D.G.: On the interpretation of discriminant analysis. Journal of Marketing Research 6, 156–163 (1969)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Prinzie, A., Van den Poel, D.: Incorporating sequential information into traditional classification models by using an element/position-sensitive SAM. Decision Support Systems (in press, 2006)Google Scholar
  22. 22.
    Sindhwani, V., Rakshit, S., Deodhare, D., Erdogmus, D., Principe, J.C., Niyogi, P.: Feature Selection in MLPs and SVMs Based on Maximum Output Information. IEEE Transactions on Neural Networks 15(4), 937–948 (2004)CrossRefGoogle Scholar
  23. 23.
    Xing, B., Jordan, M., Karp, R.: Feature Selection for High-Dimensional Genomic Microarray Data. In: Proc. 15th International Conf. Machine Learning, pp. 601–608 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Anita Prinzie
    • 1
  • Dirk Van den Poel
    • 1
  1. 1.Department of MarketingGhent UniversityGhentBelgium

Personalised recommendations