Abstract
In steep growth in the consumption of Internet, Big Data came into picture for handling enormous amount of data. However, the data that is generated through internet has high dimensional data. So, feature engineering will be performed—to extract the best feature subset from high dimensional data. Assets are the ones to keep, expand upon, and support for the one who and what is to come. Asset mapping is a positive and charming way to learn about the community. It empowers us to contemplate where individuals live and work. It also challenges us to recognize how other people see the same community. In this paper, a model is introduced to find the required assets based on the population in the area and whether the available assets are tangible are not, is identified by extracting the features from the data gathered from the government of Andhra Pradesh. The data is pre-processed by extracting the best features in it by using feature engineering methods and classifiers like XGBoost, Random Forest and ExtraTreeClassifier. The experimental results proves that XGBoost provides the most accurate results for the specified target.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
I. Guyon, A. Elisseeff, An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
L. Jiang, S. Jiang, Q. Yu, Feature selection method based on sorting integration in software defect prediction. J. Chin. Comput. Syst. 39(7), 36–40 (2018)
G. Chandra Shekar, F. Sahin, A Survey on Feature Selection Methods (Pergamon Press); C. Du, C. Zhou et al., Application of ensemble feature selection in gene expression data. J. Shandong Univ. Sci. Technol. (Nat. Sci.) 38(1), 85–90 (2014)
A. Bidgoli, M.N. Parsa, A hybrid feature selection by resampling, chi squared and consistency evaluation techniques. Eng. Technol. 6, 276–285 (2012)
J. Yang, Study on ensemble feature selection of biomics data (2017)
W. Altidor, T.M. Khoshgoftaar, J. Van Hulse, A. Napolitano, Ensemble feature ranking methods for data intensive computing applications, in Handbook of Data Intensive Computing (Springer, Berlin, 2011), pp. 349–376
A.Y. Zomaya, Stability of feature selection algorithms and ensemble feature selection methods in bioinformatics. in Biological Knowledge Discovery Handbook: Preprocessing, Mining and Post Processing of Biological Data (2017)
V. Bolón-Canedo, N. Sánchez-Maroño, A. Alonso-Betanzos, Data classification using an ensemble of filters. Neurocomputing 135, 13–20 (2014)
S.D. Bay, Combining nearest neighbor classifiers through multiple feature subsets, in ICML, vol. 98 (Citeseer, 1998), pp. 37–45
N. Hoque, M. Singh, D.K. Bhattacharyya, EFS-MI: an ensemble feature selection method for classification
W. Hu, K.S. Choi, Y. Gu, S. Wang, Minimum–maximum local structure information for feature selection. Pattern Recogn. Lett. 34(5), 527–535 (2013)
L. Torlay, M. Perrone-Bertolotti, E. Thomas, M. Baciu, Machine learning–XGBoost analysis of language networks to classify patients with epilepsy. Brain Inform. 4, 159–169 (2017). https://doi.org/10.1007/s40708-017-0065-7
M. Ali, R. Ali, W.A. Khan, S.C. Han, J. Bang, T. Hur et al., A data-driven knowledge acquisition system: an end-to-end knowledge engineering process for generating production rules. IEEE Access 6(99), 15587–15607 (2018). https://doi.org/10.1109/ACCESS.2018.2817022
M. Ali, UFS—Unified Features Scoring Code, version 1.0 (2017). Accessed 4 Apr 2018. Available online https://github.com/ubiquitous-computing-lab/Mining-Minds/blob/master/knowledge-curationlayer/DDKAT/src/main/java/org/uclab/mm/kcl/ddkat/dataselector/FeatureEvaluator.java
V. Bolón-Canedo, N. Sánchez-Maroño, A. Alonso-Betanzos, An ensemble of filters and classifiers for microarray data classification. Pattern Recogn. 45(1), 531–539 (2012)
A.L. Blum, P. Langley, Selection of relevant features and examples in machine learning. Artif. Intell. 97(1), 245–271 (1997)
S. Abdullah, N.R. Sabar, M.Z.A. Nazri, M. Ayob, An exponential monte-carlo algorithm for feature selection problems. Comput. Ind. Eng. 67, 160–167 (2014)
O. Osanaiye, H. Cai, K.K.R. Choo, A. Dehghantanha, Z. Xu, M. Dlodlo, Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. EURASIP J. Wirel. Commun. Netw. 2016(1), 130 (2016). https://doi.org/10.1186/s13638-016-0623-3
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Sree Divya, K., Bhargavi, P., Jyothi, S. (2020). XGBoost Classifier to Extract Asset Mapping Features. In: Jyothi, S., Mamatha, D., Satapathy, S., Raju, K., Favorskaya, M. (eds) Advances in Computational and Bio-Engineering. CBE 2019. Learning and Analytics in Intelligent Systems, vol 15. Springer, Cham. https://doi.org/10.1007/978-3-030-46939-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-46939-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46938-2
Online ISBN: 978-3-030-46939-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)