Skip to main content
Log in

Imputation Method Based on Collaborative Filtering and Clustering for the Missing Data of the Squeeze Casting Process Parameters

  • Technical Article
  • Published:
Integrating Materials and Manufacturing Innovation Aims and scope Submit manuscript

Abstract

The development of a highly efficient methodology for establishing squeeze casting process parameters from past data is essential. However, designing squeeze casting process parameters based on past data is difficult when there are many missing values. Conventional missing data approaches are fraught with additional computational challenges when applied to high-dimensional multivariable missing data, especially material process data with correlation. As the relationship between material composition and process parameters has similar characteristics with that between users and information of interest, this paper proposes a method for missing data imputation based on a clustering-based collaborative filtering (ClubCF) algorithm to address this challenge. Data samples with and without missing values were divided into two groups. K-means clustering based on a canopy algorithm was applied to the data samples without missing values to obtain k subclass data, whose values were then selected to fill data samples with missing values via a collaborative filtering theory based on Pearson similarity user filling. The missing squeeze casting process parameters data of aluminum alloys were used to evaluate the method, and more comparative experiments were carried out to understand their performance and features. Two different indicators, including the mean absolute error and the standard deviation, were utilized to quantify the imputation performance, which was compared with those of three conventional methods (mean interpolation, regression interpolation, and the expectation maximization algorithm). The results indicate that the proposed approach is effective and outperforms conventional methods in processing high-dimensional correlated data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson KA (2013) Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater 1(1):011002. https://doi.org/10.1063/1.4812323

    Article  CAS  Google Scholar 

  2. Alhashmy HA, Nganbe M (2015) Laminate squeeze casting of carbon fiber reinforced aluminum matrix composites. Mater Des 67:154–158. https://doi.org/10.1016/j.matdes.2014.11.034

    Article  CAS  Google Scholar 

  3. de Pablo JJ, Jackson NE, Webb MA, Chen L-Q, Moore JE, Morgan D, Jacobs R, Pollock T, Schlom DG, Toberer ES, Analytis J, Dabo I, DeLongchamp DM, Fiete GA, Grason GM, Hautier G, Mo Y, Rajan K, Reed EJ, Rodriguez E, Stevanovic V, Suntivich J, Thornton K, Zhao J-C (2019) New frontiers for the materials genome initiative. npj Comput Mater. https://doi.org/10.1038/s41524-019-0173-4

    Article  Google Scholar 

  4. Chiang K-T, Liu N-M, Tsai T-C (2008) Modeling and analysis of the effects of processing parameters on the performance characteristics in the high pressure die casting process of Al–SI alloys. Int J Adv Manuf Technol 41(11–12):1076–1084. https://doi.org/10.1007/s00170-008-1559-5

    Article  Google Scholar 

  5. Patel GCM, Krishna P, Parappagoudar MB (2014) Optimization of squeeze cast process parameters using Taguchi and Grey relational analysis. Procedia Technol 14:157–164. https://doi.org/10.1016/j.protcy.2014.08.021

    Article  Google Scholar 

  6. Ravikumar AR, Amirthagadeswaran KS, Senthil P (2014) Parametric optimization of squeeze cast AC2A-Ni coated SiCp composite using Taguchi technique. Adv Mater Sci Eng 2014:1–10. https://doi.org/10.1155/2014/160519

    Article  CAS  Google Scholar 

  7. Souissi N, Souissi S, Lecompte J-P, Amar MB, Bradai C, Halouani F (2015) Improvement of ductility for squeeze cast 2017 A wrought aluminum alloy using the Taguchi method. Int J Adv Manuf Technol 78(9–12):2069–2077. https://doi.org/10.1007/s00170-015-6792-0

    Article  Google Scholar 

  8. Sarfraz S, Jahanzaib M, Wasim A, Hussain S, Aziz H (2016) Investigating the effects of as-casted and in situ heat-treated squeeze casting of Al–3.5% Cu alloy. Int J Adv Manuf Technol 89(9–12):3547–3561. https://doi.org/10.1007/s00170-016-9350-5

    Article  Google Scholar 

  9. Sarfraz MH, Jahanzaib M, Ahmed W, Hussain S (2019) Multi-response parametric optimization of squeeze casting process for fabricating Al 6061-SiC composite. Int J Adv Manuf Technol 102(1–4):759–773. https://doi.org/10.1007/s00170-018-03278-6

    Article  Google Scholar 

  10. Agrawal A, Choudhary A (2016) Perspective: materials informatics and big data: realization of the “fourth paradigm” of science in materials science. APL Mater 4(5):053208. https://doi.org/10.1063/1.4946894

    Article  CAS  Google Scholar 

  11. Deng Z, Yin H, Jiang X, Zhang C, Zhang K, Zhang T, Xu B, Zheng Q, Qu X (2018) Machine leaning aided study of sintered density in Cu–Al alloy. Comput Mater Sci 155:48–54. https://doi.org/10.1016/j.commatsci.2018.07.049

    Article  CAS  Google Scholar 

  12. Fernandez-Zelaia P, Melkote SN (2019) Process–structure–property modeling for severe plastic deformation processes using orientation imaging microscopy and data-driven techniques. Integr Mater Manuf Innov 8:17–36. https://doi.org/10.1007/s40192-019-00125-8

    Article  Google Scholar 

  13. Wenzlick M, Bauer JR, Rose K, Hawk J, Devanathan R (2020) Data assessment method to support the development of creep-resistant alloys. Integr Mater Manuf Innov 9:89–102. https://doi.org/10.1007/s40192-020-00167-3

    Article  Google Scholar 

  14. Paik MC, Wang C (2009) Handling missing data by deleting completely observed records. J Stat Plan Inference 139(7):2341–2350. https://doi.org/10.1016/j.jspi.2008.10.024

    Article  Google Scholar 

  15. Little RJA (1988) Missing-data adjustments in large surveys. J Bus Econ Stat 6(3):287–296. https://doi.org/10.1080/07350015.1988.10509663

    Article  Google Scholar 

  16. Ramezani R, Maadi M, Khatami SM (2018) A novel hybrid intelligent system with missing value imputation for diabetes diagnosis. Alex Eng J 57(3):1883–1891. https://doi.org/10.1016/j.aej.2017.03.043

    Article  Google Scholar 

  17. Di Nuovo AG (2011) Missing data analysis with fuzzy C-Means: a study of its application in a psychological scenario. Expert Syst Appl 38(6):6793–6797. https://doi.org/10.1016/j.eswa.2010.12.067

    Article  Google Scholar 

  18. Yoke CW, Khalid ZM (2014) Comparison of multiple imputation and complete-case in a simulated longitudinal data with missing covariate. Paper presented at the AIP Conference Proceedings

  19. Lan Q, Xu X, Ma H, Li G (2020) Multivariable data imputation for the analysis of incomplete credit data. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2019.112926

    Article  Google Scholar 

  20. Zhang L, Lu W, Liu X, Pedrycz W, Zhong C (2016) Fuzzy C-Means clustering of incomplete data based on probabilistic information granules of missing values. Knowl-Based Syst 99:51–70. https://doi.org/10.1016/j.knosys.2016.01.048

    Article  Google Scholar 

  21. Shahbazi H, Karimi S, Hosseini V, Yazgi D, Torbatian S (2018) A novel regression imputation framework for Tehran air pollution monitoring network using outputs from WRF and CAMx models. Atmos Environ 187:24–33. https://doi.org/10.1016/j.atmosenv.2018.05.055

    Article  CAS  Google Scholar 

  22. Edwards JK, Cole SR, Troester MA, Richardson DB (2013) Accounting for misclassified outcomes in binary regression models using multiple imputation with internal validation data. Am J Epidemiol 177(9):904–912. https://doi.org/10.1093/aje/kws340

    Article  Google Scholar 

  23. Robbins MW, Ghosh SK, Habiger JD (2013) Imputation in high-dimensional economic data as applied to the agricultural resource management survey. J Am Stat Assoc 108(501):81–95. https://doi.org/10.1080/01621459.2012.734158

    Article  CAS  Google Scholar 

  24. Gao Y, Merz C, Lischeid G, Schneider M (2018) A review on missing hydrological data processing. Environ Earth Sci. https://doi.org/10.1007/s12665-018-7228-6

    Article  Google Scholar 

  25. Walczak B, Massart DL (2001) Dealing with missing data: part II. Chemometr Intell Lab 1(58):29–42

    Article  Google Scholar 

  26. Qiu J-Q, Zhou Y-Q, Yue T-Y, Pei J, Shui C-Y, Li X-S, Zhang T (2018) Missing data replacement methods in different scenarios. Sichuan da xue xue bao Yi xue ban J Sichuan Univ Med Sci Ed 49(3):430–435

    Google Scholar 

  27. Miró JJ, Caselles V, Estrela MJ (2017) Multiple imputation of rainfall missing data in the Iberian Mediterranean context. Atmos Res 197:313–330. https://doi.org/10.1016/j.atmosres.2017.07.016

    Article  Google Scholar 

  28. Zainuri NA, Jemain AA, Muda N (2015) A comparison of various imputation methods for missing values in air quality data. Sains Malaysiana 44(3):449–456

    Article  Google Scholar 

  29. Jerez JM, Molina I, Garcia-Laencina PJ, Alba E, Ribelles N, Martin M, Franco L (2010) Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif Intell Med 50(2):105–115. https://doi.org/10.1016/j.artmed.2010.05.002

    Article  Google Scholar 

  30. Choi Y-Y, Shon H, Byon Y-J, Kim D-K, Kang S (2019) Enhanced application of principal component analysis in machine learning for imputation of missing traffic data. Appl Sci. https://doi.org/10.3390/app9102149

    Article  Google Scholar 

  31. Li JR, Khoo LP, Tor SB (2006) RMINE: a rough set based data mining prototype for the reasoning of incomplete data in condition-based fault diagnosis. J Intell Manuf 1(17):163–176

    Article  Google Scholar 

  32. Tahir M, Li M, Ayoub N, Aamir M (2019) Efficacy improvement of anomaly detection by using intelligence sharing scheme. Appl Sci. https://doi.org/10.3390/app9030364

    Article  Google Scholar 

  33. Rajula HSR, Odintsova V, Manchia M, Fanos V (2019) Overview of federated facility to harmonize, analyze and management of missing data in cohorts. Appl Sci. https://doi.org/10.3390/app9194103

    Article  Google Scholar 

  34. Krishnamurthy N, Maddali S, Hawk JA, Romanov VN (2019) 9Cr steel visualization and predictive modeling. Comput Mater Sci 168:268–279. https://doi.org/10.1016/j.commatsci.2019.03.015

    Article  CAS  Google Scholar 

  35. Guo S, Yu J, Liu X, Wang C, Jiang Q (2019) A predicting model for properties of steel using the industrial big data based on machine learning. Comput Mater Sci 160:95–104. https://doi.org/10.1016/j.commatsci.2018.12.056

    Article  CAS  Google Scholar 

  36. Abuomar O, Nouranian S, King R, Lacy TE (2019) Application of materials informatics to vapor-grown carbon nanofiber/vinyl ester nanocomposites through self-organizing maps and clustering techniques. Comput Mater Sci 158:98–109. https://doi.org/10.1016/j.commatsci.2018.11.011

    Article  CAS  Google Scholar 

  37. Verpoort PC, MacDonald P, Conduit GJ (2018) Materials data validation and imputation with an artificial neural network. Comput Mater Sci 147:176–185. https://doi.org/10.1016/j.commatsci.2018.02.002

    Article  CAS  Google Scholar 

  38. Tang H, Lei M, Gong Q, Wang J (2019) A BP neural network recommendation algorithm based on cloud model. IEEE Access 7:35898–35907. https://doi.org/10.1109/access.2018.2890553

    Article  Google Scholar 

  39. Wu S (2020) Research on the application of spatial partial differential equation in user oriented information mining. Alex Eng J. https://doi.org/10.1016/j.aej.2020.01.047

    Article  Google Scholar 

  40. Ge Y, Xiong H, Tuzhilin A, Liu Q (2014) Cost-aware collaborative filtering for travel tour recommendations. ACM Trans Inf Syst 32(1):1–31. https://doi.org/10.1145/2559169

    Article  Google Scholar 

  41. Yoon J, Seo W, Coh B-Y, Song I, Lee J-M (2017) Identifying product opportunities using collaborative filtering-based patent analysis. Comput Ind Eng 107:376–387. https://doi.org/10.1016/j.cie.2016.04.009

    Article  Google Scholar 

  42. Nilashi M, Ibrahim O, Bagherifard K (2018) A recommender system based on collaborative filtering using ontology and dimensionality reduction techniques. Expert Syst Appl 92:507–520. https://doi.org/10.1016/j.eswa.2017.09.058

    Article  Google Scholar 

  43. Aaldering LJ, Leker J, Song CH (2019) Recommending untapped M&A opportunities: a combined approach using principal component analysis and collaborative filtering. Expert Syst Appl 125:221–232. https://doi.org/10.1016/j.eswa.2019.02.004

    Article  Google Scholar 

  44. Khurana P, Parveen S (2016) Effective hybrid recommender approach using improved K-means and similarity. Int J Comput Trends Technol 3(36):147–152

    Article  Google Scholar 

  45. Xiaojun L (2017) An improved clustering-based collaborative filtering recommendation algorithm. Clust Comput 20(2):1281–1288. https://doi.org/10.1007/s10586-017-0807-6

    Article  Google Scholar 

  46. Zhang C, Shen X, Cheng H, Qian Q (2019) Brain tumor segmentation based on hybrid clustering and morphological operations. Int J Biomed Imaging 2019:7305832. https://doi.org/10.1155/2019/7305832

    Article  Google Scholar 

  47. Chen Q, Ibrahim JG, Chen MH, Senchaudhuri P (2008) Theory and inference for regression models with missing responses and covariates. J Multivar Anal 99(6):1302–1331. https://doi.org/10.1016/j.jmva.2007.08.009

    Article  Google Scholar 

  48. Nguyen DV, Şentürk D (2008) Multicovariate-adjusted regression models. J Stat Comput Simul 78(9):813–827. https://doi.org/10.1080/00949650701421907

    Article  Google Scholar 

  49. Karpievitch YV, Dabney AR, Smith RD (2012) Normalization and missing value imputation for label-free LC-MS analysis. BMC Bioinform S16(13 Suppl 16):S5–S5. https://doi.org/10.1186/1471-2105-13-S16-S5

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 51965006, 51875209), Guangxi Natural Science Foundation (Grant No. 2018GXNSFAA050111), and the Open Fund of the National Engineering Research Center of Near-Net-Shape Forming for Metallic Materials (Grant No. 2019001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianxin Deng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, J., Ye, Z., Shan, L. et al. Imputation Method Based on Collaborative Filtering and Clustering for the Missing Data of the Squeeze Casting Process Parameters. Integr Mater Manuf Innov 11, 95–108 (2022). https://doi.org/10.1007/s40192-021-00248-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40192-021-00248-x

Keywords

Navigation