Nonlinear association between serum testosterone levels and coronary artery disease in Iranian men

Fallah, Nader; Mohammad, Kazem; Nourijelyani, Keramat; Eshraghian, Mohammad Reza; Seyyedsalehi, Seyyed Ali; Raiessi, Maria; Rahmani, Maziar; Goodarzi, Hamid Reza; Darvish, Soodabeh; Zeraati, Hojjat; Davoodi, Gholamreza; Sadeghian, Saeed

doi:10.1007/s10654-009-9336-9

Nonlinear association between serum testosterone levels and coronary artery disease in Iranian men

Cardiovascular Disease
Published: 09 April 2009

Volume 24, pages 297–306, (2009)
Cite this article

European Journal of Epidemiology Aims and scope Submit manuscript

Nader Fallah¹,
Kazem Mohammad¹,
Keramat Nourijelyani¹,
Mohammad Reza Eshraghian¹,
Seyyed Ali Seyyedsalehi²,
Maria Raiessi³,
Maziar Rahmani⁴,
Hamid Reza Goodarzi³,
Soodabeh Darvish⁵,
Hojjat Zeraati¹,
Gholamreza Davoodi³ &
…
Saeed Sadeghian³

259 Accesses
17 Citations
Explore all metrics

Abstract

Previous studies have shown controversial results about the role of androgens in coronary artery disease (CAD). We performed this study to examine and compare the relationship between androgenic hormones and CAD using conventional linear statistical techniques as well as novel non-linear approaches. The study was conducted on 502 consecutive men who were referred for selective coronary angiography at Tehran Heart Center due to different indications. We studied the relationship between androgenic hormones and CAD by using the generalized linear models, generalized additive models, and neural networks. Free testosterone (fT), total testosterone (tT) and dehydroepiandrosterone sulfate levels in patients with significant CAD versus normal individuals were 6.69 ± 3.20 pg/ml, 16.60 ± 6.66 nm/l, and 113.38 ± 72.9 μg/dl versus 7.12 ± 3.58 pg/ml, 15.82 ± 7.26 nm/l, and 109.03 ± 68.19 μg/dl, respectively (P > 0.05). The Generalized linear models was unable to show any significant relationship between androgenic hormones and CAD, while generalized additive model and neural networks supported the significant effect of androgenic hormones on CAD. This finding suggests a nonlinear association of tT levels with CAD: lower levels have a preventive effect on CAD, whereas higher values increase the risk of CAD. Emphasizing the non-linearity of the variables may provide new insight into the possible explanation of the effect of androgenic hormones on CAD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testosterone, myocardial function, and mortality

Article 06 July 2018

Is a Previously or Currently Reduced Testosterone Level in Male Patients with Type 2 Diabetes Mellitus a Risk Factor for the Development of Coronary Artery Disease? A Systematic Review and Meta-analysis

Article Open access 04 April 2018

Potential relation between soluble growth differentiation factor-15 and testosterone deficiency in male patients with coronary artery disease

Article Open access 28 February 2019

Abbreviations

AIC:: Akaike information criteria
ANOVA:: Analysis of variance
BMI:: Body mass index
BIC:: Bayesian information criteria
CAD:: Coronary artery disease
CHOL:: Cholesterol
CRP:: C-reactive protein
DHEAS:: Dehydroepiandrosterone sulfate
DM:: Diabetes mellitus
EF:: Ejection fraction
ELISA:: Enzyme-linked immunosorbent assay
fT:: Free testosterone
GLM:: Generalized linear models
HDL:: High density lipoprotein
HTN:: Hypertension
LDL:: Low density lipoprotein
Lp(a):: Lipoprotein(a)
MLP:: Multi-layer perceptron
MSE:: Mean square error
ROC:: Receiver operating characteristic
SCA:: Selective coronary angiography
SD:: Standard deviation
SLP:: Single layer perceptron
TC:: Total cholesterol
TGs:: Triglycerides
tT:: Total testosterone
edf:: Equally degree of freedom

References

Liu PY, Death AK, Handelsman DJ. Androgens and cardiovascular disease. Endocr Rev. 2003;24:313–40. doi:10.1210/er.2003-0005.
Article PubMed CAS Google Scholar
Callies F, Stromer H, Schwinger RH, et al. Administration of testosterone is associated with a reduced susceptibility to myocardial ischemia. Endocrinology. 2003;144:4478–83. doi:10.1210/en.2003-0058.
Article PubMed CAS Google Scholar
Channer KS, Jones TH. Cardiovascular effects of testosterone: implications of the “male menopause”? Heart. 2003;89:121–2. doi:10.1136/heart.89.2.121.
Article PubMed CAS Google Scholar
Dobs AS, Bachorik PS, Arver S, et al. Interrelationships among lipoprotein levels, sex hormones, anthropometric parameters, and age in hypogonadal men treated for 1 year with a permeation-enhanced testosterone transdermal system. J Clin Endocrinol Metab. 2001;86:1026–33. doi:10.1210/jc.86.3.1026.
Article PubMed CAS Google Scholar
Malkin CJ, Pugh PJ, Jones TH, Channer KS. Testosterone for secondary prevention in men with ischaemic heart disease? QJM. 2003;96:521–9. doi:10.1093/qjmed/hcg086.
Article PubMed CAS Google Scholar
Manson JE, Bassuk SS, Harman SM, et al. Postmenopausal hormone therapy: new questions and the case for new clinical trials. Menopause. 2006;13:139–47. doi:10.1097/01.gme.0000177906.94515.ff.
Article PubMed Google Scholar
Costarella CE, Stallone JN, Rutecki GW, Whittier FC. Testosterone causes direct relaxation of rat thoracic aorta. J Pharmacol Exp Ther. 1996;277:34–9.
PubMed CAS Google Scholar
Deenadayalu VP, White RE, Stallone JN, Gao X, Garcia AJ. Testosterone relaxes coronary arteries by opening the large-conductance, calcium-activated potassium channel. Am J Physiol Heart Circ Physiol. 2001;281:H1720–7.
PubMed CAS Google Scholar
English KM, Jones RD, Jones TH, Morice AH, Channer KS. Testosterone acts as a coronary vasodilator by a calcium antagonistic action. J Endocrinol Invest. 2002;25:455–8.
PubMed CAS Google Scholar
Malkin CJ, Pugh PJ, Jones RD, Jones TH, Channer KS. Testosterone as a protective factor against atherosclerosis-immunomodulation and influence upon plaque development and stability. J Endocrinol. 2003;178:373–80. doi:10.1677/joe.0.1780373.
Article PubMed CAS Google Scholar
Wu FC, von Eckardstein A. Androgens and coronary artery disease. Endocr Rev. 2003;24:183–217. doi:10.1210/er.2001-0025.
Article PubMed CAS Google Scholar
Yue P, Chatterjee K, Beale C, Poole-Wilson PA, Collins P. Testosterone relaxes rabbit coronary arteries and aorta. Circulation. 1995;91:1154–60.
PubMed CAS Google Scholar
Kamischke A, Heuermann T, Kruger K, et al. An effective hormonal male contraceptive using testosterone undecanoate with oral or injectable norethisterone preparations. J Clin Endocrinol Metab. 2002;87:530–9. doi:10.1210/jc.87.2.530.
Article PubMed CAS Google Scholar
Zitzmann M, Nieschlag E. Testosterone levels in healthy men and the relation to behavioural and physical characteristics: facts and constructs. Eur J Endocrinol. 2001;144:183–97. doi:10.1530/eje.0.1440183.
Article PubMed CAS Google Scholar
Davoodi G, Amirezadegan A, Borumand MA, Dehkori MR, Kazemisaeid A, Yaminisharif A. The relationship between level of androgenic hormones and coronary artery disease in men. Cardiovasc J Afr. 2007;18:362–6.
PubMed Google Scholar
Bishop CM. Pattern recognition and machine learning: Springer, 2006.
Faraggi D, Simon R. The maximum likelihood neural network as a statistical classification model. J Stat Plan Inference. 1995;46:93–104. doi:10.1016/0378-3758(95)99068-2.
Article Google Scholar
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. New York: Springer; 2001.
Google Scholar
Ripley BD. Pattern recognition and neural networks. Cambridge: Cambridge University Press; 1996.
Google Scholar
Ripley RM, Harris AL, Tarassenko L. Non-linear survival analysis using neural networks. Stat Med. 2004;23:825–42. doi:10.1002/sim.1655.
Article PubMed Google Scholar
Friedewald WT, Levy RI, Fredrickson DS. Estimation of the concentration of low-density lipoprotein cholesterol in plasma, without use of the preparative ultracentrifuge. Clin Chem. 1972;18:499–502.
PubMed CAS Google Scholar
Judkins MP. Selective coronary arteriography. I. A percutaneous transfemoral technic. Radiology. 1967;89:815–24.
PubMed CAS Google Scholar
Gensini GG. A more meaningful scoring system for determining the severity of coronary heart disease. Am J Cardiol. 1983;51:606. doi:10.1016/S0002-9149(83)80105-2.
Article PubMed CAS Google Scholar
Pollak A, Rokach A, Blumenfeld A, Rosen LJ, Resnik L, Dresner Pollak R. Association of oestrogen receptor alpha gene polymorphism with the angiographic extent of coronary artery disease. Eur Heart J. 2004;25:240–5. doi:10.1016/j.ehj.2003.10.028.
Article PubMed CAS Google Scholar
Pastor R, Guallar E. Use of two-segmented logistic regression to estimate change-points in epidemiologic studies. Am J Epidemiol. 1998;148:631–42.
PubMed CAS Google Scholar
Funahashi K. On the approximate realization of continuous mapping by neural networks. Neural Netw. 1989;2:183–92. doi:10.1016/0893-6080(89)90003-8.
Article Google Scholar
Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989;2:359–66. doi:10.1016/0893-6080(89)90020-8.
Article Google Scholar
Mathieson MJ. Ordinal models for neural networks. In neural networks in financial engineering. In: Refences A-P, Abu-Mostafa Y, Moody J, Weigend A, editors. Proceedings of the Third International Conference on Neural Networks in the Capital Markets. Singapore: World Scientific; 1996. p. 523–36.
Nabney IT. Netlab: algorithms for pattern recognition. London: Springer; 2001.
Google Scholar
Pearlmutter BA. Fast exact multiplication by the Hessian. Neural Comput. 1994;6:147–60. doi:10.1162/neco.1994.6.1.147.
Article Google Scholar
Fallah N, Faghihzadeh S, Mahmoudi M. Comparing and Contrasting Fuzzy Min-Max Neural Network with the Classical Statistical Clustering Methods in classification of Rickets Disease. Bulletin of the 53rd session of the International Statistical Institute. 2001;2:445–6.
Baxt WG. Use of an artificial neural network for the diagnosis of myocardial infarction. Ann Intern Med. 1991;115:843–8.
PubMed CAS Google Scholar
Kemeny V, Droste DW, Hermes S, et al. Automatic embolus detection by a neural network. Stroke. 1999;30:807–10.
PubMed CAS Google Scholar
Das A, Ben-Menachem T, Cooper GS, et al. Prediction of outcome in acute lower-gastrointestinal haemorrhage based on an artificial neural network: internal and external validation of a predictive model. Lancet. 2003;362:1261–6. doi:10.1016/S0140-6736(03)14568-0.
Article PubMed Google Scholar
Vijaya G, Kumar V, Verma HK. ANN-based QRS-complex analysis of ECG. J Med Eng Technol. 1998;22:160–7.
Article PubMed CAS Google Scholar
Song X, Mitnitski A, MacKnight C, Rockwood K. Assessment of individual risk of death using self-report data: an artificial neural network compared with a frailty index. J Am Geriatr Soc. 2004;52(7):1180–4. doi:10.1111/j.1532-5415.2004.52319.x.
Article PubMed Google Scholar
Song X, Mitnitski A, Cox J, Rockwood K. Comparison of machine learning techniques with classical statistical models in predicting health outcomes. Medinfo. 2004;11:736–9.
Google Scholar
Penedo MG, Carreira MJ, Mosquera A, Cabello D. Computer-aided diagnosis: a neural-network-based approach to lung nodule detection. IEEE Trans Med Imaging.. 1998;17:872–80. doi:10.1109/42.746620.
Article PubMed CAS Google Scholar
Izenberg SD, Williams MD, Luterman A. Prediction of trauma mortality using a neural network. Am Surg. 1997;63:275–81.
PubMed CAS Google Scholar
Li YC, Liu L, Chiu WT, Jian WS. Neural network modeling for surgical decisions on traumatic brain injury patients. Int J Med Inform. 2000;57:1–9. doi:10.1016/S1386-5056(99)00054-4.
Article PubMed CAS Google Scholar
Grigsby J, Kooken R, Hershberger J. Simulated neural networks to predict outcomes, costs, and length of stay among orthopedic rehabilitation patients. Arch Phys Med Rehabil. 1994;75:1077–81. doi:10.1016/0003-9993(94)90081-7.
Article PubMed CAS Google Scholar
Tu JV, Guerriere MR. Use of a neural network as a predictive instrument for length of stay in the intensive care unit following cardiac surgery. Comput Biomed Res. 1993;26:220–9. doi:10.1006/cbmr.1993.1015.
Article PubMed CAS Google Scholar
Nguyen T, Malley R, Inkelis S, Kuppermann N. Comparison of prediction models for adverse outcome in pediatric meningococcal disease using artificial neural network and logistic regression analyses. J Clin Epidemiol. 2002;55:687–95. doi:10.1016/S0895-4356(02)00394-3.
Article PubMed Google Scholar
Dorsey SG, Waltz CF, Brosch L, Connerney I, Schweitzer EJ, Bartlett ST. A neural network model for predicting pancreas transplant graft outcome. Diabetes Care. 1997;20:1128–33. doi:10.2337/diacare.20.7.1128.
Article PubMed CAS Google Scholar
Buscema M, Grossi E, Snowdon D, Antuono P. Auto-contractive maps: an artificial adaptive system for data mining, an application to Alzheimer disease. Curr Alzheimer Res. 2008;5:481–98. doi:10.2174/156720508785908928.
Article PubMed CAS Google Scholar
Rossini PM, Buscema M, Capriotti M, Grossi E, Rodriguez G, Del Percio C, et al. Is it possible to automatically distinguish resting EEG data of normal elderly vs. mild cognitive impairment subjects with high degree of accuracy? Clin Neurophysiol. 2008;119:1534–45. doi:10.1016/j.clinph.2008.03.026.
Article PubMed Google Scholar
Allore H, Tinetti ME, Araujo KL, Hardy S, Peduzzi P. A case study found that a regression tree outperformed multiple linear regression in predicting the relationship between impairments and social and productive activities scores. J Clin Epidemiol. 2005;58:154–61. doi:10.1016/j.jclinepi.2004.09.001.
Article PubMed Google Scholar
DiRusso SM, Chahine AA, Sullivan T, et al. Development of a model for prediction of survival in pediatric trauma patients: comparison of artificial neural networks and logistic regression. J Pediatr Surg. 2002;37:1098–104. discussion 1098–104. doi:10.1053/jpsu.2002.33885.
Google Scholar
Eftekhar B, Mohammad K, Ardebili HE, Ghodsi M, Ketabchi E. Comparison of artificial neural network and logistic regression models for prediction of mortality in head trauma based on initial clinical data. BMC Med Inform Decis Mak. 2005;5:3. doi:10.1186/1472-6947-5-3.
Article PubMed Google Scholar
Kattan MW, Hess KR, Beck JR. Experiments to determine whether recursive partitioning (CART) or an artificial neural network overcomes theoretical limitations of Cox proportional hazards regression. Comput Biomed Res. 1998;31:363–73. doi:10.1006/cbmr.1998.1488.
Article PubMed CAS Google Scholar
Costanza MC, Paccaud F. Binary classification of dyslipidemia from the waist-to-hip ratio and body mass index: a comparison of linear, logistic, and CART models. BMC Med Res Methodol. 2004;4:7. doi:10.1186/1471-2288-4-7.
Article PubMed Google Scholar
Marble RP, Healy JC. A neural network approach to the diagnosis of morbidity outcomes in trauma care. Artif Intell Med. 1999;15:299–307. doi:10.1016/S0933-3657(98)00059-1.
Article PubMed CAS Google Scholar
Flouris AD, Duffy J. Application of artificial intelligence systems in the analysis of epidemiological data. Eur J Epidemiol. 2006;21:167–70. doi:10.1007/s10654-006-0005-y.
Article PubMed Google Scholar
Grassi M, Villani S, Marinoni A. Classification methods for the identification of `case’ in epidemiological diagnosis of asthma. Eur J Epidemiol. 2001;17(1):19–29. doi:10.1023/A:1010987521885.
Article PubMed CAS Google Scholar
Wolfe R, McKenzie DP, Black J, Simpson P, Gabbe BJ, Cameron PA. Models developed by three techniques did not achieve acceptable prediction of binary trauma outcomes. J Clin Epidemiol. 2006;59:26–35. doi:10.1016/j.jclinepi.2005.05.007.
Article PubMed Google Scholar

Download references

Acknowledgments

This study was sponsored by Tehran University/Medical Sciences. Part of this work was carried out during a sabbatical year (student visitor period) of Nader Fallah at the Mathematics and Statistics Department of Dalhousie University and University of British Columbia. Authors wish to thank R. Ripley for her help and I. Nabney for his Matlab Code. The authors thank Janet Brush, Parveer Pannu, and Catherine Pretty for editing this manuscript.

Author information

Authors and Affiliations

Department of Epidemiology and Biostatistics, University of Tehran/Medical Sciences, Tehran, Iran
Nader Fallah, Kazem Mohammad, Keramat Nourijelyani, Mohammad Reza Eshraghian & Hojjat Zeraati
Department of Biomedical Engineering, Amirkabir University of Technology, Tehran, Iran
Seyyed Ali Seyyedsalehi
Tehran Heart Center, University of Tehran/Medical Sciences, Tehran, Iran
Maria Raiessi, Hamid Reza Goodarzi, Gholamreza Davoodi & Saeed Sadeghian
Canada’s Michael Smith Genome Sciences Center, British Columbia Cancer Research Center, Vancouver, BC, Canada
Maziar Rahmani
Department of Genecology and Obstetrics, University of Tehran/Medical Sciences, Tehran, Iran
Soodabeh Darvish

Authors

Nader Fallah
View author publications
You can also search for this author in PubMed Google Scholar
Kazem Mohammad
View author publications
You can also search for this author in PubMed Google Scholar
Keramat Nourijelyani
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Reza Eshraghian
View author publications
You can also search for this author in PubMed Google Scholar
Seyyed Ali Seyyedsalehi
View author publications
You can also search for this author in PubMed Google Scholar
Maria Raiessi
View author publications
You can also search for this author in PubMed Google Scholar
Maziar Rahmani
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Reza Goodarzi
View author publications
You can also search for this author in PubMed Google Scholar
Soodabeh Darvish
View author publications
You can also search for this author in PubMed Google Scholar
Hojjat Zeraati
View author publications
You can also search for this author in PubMed Google Scholar
Gholamreza Davoodi
View author publications
You can also search for this author in PubMed Google Scholar
Saeed Sadeghian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazem Mohammad.

Appendices

Appendix 1 Generalized additive models

Generalized additive models and generalized linear models can be applied in similar situations, but they serve different analytic purposes. Generalized linear models emphasize estimation and inference for the parameters of the model, while generalized additive models focus on exploring data non-parametrically. Therefore, this model could find nonlinearity relation between predictor and response variables. Generalized additive models permit the response probability distribution to be any exponential family of distributions. Many widely used statistical models belong to this general class, including additive models for Gaussian data, binary data, and nonparametric log-linear models for Poisson data.

Suppose that Y is a response random variable and X ₁, …, X _p is a set of predictor variables. A regression procedure can be viewed as a method for estimating how the value of Y depends on the values of X ₁, …, X _p. Given a sample of values for Y and X, estimates of β₀, β₁, …, β_n are often obtained by the least squares method. In regression models the effects of prognostic factors x _i in terms of a linear predictor of the form $ \sum {x_{j} \beta_{j} } $, where the β_j are parameters. The generalized additive model replaces $ \sum {x_{j} \beta_{j} } $ with $ \sum {f_{j} (x_{j} )} $ where f _j is a unspecified (non-parametric) function. This function is estimated in a flexible manner using a scatter plot smoother. The estimated function $ \hat{f}_{j} (x_{j} ) $ can reveal possible non linearity in the effect of the x _j. Suppose y is a response or outcome variable, and x is a prognostic factor. We wish to fit a smooth curve f(x) that summarizes the dependence of y on x. If we were to find the curve that simply minimizes $ \sum {(y_{i} - f(x_{i} ))} $, the result would be an interpolating curve that would not be smooth at all. The cubic spline smoother imposes smoothness on f(x). We seek the function f(x) that minimizes

$$ \sum {(y_{i} - f(x_{i} ))}^{2} + \lambda \int {f^{\prime\prime}(x)^{2} } . $$

(1)

Notice that $ \int {f^{\prime\prime}(x)^{2} } $ measures the “Curvature’’ of the function f: λ is a non-negative smoothing parameter that must be chosen by the data analyst. The λ has a direct relation with degree of freedom. Larger values of λ force f to be smoother. More detailed have been presented elsewhere [18].

Appendix 2 Artificial neural networks

Neural network

A neural network is a set of interconnected simple processing elements called neurons, where each connection has an associated weight. The neuron or unit processes its inputs to create an output. The network consists of a number of input units representing the predictors, one or more output units corresponding to the predicted variables and possibly some internal units to increase the model complexity or flexibility. The weights associated with the interconnections between the units will be optimized in fitting the model to the data. The most commonly used form of neural network is the multi-layer perceptron (MLP). A MLP consists of one input layer of units, one output layer of units and possibly one or more layers of ‘hidden’ units. The input units pass their inputs to the units in the first hidden layer or directly to the output units. Each of the hidden layer units adds a constant (termed as ‘bias’) to a weighted sum of its inputs and calculates an activation function $ \phi_{h} $ of the result. This is then passed to hidden units in the next layer or to the output unit(s). The activation function is usually chosen in advance. Common choices of the activation function include the logistic function, tangent hyperbolic function or other monotonic functions. In this paper we fix the activation function as tangent hyperbolic function. The output units apply a linear, logistic, thresholds or other function $ \phi_{0} $ to the weighted sum of their inputs plus its ‘bias’. In this paper we use exponential function in output layer.

Denote the inputs as $ x_{i} $’s and the outputs $ t_{k} $’s, for MLP with one hidden layer

$$ \begin{gathered} t_{k} = \phi_{0} \left( {\alpha_{k} + \sum\limits_{j \to k} {\omega_{jk} \phi_{h} \left( {\alpha_{j} + \sum\limits_{i \to j} {\omega_{ij} x_{i} } } \right)} } \right). \hfill \\ \hfill \\ \end{gathered} $$

(2)

If we have only one output node, k will be equal to one. The weights can be determined by optimizing some proper criterion function such as minimizing the sum of squared errors of the predicted variable or maximizing the log-likelihood of the data in cases where a distribution of the response variable can be assumed. The structure of MLP made it possible to fit very general non-linear functional relationships between inputs and outputs. Research results have shown that neural networks with enough hidden units can approximate any arbitrary functional relationships [26, 27]. However, over-fit can be a serious problem in such a framework. This problem usually is overcome either by stopping the optimization early or more often by using regularization techniques to penalize the optimization criterion. By adding a penalty term to the optimization criterion, the estimates of the weights will be shrunk which is also termed as shrinkage method. The following smoothness penalty is often used in shrinkage method:

$$ L = - \log \,{\text{likelihood}} + \lambda \sum\limits_{\text{weights}} {\omega_{ij}^{2} } . $$

(3)

This process is also known as weight decay in neural network literatures. The tuning parameter λ can be chosen by cross-validation. For fixed number of hidden units, we minimize this penalized log-likelihood in (3) to get the weights estimated. To control the complexity of the model due to the number of the hidden units, criteria such as AIC and BIC are used.

Optimization criteria

Given a training set comprising a set of input vectors$ \left\{ {x_{n} } \right\} $, where n = 1, …, N, together with the corresponding target vector $ \left\{ {y_{n} } \right\} $, if we assume that data points $ y_{n} $ (n = 1, …, N) are independent conditional on $ x_{n} $, the likelihood function can be written as:

$$ P(y|x) = \prod\limits_{n = 1}^{N} {p(y_{n} |x_{n} )} $$

(4)

or

$$ P(y_{1} , \ldots ,y_{N} |x_{1} , \ldots ,x_{N} ) = \prod\limits_{n = 1}^{N} {p(y_{n} |x_{n} )} . $$

The error function can be defined as the negative log-likelihood:

$$ E = - \log P(y_{1} , \ldots ,y_{N} |x_{1} , \ldots ,x_{N} ) = - \sum\limits_{n = 1}^{N} {\log p(y_{n} |x_{n} )} . $$

(5)

Linear and logistic regression

For regression problems with normality assumption, this can be reduced to the most commonly used squared error criterion:

$$ E(w) = \frac{1}{2}\sum\limits_{n = 1}^{N} {\left\{ {y_{n} - t_{n} (x_{n} ;w)} \right\}^{2} } . $$

(6)

For classification problems, it is often advantageous to associate the network outputs to the posterior probabilities of each class. For a problem with two classes (such as normal and CAD), the target variable $ \left\{ {y_{n} } \right\} $ is binary and can be assumed to follow binomial distribution with its probability as $ t_{n} (x_{n} ;w) $. The error function in (5) then yields the cross-entropy error function:

$$ E = - \sum {\left\{ {y_{n} \ln t_{n} + (1 - y_{n} )\ln (1 - t_{n} )} \right\}} . $$

(7)

This definition extended to other generalized linear models (GLM) by other researcher such as in multinomial logistic regression and ordinal logistic regression or Cox regression for survival models [16–20, 26–30]. We will consider the Poisson regression in the following.

Poisson regression

Suppose we have a single target variable with count response, we consider the non-linear Poisson regression for neural networks as an extension of generalized linear models. It seems this model has not been introduced in literatures before.

The Poisson probability distribution for count data is given by:

$$ P\left[ {Y_{n} = y_{n} } \right] = \frac{{e^{{ - \lambda_{n} }} \lambda_{n}^{{y_{n} }} }}{{y_{n} !}},y_{n} = 0,1,2, \ldots . $$

(8)

In linear Poisson regression, the most commonly used formulation is the log-linear link function: $ \ln \lambda_{n} = x^{\prime}_{n} \beta $. Thus the expected value for $ y_{n} $ is given by $ E\left[ {y_{n} |x_{n} } \right] = \lambda_{n} = e^{{x_{n}^{\prime } \beta }} . $

Here we model $ \lambda_{n} $ as a function of $ x_{n} $ by an MLP neural network:

$$ t_{n} = \hat{\lambda }_{n} = \phi_{0} \left( {\alpha + \sum\limits_{j} {\omega_{j} \phi_{h} } \left( {\alpha_{j} + \sum\limits_{i \to j} {\omega_{ij} x_{n} } } \right)} \right) $$

(9)

where $ \phi_{0} $ is fixed as an exponential function.

Substituting Poisson probability function in (5) and using (9) as Poisson means, the negative log-likelihood criterion can be obtained as:

$$ E = - \sum\limits_{n = 1}^{N} {\left[ { - t_{n} + y_{n} \log t_{n} - \ln y_{n} !} \right]} . $$

(10)

Eliminating the last term which is not related to the model fitting, we have:

$$ E = - \sum\limits_{n = 1}^{N} {\left[ { - t_{n} + y_{n} \log t_{n} } \right]} . $$

(11)

Model fitting

We compare the performances of different models using simulations. Likelihood error criterion functions such as that in (11) are used to fit models with fixed number of units in hidden layer. To guard against over-fitting, a penalized version of (11) given as below is used in the non-linear model fitting.

$$ E_{r} = E + \lambda \sum\limits_{\text{weights}} {\omega_{ij}^{2} } . $$

(12)

For each neural networks model, to identify the number of units in hidden layer, both criteria Akaike Information Criterion (AIC) and Schwarz Bayesian Information Criterion (BIC) are calculated:

$$ {\text{AIC}} = - 2\,{\text{Log}}\,{\text{likelihood}} + 2m $$

(13)

$$ {\text{BIC}} = - 2\,{\text{Log}}\,{\text{likelihood}} + m\log (N) $$

(14)

where m is the number of the estimated parameters and N is the number of the observations. The model with the smallest value of the information criterion is considered to be the best. However, it should be noticed that in our neural network model fittings, for each setting of fixed number of hidden units, the negative log-likelihood score we get is suboptimal since the weights are optimized on a penalized version of (11). We thus can only get approximations of the AIC and BIC values. We also calculated MSE for testing set as a reference measure for accuracy, where MSE is defined as

$$ \frac{1}{N}\sum\limits_{n = 1}^{N} {\left( {\lambda_{n} - t_{n} } \right)^{2} } . $$

(15)

The predictions by different models are ranked by MSE.

The models considered include 2, 3, 4, 5, 10, 20 hidden units. To save the computation time, the weight decay parameter is pre-fixed at 0.012 in our computation. This value is chosen based on some empirical study for different choices of weight decay parameter.

Error gradient calculation

Back propagation is a general computing technique to fit parameters in MLP. The computation involves the numerical evaluation of derivatives of the error function with respect to the weights and biases. The general form of back propagation is described elsewhere [18]. Here we use a special algorithm based on the article by Pearlmutter [30] for computation of Hessian Matrix, similar to Nabney [29] approach. The scaled conjugate gradient algorithm is used for optimization. The code is written in R 2.5 and Matlab 7.2.

Clinical application

Reports in medical literature suggest that neural network models are applicable in diagnosing such as ricket disease [31] myocardial infarction [32] pulmonary emboli [33] and gastrointestinal hemorrhage [34], using waveform analysis of EKGs [35], prediction of health outcome [36, 37], and radiographic images [38]. Neural networks have also been successfully applied in clinical outcome prediction of trauma mortality [39], surgical decision making on traumatic brain injury patients [40], recovery from surgery [41, 42], pediatric meningococcal disease [43], transplantation outcome [44] Alzheimer’s [45] and Dementia [46]. In addition some more technical comparison between statistical methods and artificial intelligence techniques for medical data exist [45–55].

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fallah, N., Mohammad, K., Nourijelyani, K. et al. Nonlinear association between serum testosterone levels and coronary artery disease in Iranian men. Eur J Epidemiol 24, 297–306 (2009). https://doi.org/10.1007/s10654-009-9336-9

Download citation

Received: 29 June 2008
Accepted: 23 March 2009
Published: 09 April 2009
Issue Date: June 2009
DOI: https://doi.org/10.1007/s10654-009-9336-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonlinear association between serum testosterone levels and coronary artery disease in Iranian men

Abstract

Access this article

Similar content being viewed by others

Testosterone, myocardial function, and mortality

Is a Previously or Currently Reduced Testosterone Level in Male Patients with Type 2 Diabetes Mellitus a Risk Factor for the Development of Coronary Artery Disease? A Systematic Review and Meta-analysis

Potential relation between soluble growth differentiation factor-15 and testosterone deficiency in male patients with coronary artery disease

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Generalized additive models

Appendix 2

Artificial neural networks

Neural network

Optimization criteria

Linear and logistic regression

Poisson regression

Model fitting

Error gradient calculation

Clinical application

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Nonlinear association between serum testosterone levels and coronary artery disease in Iranian men

Abstract

Access this article

Similar content being viewed by others

Testosterone, myocardial function, and mortality

Is a Previously or Currently Reduced Testosterone Level in Male Patients with Type 2 Diabetes Mellitus a Risk Factor for the Development of Coronary Artery Disease? A Systematic Review and Meta-analysis

Potential relation between soluble growth differentiation factor-15 and testosterone deficiency in male patients with coronary artery disease

Abbreviations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1

Generalized additive models

Appendix 2

Artificial neural networks

Neural network

Optimization criteria

Linear and logistic regression

Poisson regression

Model fitting

Error gradient calculation

Clinical application

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation