In this issue of Endocrine, Lawrence Hayes and colleagues [1] evaluated immunoassays to look at the suitability of salivary testosterone to identify biochemical hypogonadism in aging men. The diagnosis of androgen deficiency in aging men depends on the combination of consistent symptoms and signs and unequivocally low serum testosterone levels determined in a morning blood sample by an accurate assay [2]. Presently, well-validated mass spectrometry-based methods such as isotope dilution gas chromatography–mass spectrometry (ID-GCMS) or liquid chromatography–tandem mass spectrometry (LC-MSMS) are considered the state of the art for testosterone measurement [3]. Since the landmark paper of Taieb first exposed the appalling quality of testosterone immunoassays [4], not much has changed, and numerous papers have since illustrated problematic immunoassay accuracy (IA) and standardization, especially for low values. Data from the EMAS study illustrate that in the context of general screening purposes, IA accuracy can be considered acceptable but accuracy deteriorates in the hypogonadal range [5]. Considering most certified laboratories utilize automated immunoassays, this is sobering knowledge as to the correct analysis and interpretation of a low serum total testosterone result.

For some patients suspected of hypogonadism, the Endocrine Society advocates the use of free or bioavailable testosterone, determined using an accurate method [2]. Most testosterone in the human blood is bound to SHBG and—more weakly—to albumin, and only a few percent is free [6]. With age, in patients with obesity or hyperthyroidism or during corticoid therapy, SHBG levels are markedly influenced. Free testosterone is believed by many authors to be a more informative reflection of testosterone available for biological action, in particular in such situations characterized by altered SHBG levels. Not surprisingly, numerous studies have illustrated stronger associations of clinical parameters with free rather than total testosterone. Free testosterone can be measured using either ultrafiltration or equilibrium dialysis, coupled with a direct or indirect (after the addition of isotope-labeled testosterone) measurement of testosterone in the ultra filtrate or dialysate. The gold standard for free testosterone measurement is direct equilibrium dialysis coupled to mass spectrometry. It remains however out of reach for most laboratories. Equilibrium dialysis is difficult to perform, and direct analysis of the dialysate assumes a highly sensitive and thus high-end LC-MSMS instrument with a thoroughly validated method.

Over the years, alternative solutions have been explored such as direct measurement using immunoassays (inaccurate and should no longer be used) or calculated using total testosterone and SHBG levels [6, 7]. The calculated value has been shown to correlate well with equilibrium dialysis, although some uncertainties remain. A number of authors explored which calculation method works best in different patient cohorts or might mimic more closely the biological binding characteristics of SHBG [7]. Varying bias and scatter versus equilibrium dialysis results have been observed, dependent not only on the calculation methods used, but also on methodological issues related to the equilibrium dialysis (e.g., dilution, standardization, method variability, and indirect estimation using tracer may influence results). The mere fact that calculated free testosterone is a mathematical estimation also undermines its functionality in those patients where normal steroid binding patterns are disturbed (e.g., competition from exogenous steroids, very low or very high SHBG,…).

For some decades now, salivary measurements have been put forward as a non-invasive, stress free, and thus theoretically attractive alternative to steroid measurements in blood. Especially for salivary cortisol extensive literature has been published on its role in population studies, in the follow-up of therapy and, most promising, on a midnight cortisol diagnostic test for Cushing [8]. Immunoassay inherent problems regarding specificity and standardization have been shown to be an even bigger issue for measurements in saliva than serum [9]. Although pre-analytical factors, accuracy, and choice of collection device have all been shown to be important potential confounders, there is now clearly an established place for salivary cortisol measurements.

When looking at testosterone, there is much less compelling evidence. Serum and saliva testosterone concentrations are a 100-fold lower compared to cortisol, and the free percentage in serum is also smaller for testosterone. Therefore, confounding factors such as blood contamination (e.g., oral hygiene) have a much bigger influence on salivary testosterone results compared to that on cortisol [10]. For testosterone, passive drooling is the collection method of choice, whereas for cortisol multiple alternatives offer acceptable results. Also, as male saliva concentrations have been shown to approximate serum free testosterone values, accurate measurement requires state-of-the-art methodology [11, 12].

Up to now, there has been fairly limited clinical experience with salivary testosterone as a biological marker, in contrast to total or free serum testosterone. Possible diurnal variation as compared to serum testosterone has not been well researched. There are no broadly validated reference ranges and the substantial differences in reported values are a stark reminder of those aforementioned (pre-)analytical difficulties in measuring low testosterone concentrations in such a challenging matrix. The risk of both over- and under-diagnosed hypogonadism is therefore much higher.

In this context, the work Lawrence Hayes and colleagues and other investigators performed to explore the accuracy of salivary testosterone to determine androgenic status is timely and important. The study by Hayes is far from conclusive and many questions remain unanswered. It may well be that a place for salivary testosterone in selected cases, such as in population studies or those involving children, can be identified in the future, but only if salivary assays are proven sufficiently sensitive and specific. Unfortunately given the low concentrations and the above-mentioned specificity issues, it is unlikely that this can be achieved using immunoassays. Well-validated mass spectrometry-based methods are to be recommended. In addition, to these high analytical requirements, there are also challenging biological and pre-analytical potential confounders such as blood contamination, acidification, chewing, absorption,… Thus caution must be expressed in correctly interpreting conflicting reports. In this context regarding the failure of salivary testosterone to diagnose hypogonadism in the study of Lawrence Hayes and colleagues, it is likely that (pre-)analytical problems have substantially contributed to the negative findings.

Until more evidence in large patient cohorts is gathered using state-of-the-art analytical methods with impeccable pre-analytical care, the current recommendation to rely on a repeated and accurate morning serum total testosterone measurement as a first diagnostic step in men with suspected hypogonadism remains the method of choice in the management of hypogonadism. In those patients with borderline results and suspected alterations in SHBG concentration, assessment of free testosterone can contribute to the diagnosis.