# Nonlinear expression and visualization of nonmetric relationships in genetic diseases and microbiome data

## Abstract

### Background

The traditional methods of visualizing high-dimensional data objects in low-dimensional metric spaces are subject to the basic limitations of metric space. These limitations result in multidimensional scaling that fails to faithfully represent non-metric similarity data.

### Results

Multiple maps t-SNE (mm-tSNE) has drawn much attention due to the construction of multiple mappings in low-dimensional space to visualize the non-metric pairwise similarity to eliminate the limitations of a single metric map. mm-tSNE regularization combines the intrinsic geometry between data points in a high-dimensional space. The weight of data points on each map is used as the regularization parameter of the manifold, so the weights of similar data points on the same map are also as close as possible. However, these methods use standard momentum methods to calculate parameters of gradient at each iteration, which may lead to erroneous gradient search directions so that the target loss function fails to achieve a better local minimum. In this article, we use a Nesterov momentum method to learn the target loss function and correct each gradient update by looking back at the previous gradient in the candidate search direction.

By using indirect second-order information, the algorithm obtains faster convergence than the original algorithm. To further evaluate our approach from a comparative perspective, we conducted experiments on several datasets including social network data, phenotype similarity data, and microbiomic data.

### Conclusions

The experimental results show that the proposed method achieves better results than several versions of mm-tSNE based on three evaluation indicators including the neighborhood preservation ratio (NPR), error rate and time complexity.

## Keywords

Multiple maps t-SNE Data visualization Non-metric similarities Nesterov momentum## Abbreviations

- AS
Apert syndrome

- EVC
Ellis-van Creveld syndrome

- HWS
Hay-Wells syndrome

- mm-tSNE regularization
Multiple maps t-SNE with Laplacian regularization

- mm-tSNE
Multiple maps t-SNE

- MOWS
Mowat-Wilson syndrome

- NPR
Neighborhood preservation ratio

- OMIM
Online Mendelian Inheritance in Man

- t-SNE
t-Distributed Stochastic Neighborhood Embedding

## Background

A large number of studies have shown that genetic diseases with overlapping phenotypes are closely related to function-related gene mutations [1, 2]. From another perspective, there are similar pathophysiological mechanisms between different clinical features and genetic diseases [3, 4]. In addition, classical methods of dimensionality reduction and visualization of data have been applied to the analysis of microbial data [5]. However, generally speaking, the integration and analysis of microbiome big data are still in its preliminary stage. There are currently no effective integration techniques and visualization methods to exploit microbiome big data. Some studies have focused on established mathematical models that exploit the complicated correlations between phenotypes and genotypes in isomeric genomic datasets such as genetic expression data, gene ontology annotations [6], and protein-protein interaction networks [7, 8]. In addition, some studies prove that non-metric attributes are important features of microbial data [9]. Researching the associations between diseases not only helps us to discover their mutual hereditary basis [10], but also provides us new insights into the molecular circadian mechanisms [11] and prospective drug target studies [12] Each person’s gut microbiota has a dominant flora in the intestine and can be divided into three different “intestinal types” based on the characteristics of the human intestine. This finding can help us discover the relationship between drugs, diet, microbes and the body in different states of health and disease [13]. These microbes distributed in different parts of the body play a vital role in our health. Lowering the dimensions of data and extracting useful information from data in the analysis of microbiome big data, with the help of statistics and pattern recognition, the structure and characteristics of the microbial community could be analyzed; new biological hypothesis could be proposed and examined.

## Methods

### T-distributed stochastic neighborhood embedding (t-SNE)

*p*

_{ij}to represent the similarities between data points:

*Q*

_{ij}that centers at each and every point, for purposing of avoiding the “crowding problem [23]”. The paired distances between data points in a low dimensional space is transformed into a probability distance

*q*

_{ij}by t-distribution to represent the similarities between data points:

*q*

_{ij}in the low-dimensional space and the similarity

*p*

_{ij}in the high-dimensional space is measured by calculating the KL divergence between the joint distributions

*P*and

*Q*:

### Multiple maps t-SNE

mm-tSNE is a variant of the t-SNE method that breaks down the traditional limitations of a single metric map by constructing multiple mappings *M* in a low-dimensional space to visualize pairwise similarities in non-metric spaces.

*M*maps in low dimensional space, where each map contains

*N*data points. In the map with index m, the data point with index

*i*has an importance weight \( {\pi}_i^{(m)} \), which represents the importance of data point

*i*in map

*M*, and the sum of the weights of data point

*i*in all maps is equal to 1. Therefore, the pairwise similarity

*q*

_{ij}between data points in a low-dimensional space is measured by a weighted sum of pairwise similarities between data points

*i*and

*j*in all the maps. Its mathematical definition is as follows:

*i*in the high-dimensional space is mapped to the m map in the low-dimensional space. Since it is more difficult to directly calculate the parameter \( {\pi}_i^{(m)} \). In order to simplify the calculation, the weight of importance \( {\pi}_i^{(m)} \) is obtained by calculating the unconstrained \( {\omega}_i^{(m)} \):

The objective loss function has the uniform form as Eq. 3, but the cost function minimum is calculated by the location of the point \( {y}_i^{(m)} \) in all relevant metric maps and the associated unrestrained weight \( {\omega}_i^{(m)} \).

### Multiple maps t-SNE with Laplacian regularization

*C (Y)*.

*L*= (

*diag*(∑

_{j}

*p*

_{ij}) −

*P*

_{ij}).

*Y*are the model parameters, the velocity is

*v*

^{(t)}, the momentum coefficient is

*γ*∈ [0, 1] and

*η*is the learning rate at iteration t, \( \frac{\partial C(Y)}{\partial Y} \) is the gradient.

### Simplified Nesterov momentum

*θ*

_{t}are the model parameters, the velocity is

*v*

^{(t)},

*μ*

^{(t)}∈ [0, 1] is the momentum coefficient and

*ε*

^{(t)}> 0 is the learning rate at iteration t,

*f*(

*θ*) is the objective function and

*∇f*(

*θ*

^{′}) is a shorthand notation for the gradient \( \frac{\partial f\left(\theta \right)}{\partial \theta}\left|\theta ={\theta}^{\prime}\right. \).

Different from the momentum term, Nesterov momentum renews the parameter vector at some position*θ*^{(t)}, which depends on *μ*^{(t − 1)}*ν*^{(t − 1)} as well as in the last momentum update of the current parameter position. The gradient correction to the velocity*v*_{t}, with the Nesterov momentum, is calculated at point *θ*^{(t)} + *μ*^{(t − 1)}*v*^{(t − 1)}, and if *μ*^{(t − 1)}*v*^{(t − 1)} is an even worse update, *∇f*(*θ*^{(t − 1)} + *μ*^{(t − 1)}*v*^{(t − 1)}) will point reversely *θ*^{(t)} more forcefully than the gradient computed at *θ*^{(t)}, hence providing a larger and more timely correction to *v*^{(t)}. Fig. 1 (b) illustrates the geometric significance of this phenomenon. With the equivalent form of Nesterov momentum, we can observe the difference between Nesterov momentum and standard momentum. The direction of this update has increased by an amount of \( {\mu}^{\left(t-1\right)}\left[\nabla f\left(\hat{\theta^{\left(t-1\right)}}\right)-\nabla f\left(\hat{\theta^{\left(t-2\right)}}\right)\right] \), the change is essentially an approximation of the second order of the objective function. Since Nesterov momentum uses the second-order information of the objective function, the Nesterov momentum is more efficient than the standard momentum term in modifying the large and undue velocity in each iteration, which makes it run faster than the momentum method, and can further reduce the error rate of the loss function.

### Multiple maps t-SNE regularization based on Nesterov momentum

In this article, unlike the original several versions of mm-tSNE, we use the Nesterov momentum method to optimize the target loss function, which lets the loss function reach the optimal value better and faster and obtain a higher neighborhood preservation ratio.

*Y*represents the model parameter to be optimized,

*ν*

^{(t)}represents the velocity of the

*i*iteration,

*γ*∈ [0, 1] represents the momentum coefficient,

*η*represents the learning rate for the

*i*iteration, and \( \frac{\partial C(Y)}{\partial Y} \) represents the gradient.

#### Datasets

To assess the performance of our approach, we apply our method to several datasets, including phenotypic similarity dataset and microbial dataset. The microbial dataset consisted of 6313 orthologous proteins which are from 345 individual intestinal microorganisms [30]. After data preprocessing, a similarity matrix of 1299 KOs is finally obtained. The phenotypic similarities come from the Online Mendelian Inheritance in Man (OMIM) database [31, 32], which contains 1025 phenotypes related to 21 diseases, respectively, according to the disease classification information from the Human Disease Network [8]. At them in the middle, the value of similarity less than 0.5 is filtered out.

#### Evaluation indicators

### Neighborhood preservation ratio

*x*

_{i}in the high-dimensional space is exactly the same as its neighboring point in the low-dimensional space

*y*

_{i}. That is, it is assumed that the neighboring points around the sample point

*x*

_{i}pass through the high-dimensional space. After the dimensional method is projected into a two-dimensional space, the neighboring points around

*y*

_{i}coincide with the high-dimensional space. The neighborhood preservation ratio is a measure proposed by Laurens van der Maaten [15], which measures similarities in the high-dimensional space are preserved in the low-dimensional space by the mm-tSNE method. For each data point

*i*, we choose its

*k*highest p

_{ij}-values in the high-dimensional space as its

*k*nearest neighbors (

*N*

^{i1}for short), and select the k highest q

_{ij}-values in the low-dimensional space as its

*k*nearest neighbors (

*N*

^{i2}for short). By calculating the intersection of

*N*

^{i1}and

*N*

^{i2}, it can be determined whether the reduced-dimensional visualization method used can maintain the distribution of neighboring points of data in high-dimensional space. Therefore, NPR indicates the average ratio of the number of neighbors to be saved.

*N*

^{i1}∩

*N*

^{i2}| is the number of points that common points in high-dimensional space and low-dimensional space and

*n*represent the total number of visualized target data points.

### Error rate

The error rate represents the cost of using the *KL* divergence method to model the difference between the *Q* distribution and the *P* distribution.

### Time complexity

The time complexity of the algorithm is measured by the number of times the basic operations are repeated.

## Results

*λ*when apply mm-tSNE regularization based on Nesterov momentum algorithm. The x-axis represents the value of

*λ*in the experiment, and the y-axis represents the number of maps. The color change in the legend represents a gradual decrease in the preservation ratio of the neighborhood from high to low. When

*λ*= 0.002 and the number of maps is 27, the neighbor’s preservation ratio is maximized. Nevertheless, according to the experimental results, we choose the number of maps as 15, and set the

*λ*as15 as our model parameters, because it is sufficient to model the non-metric structure of phenotype similarities and KOs similarities. When the mm-tSNE regularization based on Nesterov momentum is applied, the relationship between the NPR and the number of maps is shown in Fig.5. When

*λ*= 0.005 and

*m*= 15, we obtain the highest neighborhood preservation ratio. Overall, the mm-tSNE regularization based on Nesterov momentum obtains better performance compared to other methods and reduces the time complexity of algorithm from Ο(1/

*k*) (after

*k*steps) to Ο(1/

*k*

^{2}) [21] (See Fig. 6). Since the processed data of the proposed algorithm is a matrix with

*N*×

*N*size, the spatial complexity of proposed algorithm does not improve relative to the original algorithms. The space complexity of the proposed algorithm is O (

*N*

^{2}).

## Discussion

Extracted similarities from original matrix

Phenotype With OMIMID | AS (OMIM:101,200) | MOWS (OMIM:235,730) | HWS (OMIM:106,260) | EVAS (OMIM:225,500) |
---|---|---|---|---|

AS (OMIM:101,200) | 1 | 0.5957 | 0 | 0.5148 |

MOWS (OMIM:235,730) | 0.5957 | 1 | 0.5298 | 0 |

HWS (OMIM:106,260) | 0 | 0.5298 | 1 | 0.5392 |

EVAS (OMIM:225,500) | 0.5148 | 0 | 0.5392 | 1 |

Importance weights for extracted phenotypes

Map9 | Map15 | |
---|---|---|

AS (OMIM:101200) | 0.5967 | 0.3896 |

MOWS (OMIM:235730) | 9.0475e-04 | 0.9920 |

HWS (OMIM:106260) | 0.1436 | 0.8348 |

EVC: (OMIM:225500) | 0.9474 | 0.002 |

Except MOWS, at Map 15 (see Fig. 7), AS has another near neighbor--Hay-Wells syndrome (HWS, OMIM: 106260) with a similarity 0.5957. AS, MOWS and HWS are all neighbors in Map 15. Nevertheless, astonishing truth is that the similarity between AS and HWS is 0 (See Table 1). Then we have a deep analysis of these three phenotypes. Apert syndrome is a congenital disease; the main symptoms include craniosynostosis, middle facial hypoplasia, hands and feet, with the tendency of bone structure fusion [33, 34, 35]. Mowat-Wilson’s syndrome is an autosomal dominant complex dysplasia, characterized by a variety of clinical symptoms such as mental retardation, motor retardation, epilepsy, vasovagal disease and neuropathy, caused by mutations in individual functions [36, 37, 38]. HWS is a rare, complex disease characterized by congenital ectodermal dysplasia with a variety of symptoms including thinning hair, mild hypohidrosis, scalp infection, dental hypoplasia, and maxillary dysplasia [39, 40, 41]. Although these three diseases belong to different types of diseases (tissue, developmental and multiple respectively), they have the same symptoms, such as nail and tooth dysplasia and skeletal deformities. The experimental result shows that although the text mining method [42] measures the direct similarity between AS and HWS as 0, our method does deduce their true relationship from data. This is different from non-transitive similarity modeling, because they are in the uniform metric space Map 15.

*Escherichia coli*and rRNA transcription [44]. From Table 3 we can see that although these three KOs are similar in Map 7, they are not similar to each other in other maps. For example, K05340 in Map 12 is not similar to K06204. Likewise, K06204 is not similar to K05340 in Map 13. These non-transitive similarities can not be expressed by traditional data visualization methods.

The weights for KOs similarity. Large values are shown by bold

Map1 | Map2 | Map3 | Map5 | Map7 | Map10 | Map12 | Map13 | |
---|---|---|---|---|---|---|---|---|

| 0.006 | 0.0041 | 0.0073 | 0.0082 | | 0.0046 | | |

| 0.0064 | 0.0035 | 0.0056 | 0.007 | | 0.0049 | | 0.0061 |

| 0.0030 | 0.0029 | 0.0021 | 0.0934 | | 0.0031 | 0.0034 | |

## Conclusions

We propose a new method to optimize the mm-tSNE regularization cost function. Experimental result shows that this method outperforms several versions of mm-tSNE, when measured by neighborhood preservation rate and error rate. In this study, it is shown that non-metric properties are ubiquitous in biological and microbiological data and should be considered in future studies. Traditional visualization techniques are effective when applied to small and medium-scale data, but they still face a huge challenge when applied to large biological and microbiological data. In future research work, we will propose a method to solve the problem of high computational complexity and problems in data visualization caused by the increase of data volume and the high dimensionality.

## Notes

### Acknowledgements

Not applicable.

### Consent to publication

Not applicable.

### Funding

Publication costs are funded by the National Natural Science Foundation of China (61532008) and the National Key Research and Development Program of China (2017YFC0909502).

### Availability of data and materials

The social network dataset used in our experiment can be downloaded in https://lvdmaaten.github.io/multiplemaps/Multiple_maps_t-SNE/Multiple_maps_t-SNE.html. This dataset is available for public and free to use.

The microbial dataset used in our experiment can be downloaded in ftp://penguin.genomics.cn/pub/10.5524/100001_101000/100036/Intermediate_results/. This dataset is available for public and free to use.

The phenotypic similarity dataset used in our experiment can be downloaded in http://www.cmbi.ru.nl/MimMiner/cgi-bin/main.pl. This dataset is available for public and free to use.

### About this supplement

This article has been published as part of *BMC Bioinformatics Volume 19 Supplement 20, 2018: Selected articles from the IEEE BIBM International Conference on Bioinformatics & Biomedicine (BIBM) 2017: bioinformatics.* The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-19-supplement-20.

### Authors’ contributions

XS and XJ designed the algorithm based on mm-tSNE regularization. XZ implemented the mm-tSNE regularization based on Nesterov momentum algorithm and run the experiments. KW and YM helped plan the experimental analysis. JL contributed to writing the manuscript. TH and XH supervised and helped conceive the study. All authors read and approved the final manuscript.

### Ethics approval and consent to participate

Not applicable.

### Competing interests

The authors declare that they have no competing interests.

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## References

- 1.Brunner HG, Van Driel MA. From syndrome families to functional genomics. Nat Rev Genet. 2004;5:545–51.CrossRefGoogle Scholar
- 2.Lim J, et al. A protein–protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration. Cell. 2006;125(4):801–14.CrossRefGoogle Scholar
- 3.Limviphuvadh V, et al. The commonality of protein interaction networks determined in neurodegenerative disorders (NDDs). Bioinformatics. 2007;23(16):2129–38.CrossRefGoogle Scholar
- 4.Oti M, Huynen MA, Brunner HG. Phenome connections. Trends Genet. 2008;24(3):103–6.CrossRefGoogle Scholar
- 5.Wooley JC, Godzik A, Friedberg I. A primer on metagenomics. PLoS Comput Biol. 2010;6(2):e1000667.CrossRefGoogle Scholar
- 6.Freudenberg J, Propping P. A similarity-based method for genome-wide prediction of disease relevant human genes. Bioinformatics. 2002;18(suppl2):S110–5.CrossRefGoogle Scholar
- 7.Lage K, et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol. 2007;25(3):309–16.CrossRefGoogle Scholar
- 8.Oti M, et al. Predicting disease genes using protein–protein interactions. J Med Genet. 2006;43(8):691–8.CrossRefGoogle Scholar
- 9.Xu, W., Jiang, X., Li, G. (2013) Nonmetric property of diabetes-related genes in human gut microbiome, IEEE International Conference on Bioinformatics and Biomedicine.CrossRefGoogle Scholar
- 10.Loscalzo J, Kohane I, Barabasi AL. Human disease classification in the postgenomic era: a complex systems approach to human pathobiology. Mol Syst Biol. 2007;3:124.CrossRefGoogle Scholar
- 11.Wang Q, Jia P, Cuenco KT, Feingold E, Marazita ML, Wang L, et al. Multi-dimensional prioritization of dental caries candidate genes and its enriched dense network modules. PLoS One. 8:e76666. https://doi.org/10.1371/journal.pone.0076666.
- 12.P. Csermely, T. Korcsmáros, H J M Kiss, G London, R Nussinov, Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehen sive review, Pharmacol Ther 138 (3) (2013) 333–408.Google Scholar
- 13.Arumugam M, et al. Enterotypes of the human gut microbiome.Nature 2011; 473:174–180.[PubMed: 21508958].Google Scholar
- 14.Legendre, P., L. Legendre, Numerical Ecology Vol. 20. 2012: Elsevier.Google Scholar
- 15.Van der Maaten L, Hinton G. Visualizing non-metric similarities in multiple maps. Mach Learn. 2012;87(1):33–55.CrossRefGoogle Scholar
- 16.Xu W, Jiang X, Hu X, Li G. Visualization of genetic disease-phenotype similarities by multiple maps t-SNE with Laplacian regularization. BMC Med Genet. 2014;7(2):1–9.Google Scholar
- 17.Li C, Li H. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics. 2008;24(9):1175–82.CrossRefGoogle Scholar
- 18.He X, et al. Laplacian regularized Gaussian mixture model for data clustering. Knowledge and data engineering. IEEE Transactions on. 2011;23(9):1406–18.Google Scholar
- 19.Qian N. On the momentum term in gradient descent learning algorithms. Neural networks. 1999;12(1):145–51.CrossRefGoogle Scholar
- 20.Shen, X., Zhu, X., Jiang, X., Hu, X. (2017) Visualization of disease relationships by multiple maps t-SNE regularization based on Nesterov accelerated gradient, IEEE International Conference on Bioinformatics and Biomedicine.CrossRefGoogle Scholar
- 21.Nesterov Y. A method for unconstrained convex minimization problem with the rate of convergence
*O*(1/k^{2}). Doklady ANSSSR (translated as SovietMathDocl). 269:543–7.Google Scholar - 22.Nesterov Y. Introductory lectures on convex optimization: a basic course. Applied optimization. Kluwer academic Publ. London: Boston, Dordrecht; 2004.CrossRefGoogle Scholar
- 23.Van der Maaten L, Hinton G. Visualizing Data using t-SNE. J Mach Learn Res. 2008;9(11).Google Scholar
- 24.Hinton GE, Roweis S. Stochastic neighbor embedding. In NIPS’2002; 2003.Google Scholar
- 25.Lacoste-Julien S, Sha F, Jordan MI. DiscLDA: discriminative learning for dimensionality reduction and classification. In NIPS, volume. 2008;22.Google Scholar
- 26.Mao Y, Balasubramanian K, Lebanon G. Dimensionality reduction for text using domain knowledge. In: Proceedings of the 23rd international conference on computational linguistics: posters, COLING '10, Association for Computational Linguistics, Stroudsburg, PA, USA; 2010. p. 801–9.Google Scholar
- 27.Jamieson AR, et al. Exploring nonlinear feature space dimension reduction and data representation in breast CADx with Laplacian eigenmaps and t-SNE. Med Phys. 2010;37:339.CrossRefGoogle Scholar
- 28.Sutskever I. Training recurrent neural networks, Ph.D. thesis. Toronto: CS Dept., U; 2012.Google Scholar
- 29.Bengio Y, Boulanger Lewandowski N, Pascanu R. Advances in optimizing recurrent networks. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), May; 2013.Google Scholar
- 30.Qin J, et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature. 2012;490(7418):55–60.CrossRefGoogle Scholar
- 31.Hamosh A, et al. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic acids research. 2005;33(suppl 1):D514–7.PubMedGoogle Scholar
- 32.Jiang X, et al. Modularity in the genetic disease phenotype network. FEBS Lett. 2008;582(17):2549–54.CrossRefGoogle Scholar
- 33.Mantilla-Capacho JM, Arnaud L, Diaz-Rodriguez M, Barros-Nunez PA. Syndrome with preaxial polydactyly showing the typical mutation Ser252Trp in the FGFR2 gene. Genet Counsel. 2005;16:403–6.PubMedGoogle Scholar
- 34.Moloney DM, Slaney SF, Oldridge M, Wall SA, Sahlin P, Stenman G, Wilkie AOM. Exclusive paternal origin of new mutations in Apert syndrome. Nature Genet. 1996;13:48–53.CrossRefGoogle Scholar
- 35.Lajeunie E, De Parseval N, Gonzales M, Delezoide AL, Journeau P, Munnich A, Le Merrer M, Renier D. Clinical variability of Apert syndrome. J Neurosurg. 2000;90:443.Google Scholar
- 36.Mowat DR, Wilson MJ, Goossens M. Mowat-Wilson syndrome. J Med Genet. 2003;40:305–10.CrossRefGoogle Scholar
- 37.Strenge S, Heinritz W, Zweier C, Rauch A, Rolle U, Merkenschlager A, Froster UG. Pulmonary artery sling and congenital tracheal stenosis in another patient with Mowat-Wilson syndrome. (letter). Am J Med Genet. 2007;143A:1528–30.CrossRefGoogle Scholar
- 38.Horn D, Weschke B, Zweier C, Rauch A. Facial phenotype allows diagnosis of Mowat-Wilson syndrome in the absence of Hirschsprung disease. Am J Med Genet A. 2004;124A:102–4.CrossRefGoogle Scholar
- 39.Hay RJ, Wells RS. The syndrome of ankyloblepharon, ectodermal defects and cleft lip and palate: an autosomal dominant condition. Brit J Derm. 1976;94:287–9.CrossRefGoogle Scholar
- 40.McGrath, J. A., Duijf, P. H. G., Doetsch, V., Irvine, A. D., de Waal, R., Vanmolkot, K. R. J., Wessagowit, V., Kelly, A., Atherton, D. J., Griffiths, W. A. D., Orlow, S. J., Ausems, M. G. E M, Yang, A, McKeon, F, Bamshad, M A, Brunner, H G, Hamel, B C J, van Bokhoven, H. Hay-Wells syndrome is caused by heterozygous missense mutations in the SAM domain of p63. Hum Mol Genet10: 221–229, 2001.Google Scholar
- 41.Bertola DR, Kim CA, Sugayama SMM, Albano LMJ, Utagawa CY, Gonzalez CH. AEC syndrome and CHAND syndrome: further evidence of clinical overlapping in the ectodermal dysplasias. Pediat Derm. 2000;17:218–21.CrossRefGoogle Scholar
- 42.van Driel MA, et al. A text-mining analysis of the human phenome. European journal of human genetics. 2006;14(5):535–42.CrossRefGoogle Scholar
- 43.Zhou J, Ashouian N, Delepine M, Mastsuda F, Chevillard C, Rivlet R, Schildkraut CL, Birshtein BK. The origin of a developmentally regulated lgh replicon is located near the border of regulatory domains for lgh replication and expression. PNAS. 2002;99(21):13693–8.CrossRefGoogle Scholar
- 44.Adachi Y, Asakura Y, Sato Y, Tajiama T, Nakajima T, Yamamoto T, Fujieda K. Novel SLC12A1 (NKCC2) mutations in two families with Bartter syndrome type1. Endocr J. 12 Nov 2007;54(6):1003–7.CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.