# A novel algorithm based on bi-random walks to identify disease-related lncRNAs

## Abstract

### Backgrounds

There is evidence to suggest that lncRNAs are associated with distinct and diverse biological processes. The dysfunction or mutation of lncRNAs are implicated in a wide range of diseases. An accurate computational model can benefit the diagnosis of diseases and help us to gain a better understanding of the molecular mechanism. Although many related algorithms have been proposed, there is still much room to improve the accuracy of the algorithm.

### Results

We developed a novel algorithm, BiWalkLDA, to predict disease-related lncRNAs in three real datasets, which have 528 lncRNAs, 545 diseases and 1216 interactions in total. To compare performance with other algorithms, the leave-one-out validation test was performed for BiWalkLDA and three other existing algorithms, SIMCLDA, LDAP and LRLSLDA. Additional tests were carefully designed to analyze the parameter effects such as *α*, *β*, *l* and *r*, which could help user to select the best choice of these parameters in their own application. In a case study of prostate cancer, eight out of the top-ten disease-related lncRNAs reported by BiWalkLDA were previously confirmed in literatures.

### Conclusions

In this paper, we develop an algorithm, BiWalkLDA, to predict lncRNA-disease association by using bi-random walks. It constructs a lncRNA-disease network by integrating interaction profile and gene ontology information. Solving cold-start problem by using neighbors’ interaction profile information. Then, bi-random walks was applied to three real biological datasets. Results show that our method outperforms other algorithms in predicting lncRNA-disease association in terms of both accuracy and specificity.

### Availability

## Keywords

LncRNA-disease association Bi-random walks Gene ontology Interaction profile## Abbreviations

- AR
Androgen

- IMC
Inductive matrix completion

- LOOCV
Leave-one-out cross validation

- ROC
receiver-operating characteristics

## Background

It suggests that only 1.5% of genes in the human genome were protein-coding genes, which are twice as many as that of worm and fruit fly [1]. However, 74.7% of the human genome is involved in the process of primary transcripts [2]. It implies that non-coding RNAs play major roles in the regulation of gene expression. The presence or absence of some non-coding RNAs could down- or up-regulate a cascade of gene expression, which could be drug targets for medical therapy of a disease. Many researchers put efforts in to the discovery of the long non-coding RNAs function. Recent studies have found strong association between lncRNA and diseases. It shows that many lncRNAs play as some functional roles in diverse biological processes, such as cell proliferation, RNA binding complexes, immune surveillance, neuronal processes, morphogenesis and gametogenesis [3]. Their dysfunction may cause various diseases. For example, HOTAIR would induce androgen-independent (AR) activation, which plays a central role in establishing an oncogenic cascade that drives prostate cancer progression. It is also a causal reason for AR-mediated transcription programs in the absence of androgen [4]. Therefore, the prediction of lncRNA function would give us a new way to understand the regulation mechanism and disease pathology. There is an urgent demand for the development of fast and accurate algorithm to predict lncRNA-disease association.

Many computational tools have recently been developed to predict potential lncRNA-disease association and functional patters in biological networks [5, 6, 7, 8, 9, 10]. Functional patterns in biological networks. These computational methods are majorly in three categories. One of them is based on the idea of matrix factorization. Matrix factorization can be seen as a linear model of latent factors. In these methods, a corresponding latent factor is generated for each lncRNA and disease. Then, it uses a dot product of the latent factors to represent their similarity. The objective function of matrix factorization is to learn the optimal latent factors which can minimize the prediction error. Recently, these methods have been widely used in the prediction of lncRNA-disease relationship. For example, MFLDA reduces the high dimension of heterogeneous data sources into low-rank matrices via matrix tri-factorization, which can help to explore and exploit their intrinsic and shared structure [11]. SIMCLDA translates the lncRNA-disease association prediction problem into a recommendation, which can be solved with inductive matrix completion (IMC) [12]. However, matrix factorization may also bear the risk of over-fitting and the problem of costing-time complexity. Another type of methods is based on the idea of "guilt-by-associate". They are intuitively guided by the assumption that similar disease or lncRNA have similar connection patterns. If disease (A) and lncRNA (A) are known to be related, and disease (A) and disease (B) are very similar. We can infer disease (B) may also related to lncRNA (A). Obviously, the performance of these algorithms heavily depends on the accuracy of the similarity measures. Many "guilt-by-association" algorithms have been proposed. For example, RWRlncD infers potential human lncRNA-disease associations by implementing the random walk with restart method on a lncRNA functional similarity network [13]. IRWRLDA predicts novel lncRNA-disease associations by integrating known lncRNA-disease associations, disease semantic similarity, and various lncRNA similarity measures and make prediction based on improved Random Walk with Restart [14]. The third type of methods focus on classification. Feature extraction was performed on the complex network. Binary classifiers could be applied in the following step to predict whether there exists a connection between lncRNAs and diseases. Another typical prediction algorithm is LRLSLDA, which constructs a cost function in lncRNA and disease space and makes prediction by combining several classifiers in the lncRNA and disease space into a single classifier [15]. LDAP predicts potential lncRNA-disease associations by using a bagging SVM classifier based on lncRNA similarity and disease similarity [16].

In this paper, we proposed a novel algorithm, BiWalkLDA, to predict potential lncRNA-disease associations. The design of BiwalkLDA was intuitivly guided by the assumption of "guilt-by-associate". In order to construct more accurate similarity network, we integrate two types of data from interaction profiles and gene ontology. Furthermore, our method was designed to solve the cold-start problem. BiWalkLDA uses bi-random walks algorithm to predict lncRNA-disease association base on a similarity network we constructed. The experiments were carried out on three real datasets downloaded from the LncRNADisease database [17]. Algorithm performance were evaluated by using Leave-one-out cross validation (LOOCV). Results show that BiWalkLDA outperforms other four state-of-art algorithms, meanwhile it is robust on different datasets and parameters in predicting novel lncRNA-disease associations.

## Methods

### Construction of disease similarity networks

*d*

_{i}and

*d*

_{j}, respectively. Like previous algorithms, we also construct disease similarity networks by using known disease and LncRNA associations. The construction process can be divided into two steps: (1) construction of an adjacency matrix \(A_{n_{l} \times n_{d}}\), where

*n*

_{l}is the number of lncRNA and

*n*

_{d}is the number of diseases.

*A*

_{ij}=1 represent that the

*i*

^{th}lncRNA is associated with

*d*

_{j}, otherwise

*A*

_{ij}=0. (2) With the matrix

*A*, we referred

*I*

*P*(

*d*(

*i*)) to the

*i*

^{th}column of

*A*, which is the interaction profile of disease

*d*

_{i}. IP(d(i)) is a binary vector of length

*n*

_{l}and represents an association pattern of disease d(i). Then we calculate the similarity between two diseases based on the gaussian linear kernel,

*γ*

_{d}is the bandwidth of kernel which is calculated as follow:

*n*

_{d}is the number of diseases. Up to now, we have constructed

*S*

_{GKD}based on known association between lncRNA and disease and

*S*

_{GO}based on disease-related GO set. Then we use a simple linear model to fuse the two similarity networks.

Here *α* is a hyperparameter that control the proportion of *S*_{GKD} and *S*_{GO}. If *α*=1, disease similarity only be calculated base on gene ontology information. If *α*=0, disease similarity only be calculated base on known disease-lncRNA associations. When the matrix is sparse, it would be better to give a large *α* so that similarity rewards can be obtained from geneontology. This technique makes the algorithm more robust

### Construction of lncRNA similarity network

*I*

*P*(

*l*(

*i*)) which is the

*i*

^{th}row of

*A*to represent the interaction profile of lncRNA

*l*(

*i*).

*I*

*P*(

*l*(

*i*)) is a binary vector of length

*n*

_{d}and represents an association pattern of lncRNA

*l*(

*i*). Then lncRNA gaussian similarity was calculated base on the following formula:

*γ*

_{l}is the bandwidth of kernel,

*n*

_{l}is the number of the lncRNA.

### Calculation of interaction profiles for new lncRNAs

*l*(

*i*) as an example, the neighbors of lncRNA

*l*(

*i*) should be satisfied with the following formula:

*n*

_{l}is the number of lncRNA. In another words, if similarity between

*l*(

*i*) and

*l*(

*j*) were larger than the mean of the similarity,

*l*(

*j*) can be defined as the neighbors of

*l*(

*i*).

*I*

*P*(

*l*(

*i*)) was the mean of its neighbors’ interaction profile.

*N*(

*l*

*n*

*c*

_{i}) is the set of the neighbors of lncRNA

*l*(

*i*) and |

*N*(

*l*

*n*

*c*

_{i})| is the size of

*N*(

*l*

*n*

*c*

_{i}). Notice that our approach here is different from the traditional approach to dealing with cold-start problem. Typically, the traditional method uses the mean of other lncRNAs interaction profile to fill in the new LncRNA. This is actually based on the popularity to make prediction. In contrast, BiwalkLDA uses local topological structure to predict missing interactions. Given a new lncRNA, we first find all its similar (or nearest) lncRNAs, which are likely to share common disease interactors with our node of interest. So, the key point is the definition of similarity function. Unlike all other algorithms, we assume that these lncRNAs sparsely connected to diseases would contribute more to the given node. It means they are likely to share common disease nodes. For example, an inactive user didnŠt buy Harry Potter, although the book is one of the best seller. How likely does a new user would choose to buy the book. In our model, new users would more likely to learn from inactive users.

### The algorithm of Bi-random walk

*a*

_{ij}represents the possibility that lncRNA(i) and disease(j) are related.

*s*

*i*

*m*

_{d}(

*k, j*) represents the similarity of disease(k) and disease(j). So the process of calculation is actually to traverse every disease k and add

*a*

_{i,k}∗

*s*

*i*

*m*

_{d}(

*k*,

*j*) up. It can be seen as a linear model based on similarity. Considering that we want to keep part of the original

*a*

_{ij}, the formula can written as below:

*a*

_{ij}is always less than 1. The above formula is based on disease similarity to make predictions. Similarly, we can make predictions based on the similarity of lncRNA and then combine the two results together to make final prediction. So the whole process of the algorithm can be divided into three steps: (1)First, we predict new scores based on disease similarity and lncRNA similarity according to random walk algorithm. (2)Then, we use the mean of two scores as the result of this round of prediction. (3)The two steps are repeatedly performed until maximum number of iterations. Let’s go into the details of the algorithm. We do row normalization on both lncRNA similarity network and disease similarity. This is because random walk is actually a linear prediction model based on similarity. The similarity should be normalized so that the prediction results are between 0 and 1.

*i*

^{th}row of

*S*

_{d}. Similarly, we normalized the similarity of lncRNA as following formula:

*i*

^{th}row of

*S*

_{GKL}. Adjacent matrix A also needs to be initialized. Scores of all known lncRNA-disease association are set to 1/n where n is th total number of known lncRNA and disease associations. Scores of Other unobserved associations are set to zero.

*S*

_{ini}represent the initial probability and the sum of initial probabilities is 1. Because the importance of predicting results based on different similarity networks may be different. We introduce two parameter l and r as the numbers of maximal iterations in the left and right random walks on these two networks. The more iterations, the more important the prediction through this similarity network is. The iterative process can be described by the following formula:

*S*

_{d},

*S*

_{l}represent disease and lncRNA similarity networks.

*S*

_{ini}represents initial score of all disease-lncRNA association.

*β*is the decay factor which control the degree of retention of initial information.

*R*

_{l}represents the score of random walk on the lncRNA similarity network and

*R*

_{d}represents the score of random walk on the disease similarity network. In the iterative function, we use the averaged value of

*R*

_{d}and

*R*

_{l}as \(S_{ini}^{t}\) in step t. This process can be seen as a combination of lncRNA similarity and disease similarity to make predictions. When the number of iterations reached max(l,r), \(S_{int}^{t}\) is the final result which represents the possibilities of all lncRNA-disease association. The pseudocode of bi-random walk algorithm can be seen in Algorithm 1.

## Data and materials

Detailed information for three datasets

Datasets | Version | No. of lncRNA | No. of disease | No. of interaction |
---|---|---|---|---|

Dataset1 | 2012 | 112 | 150 | 276 |

Dataset2 | 2014 | 131 | 169 | 319 |

Dataset3 | 2015 | 285 | 226 | 621 |

## Results

We use leave-one-out cross validation (LOOCV) to test the performance of BiwalkLDA. LOOCV is a widely-used strategy to evaluate the quality of the algorithms. In each turn, one known association was set as a test sample. All other lncRNA-disease association were set to training set to train model. All associations that are not observed will be considered as a candidate set and will be scored by BiwalkLDA. A correspond rankList can be generated based on the predicted results. Then true positive rates (TPR, sensitivity) and false positive rates (FPR, 1-specificity) can be calculated by giving different thresholds. Based on the calculated values of TPR and FPR, the receiver-operating characteristics (ROC) curves can be plotted. Then we use the areas under ROC curve (AUC) as evaluation criteria of algorithmic performance which reflects the global prediction accuracy in different situation. The value of AUC closed to one means a perfect prediction, while the AUC value of 0.5 indicates purely random performance.

### The effects of parameters

#### The effects of *α*

*S*

_{GO}and

*S*

_{GKD}. Here

*α*is a hyperparameter that control the proportion of

*S*

_{GO}and

*S*

_{GKD}. If

*α*= 1, disease similarity only be calculated base on gene ontology information. If

*α*= 0, disease similarity only be calculated base on known disease-lncRNA associations. BiWalkLDA use gene ontology information as a supplement to

*S*

_{GKD}, which makes the generalization ability of the algorithm stronger. To test the performance of the algorithm under different

*α*values, we changed

*α*from 0 to 1 and increased 0.1 per time. Then we use BiwalkLDA to make prediction. The experimental results are shown in Fig. 1, When

*α*=0.1, BiwalkLDA obtain the best results on dataset1 and dataset2. On dataset3, it reaches the peak when

*α*=0.3. It can see that small changes in

*α*do not have much impact on the results. Therefore, we recommend the region of

*α*could be set between 0.1 and 0.3 for using BiwalkLDA. The experimental results show that the fusion of

*S*

_{GKD}and

*S*

_{GO}can improve the accuracy of the algorithm. Meanwhile, the algorithm can achieve good performance even if we only use the GO similarity network. It indicates that the algorithm still works in the absence of disease-lncRNA association information.

#### The effects of *β*

*β*is a decay factor in bi-random walk algorithm.

*β*determines the degree of retention of initial information in each iteration. if

*β*=0, all initial information will be retained. If

*β*=1, all initial information will be used to predict new score in each turn. Obviously, it will result in a poor performance if

*β*is either 0 or 1 are inappropriate and will result in a poor performance. To test the performance of the algorithm under different

*β*values, we increased

*β*rom 0 to 1 in 10 steps, and run BiwalkLDA. The value of

*β*was changed from 0 to 1 and increased 0.1 each time and then using BiwalkLDA to make prediction. The experimental results are shown in Fig. 2. When 0.1 ≤

*β*≤0.9, the results of the algorithm varied slightly. It indicts that BiWalkLDA is robust to

*β*. BiWalkLDA performs the best AUC when

*β*=0.8 in dataset1 and dataset2 and performs the best AUC when

*β*=0.7 in dataset3. Intuitively, if the initial data is sufficient, a smaller

*β*is more appropriate. Because dataset3 contains more known lncRNA-disease associations, the optimal

*β*in dataset3 is less than the other dataset. Finally, we set

*β*= 0.8 as default in three datasets.

#### The effects of l and r

The effects of parameters l and r in dataset1

r = 1 | r = 2 | r = 3 | r= 4 | r = 5 | r = 6 | r=7 | |
---|---|---|---|---|---|---|---|

l = 1 | 0.7618 | 0.7230 | 0.6902 | 0.6714 | 0.6585 | 0.6448 | 0.6304 |

l = 2 | 0.8124 | 0.7890 | 0.7292 | 0.6985 | 0.6802 | 0.6702 | 0.6564 |

l = 3 | 0.8008 | 0.8214 | 0.8140 | 0.7295 | 0.7010 | 0.6838 | 0.6713 |

l = 4 | 0.7919 | 0.8092 | 0.8230 | 0.8243 | 0.7285 | 0.7000 | 0.6850 |

l = 5 | 0.7848 | 0.7989 | 0.8115 | 0.8238 | 0.8267 | 0.7269 | 0.6988 |

l = 6 | 0.7778 | 0.7911 | 0.8006 | 0.8119 | 0.8236 | 0.8268 | 0.7255 |

l = 7 | 0.7729 | 0.7834 | 0.7920 | 0.8007 | 0.8116 | 0.8233 | 0.8263 |

### Comparison with other algorithms

### De novo lncRNA-disease prediction

### Case studies

Top ten reported lncRNAs for prostate cancer

Rank | Name of lncRNA | PMID |
---|---|---|

1 | H19 | PMID: 24988946 |

2 | CDKN2B-AS1 | Unconfirmed |

3 | MALAT1 | PMID: 23845456 |

4 | HOTAIR | PMID: 26411689 |

5 | MEG3 | PMID: 26610246 |

6 | PVT1 | PMID: 21814516 |

7 | BCYRN1 | Unconfirmed |

8 | GAS5 | PMID: 23676682 |

9 | NEAT1 | PMID: 25415230 |

10 | UCA1 | PMID: 26550172 |

## Conclusion

Many recent studies suggest that lncRNAs are strongly associated with various complex human diseases and they play important roles in the gene expression regulation and post-transcription modification. Predicting lncRNA-disease association can help understand the biological mechanism of disease and reduce the cost of experimental verification. However, discovering the relationship between lncRNA and disease by means of computational model is still a very challenging problem. Therefore, the development of computational tools is much in demand. Although many computational models have been proposed. Their prediction accuracy still has a lot of room to improve. To improve the performance of existing algorithms, we present a novel algorithm, BiwalkLDA based on bi-random walks for the prediction of lncRNA-disease associations. It integrates gene ontology and interaction profile data together to calculate disease similarity, to solve the cold-start problem by using the local structure of lncRNAs neighbors information. Four the-state-of-art computational methods and BiwalkLDA are applied to predict lncRNA-disease associations on three different datasets. Results show that BiwalkLDA is superior to every other existing algorithms in terms of both accuracy and recall. There are still many problems to be dealt with. Existing models are based on small-scale datasets. Although algorithms can achieve high accuracy, their results are often repetitive. If the dataset is too large, the existing algorithms can not be applied to large-scale data. In future work, we will consider to develop more effective algorithm to solve this problem.

## Notes

### Acknowledgments

Many thanks go to Dr. Bolin Chen and Dr. Jiajie Peng for discussion.

### About this supplement

This article has been published as part of *BMC Bioinformatics Volume 20 Supplement 18, 2019: Selected articles from the Biological Ontologies and Knowledge bases workshop 2018*. The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-20-supplement-18.

### Authors’ contributions

JH designed the computational framework, YG, JL, YZ, and JW performed all the analyses of the data and wrote the manuscript; XS is the major coordinator, who contributed a lot of time and efforts in the discussion of this project. All authors read and approved the final manuscript.

### Funding

Publication costs were funded by the National Natural Science Foundation of China (Grant No. 61702420); This project has also been funded by the National Natural Science Foundation of China (Grant No. 61332014, 61702420 and 61772426); the China Postdoctoral Science Foundation (Grant No. 2017M613203); the Natural Science Foundation of Shaanxi Province (Grant No. 2017JQ6037); the Fundamental Research Funds for the Central Universities (Grant No. 3102018zy032); the Top International University Visiting Program for Outstanding Young Scholars of Northwestern Polytechnical University.

### Ethics approval and consent to participate

Not applicable

### Consent for publication

Not applicable

### Competing interests

The authors declare that they have no competing interests.

## References

- 1.Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al.Initial sequencing and analysis of the human genome. Nature. 2001; 3(6822):346.Google Scholar
- 2.Djebali S, Davis CA, Merkel A, Dobin A, Lassmann T, Mortazavi AM, et al.Landscape of transcription in human cells. Nature. 2012; 489(7414):101.CrossRefGoogle Scholar
- 3.Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, et al.Chromatin signature reveals over a thousand highly conserved large non-coding rnas in mammals. Nature. 2009; 458(7235):223.CrossRefGoogle Scholar
- 4.Zhang A, Zhao J, Kim J, et al.Lncrna hotair enhances the androgen-receptor-mediated transcriptional program and drives castration-resistant prostate cancer. Cell Rep. 2015; 13(1):209–21.CrossRefGoogle Scholar
- 5.Hu J, Gao Y, Zheng Y, Shang X. KF-finder: Identification of key factors from host-microbial networks in cervical cancer. BMC Syst Biol. 2018; 12(S4):54.CrossRefGoogle Scholar
- 6.Hu J, Gao Y, He J, Zheng Y, Shang X. WebNetCoffee: a webbased application to identify functionally conserved proteins from Multiple PPI networks. BMC Bioinformatics. 2018; 19(1):422.CrossRefGoogle Scholar
- 7.Hu J, Zheng Y, Shang X. MiteFinderII: a novel tool to identify miniature inverted-repeat transposable elements hidden in eukaryotic genomes. BMC Med Genomics. 2018; 11(S5):101.CrossRefGoogle Scholar
- 8.Hu J, Shang X. Detection of Network Motif Based on a Novel Graph Canonization Algorithm from Transcriptional Regulation Networks. Molecules. 2017; 22(12):2194.CrossRefGoogle Scholar
- 9.Hu J, Wang J, Li J, Lin J, Liu T, Zhong Y, Liu J, Zheng Y, Gao Y, He J, Shang X. MD-SVM: A novel SVM-based algorithm for the motif discovery of transcription factor binding sites. BMC Bioinformatics. 2019; 20(S7). https://doi.org/10.1186/s12859-019-2735-3.
- 10.Peng J, Guan J, Shang X. Predicting Parkinson’s disease genes based on node2vec and autoencoder. Front Genet. 2019; 10. https://doi.org/10.3389/fgene.2019.00226.
- 11.Fu G, Wang J, Domeniconi C, Yu G. Matrix factorization based data fusion for the prediction of lncrna-disease associations. Bioinformatics. 2017; 34(9):1529–37.CrossRefGoogle Scholar
- 12.Lu C, Yang M, Luo F, Wu FX, Li M, Pan Y, et al.Prediction of lncrna-disease associations based on inductive matrix completion. Bioinformatics. 2018; 34(19):3357–64. https://doi.org/10.1093/bioinformatics/bty327.CrossRefGoogle Scholar
- 13.Sun J, Shi H, Wang Z, Zhang C, Liu L, Wang L, et al.Inferring novel lncrna-disease associations based on a random walk model of a lncrna functional similarity network. Mol Biosyst. 2014; 10(8):2074–081.CrossRefGoogle Scholar
- 14.Chen X, You ZH, Yan GY, Gong DW. Irwrlda: improved random walk with restart for lncrna-disease association prediction. Oncotarget. 2016; 7(36):57919–31.PubMedPubMedCentralGoogle Scholar
- 15.Chen X, Yan GY. Novel human lncrna-disease association inference based on lncrna expression profiles. Bioinformatics. 2013; 29(20):2617–24.CrossRefGoogle Scholar
- 16.Lan W, Li M, Zhao K, Liu J, Wu FX, Pan Y, et al.Ldap: a web server for lncrna-disease association prediction. Bioinformatics. 2017; 33(3):458–60.PubMedGoogle Scholar
- 17.Chen G, Wang Z, Wang D, Qiu C, Liu M, Chen X, et al.Lncrnadisease: a database for long-non-coding rna-associated diseases. Nucleic Acids Res. 2013; 41(Database issue):D983–D986.PubMedPubMedCentralGoogle Scholar
- 18.Chen X. Katzlda: katz measure for the lncrna-disease association prediction. Sci Rep. 2014; 5:16840.CrossRefGoogle Scholar
- 19.Aken BL, Ayling S, Barrell D, Clarke L, Curwen V, Fairley S, et al.The ensembl gene annotation system. Database J Biol Databases Curation. 2016; 2016:baw093. https://doi.org/10.1093/database/baw093.Google Scholar
- 20.Bauer-Mehren A, Rautschka M, Sanz F, Furlong LI. Disgenet. Bioinformatics. 2010; 26(22):2924–292.CrossRefGoogle Scholar
- 21.Chen X, Huang YA, You ZH, Yan GY, Wang XS. A novel approach based on katz measure to predict associations of human microbiota with non-infectious diseases. Bioinformatics. 2016; 33(5):733–9.Google Scholar
- 22.Zhu M, Chen Q, Liu X, Sun Q, Zhao X, Deng R, et al.Lncrna h19/mir-675 axis represses prostate cancer metastasis by targeting tgfbi. Febs J. 2015; 281(16):3766–75.CrossRefGoogle Scholar
- 23.Ren S, Liu Y, Xu W, Sun Y, Lu J, Wang F, et al.Long noncoding rna malat-1 is a new potential therapeutic target for castration resistant prostate cancer. J Urol. 2013; 190(6):2278–87.CrossRefGoogle Scholar
- 24.Luo G, Wang M, Wu X, Tao D, Xiao X, Wang L, et al.Long non-coding rna meg3 inhibits cell proliferation and induces apoptosis in prostate cancer. Cell Physiol Biochem. 2015; 37(6):2209.CrossRefGoogle Scholar
- 25.Meyer KB, Maia AT, O’Reilly M, Ghoussaini M, Prathalingam R, Portergill P, et al.A functional variant at a prostate cancer predisposition locus at 8q24 is associated with pvt1 expression. Plos Genet. 2011; 7(7):e1002165.CrossRefGoogle Scholar
- 26.Pickard MR, Mourtadamaarabouni M, Williams GT. Long non-coding rna gas5 regulates apoptosis in prostate cancer cell lines. Biochim Biophys Acta. 2013; 1832(10):1613–23.CrossRefGoogle Scholar
- 27.Chakravarty D, Sboner A, Nair SS, Giannopoulou E, Li R, Hennig S, et al.The oestrogen receptor alpha-regulated lncrna neat1 is a critical modulator of prostate cancer. Nat Commun. 2014; 5:5383.CrossRefGoogle Scholar
- 28.Na XY, Liu ZY, Ren PP, Yu R, Shang XS. Long non-coding rna uca1 contributes to the progression of prostate cancer and regulates proliferation through klf4-krt6/13 signaling pathway. Int J Clin Exp Med. 2015; 8(8):12609–16.PubMedPubMedCentralGoogle Scholar

## Copyright information

**Open Access** This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.