A Deep Neural Network Model with Attribute Network Representation for lncRNA-Protein Interaction Prediction
- Authors: Wei M.1, Yu C.1, Li L.2, You Z.3, Lei-Wang 4
- 
							Affiliations: 
							- School of Information Engineering, Xijing University
- College of Agriculture and Forestry, Longdong University
- School of Computer Science, Northwestern Polytechnical University
- Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Guangxi Academy of Sciences
 
- Issue: Vol 19, No 4 (2024)
- Pages: 341-351
- Section: Life Sciences
- URL: https://snv63.ru/1574-8936/article/view/643857
- DOI: https://doi.org/10.2174/0115748936267109230919104630
- ID: 643857
Cite item
Full Text
Abstract
Background:LncRNA is not only involved in the regulation of the biological functions of protein-coding genes, but its dysfunction is also associated with the occurrence and progression of various diseases. Various studies have shown that an in-depth understanding of the mechanism of action of lncRNA is of great significance for disease treatment. However, traditional wet testing is time-consuming, laborious, expensive, and has many subjective factors which may affect the accuracy of the experiment.
Objective:Most of the methods for predicting lncRNA-protein interaction (LPI) rely on a single feature, or there is noise in the feature. To solve this problem, we proposed a computational model, CSALPI based on a deep neural network.
Methods:Firstly, this model utilizes cosine similarity to extract similarity features for lncRNAlncRNA and protein-protein, denoising similar features using the Sparse Autoencoder. Second, a neighbor enhancement autoencoder is employed to enforce neighboring nodes to be represented similarly by reconstructing the denoised features. Finally, a Light Gradient Boosting Machine classifier is used to predict potential LPIs.
Results:To demonstrate the reliability of CSALPI, multiple evaluation metrics were used under a 5- fold cross-validation experiment, and excellent results were achieved. In the case study, the model successfully predicted 7 out of 10 disease-associated lncRNA and protein pairs.
Conclusion:The CSALPI can be an effective complementary method for predicting potential LPIs from biological experiments.
About the authors
Meng-Meng Wei
School of Information Engineering, Xijing University
														Email: info@benthamscience.net
				                					                																			                												                														
Chang-Qing Yu
School of Information Engineering, Xijing University
							Author for correspondence.
							Email: info@benthamscience.net
				                					                																			                												                														
Li-Ping Li
College of Agriculture and Forestry, Longdong University
							Author for correspondence.
							Email: info@benthamscience.net
				                					                																			                												                														
Zhu-Hong You
School of Computer Science, Northwestern Polytechnical University
														Email: info@benthamscience.net
				                					                																			                												                														
Lei-Wang
Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Guangxi Academy of Sciences
														Email: info@benthamscience.net
				                					                																			                												                														
References
- Yin Y, Morgunova E, Jolma A, et al. Impact of cytosine methylation on DNA binding specificities of human transcription factors. Science 2017; 356(6337): eaaj2239. doi: 10.1126/science.aaj2239 PMID: 28473536
- Ding Y, Tiwari P, Guo F, Zou Q. Shared subspace-based radial basis function neural network for identifying ncRNAs subcellular localization. Neural Netw 2022; 156: 170-8. doi: 10.1016/j.neunet.2022.09.026 PMID: 36274524
- Dou L, Yang F, Xu L, Zou Q. A comprehensive review of the imbalance classification of protein post-translational modifications. Brief Bioinform 2021; 22(5): bbab089. doi: 10.1093/bib/bbab089 PMID: 33834199
- Xin H, Deng K, Fu M. Post-transcriptional gene regulation by RNA-binding proteins in vascular endothelial dysfunction. Sci China Life Sci 2014; 57(8): 836-44. doi: 10.1007/s11427-014-4703-5 PMID: 25104457
- Zhang HY, Wang L, You ZH, et al. iGRLCDA: Identifying circRNAdisease association based on graph representation learning. Brief Bioinform 2022; 23(3): bbac083. doi: 10.1093/bib/bbac083 PMID: 35323894
- Gutschner T, Hämmerle M, Eißmann M, et al. The noncoding RNA MALAT1 is a critical regulator of the metastasis phenotype of lung cancer cells. Cancer Res 2013; 73(3): 1180-9. doi: 10.1158/0008-5472.CAN-12-2850 PMID: 23243023
- Raveh E, Matouk IJ, Gilon M, Hochberg A. The H19 Long non-coding RNA in cancer initiation, progression and metastasis: A proposed unifying theory. Mol Cancer 2015; 14(1): 184. doi: 10.1186/s12943-015-0458-2 PMID: 26536864
- Hajjari M, Salavaty A. HOTAIR: An oncogenic long non-coding RNA in different cancers. Cancer Biol Med 2015; 12(1): 1-9. PMID: 25859406
- Cook KB, Hughes TR, Morris QD. High-throughput characterization of protein-RNA interactions. Brief Funct Genomics 2015; 14(1): 74-89. doi: 10.1093/bfgp/elu047 PMID: 25504152
- Ouyang Z, Snyder MP, Chang HY. SeqFold: Genome-scale reconstruction of RNA secondary structure integrating high-throughput sequencing data. Genome Res 2013; 23(2): 377-87. doi: 10.1101/gr.138545.112 PMID: 23064747
- Yi HC, You ZH, Cheng L, et al. Learning distributed representations of RNA and protein sequences and its application for predicting lncRNA-protein interactions. Comput Struct Biotechnol J 2020; 18: 20-6. doi: 10.1016/j.csbj.2019.11.004 PMID: 31890140
- Liu H, Ren G, Hu H, et al. LPI-NRLMF: LncRNA-protein interaction prediction by neighborhood regularized logistic matrix factorization. Oncotarget 2017; 8(61): 103975-84. doi: 10.18632/oncotarget.21934 PMID: 29262614
- Zhao Q, Zhang Y, Hu H, Ren G, Zhang W, Liu H. IRWNRLPI: Integrating random walk and neighborhood regularized logistic matrix factorization for lncRNA-protein interaction prediction. Front Genet 2018; 9: 239. doi: 10.3389/fgene.2018.00239 PMID: 30023002
- Luo X, Tu X, Ding Y, Gao G, Deng M. Expectation pooling: An effective and interpretable pooling method for predicting DNAprotein binding. Bioinformatics 2020; 36(5): 1405-12. doi: 10.1093/bioinformatics/btz768 PMID: 31598637
- Zhao Q, Yu H, Ming Z, Hu H, Ren G, Liu H. The bipartite network projection-recommended algorithm for predicting long non-coding RNA-protein interactions. Mol Ther Nucleic Acids 2018; 13: 464-71. doi: 10.1016/j.omtn.2018.09.020 PMID: 30388620
- Xie G, Wu C, Sun Y, Fan Z, Liu J. Lpi-ibnra: Long non-coding rna-protein interaction prediction based on improved bipartite network recommender algorithm. Front Genet 2019; 10: 343. doi: 10.3389/fgene.2019.00343 PMID: 31057602
- Zhou YK, Hu J, Shen ZA, Zhang WY, Du PF. LPI-SKF: Predicting lncRNA-protein interactions using similarity kernel fusions. Front Genet 2020; 11: 615144. doi: 10.3389/fgene.2020.615144 PMID: 33362868
- Shaw D, Chen H, Xie M, Jiang T. DeepLPI: A multimodal deep learning method for predicting the interactions between lncRNAs and protein isoforms. BMC Bioinformatics 2021; 22(1): 24. doi: 10.1186/s12859-020-03914-7 PMID: 33461501
- Ng A. Sparse autoencoder. CS294A Lecture notes 2011; 72: 1-19.
- Ke G, Meng Q, Finley T, et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 2017; 30.
- Yuan J, Wu W, Xie C, Zhao G, Zhao Y, Chen R. NPInter v2.0: An updated database of ncRNA interactions. Nucleic Acids Res 2014; 42(D1): D104-8. doi: 10.1093/nar/gkt1057 PMID: 24217916
- Zhao Y, Li H, Fang S, et al. NONCODE 2016: An informative and valuable data source of long non-coding RNAs. Nucleic Acids Res 2016; 44(D1): D203-8. doi: 10.1093/nar/gkv1252 PMID: 26586799
- Apweiler R, Bairoch A, Wu CH, et al. UniProt: The universal protein knowledgebase. Nucleic Acids Res 2004; 32(90001): 115D-9. doi: 10.1093/nar/gkh131 PMID: 14681372
- Zhao G, Li P, Qiao X, Han X, Liu ZP. Predicting lncRNAprotein interactions by heterogenous network embedding. Front Genet 2022; 12: 814073. doi: 10.3389/fgene.2021.814073 PMID: 35186016
- Yin N, Shen L, Wang M, Luo X, Luo Z, Tao D. OMG: Towards effective graph classification against label noise. IEEE Trans Knowl Data Eng 2023; 1-14. doi: 10.1109/TKDE.2023.3271677
- Wang XF, Yu CQ, You ZH, et al. KS-CMI: A circRNA-miRNA interaction prediction method based on the signed graph neural network and denoising autoencoder. iScience 2023; 26(8): 107478. doi: 10.1016/j.isci.2023.107478 PMID: 37583550
- Chen Y, Wang J, Wang C, Liu M, Zou Q. Deep learning models for disease-associated circRNA prediction: A review. Brief Bioinform 2022; 23(6): bbac364. doi: 10.1093/bib/bbac364 PMID: 36130259
- Ren ZH, You ZH, Zou Q, et al. DeepMPF: Deep learning framework for predicting drugtarget interactions based on multi-modal representation with meta-path semantic analysis. J Transl Med 2023; 21(1): 48. doi: 10.1186/s12967-023-03876-3 PMID: 36698208
- Cui F, Li S, Zhang Z, et al. DeepMC-iNABP: Deep learning for multiclass identification and classification of nucleic acid-binding proteins. Comput Struct Biotechnol J 2022; 20: 2020-8. doi: 10.1016/j.csbj.2022.04.029 PMID: 35521556
- Gu Z, Luo X, Chen J, Deng M, Lai L. Hierarchical graph transformer with contrastive learning for protein function prediction. Bioinformatics 2023; 39(7): btad410. doi: 10.1093/bioinformatics/btad410 PMID: 37369035
- Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining,. 2014; pp. 701-710. doi: 10.1145/2623330.2623732
- Tang J, Qu M, Wang M, et al. Line: Large-scale information network embedding. Proceedings of the 24th international conference on world wide web 2015; 1067-77. doi: 10.1145/2736277.2741093
- Wang D, Cui P, Zhu W. Structural deep network embedding. Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining 2016; 1225-34. doi: 10.1145/2939672.2939753
- Chen T, He T, Benesty M, et al. Xgboost: Extreme gradient boosting. R package version 04-2 2015; 1: 1-4.
- Breiman L. Random forests. Mach Learn 2001; 45(1): 5-32. doi: 10.1023/A:1010933404324
- Köhler S, Bauer S, Horn D, Robinson PN. Walking the interactome for prioritization of candidate disease genes. Am J Hum Genet 2008; 82(4): 949-58. doi: 10.1016/j.ajhg.2008.02.013 PMID: 18371930
- Ge M, Li A, Wang M. A bipartite network-based method for prediction of long non-coding RNAprotein interactions. Genom Proteom Bioinform 2016; 14(1): 62-71. doi: 10.1016/j.gpb.2016.01.004 PMID: 26917505
- Zhang W, Yue X, Tang G, Wu W, Huang F, Zhang X. SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions. PLOS Comput Biol 2018; 14(12): e1006616. doi: 10.1371/journal.pcbi.1006616 PMID: 30533006
- Zhang W, Qu Q, Zhang Y, Wang W. The linear neighborhood propagation method for predicting long non-coding RNAprotein interactions. Neurocomputing 2018; 273: 526-34. doi: 10.1016/j.neucom.2017.07.065
- Melling N, Taskin B, Hube-Magg C, et al. Cytoplasmic accumulation of ELAVL1 is an independent predictor of biochemical recurrence associated with genomic instability in prostate cancer. Prostate 2016; 76(3): 259-72. doi: 10.1002/pros.23120 PMID: 26764246
- Kabashi E, Valdmanis PN, Dion P, et al. TARDBP mutations in individuals with sporadic and familial amyotrophic lateral sclerosis. Nat Genet 2008; 40(5): 572-4. doi: 10.1038/ng.132 PMID: 18372902
Supplementary files
 
				
			 
					 
						 
						 
						 
						 
									 
  
  
  Email this article
			Email this article 