*Result*: A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library.

Title:
A Comparison of Three Machine Learning Methods for Multivariate Genomic Prediction Using the Sparse Kernels Method (SKM) Library.
Authors:
Montesinos-López OA; Facultad de Telemática, Universidad de Colima, Colima 28040, Mexico., Montesinos-López A; Centro Universitario de Ciencias Exactas e Ingenierías (CUCEI), Universidad de Guadalajara, Guadalajara 44100, Mexico., Cano-Paez B; Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), México City 04510, Mexico., Hernández-Suárez CM; Instituto de Ciencias Tecnología e Innovación, Universidad Francisco Gavidia, El Progreso St., No. 2748, Colonia Flor Blanca, San Salvador CP 1101, El Salvador., Santana-Mancilla PC; Facultad de Telemática, Universidad de Colima, Colima 28040, Mexico., Crossa J; International Maize and Wheat Improvement Center (CIMMYT), Texcoco 56237, Mexico.; Colegio de Postgraduados, Montecillo 56230, Mexico.
Source:
Genes [Genes (Basel)] 2022 Aug 21; Vol. 13 (8). Date of Electronic Publication: 2022 Aug 21.
Publication Type:
Journal Article; Research Support, Non-U.S. Gov't
Language:
English
Journal Info:
Publisher: MDPI Country of Publication: Switzerland NLM ID: 101551097 Publication Model: Electronic Cited Medium: Internet ISSN: 2073-4425 (Electronic) Linking ISSN: 20734425 NLM ISO Abbreviation: Genes (Basel) Subsets: MEDLINE
Imprint Name(s):
Original Publication: Basel : MDPI
References:
G3 (Bethesda). 2015 Sep 15;5(11):2383-90. (PMID: 26377960)
Brief Bioinform. 2007 Jan;8(1):32-44. (PMID: 16772269)
Front Genet. 2016 Dec 27;7:221. (PMID: 28083016)
Theor Appl Genet. 2020 Nov;133(11):3101-3117. (PMID: 32809035)
Front Plant Sci. 2020 Aug 07;11:1197. (PMID: 32849742)
PLoS One. 2012;7(2):e32253. (PMID: 22389690)
Adv Appl Bioinform Chem. 2009;2:57-70. (PMID: 21918616)
Front Plant Sci. 2019 Nov 08;10:1311. (PMID: 31787990)
Front Plant Sci. 2016 Nov 22;7:1666. (PMID: 27920780)
Front Genet. 2022 Jun 03;13:887643. (PMID: 35719365)
Front Genet. 2022 Sep 05;13:966775. (PMID: 36134027)
G3 (Bethesda). 2019 May 7;9(5):1355-1369. (PMID: 30819822)
Front Genet. 2022 Jul 08;13:920689. (PMID: 36313422)
Plant Genome. 2016 Jul;9(2):. (PMID: 27898810)
G3 (Bethesda). 2021 Apr 15;11(4):. (PMID: 33835165)
Genet Sel Evol. 2016 Jun 10;48(1):42. (PMID: 27286957)
PLoS One. 2011 May 04;6(5):e19379. (PMID: 21573248)
Bioinformatics. 2007 Oct 1;23(19):2633-5. (PMID: 17586829)
G3 (Bethesda). 2019 May 7;9(5):1519-1531. (PMID: 30877079)
Trends Plant Sci. 2017 Nov;22(11):961-975. (PMID: 28965742)
Plant Genome. 2018 Nov;11(3):. (PMID: 30512048)
G3 (Bethesda). 2018 Dec 10;8(12):3829-3840. (PMID: 30291108)
Plant Sci. 2016 Jan;242:23-36. (PMID: 26566822)
G3 (Bethesda). 2019 Oct 7;9(10):3381-3393. (PMID: 31427455)
J Dairy Sci. 2008 Nov;91(11):4414-23. (PMID: 18946147)
Genetics. 2014 Oct;198(2):483-95. (PMID: 25009151)
Heredity (Edinb). 2021 Jan;126(1):92-106. (PMID: 32855544)
Plant Genome. 2017 Nov;10(3):. (PMID: 29293806)
Contributed Indexing:
Keywords: genomic selection; multi-environment; multi-trait; plant breeding; statistical machine learning
Entry Date(s):
Date Created: 20220826 Date Completed: 20220829 Latest Revision: 20250728
Update Code:
20260130
PubMed Central ID:
PMC9407886
DOI:
10.3390/genes13081494
PMID:
36011405
Database:
MEDLINE

*Further Information*

*Genomic selection (GS) changed the way plant breeders select genotypes. GS takes advantage of phenotypic and genotypic information to training a statistical machine learning model, which is used to predict phenotypic (or breeding) values of new lines for which only genotypic information is available. Therefore, many statistical machine learning methods have been proposed for this task. Multi-trait (MT) genomic prediction models take advantage of correlated traits to improve prediction accuracy. Therefore, some multivariate statistical machine learning methods are popular for GS. In this paper, we compare the prediction performance of three MT methods: the MT genomic best linear unbiased predictor (GBLUP), the MT partial least squares (PLS) and the multi-trait random forest (RF) methods. Benchmarking was performed with six real datasets. We found that the three investigated methods produce similar results, but under predictors with genotype (G) and environment (E), that is, E + G, the MT GBLUP achieved superior performance, whereas under predictors E + G + genotype × environment (GE) and G + GE, random forest achieved the best results. We also found that the best predictions were achieved under the predictors E + G and E + G + GE. Here, we also provide the R code for the implementation of these three statistical machine learning methods in the sparse kernel method (SKM) library, which offers not only options for single-trait prediction with various statistical machine learning methods but also some options for MT predictions that can help to capture improved complex patterns in datasets that are common in genomic selection.*