Adaptability and phenotypic stability of soybean genotypes regarding epicotyl length using artificial neural network and non-parametric test

Adaptability and phenotypic stability of soybean genotypes regarding epicotyl length using artificial neural network and


INTRODUCTION
Soybean (Glycine max L. Merrill) is one important source of protein and vegetable oil in the world and it can be used for human and animal food (Gontijo et al., 2023;Paixão et al., 2023;Pradebon et al., 2023). The soy crop has one of the most successful historical programs and is the reason for changes in Brazilian agriculture regarding the professionalization and specialization of agriculture in some regions of Brazil, currently having the largest area and national production (Silva, Silva, Bezerra, & Sediyama, 2021). In the 2019/2020 agricultural season, Brazil became the world's largest soybean producer (Embrapa Soja, 2021). According to Oda et al. (2015) part of this success is due to the genetic improvement programs of several Brazilian research institutions and universities. Moreover, multidisciplinarity has become a necessary and mandatory requirement in soybean breeding, from the planning of potential parents to the commercialization of seeds of the new cultivar (Matsuo, Borém, & Sediyama, 2021b).
Changes in soybean improvement have been observed after the approval of the Cultivar Protection Law, in the sense of increasing the number and quality of cultivars available to rural producers. That happened as a result of greater investments by different research institutions, in addition to the number of companies (public and private) active in soybean improvement, internationalization of research (mainly in biotechnology topics) and improvements in the field research network (Matsuo, Borém, Sediyama, & Ferreira, 2021a).
Soybean cultivars are differentiated when evaluated by several phenotypic characters, such as: anthocyanin pigmentation of the hypocotyl, predominant flower color, type of plant growth, among others. Nogueira et al. (2008) suggested the need to expand the list of descriptors used in this differentiation. In this sense, among several characters, the epicotyl length has stood out in terms of identifying differences between cultivars (Nogueira et al., 2008;Matsuo, Sediyama, Cruz, Oliveira, & Cadore, 2012a;Matsuo, Sediyama, Cruz, & Oliveira, 2012b;Chaves et al., 2017;Camargos, Campos, Alves, Ferreira, & Matsuo 2019;Hanyu, Costa, Cecon, & Matsuo, 2020;Gontijo et al., 2021). However, no manuscripts reporting phenotypic stability and its behavior as a function of different environments were identified in the literature.
This knowledge becomes important because when analyzing a genotype in different environments, its phenotypic value, besides being influenced by the environment to which it is subjected and its genotypic effect, can be influenced by an additional component called interaction genotypes x environments (Cruz, Regazzi, & Carneiro, 2012). Studies regarding the interaction genotypes x environments, despite being of great importance for the improvement, do not provide detailed information about the behavior of each genotype against environmental variations (Cruz et al., 2012). For this purpose, these authors recommend the analysis of adaptability and stability, through which it becomes possible to identify cultivars with predictable behavior and that are responsive to environmental variations, under specific or broad conditions. Several methodologies are available for analyzing the stability and adaptability of a group of genotypes tested in a series of environments, such as Artificial Neural Networks of Nascimento et al., 2013, Lin andBinns (1988), modified by Carneiro (1998), among others.
The artificial neural network methodology for phenotypic adaptability and stability by Nascimento et al (2013) was based on training an artificial neural network considering the methodology of Eberhart & Russell (1966). Simulated genotypes are used in the training and validation of the neural network. In this way, no assumptions about the model are made and the genotype assignment in terms of adaptability and stability is not based only on the genotypes under study but on a large collection of genotypes simulated according to characteristics of the experiment under study . Thus, the interpretation of adaptability by ANN should be similar to that of Eberhart & Russell (1966), whereas regarding stability, the concept used in the ANN of Nascimento et al (2013) is based on the work of Finlay and Wilkinson (1963), which differs from Eberhart by considering stability as invariance and not predictability . Artificial neural networks with a focus on adaptability and stability analysis have already been used to evaluate the stability and adaptability of grain yield of soybean elite strains by traditional methods and neural network-based approaches (Oda, Sediyama, Matsuo, Nascimento, & Cruz, 2019) and for evaluating the interaction genotypes x environments in soybean genotypes grown in Alto Paranaíba, Minas Gerais, and identifying genotypes with indicated for broad (general) environment, those recommended for favorable or unfavorable environments through different methodologies of adaptability and stability (Matsuo, Dezordi, Nascimento, & Cruz, 2022).
The methodology of Lin and Binns (1988) is based on the estimation of the Pi parameter that measures the deviation of the average of a genotype in relation to the maximum in each environment. Carneiro (1998) proposed an improvement of the method in order to make it capable of determining the behavior of the genotypes in specific environments: favorable and unfavorable, in which the Pi parameter started to be interpreted by MAEC (Measure of Adaptability and Stability of Behavior) and the classification of the environments was made based on the environmental indexes (Cruz & Carneiro, 2006). The use of the method of Lin and Binns (1988), modified by Carneiro (1998), by means of the bissegmented straight line can also be applied for cases in which the interest is in the best values of the variable analyzed or when the interest is around a fixed value (highest, lowest average or a specific value) (Cruz & Carneiro, 2006).
The identification of genotypes that are responsive or not to the improvement of environment is important for the genetic improvement in soybean, but specifically regarding the differentiation of genotypes. Thus, the objective is to identify soybean genotypes that present low or high averages, highly stable throughout the environments analyzed and that present adaptability to different environments.

MATERIAL AND METHODS
The experiments were conducted under greenhouse conditions at the Universidade Federal de Viçosa -Rio Paranaíba Campus, in Rio Paranaíba -MG (19°12'50.4 "S 46°13'57.7 "O; 1133 msnm). Pots with 3 dm³ of soil capacity were used and arranged on benches. The seeds, of random size, were planted at a depth of 2 cm and the seedlings were grown according to the recommendation of the cultivar (Sediyama, 2009 When the seedlings reached the VC development stage (Fehr & Caviness, 1977) Adaptability and phenotypic stability… they were thinned leaving two per pot and the evaluation of the epicotyl length, in mm, was performed when the plants reached the V2 and V3 development stages. In each planting season, a randomized block design with four repetitions was considered, with the experimental unit consisting of two plants and the mean of the plot was used in the analysis of the epicotyl length data. Experiment B. Similar experiments were conducted in six planting seasons (November/2014, January/2015, March/2015, October/2015, February/2016, and April/2016 in which 16 genotypes (P98Y30; BRSMG820RR; BRSMG850GRR; BRS810C, BRSValiosaRR, BRSMG811CRR, MG/BR 46 (Conquista), BG4277, BRSMG760SRR, TMG1175RR, BRSMG752S, BG4272, PRE6336, BMXTornadoRR, NA5909RG and PRE5808).
When the seedlings reached the VC development stage (Fehr & Caviness, 1977) the plants were thinned leaving three per pot and the evaluation of the epicotyl length, in mm, was performed when the plants reached the V2 development stage. In each planting season, a randomized block design with four repetitions was considered, with the experimental unit being composed of three plants and the mean of the plot was used in the analysis of the epicotyl length data.
Statistical analysis. Experiments A and B were analyzed separately. The data obtained were initially analyzed by individual analysis of variance (i.e., each planting season was characterized as a growing environment). Subsequently, the homogeneity of the residual variances was analyzed using the Fmax test (Cruz e al., 2012) and the variances were considered homogeneous when this ratio was less than 7.0 (Pimentel- Gomes, 1990). The joint analysis of variance was performed considering the randomized block design in the simple factorial scheme (genotypes vs. Environments). It consisted of fixed effects for genotypes, random effects for environments and for the interaction of genotypes x environments and 5% of error probability (type I error). Additionally, the coefficients of genotypic determination (H 2 (%)) were estimated.
The model joint analysis of variance was Yijk= m + Gi + B/Ajk + Ak + GAik + Eijk , where: Yijk is the observed value of productivity in genotype in block and within environment ; is the overall mean; Gi is the fixed effect of genotype ; B/Ajk is the random effect of block within environment ; A k is the random effect of environment ; GAik is the random effect of the interaction of genotype with environment ; and E ijk is the experimental error.
After verifying the significance of the interaction (p<0.05) the Scott-Knott test was performed to group the genotypes within each environment and the study of adaptability and stability through the Artificial Neural Network of Nascimento et al., 2013 andLin andBinns (1988), modified by Carneiro (1998), to analyze the behavior of each genotype throughout the different environments.
Regarding the artificial neural network (ANN), the simulation of the data for computational training purposes and classification of the genotypes as to adaptability and stability by means of the ANN were obtained according to Nascimento et al. (2013). Therefore, the back-propatation single hidden layer network was used, with 1 input, 1 intermediate and 1 output layer. The first layer with 4 and 6 inputs which refer to the number of epicotyl length averages evaluated in 4 and 6 environments (for experiments A and B, respectively). In the intermediate layer the number of neurons was a whole value, between 1 and 15 neurons. The output layer, on the other hand, was composed of 6 neurons that referred to the classification of the genotype in one of the six classes defined by Eberhart & Russell (1966), as described by Nascimento et al. (2013). The necessary arguments for the network function, such as number of neurons in the hidden layer, initial values for Adaptability and phenotypic stability… the weights, decay rate and maximum iterations were chosen considering the network that provided an error value of at most 2% for the test set, as performed by Nascimento et al. (2013) and Barroso, Nascimento, Nascimento, Silva and Ferreira (2013). These authors suggest that the best network architecture should be established considering a classification error lower than 2%.
For the analysis using the method of Lin and Binns (1988), modified by Carneiro (1998), specifically regarding the use of bissegmented straight lines, according to Cruz & Carneiro (2006), β0m was attributed the minimum and maximum value of the variable throughout the experiment and the regression coefficients for the unfavorable and favorable environments (β1m and β1m) assumed the value zero. Under this condition, the genotype with the best genotypic performance, that is, the most adapted and with greater stability of behavior, would be the one closest to the ideal in each location, which would present the lowest estimate for the MAEC parameter (Cruz & Carneiro, 2006).
For adaptability and stability analysis using Artificial Neural Networks, the nnet function from the nnet package (Venables & Ripley, 2002) implemented in the R Program (R Development Core Team, 2021) was used and the other analyses were performed in the Genes Program (Cruz, 2013).

RESULTS AND DISCUSSION
The genotypes showed significant differences (p < 0.01) for epicotyl length in the four environments and the two developmental stages in Experiment A and in the six environments in Experiment B (Table 1). This indicates that there is a possibility of identifying genotypes that differ in epicotyl length in all environments analyzed (Experiments A and B). The values of coefficients of variation ranged from 8.3% to 17.1%. These values are similar to those found in the literature as in Matsuo et al. (2012a), Nogueira et al. (2008) and Hanyu et al. (2020), whose coefficient of variation values obtained were, respectively, lower than 16%, 19% and 12% and 19.85%. The ratio between the largest and smallest residual mean square of the experiments was 1.81 for the evaluations performed in the V2 stage (Exp A), 1.72 for V3 (Exp A) and 4.37 for Experiment B. The values were lower than 7.0 and indicate, according to Pimentel-Gomes (1990), homogeneity of variances, thus allowing the joint analysis of the experiments.
In the joint analysis ( Table 2) the effects of interaction genotypes x environments (GxA), of environments and of genotypes were significant (p < 0.01) in both experiments. The significance of the GxA interaction shows that there is a response variation of the genotypes in the different environments. The coefficients of variation, in the joint analysis, were 15.4% and 13.7%, respectively for the V2 and V3 stages, in Experiment A and 13.3%, for V2 in Experiment B.
The estimates obtained for genotypic determination coefficient H 2 were higher than 0.79 (i.e., > 79%), considering the environments individually (Table 1) and the pooled analyses of variance (Table 2). Estimates higher than 70% indicate high genetic influence on phenotypic expression (Chaves et al., 2017). Therefore, the results of the present work indicate high genetic influence for epicotyl length. Similar results were obtained by Nogueira et al. (2008), Matsuo et al. (2012a) and Hanyu et al. (2020). The genotypes MG/BR46_Conquista, TMG 4185, BRSGO 7560 and BRSMG 752 S were allocated in the highest average group and the genotypes FT Cristalina and BRS 283 in the lowest average groups for epicotyl length measured at the V2 stage (Table  3). By the RNA method of Nascimentos et al. (2013) (Table 3) of those selected as low and high average by Scott-Knott, the genotype BRS 283, FT-Cristalina and BRSMG 752 S present high stability and recommendation for general environment for the measurements performed at the V2 stage by ANN.
The percentages of accordance between the methodology of Eberhart & Russell (1966) and the training of an artificial neural network to analyze the phenotypic adaptability and stability of alfalfa (Medicago sativa L.) genotypes were, respectively for adaptability and stability, 81.52% and 83.69% comparing Eberhart & Russell (1966) and Artificial Neural Networks (Barroso et al., 2013). Teodoro et al. (2015) indicated that the magnitudes of agreement between the methods were 100% and 70%, respectively for phenotypic adaptability and stability. Carvalho et al. (2018) found 100% and 81.82% agreement for phenotypic adaptability and stability of cotton genotypes. Alves et al. (2019) when considering all cultivars analyzed, Adaptability and phenotypic stability… identified 87.5% and 81.25% of agreement between the methods (Eberhart & Russell (1966) and Artificial Neural Networks) for Adaptability and Stability, respectively. Oda, Sediyama, Cruz, Nascimento, and Matsuo (2022) the agreement, regarding the classification of genotypes, between the ANN proposed by Nascimento et al. (2013) with the method of Eberhart and Russell (1966) was 94% in adaptability and regarding phenotypic stability was 53%. Therefore, according to Nascimento et al. (2013), Teodoro et al. (2015) and Carvalho et al. (2018) ANNs can be considered an effective alternative to measure adaptability and phenotypic stability of genotypes in breeding programs. Table 3. Averages of epicotyl length, in mm, measured in Experiment A, V2 stage, in 28 soybean genotypes as a function of 4 environments and estimates of adaptability and stability parameters by using the ANN method of Nascimento et al., 2013 andLin andBinns (1988), modified by Carneiro (1998)  considering the ideal genotype as the one with the lowest average and the genotypes MG/BR46_Conquista and BRSMG 752 S when considering the ideotype as the one with the highest average. Through this methodology, the most adapted genotypes and with greater stability of behavior will be those with values closer to the ideal (lowest, average, highest or a fixed value of the average), that is, with lower estimates for the MAEC parameter, Pi (Cruz & Carneiro, 2006). For the V3 stage, the genotypes MG/BR46_Conquista, TMG 4185, BRSGO 7560 and BRSMG 752 S formed the group with the highest average and FT Cristalina and BRS 283 the two groups with the lowest average (Table 4). When analyzing the results for V3 stage, it was verified by the RNA methodology, that BRS 283, TMG 4185, BRSGO 7560, BRSMG 752 S and FT Cristalina were of high stability and recommendation for general environment. Table 4. Averages of epicotyl length, in mm, measured in Experiment A, V3 stage, in 28 soybean genotypes as a function of 4 environments and estimates of adaptability and stability parameters by using the ANN method of Nascimentos et al., 2013 andLin andBinns (1988), modified by Carneiro (1998)  Also by this methodology, MG/BR46_Conquista was shown to be of high stability, but recommended for favorable environments. When analyzing the results of Lin and Binns (1988), modified by Carneiro (1998), when considering the highest average as the ideal, the genotypes TMG 4185, BRSGO 7560 and BRSMG 752 S were the ones with the best genotypic performance. Whereas FT Cristalina and BRS 283 were the most recommended when the ideal was the lowest average. In this case, FT-Cristalina and BRS 283 were allocated in the groups of the lowest averages in the four environments and show high stability of behavior because the values obtained were the closest to the lowest average in each of the evaluation sites.
The genotypes BRSMG 850 GRR, MG/BR46_Conquista and BG 4272 were allocated to the highest average groups, in 5 of the 6 analyzed environments, and TMG 1175 RR and BMX Tornado RR to the lowest average groups, considering each environment separately (Table 5). Among the genotypes identified as having low or high averages, it was identified that all were classified as having high stability by RNA by Nascimento et al. (2013). For adaptability, TMG 1175 RR is recommended for general environments, BG 4272 and BMX Tornado RR are recommended for Unfavorable environments, while BRSMG 850 GRR and MG/BR46_Conquista are recommended for favorable environments. For Lin and Binns (1988), modified by Carneiro (1998), the genotypes TMG 1175 RR and BMX Tornado RR are the most recommended when the ideotype is the one with the lowest average and the genotypes BRSMG 850 GRR, MG/BR46_Conquista and BG 4272 when analyzed based on the one with the highest average ( Table 6).
The search for genotypes that are not or only slightly responsive to Adaptability and phenotypic stability… environmental improvements is justified, in this work, to identify those that even in the presence of some environmental stimulus, the genotype will remain with an average statistically similar to that of the environment that did not receive this stimulus. Thus, the aim was to identify the genotypes of wide adaptability or adaptability to unfavorable environments. This is because the concept of adaptability refers to the ability of genotypes to respond to environmental stimuli and are classified as: genotypes with broad or general adaptability (β1i = 1); genotypes with specific adaptability to favorable environments (β1i > 1); genotypes with specific adaptability to unfavorable environments, (β1i < 1) (Barroso et al., 2013). Table 6. Averages of epicotyl length, in measured in Experiment B, V2 stage, in 16 soybean genotypes as a function of 4 environments and estimates of adaptability and stability parameters by using the ANN method of Nascimentos et al., 2013 andLin andBinns (1988), modified by Carneiro (1998)  Conversely, it becomes possible to identify genotypes that are responsive to environmental improvements, i.e. specific adaptability to favorable environments. In this sense, the genotypes BRS 8381 and TMG 4185 (in V2 of Experiment A), MG/BR46_Conquista (in V3 of Experiment A) and BRSMG 850 GRR, BRS Valiosa RR, MG/BR46_Conquista and BG 4277 (in Experiment B) were recommended for favorable environments. This indicates these genotypes showed favorable response to environment stimulus and therefore epicotyl length can be influenced when the growing environment is favorable for its growth.
Thus, it was verified that there is a difference between genotypes, that the epicotyl length is a character that is influenced by environmental effects and that the genotypes present different responses when cultivated in different environments. In other words, the epicotyl length in some genotypes is stable and poorly responsive to environmental improvements, while in others the response is stable and