Molecular and phenotypic diversity of ICARDA spring barley (Hordeum vulgare L.) collection

Plant breeders are interested in using diverse genotypes in hybridization that can segregate for traits of importance with possibility of selection and genetic gain. Information on molecular and agro-morphological diversity helps the breeders reduce the effort for parental selection and helps the advancement of generations. A phenotypic and molecular diversity study, using 24 traits (agronomic and disease) and 6519 SNPs in a diverse collection of 336 spring barley genotypes, was carried out at Marchouch and Jemma Shiam research stations in Morocco. Based on structure and multivariate analyses, strong differentiation between the two- and six-row types were observed. The linkage disequilibrium (LD) decay of the current collection (for the combined population) was up to 3.58 cM (r2 = 0.15) while LD decay were estimated 3.91 and 2.36 cM for two- and six-row barley, respectively. PCA of agro-morphological traits revealed grain per spike, net form of net blotch (NFNB), spot form of net blotch (SFNB), and 1000 kernel weight were the most discriminatory traits in the current collection. Association mapping in the two independent populations will be ideal for identification of markers, and QTL related to traits. The generated information on relatedness between individuals will help identify diverse genotypes for breeding programs.


Introduction
Barley (Hordeum vulgare L.) is one of the most important cereal crops in the world with nearly 50 million hectares (ha) of harvested area and 145 million Electronic supplementary material The online version of this article (doi:10.1007/s10722-017-0527-z) contains supplementary material, which is available to authorized users. tons (t) production worldwide (FAOSTAT 2015). This crop in particular was domesticated from its wild relative Hordeum vulgare subsp. spontaneum (K. Koch) around 10,000 years ago in the Fertile Crescent (Badr et al. 2000;Zohary and Hopf 2000). New evidence based on RBP2 gene shows that barley was domesticated both in the Fertile Crescent and Tibetan Plateau (Wang et al. 2016a, b). Barley is mainly used for animal feed, brewing malts and human consumption (Munoz-Amatriain et al. 2014;Hayes et al. 2002) and is considered a staple food in several regions of the world, including the North and East Africa (Shewayrga and Sopade 2011).
The worldwide distribution of barley is due to its wide adaptation to diverse agro-ecology and different abiotic stresses (drought, cold, heat, and salinity). Recent studies indicated that polymorphism in flowering time genes HvCO1, HvFT1, Ppd-H1, and VRN1-H1 has contributed to the adaptation of barley towards diverse agro-ecology (Aslan et al. 2015). With changes in climate, food productivity has to be increased to meet the global food demands. Barley can be considered as a model species due to its ability to grow in different environments which shaped its diversity, accumulating a rich pool of genes as a result of adaptation to wide environments and survival in harsh conditions (Grando et al. 2001). In fact, an extensive amount of data has been generated from genetic diversity surveys in wild and cultivated barley over the past decade (Munoz-Amatriain et al. 2014;Comadran et al. 2009;Orabi et al. 2007;Brantestam et al. 2006;Feng et al. 2006;Malysheva-Otto et al. 2006;Pandey et al. 2006;Chabane et al. 2005;Hou et al. 2005;Hamza et al. 2004;Baek et al. 2003;Matus and Hayes 2002;Struss and Plieske 1998;William et al. 1997). Genetic diversity studies are important tools that help crop improvement by identification of diverse parental lines for hybridization and to introgress desirable genes into elite germplasm (Chakravorty et al. 2013;Gyawali et al. 2013). Such studies can provide information about the resource allocation that affect the long-term maintenance of diverse germplasm collections (McClean et al. 2012). An understanding of diversity and genetic structure is also important for association mapping since population structure can lead to spurious associations and a control can be used to reduce false positives (Gyawali et al. 2016). High-throughput genotyping platforms and candidate gene studies have promoted association mapping as a viable approach for quantitative trait locus (QTL) mapping. It is an alternative to the traditional QTL mapping by using the recombination events from multiple lineages and to exploit the natural variation in large samples. Genotyping a diverse collection will help identify genomic regions of interest that control phenotypic variation.
The success of association mapping depends on the extent and patterns of linkage disequilibrium (LD). The extent of LD in a given population determines the density of markers required for whole genome scan that have implications for the identification of candidate genes associated with traits of interest (Szalma et al. 2005). Patterns of LD help discern the regions of low LD that has implications for breeder's selection. The overall LD facilitates in the understanding of the population genetic processes involved in shaping the present diversity of plants (Iqbal et al. 2012;Gurung et al. 2011;Mackay and Powell 2007;Malysheva-Otto et al. 2006;Gupta et al. 2005;Flint-Garcia et al. 2003), because the LD is affected by mating systems, recombination, selection, and genetic bottlenecks (Hamblin et al. 2011;Flint-Garcia et al. 2003). Therefore, it is important to know the population structure and the diversity of the population that can be used for association mapping.
High-throughput SNP genotyping platforms have revolutionized the gene mapping and genome-wide association studies (GWAS) in plants (Tian et al. 2011). The barley 9 K iSelect Illumina SNP platform gives whole genome coverage and an adequate genetic characterization of germplasm collections, which will make the diversity contained in a given collection efficiently accessible to barley breeders (Munoz-Amatriain et al. 2014;Comadran et al. 2009). This 9 K SNP chip has been effective in the identification of QTL in several studies, including Turuspekov et al. (2016), Tamang et al. (2015) and Mamo and Steffenson (2015).
The International Center for Agricultural Research in the Dry Areas (ICARDA) has the global mandate for barley improvement among the Consultative Group for International Agricultural Research (CGIAR) centers and holds one of the largest barley accessions (more than 30,000 barley accessions including wild relatives, landraces, and cultivars) in its gene banks across the world. In order to conduct GWAS for multiple traits of interest, a collection of 336 genotypes consisting of elite lines from multiple agro-ecological environments, released cultivars, landraces, and differentials, was assembled representing much diversity present in the ICARDA's spring barley breeding gene pool adapted to variable environments. The objectives of the current study were to explore genetic and phenotypic diversity of the collection and to determine the patterns of population structure and LD within this collection.

Plant materials
A total of 336 barley genotypes that includes advanced breeding lines, cultivars, and landraces from ICARDA and other sources such as barley genotypes introduced from different countries into ICARDA's barley breeding program were used for this study (Supplemental Table S1). Barley genotypes were selected for this study specifically representing tolerances to abiotic stress (drought and heat) and biotic (foliar diseases including rust, net blotch, spot blotch, powdery mildew) stresses. Further, genotypes were selected from low input barley breeding programs (stressed conditions for moisture and fertility), high input barley breeding programs (favorable conditions) of ICARDA. Barley genotypes selected for this study also represented feed, food, and malt barley programs of ICARDA. While selecting genotypes, appropriate consideration was given to select representative samples from both two-and six-row barley. All genotypes are of spring growth habit, out of which 199 are six-row and 137 are two-row barley. Any genotypes showing winter or facultative growth habit were removed from collection. Furthermore, the collection can be classified as hulled (276), primarily used for feed and malting purposes, and hulless (60) barley for food. The collection consisted of 230 barley genotypes from low input barley breeding programs (genotypes adapted to abiotic stresses), 82 from high input breeding program (adapted to the favorable production conditions) and rest 24 genotypes being frequently used by both programs. All available information on this collection is presented in the Supplemental Table S1.

Field experiment and phenotyping
Evaluations of agronomic traits and screening of disease resistances were carried out at two research stations in Morocco. The experiments were carried out in alpha-lattice design with two replications during 2014-15 season in Marchouch (MCH) (33°33 0 38.2 00 N 6°41 0 24.7 00 W), and Jemma-Shiam (JS) (32°21 0 09.3 00 N 8°50 0 32.0 00 W) stations. Marchouch research station has been considered a high production potential with no stresses of moisture and soil fertility. In contrast to MCH, JS research station lack water supply for crop growth and is dependent on rain fed condition, therefore growing conditions in JS is considered to be moisture and nutrient stressed is considered. Data was recorded at both locations for agro-morphological and yield components, including days to heading (DH), days to maturity (DM), plant height (PH), spike length (SL), grains per spike (G/S), biological yield ha -1 (BY) grain yield ha -1 (GY), harvest index (HI = GY/BY), 1000 kernel weight (TKW) and hectoliter (test) weight in kg/hectoliter (HW). The genotypes were also screened for adult plant resistance (APR) to spot form of net blotch (SFNB), net form of net blotch (NFNB), and powdery mildew (PM) under natural conditions. In JS, PM resistance was evaluated at Zadoks growth stage 19-29 using 1-5 scale. At adult stage (Zadoks GS 77-87), disease rating was visually recorded using double digit scale (00-99) where the first digit indicates vertical disease progress on the plant and the second digit refers to severity measured in the infected leaf area (Saari and Prescott 1975).
The statistical analyses for all traits, in each location (MCH or JS), were taken up using Genstat v18 (VSN international, GenStat.co.uk). Multivariate analysis was performed on the measured qualitative and quantitative traits by using the principal component analysis (PCA) implemented in Genstat v18. The ANOVA was performed to evaluate the effects of genotypes (G), environment (E) and G 9 E interaction. In addition, each trait was investigated to determine relatedness of traits using Pearson's correlation coefficients. In multivariate analysis of agronomic traits, data from only 326 genotypes were considered that met the criteria of no missing data. The remaining 10 genotypes had at least one trait data missing and were excluded from PCA. For further investigation, a dendrogram based on mean traits from both locations was generated using hierarchical cluster analysis with the group average linkage method in Genstat v18.

SNP genotyping and diversity
Single plants of each line were grown in a greenhouse and the leaf tissue was lyophilized. Genomic DNA was extracted using the method described in Slotta et al. (2008). The barley genotypes were genotyped using 9 K iSelect SNP array based on Illumina's Inifinium Assay (Illumina, San Diego, CA, USA) at Cereal Crop Research Unit, USDA-ARS, Fargo, ND. The obtained SNP data were further filtered for (a) a minor allele frequency of 0.05, (b) rate of missing values above 10%.
Diversity statistics including genetic diversity, major allele frequency and Polymorphic Information Content (PIC) were analyzed using PowerMarker v3.25 (Liu and Muse 2005). The phylogenetic analysis was conducted using Nei distance matrix (Nei 1972), computed by PowerMarker and used as input to generate the Unweighted Pair-Group Method using Arithmetic averages (UPGMA) dendrogram, viewed in TreeView X v0.5 (Page 1996). The genetic distance (D) among the genotypes was estimated by Unbiased Measures of genetic distance (Nei 1972). The genetic relationships between genotypes were further investigated by principal coordinate analysis (PCoA) based on the Nei genetic distance matrix in NTSYSpc 2.02i (Rohlf 2000).

Population structure analysis
Analysis of the population structure among barley genotypes was performed using the Bayesian modelbased analysis implemented in the STRUCTURE v2.3.4 (Hubisz et al. 2009;Falush et al. 2003;Pritchard et al. 2000). Each individual is assigned to different groups according to a membership coefficient (q i ; Rq i = 1.0). The posterior probabilities were estimated using the Markov Chain Monte Carlo (MCMC) method. The number of hypothetical populations (K) tested was from 1 to 7. For each K, 5 runs were set and the MCMC chains were run with a 100,000 burn-in period, followed by 100,000 iterations using the admixture model with correlated allele frequencies. The most likely number of sub populations was determined using the DK (Evanno et al. 2005) implemented in Structure Harvester (Earl and vonHoldt 2012).

Linkage disequilibrium
The estimates of the linkage disequilibrium (LD) of SNPs were determined for pairs of loci using the software package Tassel 3.0 (Bradbury et al. 2007) using SNPs of known marker positions only. The squared allele-frequency correlations (r 2 ) (Weir 1979) was calculated for each intra chromosomal combination. The distribution and extent of LD were visualized by plotting intra-chromosomal r 2 values against the genetic distance in cM for all inter-chromosomal marker pairs using nonlinear regression as described in Remington et al. (2001) and implemented in SAS 9.3.

Phenotypic diversity
The phenotypic stats (minimum, maximum, mean, standard error of the mean, and range) of 24 traits are presented in Table 1. The agro-morphological traits of individual genotypes are presented in Supplemental  Table S1. The agronomic data of two-and six-row types are presented in Supplemental Tables S2 and S3, respectively. Using data for the quantitative and qualitative traits from both locations, the first three principal components (PCs) accounted for 66.4% of the total variability. The first PC explained 25.45% of the total variation (Fig. 1). Particularly, G/S, NFNB and SFNB resistance in both locations were the variables with high positive loadings, while TKW had the largest negative loading. The second PC explained 21.79% of the total variation. In second PC, SFNB resistance in JS and TKW had the highest positive loadings while G/S and NFNB resistance in JS were the variables with the largest negative loadings. The third component that explained 19.16% of the total variation was associated with high positive loadings of NFNB resistance in both locations and TKW while the largest negative loading was associated with SFNB resistance in JS (Table 1). The PCA of agronomic traits measured in MCH and JS is presented in Supplemental Figs. 1a and 1b. The ANOVA of agronomic traits are presented in Tables 2 and 3. A highly significant (P \ 0.01) effect of genotypes (G) was found for DH, PH, SL, NFNB, SFNB, G/S, BY, GY and HI (Table 2). A highly significant (P \ 0.05) effect of environments (MCH and JS) was found for DH, PH, SL, NFNB, GY, BY, and HI index. Likewise, highly significant (P \ 0.01) effect of G 9 E interaction was observed for DH, SL, NFNB, SFNM, and GY. A highly significant (P \ 0.01) effect of genotypes was found for DM, TKW, HI, and PM-Adult in MCH while PM-Seedling was non-significant in JS (Table 3).

Correlation between phenotypic traits
Correlation coefficients (r 2 ) were highly significant (P \ 0.001) in 46 of the 276 trait combinations, where r 2 ranged from 0.01 to 0.96 (Fig. 2). The correlation coefficient and P values are presented in a correlation matrix in Supplemental Table S4. High positive correlations (r 2 C 0.5) were found between BY and GY at both locations (BY-JS and GY-JS; BY-MCH and GY-MCH); DH-MCH and DM-MCH; G/S-JS and G/S-MCH with row type. High biomass imply high grain yield and similarly for days to heading and days to maturity. Highly significant negative correlations were found for G/S with, TKW and HW at both locations; HI-MCH with PH-MCH; and row type with SL, TKW and HW. The row type appears to play a key role on the number of grains per spike, TKW and HW where the two-row types have less grains compared to the six-row types and tend to have heavier and larger grains which determine the TKW and HW. Significant positive correlations between NFNB resistance at both locations was observed (r 2 = 0.39). Similarly, the correlation was positively significant for resistance to SFNB at two locations (r 2 = 0.33). This indicates that the resistance/susceptibility was mainly governed by genetic factors, while the environment has very little impact.

Genetic diversity and cluster analysis
A subset of 6940 genome-wide SNPs was used to assess genetic diversity in the collection. Out of 6940, 1982 did not have a known chromosome position and the remaining 4958 were distributed over all seven chromosomes. The subset was further filtered for minor allele frequencies (MAF B 0.05) and missing SNPs ([10%), and a final set of 6519 SNPs were used for further analyses (Table 4). Gene diversity and polymorphism information content (PIC) values on different chromosomes varied from 0.005 to 0.500 and 0.006 to 0.375, with average values of 0.366 and 0.290, respectively (Table 4). The genetic similarity between genotypes quantified using Nei genetic distance (Nei 1972) resulted into two main clusters of significant size corresponding to row type. Furthermore, within the same cluster, genotypes were grouped depending on their adaptation (high-input barley, low-input barley, landrace). The largest distance (D = 0.89) was found between AM-27 (LIMON/BICHY2000//DEFRA/DESCONO-CIDA-BAR) and AM-300 (Arimont). The smallest genetic distance (D = 0.00) was observed between a seventeen pairs of genotypes, all sister lines originated from the same crosses. In order to demonstrate the phylogenetic relationships of the 336 barley genotypes studied, an Unweighted Pair-Group Method using Arithmetic averages (UPGMA) dendrogram was generated ( Fig. 3a) and all genotypes were assigned to two major groups (two-and six-row barley genotypes) and three sub-groups (high-input barley, low-input barley, landraces).

Population structure analysis
The break point of DK in the current study was K = 2 (Fig. 3b). As described by Evanno et al. (2005), the true value of K is when DK, an ad hoc quantity of the second order rate of change of the likelihood function with respect to K, reached its peak. Out of 336 genotypes, 138 (41.08%) were assigned to Q1 and 84 (25%) were assigned to Q2, while the remaining 114 genotypes (33.92%) were admixed (membership coefficient, q i B 0.8). The genetic structure of the collection was also analyzed by using Principal Coordinate Analysis (PCoA). The PCoA of genetic distance revealed a clear differentiation between two and six-row barley sub-populations (Fig. 4). The first and second axes explained 45.49 and 18.05% variations, respectively, and separated genotypes in different clusters corresponding to the row type. One of the clusters mostly contained two-row while another cluster contained six-row barley genotypes. However, some overlaps between two-and six-row clusters were also observed.

Linkage disequilibrium (LD)
The extent of LD was assessed among all chromosomes as well as for the two sub-populations separately. For all genotypes, 16.27% of the total SNP pairs were in LD (P \ 0.001) and 26.53% at P \ 0.05 significance. In our samples, the genome-wide LD decay was 3.58 cM at r 2 [ 0.15 (Fig. 5). However, for the two row genotypes, the number of SNP pairs that are in LD is 29.62% (P \ 0.05) and 19.65 (P \ 0.001) and for six row genotypes it is 32.78% (P \ 0.05) and 21.96 (P \ 0.001). The decay values are about 3.91 for two-row barley and 2.26 for six-row barley (Supplemental Fig. 2a and 2b).

Phenotypic diversity
Descriptive statistics (mean, range and standard error of the means) of 24 agronomic traits showed high levels of variation in barley genotypes. For example, the number of QTL for yield reported were about 60 (Wang et al. 2016a, b;Xue et al. 2010;Pillen et al. 2003;Marquez-Cedillo et al. 2001;Teulat et al. 2001) and for disease resistance there were 31 QTL for leaf rust (Kertho et al. 2015), between 8 and 13 for various strains of SFNB in barley (Tamang et al. 2015). Based on PCA of the phenotypic traits, this barley collection was mainly clustered with respect to their disease resistance (SFNB and NFNB), number of grains per spike, and TKW. This clustering was quite evident since there was strong variability in terms of net blotch (both NFNB and SFNB) response among the genotypes. The variation in the number of grains per spike reflects the row type, but was not enough to separate our population into two groups as revealed by the SNPs markers. Thousand kernel weight (TKW) had the highest PC2 positive loading compared to rest of the traits which show there was high variation in TKW in this collection. Most of two-row genotypes had larger grain than six-row in this study (Supplemental Table 2) which was in agreement with previous reports (Ayoub et al. 2002; Marquez-Cedillo et al. a DM-Days to maturity, TKW-1000 kernel weight, HW-Hectoliter (test) weight, PM-Adult-Powdery mildew severity recorded at adult stage using double digit in Marchouch b PM-Seedling-Powdery mildew recorded at seedling stage using 1-5 scale in Jemma Shiam where 1 is resistance response and 5 is susceptible *, ** are significant at 0.05 and 0.01 probability levels   Kjaer and Jensen 1996). Grain weight compensates for early stages of environmental stresses if favorable conditions prevail during the period of grain filling. In dry areas, moisture stress is prevalent at all stages, especially grain filling, and ICARDA barley breeders tend to select material based on grain weight within level of inputs. Although in many cases, the coefficients (r 2 ) were low, there were significant correlations among different traits. Hence a trade-off of key traits should be taken into consideration during selection and breeding. The disease reactions for NFNB and SFNB at two locations indicated a similar response to either of the disease at both locations, indicating that the pathotypes at both locations might be similar.

6-row 2-row
The classification of genotypes based on hierarchical clustering using Euclidean distance resulted in two main groups, six-row and two-row types. This is in support of the classification of the SNP markers. However, subgroups within a given cluster gathered with a contrasting expression of agronomic traits. Based on the agronomic merit of each subgroup, the genotypes can be classified according to their disease resistance/susceptibility, biomass, yield, height and earliness. No specific differentiations can be made based on other traits. The maximum distance was found between AM-1 (Alanda/5/Aths/4/Pro/TolI// Cer*2/TolI/3/5106/6/Baca'S'/3/AC253//CI08887/ CI05761), a six-row accession, highly susceptible to NFNB, semi-dwarf with short spikes and AM-304 (CI3576) which is a two-row landrace highly resistant to NFNB, tall with long spikes. This amplitude of agronomic traits and disease resistance in barley genotypes reflects the wide genetic variability present in our collection, which is a fundamental condition for the genetic improvement. Similar observations were reported earlier by Shakhatreh et al. (2010) and Manjunatha et al. (2007) in barley collections according to agro-morphological traits.

Genetic diversity
The current study is amongst the first in ICARDA to deliberately assemble and analyze a specific  population representing very diverse cultivated barley from ICARDA germplasm to provide a platform of GWAS for several important traits. We used SNP markers because it offers a highly polymorphic, codominant, and high-throughput marker system which can be used in germplasm characterization and selection of desirable alleles in breeding programs (Lombardi et al. 2014). Minor allele frequency and expected heterozygosity are directly correlated. This additional measure can determine the proportion of rare alleles (MAF \ 0.2), which in turn determines the diversity of the population. In our study, we found an average expected heterozygosity of 0.29 which is comparable to that observed in other studies (Lombardi et al. 2014;Emanuelli et al. 2013;Jones et al. 2007;Ching et al. 2002). Furthermore, the average gene diversity in our sample was 0.366, which is slightly higher than that reported by Rodriguez et al. (2012) and Sun et al. (2011) using SSR markers, 0.298 in barley landraces from Sardinia and 0.338 in a worldwide barley genotypes, respectively. Higher genetic diversity is generally expected in the current mapping panel because of the diverse nature of genotypes used in the current study, which were originated from different barley breeding programs across the globe, landraces collected from diverse geographical regions. Therefore, by selecting SNPs based on their high polymorphism levels, the discriminating power of the SNP can be considerably increased (Jones et al. 2007). Many of the ICARDA's breeding lines, analyzed in this study, share common parents. As genetic distance is based on the principal that shared alleles are identical by descent, this measure of discrimination power is meaningful in our population. The maximum distance was found between Arimont, an American six-row, naked genotype and LIMON/BICHY2000// DEFRA/DESCONOCIDA-BAR, a two-row malt barley cross derived from highly separated localities and breeding programs and inversely, the lowest distance was found between pairs of sister lines from the ICARDA breeding programs, which is evident as they had same parentage.

Population structure and linkage disequilibrium
Cluster analysis based on Nei (1972) distances separated, with some exceptions, the genotypes according to their row type. Our results correlate with previous studies showing a clear separation between two-and six-row types (Usubaliev et al. 2013;Chaabane et al. 2009;Chen et al. 2009;Lasa and Igartua 2001;Franckowiak and Lundqvist 1997). Historically, in ICARDA, breeders had made several two-by-six row crosses which was evident in this study by the identification of admixtures (Fig. 2b). This admixture was clearly shown from the pedigree of ICARDA barley breeding lines where both two-and six-row genotypes were included in particular crosses (Supplemental Table S1). Hence, both structure and PCA analyses support the hypothesis of genetic admixture of two-and six-row barley in ICARDA germplasm. Despite that the optimum number of subpopulations was two (K = 2), genotypes tend to cluster (based on their coefficient of membership; Q i ) according to their adaptation mode (high-input barley, low-input barley), regardless of their row-type. This is evident since ICARDA had two distinct barley breeding programs, in the past, located in Syria and Mexico based on target countries and end uses. The one in Syria was the lowinput breeding program where the developed genotypes are more adapted to stressed environments (poor crop management, cold and drought conditions), and are bred for feed and food purposes. Whereas the genotypes developed in Mexico under the high input breeding program are more adapted to favorable conditions (high rainfall/irrigated and appropriate crop management) and mainly bred for malt or feed. However, in the current study, the structural tendencies may not be absolute as 34% of the genotypes were admixed, and can be derived from the crosses of different parents and may be suitable for both environments.
In our study, LD at P \ 0.001 was observed in 16.27% of loci pairs and in 26.53% at P \ 0.05 significance level, where 74.4% are linked (\40 cM). Our results considerably exceeded LD reported by Rodriguez et al. 2012 using S-SAP markers where 25 genotypes of Hordeum spontaneum with 15% of loci pairs at P \ 0.05 and 13% of loci pairs at P \ 0.01 in a landrace population of Sardinia were observed. Our results were lower than the proportion reported by Malysheva-Otto et al. (2006), where 42% of loci pairs at P \ 0.05 in 207 European two-row spring barley using SSR markers were observed. The most plausible explanations for the moderately low LD in our collections compared to Malysheva-Otto et al. (2006) are, the use of bi-allelic SNP markers and secondly, nature of barley germplasm used in this study. Our panel includes a considerable number of landraces while breeding lines used in the current study were generated by frequently including landraces in the ICARDA's barley breeding programs. The number of detected loci pairs in LD is greater in multi-allelic markers such as SSR compared to biallelic markers such as SNPs. Also, the level of LD is higher in cultivated barley compared to landraces and wild genetic resources (Flint-Garcia et al. 2003). In the current study, we used bi-allelic SNP markers and nearly 12% of our population consisted of landraces or cultivars with a background of wild barleys, therefore an average low level of LD was expected (Massman et al. 2011;Cockram et al. 2008;Malysheva-Otto et al. 2006).
Mean r 2 LD values higher than 0.15 extended up to 3.58 cM in our study and we argued that the current marker density (0.231 cM/SNP) was sufficient for genome wide association studies in barley. In the case of bi-allelic markers, previous studies have reported successful association mapping in barley using a marker density of 1 DArT marker per 1.5 cM (Comadran et al. 2009) and 1 SNP marker per 0.72 cM (Pasam et al. 2012;Massman et al. 2011;Cockram et al. 2008). In this study, the barley 9 K Ilumina SNP array (6519 SNP markers) gave an approximate coverage of 1 SNP marker per 0.231 cM. The 9 K SNP platform was successfully used for various GWAS of different traits in barley (Tamang et al. 2015;Munoz-Amatriain et al. 2014).

Conclusions
This study provided a detailed description of a population, that represents a wide range and historical survey of barley diversity within ICARDA germplasm and comprised a considerable proportion of the genetic and phenotypic variation underlying the different strategies for adaptation to different environments. We have demonstrated that the barley genotypes studied were genetically and phenotypically diverse, and strongly structured. The marker coverage, population stratification and the level of LD in our germplasm set was appropriate to run different GWAS studies for key traits in barley. For detecting the most confident QTLs and avoid spurious associations, it is important to consider association mapping using combined and independently in the two subpopulations i.e. two-and six-row barley.