ГЕНЕТИКА, 2015, том 51, № 6, с. 694- 703
DEVELOPMENT AND ASSESSMENT OF EST-SSR MARKER FOR THE GENETIC DIVERSITY AMONG TOBACCOS (Nicotiana tabacum L.)
© 2015 C. Cai1, *, Y. Yang2, *, L. Cheng1, C. Tong3, J. Feng1
1Tobacco Research Institute of Hubei Province, Wuhan 430030, Hubei, China e-mail: firstname.lastname@example.org 2College of Life Science, Hubei University, Wuhan 430062, Hubei, China e-mail: email@example.com 3Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan 430062, Hubei, China
e-mail: firstname.lastname@example.org Received July 09, 2014
Because of the advantages of EST-SSR markers, it has been employed as powerful markers for genetic diversity analysis, comparative mapping and phylogenetic studies. In this study, a total of 429,869 tobacco (Nicotiana tabacum L.) ESTs were downloaded from the public databases, which offers an opportunity to identify SSRs in ESTs by data mining, and 38,165 SSRs were identified from 379,967 uni-ESTs with the frequency of one SSR per 5.52 kb. Mono- and tri-nucleotide repeat motifs were the dominant repeat types, accounting for 40.53 and 34.51% of all SSRs, respectively. After eliminating mononucleotide-containing sequences, 86 pairs of primers were designed to amplify in four tobacco accessions. Only 15 primers (17.44%) showed polymorphism, and then they were further used to assess genetic diversity of 20 tobacco accessions. Unweighted pair-group method with arithmetic average dendrograms (UPGMA) and principal coordinates analysis plots (PCA) revealed genetic differentiation between N. rustica and N. tabacum, and between oriental tobacco and other accessions of N. tabacum. The present study reported the development of EST-SSR markers in tobacco by exploiting EST databases, and confirmed the effective way to develop markers. These EST-SSRs can serve in studies on cultivar identification, genetic diversity analysis, and genetics in tobacco.
Tobacco (Nicotiana tabacum L.) is a natural allopolyploid (2n = 48) between ancestors of the diploid species Nicotiana sylvestris (2n = 24) and Nicotiana to-mentosiformis (2n = 24) [1, 2], which formed up to six million years ago. Tobacco is one of the important economic crops in the world, and cultivated in more than 100 countries. Around 1,900 accessions of N. tabacum are maintained at the ex-situ U.S. Nicotiana Germplasm Collection , and classified into flue-cured, burley, oriental, cigar, dark (air/sun cured), and primitive tobacco on the basis of the method of curing and biochemical characteristics. Burley tobaccos are believed to derive from a mutation identified in a strain of maryland tobacco, and flue-cured tobaccos are closely related to dark fire-cured tobaccos . However, the high morphological similarity and low genetic differentiation are observed in tobacco, which limit the development of new tobacco cultivars. One of the major limitations to the application of genomic technology in tobacco improvement is the paucity of informative DNA markers. Therefore, it is important to develop efficient molecular marker for genetic researches in N. tabacum.
* These authors contributed equally to this work.
Molecular marker technology has been used to research genetic polymorphism in many crop species in genomic level [5, 6]. Several kinds of molecular markers have been developed, such as RAPD (Random Amplified Polymorphic DNA), AFLP (Amplified Fragment Length Polymorphism), ISSR (Inter-Simple Sequence Repeats), SSR (Simple Sequence Repeat) and EST-SSR (Expressed Sequence Tag-derived SSR markers). RAPD and AFLP molecular markers had been applied to analyze genetic diversity and relationship among different cultivars in tobacco, and reveal relatively low level of genetic diversity among tobacco accessions [7, 8]. Interestingly, they were useful for detecting chromatin introgressed from wild relatives [4, 9, 10]. These two markers have their own advantages in the genetic variation analysis, however, RAPD shows poor consistency and low reproducibili-ty [11, 12], and the AFLP system requires multiple steps with high pseudopolymorphism . These disadvantages limit their application in-depth in plant genetic research.
SSRs, including genomic SSRs and EST-SSRs, are the preferred type of molecular marker. It has many advantages over other markers, such as codominance, high levels of polymorphism, high reproducibility, and
abundant distribution throughout genomes. SSR markers have been widely applied in plant genetic research [13, 14]. At present, limited study that used SSR molecular marker in tobacco has been reported. A total of637 pairs of SSR primers (genomic SSR) had been developed on basis of large-scale genomic sequencing for tobacco, and 282 polymorphic primers were used to construct genetic linkage map in this species . Moon et al. (2009) selected 71 pairs of SSR primers to detect changes in genetic diversity of American flue-cured tobacco germplasm over seven decades of cultivar development, and found a gradual reduction in genetic diversity for flue-cured tobacco . Using 49 SSR markers, the genetic diversity and relationship of 312 worldwide tobacco accessions were assessed, and a tobacco core collection with 89 accessions was constructed .
Recently, with the development of the tobacco cDNA sequencing projects, a large number of tobacco EST sequences have been published, which provides an excellent opportunity to develop large-scale EST-SSR markers in tobacco. The development of EST-SSRs in silico by mining of EST databases is time-saving and high efficient. Moreover, EST-SSRs possess two intrinsic advantages over genomic SSRs. First, EST-SSRs are more transferable across taxonomic boundaries because of the conserved DNA sequences in expressed regions of genome. Second, EST-SSRs are distributed in transcribed sequence and may be related with the function potential genes [18, 19]. To date, EST-SSRs have been employed as powerful genetic markers for genetic mapping, genetic diversity analysis, and phylogenetic studies. However, no study has been reported on the mining of EST-SSRs in tobacco .
In this study, we developed a set of EST-SSR markers based on EST sequences of tobacco and further assessed the genetic diversity and relatedness among 20 tobacco accessions using EST-SSR markers available. Results reported here will be valuable resources for further breeding and improvement of varieties in tobacco.
MATERIAL AND METHODS
Plant material and DNA isolation
Four burley tobacco accessions, Va1050 with white color, Va1050 with red color, Burley37 with white color and Burley37 with red color, were used for primers screening. A set of 20 tobacco accessions including 18 of N. tabacum and 2 of N. rustica were selected to detect genetic polymorphism and assessment of EST-SSR effectiveness in tobacco (Table 1). The 18 N. tabacum are 2 oriental tobaccos, 2 cigar tobaccos, 5 burley tobaccos, 3 flue-cured tobaccos, 3 sun-cured tobaccos, and 3 maryland tobaccos. All accessions were collected from the Lab of Tobacco Research Institute of Hubei Province. Young leaves from each ma-
Table 1. The 20 tobacco accessions for genetic diversity analysis in the present study
Code Accession name Species Type
1 Rustica N. rustica
2 Yangjiapa rustica N. rustica
3 Canik N. tabacum Oriental tobacco
4 Samsun » Oriental tobacco
5 Havana10 » Cigar tobacco
6 Little Dutch » Cigar tobacco
7 B21 » Burley tobacco
8 B37 » Burley tobacco
9 Tn86 » Burley tobacco
10 Dabai No. 1 » Burley tobacco
11 Ebai No. 20 » Burley tobacco
12 Zhongyan 100 » Flue-cured tobacco
13 K326 » Flue-cured tobacco
14 Honghua Dajinyuan » Flue-cured tobacco
15 Badong Dananyan » Sun-cured tobacco
16 Datiebanyan » Sun-cured tobacco
17 Maobaziyan » Sun-cured tobacco
18 Maryland-1 » Maryland tobacco
19 Md609 » Maryland tobacco
20 Kuangye Maryland » Maryland tobacco
terial at two leaf stage grown in the greenhouse were sampled and used for DNA extraction. Total DNA was extracted using the improved CTAB method , and saved in -20°C.
Data mining and primer design
A total of 429,869 tobacco EST sequences were downloaded from the GenBank database (NCBI, http://www.ncbi.nih.gov/, 12 May 2013). Firstly, the short (<100 bp) and long (>700 bp) EST sequences were removed using a perl script, Est-timmer.pl (http:// pgrc.ipk-gatersleben.de/misa/). And then, these EST sequences were assembled through the software CD-HIT (http://weizhong-lab.ucsd.edu/cd-hit/), of which the parameter setting was that overlapped sequences was more than 14 nucleotides and the similarity was more than 98%. Finally, MISA software (http:// pgrc.ipk-gatersleben.de/misa/) was used to analyze and detect SSR loci, which repeat unit size and minimum number of repeats were set as 1/10, 2/6, 3/5, 4/5, 5/5 and 6/5 for mononucleotide, dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide, and the maximum insertion of non-repeat nucleotides was 100 bp.
EST-SSR primers were designed using the Primer Premier 5.0 with the following parameters: the optimal primer length = 20 bp (range = 18—25 bp), the
Table 2. Distributions of the major SSR motifs identified in ESTs of tobacco
Repeat types Number Proportion, %* Frequence, % Mean distance, kb
Mononucleotide 15470 40.53 4.07 13.63
A/T 13972 36.61 3.68
C/G 1498 3.93 0.39
Dinucleotide 8434 22.10 2.22 25.00
AG/CT 4640 12.16 1.22
AC/GT 2415 6.33 0.64
AT/TA 1345 3.52 0.35
Trinucleotide 13171 34.51 3.46 16.01
AGC/CTG 5658 14.83 1.49
AAG/CTT 3120 8.18 0.82
AAC/GTT 975 2.55 0.26
ATC/ATG 616 1.61 0.16
CCG/CGG 597 1.56 0.16
AAT/ATT 594 1.56 0.16
AGG/CCT 548 1.44 0.14
ACC/GGT 506 1.33 0.13
Tetranucleotide 713 1.87 0.19 295.67
Pentanucleotide 175 0.46 0.05 1204.63
Hexanucleotide 202 0.53 0.05 1043.61
* The repeat motifs with the percentage less than 1% were not listed here.
length of PCR products = 100-3
Для дальнейшего прочтения статьи необходимо приобрести полный текст. Статьи высылаются в формате PDF на указанную при оплате почту. Время доставки составляет менее 10 минут. Стоимость одной статьи — 150 рублей.