OryzaSNP Project
The International Rice Functional Genomics Consortium (IRFGC) has been funded to perform sequence analysis on diverse cultivars of rice (Oryza sativa) for the purpose of identifying SNPs. After a request to the community for suggestions, 20 cultivars were chosen for inclusion in this project: Nipponbare, IR64, SHZ2, Azucena, LTH, N22, Dom Sufid, Pokkali, Moroberekan, Aswina, Dular, Rayada, M202, FR13A, Swarna, Sadu Cho, Zhenshan 97B, Minghui 63, Cypress and Tainung 67. Sequence analysis has been performed on 100 Mb of gene-rich genomic sequence from each of these cultivars using Perlegen hybridization-based “resequencing” technology.
Resequencing data were analyzed using both a Perlegen model-based method as well as a machine learning method. These SNP identification algorithms are similar to those used in a similar Arabidopsis thaliana SNP identification project. The two methods have slightly different specificities, and it was determined that rice SNPs that were identified by both methods and that fall within non-repetitive 25-mers are the highest quality SNPs with the lowest false positive rate. Over 160,000 high quality SNPs have been identified in the rice genome. The recall rate for SNP discovery was 10.7%, as determined by Sanger-based sequencing of a portion of the sequences that were resequenced by Perlegen's hybridization-based method. The false positive rate for the high quality SNPs was 2.9%.
The oligo arrays that were used for the hybridization-based “resequencing” resequencing were designed based on the IRGSP version 4 rice pseudomolecules. However, because there are two commonly used versions of the Nipponbare pseudomolecules each with their own gene models, the SNPs were also mapped relative to the TIGR/MSU rice pseudomolecules. This allowed all SNPs to be annotated relative to both the RAP rice gene models and the TIGR/MSU rice gene models. Researchers may view the SNPs via the Oryza SNP Genome Browser (TIGR/MSU-based or IRGSP-based) or by using any of several web-based search pages.
The OryzaSNP Project data are also mirrored at the project's IRRI-based website.