Tag Archives: genomic selection

NextGen PhD Student Visits Cornell for training on Prediction Modeling

May 13, 2016, Ithaca NY: Olumide Alabi, NextGen Cassava PhD student with the International Institute for Tropical Agriculture (IITA) in Ibadan, Nigeria, recently visited Jean-Luc Jannink’s laboratory group at Cornell University for training on prediction modeling. Olumide reports on his visit here:

Date: 8th March to 6th April, 2016

Location: Dr. Jean-Luc Jannink’s Research group, Plant breeding and genetics department, Bradfield Hall, Cornell University, Ithaca, NY

Practical skill acquisition in genomic prediction modeling forms the basis of my brief visit to Cornell. I got handy explanation on prediction modeling processes as they apply to past and present genomic selection cycles as being implemented in IITA-NextGen Cassava Breeding Project.

Three major objective activities included:

  1. The prediction modeling for the IITA-Genomic Selection

Marnin Wolfe, postdoctoral associate at Cornell, was able to guide me from the known in Genomic predictions in general to the unknown with practical step-by-step activities using the IITA-NextGen cassava dataset. I received concrete training on the use of single step model and information on the limitation to it, as it could be computationally intensive with large datasets. Also, I was trained on two-step model, formation of the kinship matrix using the “A.mat” function, model.matrix, kin.blup phenotype dataset curation for prediction modeling, G-BLUP model, RR-BLUP model, the inclusion of multiple random effects in prediction modeling using the EMMREML model and general theories and coding syntaxes associated with these above-mentioned models. One of the newest concepts to me in all was when I was guided through the IITA-Cycle 3 prediction, de-regressed BLUPs, especially with the theory and concept of reliability estimation, PEV,  and how these influence the accuracy of our predictions. Marnin did well in guiding me through these concepts both theoretically and practically, coupled with exercises, reading assignments, brainstorming sessions. To wrap it up, I was guided through the entire IITA-GS Cycle 3 prediction model; the code was provided to me by Marnin with detailed explanations.

  1. Fitting the appropriate model for the genetic gain estimation

Estimating the “Expected Gain” in GS application in cassava is not a straight-forward thing, as the selection of the parents is based on selection index built from the GEBVs of traits and individuals. In the gain estimation using the conventional breeder’s equation, there is a little adjustment in GS concept, which is basically the selection accuracy factor in the model. To obtain this, we had to correlate the S.I_GEBVs (Predicted) of lines and the S.I_BLUPs (Observed). In my brainstorming with Marnin, we came up with the concept highlighted below:

rA = corr(S.I_GEBVs, S.I_BLUPs)

Where S.I_GEBVs = wtGEBVT1 + wtGEBVT2 + wtGEBVT2…+ wtGEBVTN

wt = the economic weight used for trait T in the selection index model

S.I_BLUPs = wtBLUPT1 + wtBLUPT2 + wtBLUPT2…+ wtBLUPTN

Hence, the rA could be appropriately fitted in the breeder’s equation for the expected gain estimation.

  1. GWAS exploration on the plant type dataset

Dunia (Research Associate) guided me through GWA-studies with the use of datasets on plant type and the associated SNP data. For better handling of the categorical nature of the Plant Type trait (compact_1, open_2, umbrella_3 and cylinderica_4), Marnin suggested the classification of the trait as binomial scores (E.g. Compact: 0_absent, 1_present), hence coding the scores as a trait per time. It was to enable us to fit a GLIMMIX model with the flexibility of a link function for variance components.

  1. I participated in the research group and graduate student seminars and symposiums.

Skills acquired

I can practically implement Genomic prediction with more confidence on availability of appropriate dataset. I got a detailed understanding of the past IITA GS Cycle selection and a first-hand understanding of the present Cycle 3 predictions (Thanks to Marnin). I got a better clue on several aspects in statistical modeling to be included in my thesis report, especially the expected gain estimation concept and some genomic prediction steps.

Acknowledgement

My appreciation goes to Dr. Jean-Luc Jannink for the time and audience given to me while I was in Ithaca; the meeting for updates in his office and facilitation of my visit; amidst other.

Many thanks to Marnin for devoting much time in coaching me. In fact, he was my tutor all through the period I was in Ithaca. Dunia did a great job as well as my NextGen graduate student colleagues, Ugo, Uche, and Alfred. Alex of BTI is appreciated for his kind gestures all through my time in Ithaca. I would not but mention the logistics from Dan’s end, Karen and the team in IP-CALS office.

I want to thank my supervisors in IITA, Drs. Peter Kulakow and Ismail Rabbi, for granting the home-support needed to visit Cornell this period. Thanks to Dr. Chiedozie Egesi and Dr. Hale Tufan. My final appreciation goes to the Cornell-NextGen Cassava project for the full support. My regards to all.

Olumide during his training at Cornell and with Marnin Wolfe, bottom left

Olumide during his training at Cornell and with Marnin Wolfe, bottom left

NextGen Cassava featured in The Economist

Dr. Chiedozie Egesi, NextGen Cassava project manager

Dr. Chiedozie Egesi, NextGen Cassava project manager

During the recent annual AAAS (American Association for the Advancement of Science) meeting in Washington, D.C., The Economist interviewed Dr. Chiedozie Egesi, NextGen Cassava project manager. Chiedozie spoke about the potential of the NextGen Cassava project to improve the cassava crop and address challenges such as disease, low yield, and vitamin deficiency.

See the full story, “Cassava-nova,” in the online edition of The Economist.

NEXTGEN PhD Student Roberto Lozano Attends Cold Spring Harbor Laboratory Course

NEXTGEN Cassava PhD student Roberto Lozano recently attended a two-week course on Statistical Methods for Functional Genomics at Cold Spring Harbor Laboratory (CSHL), and he reports on it here:

CSHL is considered among the leading research institutions in the world in molecular biology and genetics. Not only because of its history (considerable long list of noble laureates) but also for the current research taking place there.

Part of my research as a graduate student is focused on using high-throughput genomic data to identify functional regions across the cassava genome and try to use this information to improve Cassava GS-assisted breeding. Some of the high-throughput genomic data will come from transcriptome sequencing, chromatin footprinting and methylation profiling analysis.

Statistical Methods for Functional Genomics course attendees

Statistical Methods for Functional Genomics course attendees

High-throughput sequencing has become a major technique in biological research. However analyzing big data sets, products of these technologies, carries some challenges that are not always properly tackled. These kinds of errors can threaten the biological inferences that are made. All the techniques that I planned on using for my research carry some unique difficulties and sometimes complex statistical principles underlying their analysis methods. This course tackled all those techniques, and the instructors and speakers have wide experience working with that kind of data. That’s what initially caught my attention to apply for this course.

DNA Sculpture at CSHL

DNA Sculpture at CSHL

After taking it I have to admit that it was as good as it could get. All the instructors were great; each of them leads their own top-notch research group, and they were really helpful and resourceful. The invited speakers were great as well, showing some of the latest techniques and applications of next-gen sequencing. The attendees came from a wide variety of fields and from all around the world, working in both Academia and private companies, and the wide variety of their study fields (cancer, neurobiology, plant genomics, immunology and more) really assured lots of interesting discussions. Finally I had to mention that even the location of the Cold Spring Harbor Labs was something else, a beautiful environment that let people focus on their research.