Tag Archives: IITA

NextGen PhD Student Visits Cornell for training on Prediction Modeling

May 13, 2016, Ithaca NY: Olumide Alabi, NextGen Cassava PhD student with the International Institute for Tropical Agriculture (IITA) in Ibadan, Nigeria, recently visited Jean-Luc Jannink’s laboratory group at Cornell University for training on prediction modeling. Olumide reports on his visit here:

Date: 8th March to 6th April, 2016

Location: Dr. Jean-Luc Jannink’s Research group, Plant breeding and genetics department, Bradfield Hall, Cornell University, Ithaca, NY

Practical skill acquisition in genomic prediction modeling forms the basis of my brief visit to Cornell. I got handy explanation on prediction modeling processes as they apply to past and present genomic selection cycles as being implemented in IITA-NextGen Cassava Breeding Project.

Three major objective activities included:

  1. The prediction modeling for the IITA-Genomic Selection

Marnin Wolfe, postdoctoral associate at Cornell, was able to guide me from the known in Genomic predictions in general to the unknown with practical step-by-step activities using the IITA-NextGen cassava dataset. I received concrete training on the use of single step model and information on the limitation to it, as it could be computationally intensive with large datasets. Also, I was trained on two-step model, formation of the kinship matrix using the “A.mat” function, model.matrix, kin.blup phenotype dataset curation for prediction modeling, G-BLUP model, RR-BLUP model, the inclusion of multiple random effects in prediction modeling using the EMMREML model and general theories and coding syntaxes associated with these above-mentioned models. One of the newest concepts to me in all was when I was guided through the IITA-Cycle 3 prediction, de-regressed BLUPs, especially with the theory and concept of reliability estimation, PEV,  and how these influence the accuracy of our predictions. Marnin did well in guiding me through these concepts both theoretically and practically, coupled with exercises, reading assignments, brainstorming sessions. To wrap it up, I was guided through the entire IITA-GS Cycle 3 prediction model; the code was provided to me by Marnin with detailed explanations.

  1. Fitting the appropriate model for the genetic gain estimation

Estimating the “Expected Gain” in GS application in cassava is not a straight-forward thing, as the selection of the parents is based on selection index built from the GEBVs of traits and individuals. In the gain estimation using the conventional breeder’s equation, there is a little adjustment in GS concept, which is basically the selection accuracy factor in the model. To obtain this, we had to correlate the S.I_GEBVs (Predicted) of lines and the S.I_BLUPs (Observed). In my brainstorming with Marnin, we came up with the concept highlighted below:

rA = corr(S.I_GEBVs, S.I_BLUPs)

Where S.I_GEBVs = wtGEBVT1 + wtGEBVT2 + wtGEBVT2…+ wtGEBVTN

wt = the economic weight used for trait T in the selection index model


Hence, the rA could be appropriately fitted in the breeder’s equation for the expected gain estimation.

  1. GWAS exploration on the plant type dataset

Dunia (Research Associate) guided me through GWA-studies with the use of datasets on plant type and the associated SNP data. For better handling of the categorical nature of the Plant Type trait (compact_1, open_2, umbrella_3 and cylinderica_4), Marnin suggested the classification of the trait as binomial scores (E.g. Compact: 0_absent, 1_present), hence coding the scores as a trait per time. It was to enable us to fit a GLIMMIX model with the flexibility of a link function for variance components.

  1. I participated in the research group and graduate student seminars and symposiums.

Skills acquired

I can practically implement Genomic prediction with more confidence on availability of appropriate dataset. I got a detailed understanding of the past IITA GS Cycle selection and a first-hand understanding of the present Cycle 3 predictions (Thanks to Marnin). I got a better clue on several aspects in statistical modeling to be included in my thesis report, especially the expected gain estimation concept and some genomic prediction steps.


My appreciation goes to Dr. Jean-Luc Jannink for the time and audience given to me while I was in Ithaca; the meeting for updates in his office and facilitation of my visit; amidst other.

Many thanks to Marnin for devoting much time in coaching me. In fact, he was my tutor all through the period I was in Ithaca. Dunia did a great job as well as my NextGen graduate student colleagues, Ugo, Uche, and Alfred. Alex of BTI is appreciated for his kind gestures all through my time in Ithaca. I would not but mention the logistics from Dan’s end, Karen and the team in IP-CALS office.

I want to thank my supervisors in IITA, Drs. Peter Kulakow and Ismail Rabbi, for granting the home-support needed to visit Cornell this period. Thanks to Dr. Chiedozie Egesi and Dr. Hale Tufan. My final appreciation goes to the Cornell-NextGen Cassava project for the full support. My regards to all.

Olumide during his training at Cornell and with Marnin Wolfe, bottom left

Olumide during his training at Cornell and with Marnin Wolfe, bottom left

NextGen Cassava featured in The Economist

Dr. Chiedozie Egesi, NextGen Cassava project manager

Dr. Chiedozie Egesi, NextGen Cassava project manager

During the recent annual AAAS (American Association for the Advancement of Science) meeting in Washington, D.C., The Economist interviewed Dr. Chiedozie Egesi, NextGen Cassava project manager. Chiedozie spoke about the potential of the NextGen Cassava project to improve the cassava crop and address challenges such as disease, low yield, and vitamin deficiency.

See the full story, “Cassava-nova,” in the online edition of The Economist.

NEXTGEN Sends 58 Team Members to Nanning, China for World Congress on Root and Tuber Crops

World Congress on Root and Tuber CropsFifty-eight NEXTGEN scientists representing NEXTGEN partners from Africa, North America, and South America will attend the first World Congress on Root and Tuber Crops (WCRTC) in Nanning, China January 18-22. A 5-day conference drawing more than 500 scientists from across the globe, WCRTC represents the merger of the 3rd Scientific Conference of the Global Cassava Partnership for the 21st Century (GCP21) & the 17th Symposium of the International Society for Tropical Root Crops (ISTRC). During the conference, dedicated to adding value to root and tuber crops, more than twenty NEXTGEN scientists will give presentations on their current research on topics ranging from diseases threatening cassava to breeding to biodiversity. Additionally, 17 NEXTGEN Masters and PhD students will present posters on their research at Cornell University and Makerere University in Kampala, Uganda.

NEXTGEN is a proud sponsor of the World Congress on Root and Tuber Crops and is honored to support four outstanding female cassava researchers with NEXTGEN Cassava Early Career Female Scientist travel awards to attend the conference. These awards, presented to Teddy Amuge from Uganda; Sally Mallowa-Nyawanda from Kenya; Sarah Nanyiti from Uganda; and Nneka Okereke from Nigeria, will provide an opportunity for the awardees to meet with cassava experts from around the world and to present their research to a large and influential audience.

World Congress on Root and Tuber Crops

NEXTGEN Students and Postdoc Attend Genomic Selection Course in Aarhus, Denmark

Ismail Kayondo, Dunia Pino del Carpio, and Olumide Alabi at Aarhus University

Ismail Kayondo, Dunia Pino del Carpio, and Olumide Alabi at Aarhus University

NEXTGEN Cassava PhD students Olumide Alabi (IITA) and Ismail Kayondo (NaCRRI) and postdoctoral fellow Dunia Pino del Carpio (Cornell) recently attended a week-long course at Aarhus University, Denmark, titled “Statistical Models for Genomic Predictions in Animals and Plants.” Below is Olumide Alabi’s report on the course:

Day 1: Background on genomic selection; classical MAS; basic prediction using GWAS; environmental effects
The course started in the morning with a theoretical background and introduction to the topics stated above. A hands-on exercise on the optimization of breeding plans using phenotypic selection and genomic selection was simulated with varying population sizes and marker densities. In the afternoon, a real dataset was provided to work with, from which we individually ran a simple GWAS model using R and later fitted prediction models from the GWAS result after we had identified significant SNP effects. The last exercise of the day was fitting the GWAS model with environmental effects and comparing prediction models using different cross validation schemes.

  • snp_select = which(gwas_BW_results[,4] < 2.7e-5)
  • lm(dat_train$BW ~factor(dat_train$Sex)+factor(dat_train$Batch)+geno_train$V2)

Lessons: Making conclusions from computing predictions from larger sets of SNPs by different thresholds for p-values in the modeling for predictions using the GWAS results obtained

Day 2: Whole-genome SNP regression model; introduction to single-trait and multi-trait GBLUP; cross‐validation systems
With preliminary theoretical discussions on each of the topics listed above, much time was devoted to hands-on exercises on them. The first exercise of the day was the Random Regression with R-BGLR. It was noted that BGLR does not accept missing data, hence, a replacement with the mean genotype (2p: allele frequency). A DMU R package developed by the Danish group was installed, and we used this to run a multi-trait WGRR and GBLUP. Finally, an exercise cross-validation of varying k-fold schemes was carried out.

Day 3: Making, scaling and interpreting Genomic Relationships matrices; single step GBLUP and scaling G and A
I initially found it difficult to comprehend some aspects of the day 3 topics and exercises; however, the given publication (VanRaden, 2008 and Legarra et al, 2015) and additional explanation by the instructor and interaction with colleagues in the class helped somehow. Time was dedicated to the single step approach for genomic evaluation, compatibility of G and A matrix and the single-step in Rdmu using the pedigree file.

Day 4: Bayesian shrinkage models; Bayesian mixture/variable selection models
There was theoretical explanation on posterior distribution and prior distribution information of parameters used for the modeling. Exercises on Mixture model approach were practiced, comparison of different model approach for GWAS and genomic prediction was part of the exercises for the day (LASSO, Bayes A, Bayes B.) using the BGLR and the Rdmu packages. My personal motivation is to read more on Bayesian statistics.

Day 5: Relationships in data; genomic feature models; usual SNP QC
One of the fascinating lessons of the day for me was the Genetic Feature model using the GBLUP models and the Bayesian approach. You can either use a GBLUP model, building G-matrices for SNPs from one chromosome versus the other chromosomes, or a Bayesian model that directly models 19 different variances for the SNPs in each chromosome.

General comments

Olumide Alabi

Olumide Alabi

  • The lessons of the summer course will be very useful for me in the immediate term, as I will hopefully participate fully alongside Marnin and Uche in the NEXTGEN GS Cycle 3 genomic predictions of the IITA GS program.
  • The attendance of this course has filled the gap pointed out to me during my Comprehensive exam by the panel: “Assuming all the support and the associated institutions in my program are not there, how will I cope to implement GS on my own in terms of the predictions, marker system management…”.
  • Although I cannot claim 100% understanding of all the theories and exercises at once, the interactive nature of the course was of immense help to my comprehension of what I could apply in my current research and future endeavours.
  • The concepts learnt in the course will help me in detailing some of the background concepts of several approaches in my final thesis and publication efforts.
  • Meeting several new persons, the exchange of research efforts, and the adventure of getting around some part of Aarhus city after class in the evening time cannot be overemphasized. Although the course was titled “summer course,” it was cold all through, coupled with the experience of very long day hours and short night darkness (~ 4 hours).

I acknowledge the NEXTGEN program management for the capacity-building investment by giving us the opportunity of attending courses that are of relevance to our present research efforts and preparing us for future research endeavours.

Ugandan and Nigerian Scientists Attend CIAT Training in Cali, Colombia

Ugandan and Nigerian scientists attend a training at CIAT in Cali, Colombia

Ugandan and Nigerian scientists attend a training at CIAT in Cali, Colombia

In the framework of the RTB-ENDURE project and in collaboration with the NEXTGEN Cassava Project, Ugandan and Nigerian colleagues from the Ugandan National Agricultural Research Organization, IITA, and IIRR are now in Colombia to attend a training organized by CIAT for strengthening the capacities to assess the postharvest physiological deterioration of cassava and the feasibility of adopting technologies for extending the shelf-life of the roots in Uganda.