Statistical Analyses Trapped in Time – And It Could Really Cost You

dreamstime_4571Years ago, when I was conducting yield trials as a plant breeder, drought seemed imminent a few months after planting. Rather than lose the entire trial, I requested sprinkler irrigation at the research station. Unfortunately, water pressure to the far end of the station wasn’t ideal, and there wasn’t a perfect fit for one of the pipe connections. I could see the tragedy unfolding: uneven irrigation perpendicular to the blocks in the randomized complete block design (RCBD) and thus a high covariance (CV).

During the season, anyone could see an obvious gradient along the plots. The plants on the left side of the trial were shorter, and the plants on the right were taller and greener. This was a statistician’s worst nightmare. Due to the conditions and variability, I thought the trial might still be a loss.

When I finally analyzed the trial, the CV was indeed high. However, around that time, I happened to come across a small software program that analyzed trials with a spatial algorithm to account for trends in the field. I wondered if it would work for me. Much to my surprise it estimated the trends and even the exact spot where the pipes didn’t connect and water gushed out for a few hours. The CV went way down, and heritability went way up.

But what shocked me most were the adjustments in the means for the varieties now estimated in the absence of confounding trends – in this case, from uneven irrigation. The RCBD analysis just gives the arithmetic mean of entries, whereas the spatial analysis estimates means in the absence of trends (estimated iteratively from solely plot data). Among the RCBD-ranked Top 5, three were actually low yielding. Furthermore, some of those ranking lower in the trial were actually among the top yielding when estimated by this spatial data analysis. It was a night-and-day difference.

dreamstime_s_70223305Which analysis would I believe? What if I chose the wrong varieties because they were incorrectly ranked? This had massive implications for my breeding program. Years later, and after further research co-published in Crop Science and American Statistician, I remain firmly convinced that nearest neighbor analysis is a simple but powerful approach in spatial analysis. It is called “nearest neighbor” because the nearest plots (the ones to the left and immediate right of a central plot) are used in the algorithm in sequence.

In the years since my discovery, I’ve had the opportunity to reanalyze yield trial data for a number of seed companies, and often, the spatial analysis proved superior with greater confidence in selecting the best hybrids and varieties. In fact, one client reanalyzed 25 years of yield data, comparing it to the RCBD and incomplete block analyses, and came to different conclusions through the spatial analysis, which was decidedly superior.

This meant somewhat different decisions as to the release of final varieties, which are million-dollar decisions for seed companies. But still, all too many companies rely on the RCBD analysis developed by Sir Ronald Fisher in the 1920s, almost as if statistical time stood still.

(First published on | May 2017)

Discover more of the statistical capabilities of AGROBASE Generation II®

Dr. Dieter Mulitze is Founder and CEO of Agronomix Software – data management and analysis software for plant breeders and variety testers. 

Overcoming a Mental Roadblock for Using Plant Breeding Software

Are you seriously contemplating using plant breeding software for your research program?

Pushing Spreadsheets with your PhD?

Do you know how much time you, or plant breeders within your company, spend looking at spreadsheets? It might surprise you. I’ve recently had conversations with leaders at one large seed company who estimate breeders there spend about 20 percent of their time looking at spreadsheets.

Who is the Most Important Person in Your Company?

Have you ever met a CEO or company president that feels he or she is of celebrity status? Unfortunately, it’s not uncommon these days. We live in a celebrity-centric world, where emphasis is put on what and how many events you attend, how many hands you shake, and the number of followers one has accumulated on social media.

Plant Breeders Aren’t Robots, They’re Human

Each year, I spend several weeks analyzing data, navigating back and forth between Excel sheets and plugging in data. I have about a two-week window when this is all I do. I stop only to sleep and eat … and my family, well, they don’t see much of me during that time of year. That … Read more Plant Breeders Aren’t Robots, They’re Human

Are You Losing 2.5 Weeks of Work?

We all know that plant breeding is a numbers game. Twenty years ago, a plant breeder would make as many crosses as possible, from hundreds to even thousands. As more crosses were made, so did the odds for success. While that notion still holds true, breeders are limited or empowered by their ability to work with these … Read more Are You Losing 2.5 Weeks of Work?

How a Three-Minute Interview and Flight 155 Became Pivotal Career Moments

Dieter Mulitze PhD, Founder and CEO, Agronomix

I am often asked how I ended up starting a software company for plant breeding. Honestly, it was never my plan even for a second, and looking back I can identify the critical moments. When we do such reflection, we can be surprised by what we see, and better still, learn to identify such pivotal moments or events for our future. Here’s my story, even if it seems somewhat unbelievable.

I grew up on a dairy farm in eastern Ontario, so “agriculture was in my DNA.” My university career started at the University of Waterloo in mathematics and computer science, but a year later I switched to the University of Guelph into Crop Science. Back to my roots, you might say. After my second year I again needed a summer job and decided to visit the professors in Crop Science to see if there were any summer positions.

Read moreHow a Three-Minute Interview and Flight 155 Became Pivotal Career Moments