Introduction

 

The Bionutrient Institute (BI),
formerly the Real Food Campaign
(RFC),
emerged from a collaboration between
 
Bionutrient Food AssociationNext 7, & Our Sci LLC 
in 2018 with four primary goals: 

  1. Determine the amount of variation in nutrition in the food supply.

  2. Relate soil health and nutrient density outcomes to crop and soil management practices.

  3. Predict nutritional parameters in produce using spectral data and metadata.

  4. Build a public library of crop nutrition and soil and crop management data.


Over the past several years an additional goal has emerged; establishing a practical, empirical, and measurement-based suite of nutrient density measurements with clear and strong connections to human health. Nutrient density is most commonly defined as the level of nutrients per unit calorie. This definition is useful when comparing two different foods. For example, kale has on average a relatively high level of nutrients but a low level of calories, while potatoes have more calories per unit nutrient. Therefore, kale would have a higher nutrient density score than potatoes. However, this definition assumes that all kale or all potatoes are relatively nutritionally uniform.

The broader food movement is beginning to explore a new understanding of nutrient density. When we speak of increasing nutrient density in foods, we aren't trying to categorize some types of food as more nutritious than other types. Rather, we are attempting to identify the factors that produce the most nutritious food in every category. How nutritious one bunch of kale is in relation to another, or one sack of potatoes to another? How can producers increase the nutrient density of those foods?

In its first two years of operation, the BI (RFC) Food and Soil Survey showed that there was significant variation (up to 200:1) in antioxidants, polyphenols and minerals in carrots, spinach, kale, lettuce, grapes and cherry tomatoes (2018 and 2019 Final Reports). However, the BI needed to capture more detailed crop management and variety data to delve deeper into these relationships in non-grain crops. This meant that the 2020 survey required a shift away from sampling in grocery stores and farmers markets toward working directly with farmers to better capture detailed management data.


In 2020, the BI sought to expand the reach & potential impact of its annual Food and Soil Survey


Significant changes to the program in 2020 included: 

  • Expanding the number of crops included in the survey from 6 in 2019 to 20 crops

  • Including crops that 1) are grown on significantly more acreage than the specialty crops first included and 2) make up a larger percentage of peoples diets.

  • Increasing the total number of samples from 2000 in 2019 to 4000 samples.

  • Increasing the number of labs from one to three.

  • Collecting more detailed metadata on management and variety for each crop by increasing the percentage of samples sourced directly from farms. This included increasing the number of direct grower partners from 30 in 2019 to 150 in 2020.

  • Better balancing the sourcing of samples from different climate regions and soil types across the US.

  • Engaging directly with more farm organizations and food supply chain companies.

 

Primary Outcomes

 

Expanded the operations and global footprint of the Bionutrient Institute:

  1. We worked with partners at California State University, Chico and Valorex in France to open up two additional Bionutrient Institute labs.

  2. Increased the suite of crops analyzed by the Bionutrient Institute from six crops in 2019 to 20 crops in 2020. In addition to increasing the number of crops, we added wheat and oats, which are grown on significantly more acreage and comprise a larger portion of people’s diets than the other fruit and vegetable crops in the program. 

Developed stronger relationships with numerous stakeholders

Partner organizations that the BI engaged with in 2020 included:


Developed and deployed beta models to predict nutrients and soil carbon
We developed and deployed models to predict antioxidants, polyphenols or Bionutrient Quality Index (a quality index developed by the BI) in 12 crops. These models were deployed in the bionutrient.surveystack.io app so that community members who purchased a Bionutrient Meter can test predictions on crops from their farms or grocery stores. We have further developed a validation program whereby these individuals can submit the same samples to the BI lab where they can be tested and compared to the model outputs.

 

Conducted a series of projects within the BI Framework
In addition to the primary report included here, numerous partners used BI lab processes to ask their own research questions and generate their own, more specific, project reports. Short descriptions of these trials and links to those reports are available below. In other cases, partners conducting long-term research trials on different regenerative practices included BI lab processes in their 2020 activities. These participants submitted detailed management data to the BI and their data and lab results are integrated into the 2020 final report. These projects will also be analyzed and reported on separately by each respective research partner. 


Trials with separate reports:

SMALL
GRAINS
REPORT

We collaborated with Pipeline Foods to produce a project funded by Bank of America. This project expanded the BI crop suite to include wheat and oats. Within the project, we received 298 wheat and 372 oat samples from 45 farmers across 13 states and from General Mills. We found evidence that regenerative practices, especially no-till, correlated to higher soil carbon and nutrient density outcomes in grain. Furthermore, we were able to develop good prediction models of soil carbon and nutritional outcomes in whole wheat and oats. 

FOOD
DESERT
REPORT

We engaged with citizen science partners in six cities across the USA to evaluate the nutrient density of produce in stores in and out of food deserts and by grocery store class. More specifically, we evaluated nutrient density in lower-priced grocery stores compared to higher-end stores like local boutiques and Whole Foods to see how economic status may affect access to nutrient dense food. Variability in nutrients was greater between cities than between store types, suggesting that location is more important than whether a sample was purchased from a high- or low-end grocery store as it relates to identifying the most nutritious foods.

HYDRO TOMATOES REPORT 

With support from Next7, we developed a targeted sampling plan to compare hydroponically grown and soil grown organic tomatoes. While this study covered only a single location, (Boulder, CO) it demonstrates how partners could layer their own experiments into the broader framework of the BI. In the study, soil grown tomatoes had higher mineral and polyphenol content than hydroponically-grown tomatoes, which is surprising given that hydroponic systems provide minerals directly to the plant.

BLUEBERRY VARIETY REPORT

We worked with 5 blueberry farmers in southwest Michigan to compare the nutrient density profiles of 3 late season blueberry varieties. All of the blueberries were grown under management and in the same climate region and soil type. Blueberry variety did not have a significant impact on antioxidant or polyphenol content, but did affect Brix readings and Ca and K content.


Trials without reports:

 

SUSTAINABLE POTATO
TRIALS

We engaged with researchers from six land grant and research universities across the US to evaluate the impacts of different potato varieties and regenerative management practices on potato nutrition. 

REGENERATIVE AGRICULTURAL TRIALS

The BI lab at California State University, Chico analyzed 424 samples from three separate regenerative research projects being conducted on their campus:

  • Comparisons of organic tillage vs no-till treatment in peppers supported by their organic vegetable project (OVP)

  • A stacked BEAM compost/no BEAM comparison on top of the tillage/no-till comparison with late fall produce (mizuna, mustard, kale and lettuce) funded by a Conservation Innovation Grant (CIG) from NRCS.

  • Comparing long-term (15+ years) butternut squash tillage and no-till production system trials.

 
 

LAB Methods

 
 

The BI Lab is designed to be a high throughput, low-cost lab, allowing the BI to get a snapshot of the nutrient density of thousands of food samples every year. To meet this goal, the BI needs to identify and perform lab tests which are inexpensive (low capital cost, low ongoing cost), capture the broadest perspective on quality (classes of compounds rather than individual compounds), and tend to correlate with easier-to-measure parameters like spectral reflectance.

Grower and Citizen Science Partners submitted produce and soil samples to the BI lab and completed full surveys for each sample in the field, which included (depending on sample type) store information, farm information, detailed management data, as well as visual and taste evaluations. This information was collected prior to samples being received in the lab. Full details about sample metadata collection are available in the Sample and Data Collection section below. 

Upon arrival at the BI lab, samples were processed and analyzed for the soil and crop parameters in Table 1. Full descriptions of all lab methods are available here

 

Table 1. Summary of measurement for all produce and soil samples received directly from farmers.

 

 

Statistical Methods

The BI’s Food and Soil Survey is a large observational study that depends on individuals to submit samples and sample metadata to the lab. As such, the data is often not normally distributed, which is a requirement for many common statistical approaches. Therefore, the BI uses a combination of traditional approaches (regression, ANOVA, etc) and more novel, non-parametric methods to identify relationships between farm practices, environmental parameters, crop variety, and nutrient density outcomes. 

The non-parametric methods used in this study test the values in an observational group in relation to a reference set. For example, when examining the effect of tillage intensity on outcomes, we used “no-till” as the reference category and compared how “light” and “heavy” tillage shifted the median value in relation to no-till. We consider a median shift to be significant if the p-value < 0.1, or if the 90% confidence interval did not cross 0 (the reference value). These thresholds of confidence are lower than p < 0.05, which is the standard for traditional academic research, because that level of confidence is often not feasible for on-farm research in real world settings. Furthermore, our goal is to discover trends and patterns within the dataset that are strong enough to warrant further testing, and setting overly stringent parameters would reject many potentially interesting trends. 

Models to predict nutrient density outcomes from sample metadata, spectral data, and soil data were developed using multiple linear regression analysis and Random Forest, an ensemble machine learning program. To test Random Forest predictions models we use k-fold cross validation to reduce overfitting. 

 

Sample and Data Collection

 

Partner Programs

In the 2020 Grower Partner program we were able to onboard 119 farmers in the US, 78 (65.5%) of whom successfully submitted samples to the BI lab. An additional 83 farmers submitted samples and sample metadata to the French lab. In total, we successfully increased the number of Grower Partners to 161 across the USA and Europe in 2020, up from 30 farm partners in 2019 (Fig. 1). 

This increase in participation has several causes. In 2019, the most frequently mentioned constraint to participating in Bionutrient Institute programs was the requirement to have an android phone in order to submit management and sample metadata. In 2020, our technology partner Our Sci LLC released a new data collection platform (surveystack.io), a progressive web app which is fully cross-platform (android, iOS and computers). Additionally, we reached out to farmer organizations across the country who helped market the program to their membership. These groups included Pasa Sustainable Agriculture, Virginia Association of Biological Farming (VABF), Organic Association of Kentucky (OAK), NOFA-Mass, Pipeline Foods and the BFA membership.

Grower Partners in the US and France were expected to complete 3 forms for every crop that they submitted:

  • Planting Form captured key management data up to planting. This included whether cover crops had been used up to 12 months prior to planting, how the seed bed was prepared for planting (tillage, broadforking, no-till, etc), whether a crop was transplanted from a greenhouse, and what amendments, if any, were added at or before planting.

  • In Season Management Form captured key management activities post-planting. This included irrigation, weed and pest control and what fertilizers or other amendments were added.

  • Sample Collection Form capture the sample date, crop variety and sample number to track the sample from the field to the lab. 

We also collaborated with 31 Citizen Science Partners (formerly called data partners). Of these, 25 regularly contributed samples from stores, farmers markets and gardens in their area. Another 6 partners participated in the Food Desert experiment [LINK FOOD DESERT REPORT]. This program provides a template for how the Bionutrient Institute and Citizen Science Partners can help advance the research surrounding nutrient density. In this project, specific questions about the effects of food deserts and socio-economic status on nutrient density were asked, then the community helped to both provide samples and data to answer those questions and added that data to the greater BI library of nutritional outcomes. 

Citizen Science partners were asked to complete one of the following forms:

  • Sample Collection Form: captured summarized data about crop labels and management practices from samples sourced from grocery stores, farmers markets and gardens, crop variety and sample number to track the sample from the store, market or field to the lab.

  • Citizen Science Food Desert Experiment Form captured sample and associated metadata specific to the Food Desert Experiment.

Figure 1. Map of the location of each Grower Partner (Green) and Citizen Science Partner (Orange) in 2020.


Sample Summary

A complete breakdown of how many samples the BI labs received for each crop is presented in Table 2. Between all three labs, the BI received 3,851 samples, and 3662 (95%) of those samples were fit to analyze. Of those samples, 2260 samples (58.7%) arrived directly from farmers and 2040 had soil samples as well as produce samples. 1755 of the 2260 samples arriving directly from farmers (77.7%) had at least a planting form submitted capturing key management data. These samples had the most granular management data associated with them. Another way BI volunteers captured management data was through a one-time interview with the producer at the time of collection. This interview allowed the BI to gather top-level management data such as whether the grower practiced no-till, used cover crops or added compost to their crop. We received this level of management detail for 370 samples (10%). An additional 1,242 had limited sample metadata. This means that they either came from a store and the only information came from the label (certified organic, greenhouse grown, etc) or an interview with the producer was not possible so the sample only had a limited metadata.

Between direct Grower Partners who collected granular metadata, Grower Partners aggregating samples from multiple farms, and Citizen Science partners collecting samples and interviewing growers, we received samples from 129 farms in the US and our partner lab in France received samples from an additional 83 farmers. In the US the average Grower Partner submitted 17.3 samples and French partners submitted 3.85 samples per partner.

Table 2. Summary of samples by crop and the level of detail of the associated metadata. 

Challenges and Changes in 2020 Data

Like everyone, the Bionutrient Institute had to adjust to Covid-19 restrictions and take extra precautions to keep project staff safe. For the BI labs, this meant reducing the number of staff present in the lab at one time and implementing social distancing policies. In order to still approach our goal of analyzing 4,000 samples, this meant asking our staff to work staggered shifts ranging from 6 am to 10 pm. An additional Covid-19 related challenge was a severe slow down in USPS delivery times in the summer and early fall of 2020. The slower delivery time increased the time samples were en route and led to increased sample loss, especially for leafy green samples. Therefore, we suspended leafy green sampling during the summer and did not resume accepting leafy green samples until the fall, when temperatures cooled and the shipping time started to return to normal.

An additional challenge in 2020 was a significant scale up of high quality data captured by the Food and Soil Survey. As previously mentioned, the BI labs not only received almost twice as many samples than in 2019, but also a much larger percentage of those samples came directly from producers, with much more metadata associated with them. This increase in the amount of total data collected stressed our existing data pipelines and complicated the task of analyzing the data and identifying trends. Therefore, the BI hired a specialist whose primary responsibilities include: 1) managing and improving the BI’s automated data pipeline, 2) analyzing the dataset to identify trends and 3) building visualization tools to share results with the community. 

 

Results

 

About This Section

In previous years, the results section of the year-end report included a detailed analysis of the key outcomes and helped the BI to identify areas to improve upon in the next year's Food and Soil Survey. However, as we increased the scale of our data and metadata collection in 2020, that same report structure is no longer feasible. For example, in years past we could easily share links to dozens of boxplots that allowed the community to examine sources of variability on their own. Due to the increase in the number of crops analyzed in 2020, we would need to share hundreds of boxplots to display the same sources of variability. Likewise, whereas in previous years we provided a detailed analysis of the full dataset, it would not be possible to complete analysis of the 2020 data in the same manner--there is simply too much data on too many crops and management practices to complete such an analysis in a timely manner. Over the next year we will work with research partners to identify key relationships and data gaps and develop manuscripts for publication in peer-reviewed science journals.

In this section, then, we will provide a broad overview of key research outcomes instead of a complete analysis of all the results and key findings. We will highlight 1-2 examples of key lessons learned in each section. For example, we will examine Calcium content in kale samples to investigate the sources of variation. We will also examine the impact of tillage intensity on soil carbon and respiration to explore the relationships between management practices and soil health parameters. These examples illustrate how the BI library can be used to answer questions that are important to the community.

Finally, throughout this section, we will also illustrate how any member of the community can complete the same investigations (or their own investigations) using the BI’s interactive Data Explorer Dashboard.

Data Explorer Dashboard

All of the 2020 data is now available in an interactive Data Explorer Dashboard and we are hoping to also add the 2018 and 2019 data within the next 6 months to a year. The data explorer allows any member of the public to compare results for any of the crops sampled by the BI using a selection of preset filters (climate region, store label, etc) and the ability to create custom filters. For example, a user could set the filters to compare the antioxidant content of potatoes grown under certified organic, no-till and biodynamic systems in just the northeast part of the country. Additionally, producers who submitted samples from their farms can view and download their data and compare their results to the rest of the BI community using the same set of preset and custom filter tools. This tool will allow the BI to streamline the process of returning results to producers, cutting down on the time between sample submission and receiving results and will provide producers with the context they need to interpret the complex data collected by the BI. Throughout the remainder of this report, we will present figures commonly used by scientists to visualize data alongside the visualizations of the same trends using the data explorer dashboard. 

Description of Variation

In 2018 and 2019, a key outcome from the Food and Soil Survey was that significant variation exists in nutrient and mineral content in the crops measured by the BI lab. The next step, then, is to try to understand how that variation can affect consumers in more practical terms. As a first step, Figure 2 presents the mean, median, and min-to-max range of observations for 8 minerals measured by the BI lab (Ca, Cu, Fe, K, Mg, Mn, P, and Zn). These values have been converted to “percent of Recommended Daily Allowances,” based on National Institute of Health fact sheets for each mineral instead of reporting the concentrations. Additionally, we have converted the USDA reported mineral content of all 8 minerals as published in the USDA FoodData Central Databaseinto “% RDA per 100g fresh weight.” All of the ranges presented in figure 2 are α-trimmed, meaning we removed the top and bottom 5% of observations to eliminate extreme outliers and present a range that is more representative of what is present in the food supply.

This means that instead of reporting the range of Calcium observations in Kale as 150-400 mg Ca per 100 g fresh weight, which means very little to a consumer, we report that the Ca content of kale in our study ranged from 15-40% of the recommended daily allowance of Ca (according to the NIH fact sheet, the average adult requires 1000 mg of Ca per day). This range is significant because it means that consumers eating low quality kale (in this case defined as kale with low Ca content) would need to eat 2.67 times more fresh kale to receive the same amount of Ca present in high quality kale. Not only do we see a significant variation in mineral content for many of the crops measured, but the BI average and the USDA average are often quite close to each other and within the detected range of observations. This is an important validation that the methods used by the BI-labs are comparable to that of the USDA in determing mineral concentration.

Figure 3 presents the range plots of antioxidants and polyphenols for all crops included in the Food and Soil Survey. Unlike the minerals presented in figure 2, there are no standard recommended daily allowances or USDA averages for antioxidants and polyphenols. Similar to figure 2, however, there are large variations in nutrient concentration. Of particular interest is the large variation in antioxidant content in blueberries. Blueberries are often considered superfood, largely because of their high antioxidant content. It is true that the average antioxidant content of blueberries is higher than any of the other crops in the Food and Soil Survey. However, low quality blueberries may not have any more antioxidants than the average oat sample.

 Figure 2. Range, median, mean and USDA average values for 8 minerals measured by the BI labs by crop. The range equals the area from the minimum to maximum value after the top and bottom 5% of observations have been trimmed to remove outliers.

 

Figure 3. Range, median, mean and USDA average values for antioxidants and polyphenols by crop. The range equals the area from the minimum to maximum value after the top and bottom 5% of observations have been trimmed to remove outliers. There are on recommended daily allowances for antioxidants and polyphenols.


Sources of Variation

In 2018 and 2019, the BI identified climate region, soil type, crop variety and farm management practice as sources of variation for which we could collect meaningful sample metadata. An additional goal of the 2020 season, based on previous seasons results, was to better balance the samples coming from different regions, soil types and farm practices so that we could better understand these individual effects. Figure 4(a,b) displays the box and whisker plots for Ca in kale by climate region and crop variety. Looking at the number of observations for each climate region, we can see that we did balance the samples much better than in previous years, with many regions having a similar number of samples. We can also see that the climate regions with the most observations show a large range in Ca content. These results suggest that while climate regions may affect variation, they are not the driving force. Kale variety (Fig. 4b) does seem to impact variation, but many of the samples have few observations, making it difficult to separate out variety from other potential sources of variation, such as management. Figures 4c and 4d display the same parameters (Ca in Kale) from climate region and crop variety using the Data Explorer Dashboard.

Figure 4. Boxplots of Ca in Kale by climate region (A) and crop variety (B) and screenshots of variability plots generated using the Data Explorer Dashboard for Ca in Kale by climate region (C.) and crop variety (D).

In both the kale variety boxplot (Fig 4b) and variability plot for crop variety from the Data Explorer (Fig. 4d) it is clear that Lacinato kale, a common Kale variety, exhibits a large range of Ca values. Those values range from well below the USDA average to well above the average (Fig 5). This wide range within a single crop variety suggests that other factors beyond variety itself are driving those variations.

Figure 5. Histogram of Ca observations in Lacinato kale.

 

Figure 6. Median shifts of Ca content in kale 1) managed using heavy tillage relative to no-till (top left) and comparing different no-till land preparation methods to tilled kale samples (top right). Green highlighting represents a positive effect and red a negative effect. The shading of the bar represents the statistical confidence in the effect, with dark shading being statistically significant (p < 0.1), medium shading is not statistically significant but may warrant further examination and light shading represents very low confidence. The number is the percent shift from the reference.

 

The next step in exploring the sources of variation in Ca content in kale was to use the same non-parametric analyses used in 2019 and in the 2020 Grains Report to further investigate if farm practices are affecting Ca content in Kale. As previously mentioned, the observational and non-standardized nature of this study makes using traditional statistical methods nearly impossible. Therefore, we use non-parametric analysis to test the “median shift” (percentage change) of individual nutrients grown under different practices compared to a reference set of samples. Results are split into three categories based on our level of confidence. Dark shading (green or red) indicates that the observed effect is more statistically likely to be real (p value < 0.1). Medium shading indicates that the observed effect was not quite statistically significant (0.1 < p < 0.5) but was close enough to warrant follow up investigation to prove or disprove that the effect is real. Finally, light shading effects are just as likely to be caused by random chance as by farm practice and therefore should not be investigated. The color represents the direction of the effect, green is positive and red is negative, and the number is the percent change.

By investigating median shifts, we can explore the impacts of specific management practices more effectively. For example, kale grown with “heavy tillage” (defined as tillage 6 or more inches deep) had Ca content that was nearly 30% lower than kale produced using no-till management. Likewise, comparing different no-till practices to tillage, kale grown using solarization or broadforking had 16-20% higher Ca and using sheet mulching had 32% higher Ca than tilled kale samples. In this example, all of the effects listed are statistically significant.

Linking farm practices, soil health, and nutrient density outcomes

Within the regenerative agriculture community there is a belief that “regenerative practices” will improve soil health, which will in turn improve the nutrient density of the crops grown in that soil (https://regenerationinternational.org/why-regenerative-agriculture/). However, given the number of factors that influence nutrient density--climate, soil type, crop variety, management practices, etc--it is difficult to test this hypothesis. By conducting the annual Food and Soil Survey and capturing detailed metadata on these factors, we are building the deepest library of this data that is currently available. By examining this library we can then identify trends and understand where to look next.

To set up this examination, we started 2020 with the goal of dramatically scaling up on-farm data collection. We accomplished this by increasing the number of samples directly from farms from 813 samples (40% of all samples) in 2019 to 2260 (58.7% of all samples) in 2020 (Fig. 7). Additionally, the BI lab received 1,566 and 1,730 soil samples respectively from 0-10 cm and 10-20 cm soil depths, associated with 2,040 different produce samples (53% of all samples). This allows us to evaluate the impact of farm practice on soil properties measured by the BI labs (soil carbon and respiration), and those soil properties on nutrient density outcomes.

Figure 7. Sample sources from farms, farmers markets and stores in 2019 and 2020.

Farm practice and nutrient density outcomes

We began this investigation by examining the median shift that different amendments and tillage intensities produced directly on nutrient outcomes (Fig. 8). A close examination of the table reveals that the results are often contradictory, with a given practice positively affecting one crop and negatively affecting another crop. For example, using mulch resulted in 22% higher antioxidant content in beets but 63% lower antioxidant content in carrots. Likewise, the same practice or amendment may have a positive impact on one analyte but a negative impact on another analyte in the same crop. Again using mulch as an example, antioxidant content in zucchini grown with mulch was shifted down 5%, while polyphenol content for zucchini was shifted up 10%. One reason for these dramatic shifts in effect size and direction is the small sample sizes available. There were 35 beet samples grown with mulch, but only 9 carrot samples grown with mulch. 

Even though we drastically increased the number of samples analyzed in BI labs in 2020, we received between 36 and 480 samples per crop. This is within the same range of samples we received of individual crops in 2018 and 2019. Therefore, we did not gain any statistical power to analyze the effect of farm management on nutrient density outcomes in crop samples. In a few years, after aggregating numerous years worth of samples and building a larger library of outcomes, these median shift tables may be more informative. At this time, however, we need to identify ways to aggregate samples together to produce larger sample sizes. 

Figure 8. The median shifts of different amendments compared to crops grown without amendments (top) and of differing tillage intensities compared to no-till samples (bottom). Heavy tillage is when at least one tillage pass met or exceeded 6 inches in depth. Light tillage was defined as when none of the tillage passes reached 6 inches deep. Green highlighting represents a positive effect and red a negative effect. The shading of the bar represents the statistical confidence in the effect, with dark shading being statistically significant (p < 0.1), medium shading is not statistically significant but may warrant further examination and light shading represents very low confidence. The number is the percent shift from the reference.

 

In 2019, we started experimenting with a Bionutrient Quality Index (BQI) value, which was developed from normalized values of a subset of nutrients and minerals measured in the BI lab (Box 1). An example of this would be to rank tomatoes and peppers from highest to lowest for a given nutrient (e.g. antioxidants), then the highest tomato and pepper samples are given a score of 100, and the lowest a score of 0. Once complete, all of the crops have the same range of values, 0-100, and can be compared to each other. Next, BQI combines multiple nutrients and minerals into a single value we can use to aggregate complex data to: 1) take a more holistic look at how nutritious a sample was and 2) combine a number of crops together to increase the statistical power of our comparisons and learn more from the data. In 2019, when the Food and Soil Survey included only six crops, we were able to analyze all six crops together to examine the relationships between farm practice, crop labels and nutrient density outcomes. However, after expanding the Food and Soil Survey to 20 crops, this analysis did not yield useful results (data not shown).

Important Note: The first goal of the Food and Soil Survey is to build a library of outcomes. We will engage with experts to continue to analyze the results from the library. This may mean refining the methods included in this report or changing our methodologies completely to provide the most robust analysis possible of the library

To refine the BQI approach, we decided to group crops with similar management together. We should not expect that a given farm practice will have the same nutrient density impact on a specialty crop grown on a partial acre as on a grain crop grown on fields ranging from dozens to hundreds of acres in size. Nor should we expect the relationships between tillage and nutrient density outcomes to be similar for perennial crops and tuber/root crops. Therefore, we aggregated wheat and oats as ‘grain’ crops and apple, blueberry and grapes as ‘fruit’ crops with similar perennial management (Table 3). 

Table 3. Grouping different crops to better aggregate crops by management.

Figure 9 displays the effects of farm practice on BQI for each of the five crop types outlined above. Like the median shift tables, we saw inconsistent effects across different crops. For example, cover crops seem to have a positive impact on vegetables and fruit, but negative or not significant effects on grains and tubers. Additionally, with the exception of no-till in grains, the size of the effect is usually very small, less than 5% different from the reference.

Note: The large increase in BQI in grains was, in part, related to a single farmer organization that participated in the BI and had high quality nutrient outcomes. However, we did not have enough samples from across the US to determine if the effects were due to no-till or other climate and management effects. For more details about no-till in grains, you can read our grains report here.

This section has mostly focused on the challenges that present themselves when trying to identify quantitative relationships between farm practices and nutrient density outcomes in specific crops. This is not surprisingly given the complexity of natural systems or soil ecosystems in particular. All of the results in this section will be affected by the sources of variation laid out previously--climate, soil type, and crop variety--as well as small sample size effects from the large number of crops included in the survey. Therefore, in the next section we will focus on the relationships between soil parameters and nutrient density outcomes and between farm practices and soil health, independent of the crops being grown.

Figure 9. he effect of crop management on BQI for 5 different crop classes: vegetables (top left), tubers (top right), grains (middle left), leafy greens (middle right), and fruit (bottom left). The darker the shading of the bar the greater the statistical confidence in the results. The y-axis is the size of the effect (0.02 = 2%).

Impact of Farm Practice on Soil Parameters

To begin this examination, we used multiple linear regression analysis to predict antioxidants, polyphenols and BQI for crop samples using only the total carbon, respiration and pH values generated for soil samples in the BI lab. Both total carbon and respiration are recognized as indicators of soil health. Eleven out of the twenty crops tested in 2020 had sufficient samples to test this relationship, which is presented in Table 4. While the strength of the relationships ranged from very weak (r2 < 0.1) to quite strong (r2 = 0.61), there were many statistically significant relationships between soil properties and crop quality outcomes. In beets, for example, soil carbon and respiration from the 10-20 cm depth range were positively correlated with antioxidant, polyphenol and BQI outcomes. 

Table 4. Multiple linear regression coefficients for the relationships between soil parameters and nutrient density outcomes. Models that are statistically significant are highlighted in bold. Coefficients are statistically significant if they are highlighted in green or red. Green highlights indicate a positive correlation and red highlights indicate a negative correlation.

These results indicate that soil health does influence nutrient density outcomes, but not always in a positive way. We will explore why some of these relationships may be negative a little later in this section. More importantly, this statistical link between soil health parameters and nutrient density allows us to focus our examination on the connection between management practices and soil parameters, instead of looking for direct links between management practice and nutrient density outcomes in each crop.

To provide an example of the type of analysis that is currently possible with the library of data on hand, we will examine the effects of tillage intensity on soil carbon and respiration. Figure 10 presents the median shift of soil carbon and respiration values, compared to a no-till reference, for crops grown using light and heavy tillage. For the purposes of this analysis, a crop was grown using heavy tillage when at least one of the tillage passes was six or more inches deep. Light tillage is defined as when none of the tillage passes reached 6 inches in depth. We choose these definitions because it is easy to process the existing management data captured by producers to automatically categorize tillage as “light” or “heavy.” For grain crops, soils grown using both heavy and light tillage had less soil carbon than soils under no-till management. However, in produce farming systems, soils with light tillage had more carbon than no-till soil, while soils under heavy tillage had significantly lower soil carbon. Also, we see that in grain systems, soil respiration increased with the use of tillage while it decreased in the produce farming systems.

Figure 10. The median shift (%) in soil carbon and respiration under heavy and light tillage compared to no-till in grain and diversified vegetable operations. All of the effects shown here are statistically significant at p < 0.1.

Tillage as a practice is known to destroy the continuous soil pore network and infrastructure that has been created by soil organisms. Conversely, no-till supports the activities of the soil biota, improves soil's ability to store moisture, and increases soil carbon content.[1] Additionally, tillage mixes the soil, bringing organic matter into contact with soil microorganisms, often stimulating biological activity, and re-distributing organic matter throughout the plow layer. These seemingly conflicting results are a strong reminder that context matters. The average grain grower partner owned 3,173 acres. In contrast, the average produce grower partner owned 53 acres, and over 50% of our produce grower partners owned less than 5 acres of land. This difference in scale will affect management practices, and therefore the impacts of those management practices. For example, while it was common for small-scale produce partners to add numerous organic amendments to the soil, most grain partners added little or no organic amendments. Therefore, in grain systems the primary source of organic material was the crop residue, so the act of tilling in the residue brought the residue into contact with, and provided a food source for, soil microorganisms. The end result was that the residue was mineralized more quickly, with a significant portion of the carbon from that residue being converted into CO2 by the soil biology, so total carbon went down and soil respiration went up. Conversely, the regular organic additions to soils in smaller diversified vegetable operations were enough to offset carbon loss from light tillage, leading to increased soil carbon at the 0-4 inch depth. When sampling, producers were asked to move aside organic amendments and mulch layers and to sample only the mineral soil, so layers of thick organic materials that may build up in no-till systems were not sampled. For produce samples under heavy tillage the significant reduction in soil carbon is most likely driven by two factors: increased mineralization of organic materials due to soil disturbance, and mixing of carbon below the 8 inch depth that was sampled via deep tillage operations. 

The varying impacts of tillage practices on respiration for different crop groups may be driven by initial differences in the soil carbon levels in those fields. Figure 11 presents the median soil carbon and respiration values by crop at the 0-4 inch depth. Many of the specialty crops included in the BI have higher than average soil C and respiration levels. For example, the average soil that produced mizuna, bok choy, spinach, kale and lettuce had soil C values greater than 5%, whereas wheat, oats, potato and blueberry had average soil C values between 2.5 and 3.5%. The same trend held true for respiration, with specialty crops generally averaging greater than 25 ug C g soil-1 and wheat, oats, potato and blueberry having less than 25 ug C g soil-1. These results suggest that when soils are carbon-limited, the practice of mixing organic materials into the soil increases respiration. Conversely, when soil carbon is not limiting, tillage-based disruptions of microorganisms reduce respiration.

Figure 11. Mean soil carbon and respiration values by crop for the 0-4 inch depth increment.

The previous example shows the types of analyses that can be done using the data in the BI library. Realistically, however, there are so many different management practices captured on so many crops that we would struggle to include them all here. Over the next year, the Bionutrient Institute will continue to engage with soil scientists, agronomists and other researchers to investigate the relationships between farm practices and soil health parameters. Those results will be shared to the public in the form of infographics and peer-reviewed scientific journal publications. In the meantime, members of the public can use the Data Explorer to investigate relationships of interest to them. For example, figure 12 uses the Data Explorer to look at the effect of mulch, synthetic fertilizer and organic amendments on soil respiration in potato fields. Not surprisingly, mulches and organic amendments increased soil respiration over the average while synthetic fertilizer decreased soil respiration.

Figure 12. Range plots comparing the effects of mulch, synthetic fertilizers and organic amendments on soil respiration (0-4 inches) in potatoes using the Data Explorer Dashboard.

 

Predicting Variation

In 2018 and 2019, the BI evaluated the feasibility of using a low-cost handheld sensor--the Bionutrient meter--to predict nutrient density in crops. To do this, each food sample analyzed in the BI lab had its reflection spectra measured with the Bionutrient meter (10 channel spectrometer ranging from 365-940 nm) and the Siware spectrometer (1300-2500 nm range, 30 nm resolution). First, the surface reflectance was scanned and then the sample was juiced and the reflection spectra of the juice was measured. This allowed the BI lab to generate a large database of spectral data and nutrient outcomes. Prediction models for nutrients were then tested based on this database using linear regression and the Random Forest ensemble machine learning methods. Some of the key questions we sought to answer were: 

  • Can the lower-cost, but less accurate, Bionutrient meter capture enough spectral data to generate predictions?

  • Does the increased resolution of the SiWare spectrometer translate into significantly better nutrient predictions?

  • Can we develop prediction models based on raw samples (eg. whole leaves or fruit) or do we need to juice the sample to develop a usable prediction model?

  1. In evaluating the database and prediction models in 2019 we determined that:
    The Bionutrient meter, combined with appropriately attainable metadata, is just as effective at predicting nutrient quality as the benchtop Siware device.

  2. Using variety data may provide a large boost to the predictive capacity of the Bionutrient meter. Therefore, the BI should put more emphasis on capturing variety data in 2020.

  3. The best predictive capacity came when attempting to categorize samples as above average, average, or below average, not when predicting absolute values.

This year, instead of evaluating the potential to predict nutrient concentrations in crops, we evaluated prediction models through the lens of which models we would feel comfortable releasing to the public as functional prediction models. More specifically, we sought to develop prediction models that used the Bionutrient meter to estimate the level of three nutrients in crops: antioxidants, polyphenols and BQI.

To guide the release of prediction models we developed a set of criteria that we could easily communicate to the community:

  1. The models would only require data that could be easily collected by a non-scientist member of the community when they were measuring the spectra of the food with the Bionutrient meter.

  2. We will predict the ranking of a nutrient and not its absolute value. At this early stage of model deployment it is more useful to tell a consumer where an item of food ranks in our database than to display the absolute value (ex: Kale leaf ‘X’ has 255 FRAP units of Antioxidants). This gives the community a sense of the relative quality that is easy to interpret.

  3. We will provide the user with a value range at an 80% accuracy level. For example, a model will output a ranking (ex: 72) and a range (ex: 60 - 85). These values reflect the model's best guess of the percentile of the nutrient and the 80% confidence range (we are 80% confident the value falls within the given range).

  4. The maximum allowable average value range, or width, was set to 35%. Therefore, any model whose width was greater than 35 at the 80% accuracy threshold will not be deployed.

Furthermore, in 2021 we will target sampling of crops that fell just short of our deployment criteria to boost the number of observations and (hopefully) release models for these crops early in 2022

Figure 13. The width (+/- percentile range) for each prediction model deployed in August 2021 using the Bionutrient Meter at 80% accuracy.

 

Figure 14. Prediction Model output using the bionutrient.surveystack.io app to run the Bionutrient meter. The number is the predicted ranking, meaning that the sample has more polyphenols than 66% of the samples submitted to the BI labs for that crop type. The width represents the +/- accuracy range at 80% accuracy.

In addition to developing a subset of functional prediction models for a beta release, we tested 20 different prediction models for each crop nutrient using different sets of spectral data and metadata. More details about all of these prediction models are available in the detailed data analysis briefs developed for each crop type.

- Grains Data Brief.
- Leafy Greens Data Brief.
- Vegetables, tubers and fruit Data Brief.

One theme in 2020, that is consistent with results from 2018 and 2019, is that using the Siware spectrometer or homogenizing the samples through juicing or grinding samples did not significantly increase the quality of the prediction models (data not shown).

Next Steps for model deployment

Releasing a set of functional models to the public in 2021 is the realization of many years of work within the Bionutrient Institute. However, it is only the first step in what is sure to be a long process. In order for the Bionutrient meter to be a useful tool within the food supply chain, building transparency and trust in the outcomes are long-term requirements. The first requirement, transparency, has been a key principle of the lab and meter development process since 2018. All of the data and models used are open source and available to the public. The second component, trust, will require testing and interaction with the community of users. 

We are working with Citizen Science and Grower partners who purchased a Bionutrient meter to validate the in field predictions. To do this, partners will submit a subset of samples that they test using the Bionutrient meter to the BI lab for lab analysis of the same samples. This will allow us to compare predictions to actual lab outcomes. This data will help us optimize the prediction models that we have released to:

- Increase the accuracy of the prediction models
- Reduce the width for the 80% accuracy predictions
- Expand the number of nutrients we can predict

Furthermore, throughout the 2nd half of 2021, we will work with our 2020 data set to release more prediction models. In particular, we will target crops that fell just short of our deployment criteria to boost the number of observations and (hopefully) release models for these crops by the end of 2021.

 

Conclusions

 
  • In 2020, the BI was successful in scaling up the Food and Soil Survey to include 3 labs, almost doubling the number of samples received (3,851), working directly with 5 times more farmers than in 2019, and capturing detailed management data on a much larger percentage of the samples received.

  • We saw significant variation in nutrient and mineral content in almost all of the 20 crops measured by the BI labs in 2020.
    -By conducting more intentional experimental design in 2020, getting more data from farms and balancing samples from across a wider range of climate regions and soil types, we were better able to examine factors affecting nutrient density outcomes.

  • We identified significant correlations between soil health parameters and nutrient density outcomes. Often those relationships were positive, but sometimes they were negative.

  • Tillage intensity can have a significant impact on soil carbon and respiration, but the context (the crop being grown, size of the farm, etc) will impact how tillage intensity impacts the results.

  • With the release of the Data Explorer Dashboard, anyone can examine the sources of variation using a series of preset and custom filters, or compare management impacts for any crop or nutrient measured by the BI lab.

  • After developing a set of criteria to guide the release of nutrient density estimation models using the Bionutrient Meter, we released models to predict antioxidants, polyphenols and/or BQI on 12 crops.

 
 

Next Steps

  • Continue to optimize the nutrient density estimation models that were released in 2020.

  • Release new models for crops or nutrients that were not included in the initial model release.

  • Engage with other domain experts to analyze the BI dataset and develop manuscripts for publication in peer-reviewed science journals.

  • Continue to improve the BI data pipeline to shorten the time between samples arriving in the lab and reporting results to grower partners and when that data is available on the Data Explorer Dashboard for public consumption.

 

IN PROGRESS:

We are engaging with experts across numerous scientific disciplines, whose feedback will help us improve the analysis and interpretation of the 2020 data and the BI nutrient density library as a whole.

This means that there may continue to be revisions to this report.