Portfolio Analysis of NIH Prevention Research

The ODP plays an important role in characterizing the NIH prevention research portfolio. The ODP has been analyzing the NIH research portfolio since 2013, and in 2015, the ODP began developing more precise machine learning tools and approaches to better describe and understand NIH-funded prevention research. We published our initial findings in 2018 and continue to release new data on a regular basis.

The ODP’s process and the major results of our work are highlighted below.

NIH Investment in Prevention Research

According to the ODP's most recent analysis (published in March 2021):

  • Primary and secondary prevention research represents 20.7% of NIH research projects and 27.4% of NIH research funding.
  • A large proportion of prevention research projects included observational studies (63.9%), analysis of existing data (46.5%), or methods research (24.0%).
  • Projects using a randomized clinical trial design represented just over a tenth (12.3%) of the NIH prevention research portfolio.

Interested in exploring NIH-funded research projects?

Search these databases for NIH-funded research projects, publications, and more.

The ODP has looked at more than 14,000 NIH-funded research projects across 12 activity codes for grants awarded in fiscal years 2012–2019, continuing our detailed analysis of the NIH prevention research portfolio. These 12 activity codes represent 90.8% of all new projects and 80.1% of all dollars used in NIH research supported by extramural grants and collaborative agreements.

The ODP collaborated with the Office of Portfolio Analysis (OPA) to develop and validate novel machine learning algorithms that identify prevention research projects. Because the machine learning method was specifically trained to recognize applied prevention research, it more accurately identifies applied prevention projects than other, earlier approaches.

The ODP regularly publishes the results of its analyses of the NIH prevention research portfolio in specific areas, which are listed below. New publications are added as they become available.

Primary and Secondary Prevention Research


1Protocol: Coding Abstracts Using the ODP Prevention Taxonomy - version 1.0 (PDF)
2Protocol: Coding Abstracts Using the ODP Prevention Taxonomy - version 3.1 (PDF)

Study Designs Used in NIH Prevention Research


2Protocol: Coding Abstracts Using the ODP Prevention Taxonomy - version 3.1 (PDF)

Diet and Physical Activity


1Protocol: Coding Abstracts Using the ODP Prevention Taxonomy - version 1.0 (PDF)

Health Care Delivery


2Protocol: Coding Abstracts Using the ODP Prevention Taxonomy - version 3.1 (PDF)

Substance Use


1Protocol: Coding Abstracts Using the ODP Prevention Taxonomy - version 1.0 (PDF)


Why the ODP Analyzes NIH Prevention Research

The ODP seeks to better understand the NIH’s investment in prevention research by methodically characterizing the NIH prevention research portfolio, providing more information about what the NIH is funding and in greater detail than previously available.

More specific identification and analysis of NIH-funded prevention research enables the ODP to assess the progress and changes in NIH-funded prevention research over time. These efforts help the ODP describe trends in NIH-funded prevention research and identify gaps in the NIH prevention research portfolio that could benefit from targeted investments, potentially addressing important modifiable risk factors and, therefore, reducing the burden of preventable disease.

The ODP provides leadership for the development, coordination, and implementation of prevention research in collaboration with NIH Institutes and Centers and other partners. Fulfilling this vision depends on the ODP’s ability to accurately characterize studies across a number of dimensions such as topic area, study design, population studied, and type of prevention research.

How the ODP Identifies and Classifies NIH Prevention Research

Defining and Coding Prevention Research

The ODP defines prevention research as encompassing both primary and secondary prevention research in humans, as well as prevention-related methods for use in humans—it does not include basic or preclinical studies that could still be years away from preventing disease or disability.

Working closely with the OPA, the ODP developed new methods to apply a machine learning approach to the NIH’s prevention research portfolio based on the ODP's definition of prevention research.

Once the machine learning tools identify relevant NIH prevention research projects, the ODP’s prevention research taxonomy (PDF)—along with a detailed protocol (PDF)—serves as a set of rules for coding project abstracts. The taxonomy is a framework for classifying research and includes 140 non-mutually exclusive topics grouped into six categories. The protocol provides teams of coders with instructions, definitions, and examples to support the accurate, standardized classification of research projects.

Overall process of characterizing the NIH prevention research portfolio using a novel machine learning algorithm. #1: Database of funded NIH grants (FY2012-2017). #2: Feed into Machine Learning Program. #3: Machine learning program identifies prevention and non-prevention projects. #3 50% of prevention projects are manually coded and 5% of non-prevention are manually coded by staff. #4: Quality control checks of project coding. #5: Extrapolation of data. #6: Database of well-annotated NIH-funded prevention research.

The ODP continually refines and trains its machine learning algorithms to identify prevention research projects. Efforts are currently underway to apply machine learning to support coding more specific details of individual prevention research grants based on the ODP taxonomy, such as the health condition, study population, study design, and type of prevention research.

1Protocol: Coding Abstracts Using the ODP Prevention Taxonomy - version 1.0 (PDF)

Last updated on February 9, 2023