Advancing Medicine through Experiments on Animals: Room for Improvement?

Several recent articles have discussed problems that have arisen in translating findings from animal studies into clinical treatments.[1],[2],[3]   Some have interpreted this limited translational success as evidence that animal studies are not useful in generating treatments for disease. We challenge that conclusion in light of all the medical advances that have stemmed from animal experiments. In our view, these articles reveal underlying problems in terms of how science is funded, conducted, and reported. The result has been premature efforts to translate preliminary findings in animals to human medicine. The problems we see are these:

  • A scarcity of research funding, which makes replicating preliminary studies a low priority.
  • The unintended consequences of efforts to improve animal welfare, which can mean that a study uses too few animals to produce statistically valid data; and
  • Journals’ reluctance to include sufficient detail about research protocols and to publish information about approaches that did not succeed.

Also at issue is growing pressure to produce positive results even when there is just as much to be learned from failures.

Mice - animal experiment methods could be improved

Research in a time of scarcity: Because funding is limited, granting agencies can fund only a small percentage of proposals submitted. Moreover, breaking new scientific ground is seen as the way to maintain public support for their missions. Repeating studies for the sake of validation has much less cache. Given that there is little optimism about near-term prospects for funding increases, granting agencies need to become savvier about the value of replicating of scientific studies as part of the translational research process.

Unintended consequences of animal welfare measures: In most countries, efforts to promote animal welfare follow the “3 Rs” principles of Russell and Burch, namely, reduce, refine, and replace.[4]  This has encouraged greater concern for animal welfare in study design and execution as the authors intended.  However, for those who oppose research, reduction is often viewed as an end in itself. The starting point is a set of very reasonable tenets along these lines:

  1. Experiments should involve the fewest number of animals that will produce a statistically significant outcome;
  2. No unnecessary duplication of studies should occur; and
  3. The phylogenetically lowest species to accomplish the scientific goals of the study should be used.

Each of these principles includes important qualifiers that sometimes tend to be discounted in the face of pressure to show “progress” by reducing animal numbers. Disregarding those qualifiers has negative consequences downstream when efforts are made to build upon the findings.

When too few animals are used in a study, it is “underpowered,” meaning that the experiments do not produce enough data to draw solid conclusions. This can lead to both false positives (i.e., concluding that a treatment is successful when it is not) and false negatives (i.e., concluding that a treatment is not effective when it really is). The statistical tool scientists use to determine the appropriate number of subjects needed to produce statistically valid results is called a power analysis. A power analysis must take into account all the factors that could affect the treatment, but in many cases the researcher cannot identify every factor before the study begins. As a result, the power analysis is an educated guess. It is still the best tool available even in the face of many unknowns, but it can’t give a definitive answer in terms of how many subjects are needed to draw valid conclusions. When study data are analyzed, unexpected sources of variation may be discovered. This means that achieving statistically valid conclusions requires more data—in other words, more animals. However, it is not always possible to do the additional experiments: regulators may be reluctant to approve additional experiments, or there may not be enough funding to do more studies.

The other problem is the push to use model organisms that are low on the phylogenetic or evolutionary ladder. Animal models are meant to serve as a stand-ins for human beings based upon their physiological similarities. Yet every organism has many specialized adaptations so whatever their physiological similarities, there are still differences between species. In addition, there can be significant variations between individuals within one species. Unless the end goal actually is to cure mice or rats, potential treatments should be tried in several species—including more phylogenetically advanced species such as dogs, cats, pigs, or nonhuman primates—before human clinical trials. This increases the likelihood of successful translation; helps identify problematic side effects; and may also help determine whether other species themselves could benefit from the treatment.

The devil is in the details: It is also important to have scientists in different labs repeat the same experiment since “non-experimental” factors can also influence outcomes. Differences in the animals’ food, the type of bedding materials in their cages, chemicals used to sterilize instruments and cages, and even the water they drink may affect the outcome of the study. The role of variability in these non-experimental factors may only become apparent when one lab is unable to repeat the findings of a study. The first step is to review every aspect of the experimental protocol, but it is sometimes also necessary to look at other “non-experimental factors” like those listed above. Recently several prominent organizations have produced guidelines recommending more extensive reporting of animal study methods to reduce this kind of variability.

ARRIVE guidelinesIn 2010, the U.K.-based National Centre for the Replacement, Refinement and Reduction of Animals in Research prepared the Animal Research: Reporting In Vivo Experiments or ARRIVE guidelines , while in 2011, the U.S.-based National Academy of Sciences published Guidance for the Description of Animal Research in Scientific Publications. In 2012, the National Institute on Neurological Diseases and Stroke published “A call for transparent reporting to optimize the predictive value of preclinical research”[5] to report the outcomes of a workshop it organized to explore what important details of experimental procedure ought to be described.

Failure to include sufficient detail to permit replication of a study can often be laid at the doorstep of scientific journals. Journals have traditionally allowed limited the space to describe methods in the interest of keeping the cost of journal printing and mailing under control. As a result of the NINDS workshop, Nature[6] has expressed a willingness to provide more experimental details. It would be helpful if other journals follow suit.

The positive side of negative findings: Journals are in the business of publishing cutting edge science that drives a field forward. Most are reluctant to accept articles that report experiments that don’t work or treatments that fail to show the desired effect since those “negative findings” aren’t likely to be cited in other papers—a key factor in building a journal’s reputation. However, without access to negative findings, scientists don’t have the opportunity to learn from others’ experience so they may go down the same blind alleys. In addition, positive findings may be interpreted as more significant than they really are when relevant negative findings about the same topic have not been aired. Everyone would benefit from making negative findings widely available, but it’s not clear how to launch reputable venues for publishing them.

Although at first blush, it might seem that the approaches discussed above will increase the number of animals in experimental studies that is not necessarily the case. If negative findings are shared more widely, unproductive approaches will be abandoned more quickly.  If early-stage, exploratory studies are repeated for the sake of validation, more “false positives” will be weeded out. Then the treatments that go on to further animal studies and human clinical trials are more likely to produce successful clinical outcomes. Given the high cost of clinical trials, this approach will save money in the long run, making more funding available for both basic research and drug development studies.

Taking these steps will require new thinking by those who regulate animal and human research, those who publish research articles, and those who allocate funds for research. However, the pay-off is worthwhile because it may lead to earlier identification of promising treatments for human and animal patients.



[1] van der Worp HB, Howells DW, Sena ES, Porritt MJ, Rewell S, et al.  Can Animal Models of Disease Reliably Inform Human Studies? PLoS Med 7(3): e1000245, 2010.

[2] Houser SR, Margulies KB, Murphy AM, Spinale F, Francis GS, et al.  Animal Models of Heart Failure.  Circulation Res 111: 131-150, 2012.

[3] Improving the Utility and Translation of Animal Models for Nervous System Disorders: Workshop Summary. The National Academies Press, 2013.

[4] Russell, WMS and Burch, RL. The Principles of Humane Experimental Technique. Methuen & Co Ltd.: London. 1959.

[5] Landis, SC et al. Nature 490, 187–191, 2012.

[6] Raising Standards.  Nat Immunol 14: 415, 2013.