Challenges

Fraud and paper mills

Richardson et al. (2025)

Paper mills flourish because of research systems that evaluate scientists using publication metrics, thereby inadvertently providing an incentive for misconduct. (Abalkina et al. 2025)

Certain research fields seem to be particularly susceptible, namely those in which the number of possible experiments far exceeds the available scientific resource. Fields we know of include non-coding RNAs in human cancer and crystallography — vast numbers of different RNA combinations and crystal structures can potentially be investigated. (Abalkina et al. 2025)

Aquarius et al. (2025)

Detecting fraud

The first step is to check whether a published paper has been flagged in a database that lists retractions and corrections, such as the Retraction Watch database, or on PubPeer (Abalkina et al. 2025)

Correlates of fraud

  • Very large effect sizes
    • If your Cohen’s \(d > 1.72\) (height difference between sexes), why does not everyone know about it already?
  • Lack of open code/data
  • Image manipulation
    • Duplication
    • Fabrication (vertical line test)
  • No reporting of funding or ethical approval
  • Round numbers in sample sizes
    • e.g. N=100 participants, 50 per group: only 8% chance (simple randomization)
  • Unexpected patterns in data sets
    • Download raw data, plot correlations: plausible? (requires domain knowledge)
  • Statistical inconsistencies
    • GRIM test: granularity-related inconsistencies of means (or std)
      • Mean of \(N\) integers can only have specific values
    • Auto-correlation of random sequences (humans are bad RNGs)
    • Inconsistencies in reported \(p\)-values
      • Is the test statistic consistent with the reproted \(p\)-value?
      • Rounded numbers in tables can produce limited range of \(p\)-values
        • Does \(p\)-value fall outside of range?
        • Perfectly reproducible list of \(p\)-values from rounded numbers indicate likely no raw data was analyzed

“Bosom peril” is not “breast cancer”: How weird computer-generated phrases help researchers find scientific publishing fraud
Bulletin of the Atomic Scientists (13 January 2022)

A tortured phrase is an established scientific concept paraphrased into a nonsensical sequence of words.

  • Arbitrary timberland arrangement (random forest classifier)
  • Counterfeit consciousness (artificial intelligence)
  • Man-made brainpower (artificial intelligence)
  • Mean square blunder (mean square error)
  • Flag to clamor (signal to noise)

Peer review and publishing

Potential solutions for starined peer review system mentioned in Nature article by Adam (2025):

  • Publish, review, curate: publish as pre-print, review, and send to particular research communities
  • Organize peer review independently of journals: authors can submit paper with reviews
  • Selective use of peer review
  • Funding lotteries for grant proposals passing basic quality check
  • Widen pool of reviewers

Alternative publishing model:

Strain on scientific publishing blog:

References

Abalkina, Anna, René Aquarius, Elisabeth Bik, David Bimler, Dorothy Bishop, Jennifer Byrne, Guillaume Cabanac, Adam Day, Cyril Labbé, and Nick Wise. 2025. Stamp Out Paper Mills’ — Science Sleuths on How to Fight Fake Research.” Nature 637 (8048): 1047–50. https://doi.org/10.1038/d41586-025-00212-1.
Adam, David. 2025. “The Peer-Review Crisis: How to Fix an Overloaded System.” Nature 644 (8075): 24–27. https://doi.org/10.1038/d41586-025-02457-2.
Aquarius, René, Elisabeth M. Bik, David Bimler, Morten P. Oksvold, and Kevin Patrick. 2025. “Tackling Paper Mills Requires Us to Prevent Future Contamination and Clean up the Past – the Case of the Journal Bioengineered.” Bioengineered 16 (1): 2542668. https://doi.org/10.1080/21655979.2025.2542668.
Richardson, Reese A. K., Spencer S. Hong, Jennifer A. Byrne, Thomas Stoeger, and Luís A. Nunes Amaral. 2025. “The Entities Enabling Scientific Fraud at Scale Are Large, Resilient, and Growing Rapidly.” Proceedings of the National Academy of Sciences 122 (32): e2420092122. https://doi.org/10.1073/pnas.2420092122.