arxiv bans authors for unchecked ai-generated papers

source: techcrunch ai: research repository arxiv will ban authors for a year if they let ai do all the work

level: technical

arxiv, the open preprint repository, is tightening its rules against careless use of large language models in scientific papers. the site, which hosts research before peer review, has become a key distribution channel in fields like computer science and math. it already requires new authors to get endorsements and is becoming an independent nonprofit to better address ai-generated content.

thomas dietterich, chair of arxiv's computer science section, announced that if a paper shows incontrovertible evidence that authors did not check llm results, they will face a one-year ban. evidence includes hallucinated references or comments to and from the llm. after the ban, authors must have future submissions accepted by a reputable peer-reviewed venue first. the rule is a one-strike policy, but moderators must flag issues and section chairs confirm evidence before penalties apply, with an appeals process available.

the policy does not prohibit llm use but requires authors to take full responsibility for content regardless of how it is generated. if researchers copy inappropriate language, plagiarized content, biased content, errors, or misleading content directly from an llm, they are still accountable. recent research shows fabricated citations are rising in biomedical papers, likely due to llms, highlighting the need for such measures.

why it matters: this policy directly affects ai and data science researchers who use llms to draft papers, emphasizing the need for human verification to maintain scientific integrity.

source: techcrunch ai: research repository arxiv will ban authors for a year if they let ai do all the work