Loading...

Navigating the Rising Tide of AI-Generated Publications: Insights from ACSE Advisory Cabinet

By   Guest Editor Apr 03, 2024 8037 50

In the rapidly evolving landscape of scholarly publishing, new challenges continue to emerge, and organizations like the Asian Council of Science Editors (ACSE) must remain vigilant and proactive in addressing them. One such pressing issue that recently came to the forefront is the recent proliferation of AI-generated publications by leading publishers.

Recognizing the significance of this new development, ACSE convened its Advisory Cabinet to thoroughly examine the case and explore potential solutions to mitigate such occurrences in the future. In this article, we explore the insights and recommendations provided by our esteemed panel of advisors, which we hope will shed more light on the ongoing discourse around AI use in content development and peer review in publications.

Details of Submitted Case:

Retraction Watch Uncovers AI-Generated Paper Published by Elsevier
In a surprising turn of events, a tweet by Retraction Watch has ignited a firestorm on Twitter regarding an article published by Elsevier that was seemingly authored by the AI language model ChatGPT. The paper in question was titled "The three-dimensional porous mesh structure of Cu-based metal-organic-framework - aramid cellulose separator enhances the electrochemical performance of lithium metal anode batteries," and was authored by Manchu Zhang, Lining Wu, Tao Yang, Bing Zhu, and Yangai Liu. It appeared in the peer-reviewed journal "Surfaces and Interfaces," Volume 46, March 2024 (DOI: 10.1016/j.surfin.2024.104081).

Elsevier

The quickly tweet went viral, prompting widespread discussions among the scientific community and beyond. Many have expressed astonishment that an AI language model could generate a paper detailed enough to pass through the editorial process of a reputable publisher like Elsevier.

Upon closer examination of the paper (accessible here: Link), it appears that the entire editorial process, including the reviewing team, may have overlooked the fact that an AI language model generated the paper.

This revelation has raised significant questions about the reliability of the peer review process and the potential implications of AI-generated content in scholarly publishing. While AI technologies have shown remarkable capabilities in generating content efficiently, this incident underscores the need for greater scrutiny and oversight in the editorial process.

As discussions continue to unfold on social media platforms, it remains to be seen how publishers like Elsevier will address this issue and what measures will be taken to prevent similar incidents in the future.

In a related development, another Twitter user has brought attention to yet another paper published by Elsevier, which raises further concerns about the use of AI-generated content in scholarly publishing. The paper, titled "Successful management of an iatrogenic portal vein and hepatic artery injury in a 4-month-old-female patient: A case report and literature review," was authored by ChatGPT.

The case report, detailing the successful management of an iatrogenic portal vein and hepatic artery injury in a 4-month-old female patient, was published in an Elsevier journal. The article delves into the case specifics and provides a literature review on the subject matter.

Elsevier

This revelation adds fuel to the ongoing discussion sparked by Retraction Watch's tweet, further highlighting the prevalence of AI-generated content within the scientific literature. The availability of such papers on reputable platforms like Elsevier raises questions about the efficacy of the editorial process in identifying and addressing AI-generated content.

The paper in question can be accessed via the following link: Link

As the debate surrounding AI-generated content in scholarly publishing continues to gain momentum, it underscores the need for heightened scrutiny and transparency in the editorial process.

ACSE Advisory Cabinet Members' Feedback:

The ACSE Advisory Cabinet Members (ACM) were consulted to gather insights, opinions, and suggestions regarding the discussed issue. To ensure the confidentiality of our cabinet members, each response has been assigned an opinion number. Below, you'll find the responses from each member:

Opinion by ACM01:

I find myself hesitant to offer commentary on the aforementioned incidents due to how they have been presented, for several reasons. Firstly, legitimate concerns have been raised regarding the (mis)use of AI. However, the final published outputs underscore the significant flaws within our publishing system and workflow, despite it being a collaborative effort involving numerous competent individuals: authors, reviewers, editors, typesetters, copy editors, and proofreaders. It appears that diligence in performing their respective roles has been lacking. The absence of a comprehensive review of the manuscript by any party is perplexing.

Second, in the Radiology Case Reports paper, it is evident that the authors paid the full article processing charge (APC) with the expectation of receiving a standard, satisfactory service from the publishers. However, the delivered outcome falls short of this expectation. While we often discuss consumer rights, it remains uncertain who would challenge or pursue legal action (and where) in the event of such rights being violated.

Thirdly, it begs the question of whether readers should also be considered consumers. This proposition is debatable, given that we do not directly pay for access to, or utilization of, the flawed open access output in Radiology Case Reports. Consequently, our rights may not be infringed upon in a strict sense. However, this assertion raises ethical considerations.

Lastly, as of this note on March 30, 2024, it is surprising to note that Surface and Interface have yet to remove the first AI-generated sentence from the Introduction, even after 38 days. Similarly, 22 days later, Radiology Case Reports has not addressed the last paragraph of the Discussion, totaling 90 words and entirely AI-generated. What does this delayed or absent response from journals and publishers signify? It implies a lack of concern, with the implicit message being one of indifference.

Opinion by ACM02:

A few thoughts come to mind on reading these case reports.   Clearly, the first example is not a great paper–but it has other problems of grammar and writing that seem unlikely to have been introduced by an AI like Chat GPT.  For instance, in the methods, some sentences are missing a verb: ChatGPT is very good about always having a subject and a verb in every sentence.  In other words, current AI is unlikely to be like the work once generated by the MIT Computer Science and Intelligence Lab, whose “SCIgen” program could produce full papers (https://news.mit.edu/2015/how-three-mit-students-fooled-scientific-journals-0414).

Realistically, no AI models can “write” (let alone “author”) a paper because they are algorithms, not authentic intelligences. ChatGPT, for instance, can deliver up to a certain word count based on an algorithm that generates statistically likely text based on its available data.  Another AI called Claude summarizes text, but it also has a limited output. Authors who are using AI to write up data they collected (as opposed to producing hoaxes like SCIgen papers) would need to give the AI specific directions, then use the ensuing different short outputs to put a manuscript together themselves, which requires a lot of composition work and fact-checking.   

Of course, authors would need to disclose the use of AI, including the methods (perhaps in an online appendix?) the same way as any other writing support must be disclosed.  It might be helpful for authors writing in English as a foreign language to have guidelines for using AI to check grammar, as long as the use is disclosed.

Expecting editors and peer reviewers to detect AI use seems unrealistic: even AIs can have difficulty identifying AI-generated text.  And although the example in the first case does look a little odd, there are other reasons that statement might have appeared. We should keep in mind that peer reviewers are providing a free service to the profession and editors already have a difficult time identifying peer reviewers. Thus, the peer review role should be limited to reviewing the apparent quality of what they are presented. It’s similar to problems of data integrity or data collection based on the text of a submitted manuscript: peer reviewers have the completed product, not access to every step of production.  

It seems clear, however, that peer reviewers in at least one case did not read the paper carefully enough–perhaps mostly examining data instead of checking the text.  Since the journal in question is produced by a large publishing house, one possible solution is to employ more copy editors to detect these sorts of errors and possible problems of integrity. Increasingly, though, subeditors of journals are also working for free, which limits the work that can realistically be expected of them. 

As mentioned later by my colleagues, editors can develop policies for AI use and disclosure and professional societies have an important role to play in providing guidance.  The Committee on Publication Ethics (COPE) and the World Association of Medical Editors (WAME) guidance on the use of AI are good models that other editors can adopt or adapt. Authors should be asked to make specific disclosures, which could be built into the journal submission system. I realize that the submission process is already sometimes complicated, but asking a direct question about AI use is a good way to remind authors that they need to disclose all writing support–as well any other use of AI.  This kind of disclosure would also provide an opportunity to differentiate the use of AI as a scientific method, as an editorial tool, or as a reference generator. 

In closing, I will add that AIs can be used in many different ways as research tools as well as in generating text or as editorial tools. Publishers, for instance, are increasingly using AIs to format references and citations in scholarly books. Therefore, I think it is important to be specific about what kinds of AI use are permissible and how best to disclose appropriate uses of AI.

Opinion by ACM03:

Thank you for reaching out regarding this matter.  I deeply value these exchanges of ideas among my esteemed colleagues, and I find these insights about our exponentially developing field absolutely invaluable!

At the current state of understanding of AI, there is likely to be amplified outrage and social media frenzy around the seemingly unethical use of AI in content development as well as in the peer review process. However, it may be useful to recognize that many of the opinions and comments circulating may be speculative at this early juncture. The proliferation of outrage through mechanisms such as sharing, likes, comments, and reactions on social media platforms contributes to the viral dissemination of information and, consequently, the polarization of viewpoints on this emerging technology. Some reactions about the problematic use of AI are obvious and very understandable; for instance, there has to be strong pushback on the lack of transparency by not declaring the use of AI. It seems obvious that there is a need for further guidelines in this rapidly developing area.

 I would advocate for educational initiatives, spearheaded by organizations like ACSE, ISMPP, and DIA, to equip professionals in the publishing sphere with the knowledge and tools necessary to navigate this terrain effectively. In my opinion, adopting a measured and systematic approach that prioritizes education and guideline development is imperative.

Conveniently, many of the questions that are jarring/troubling here may not be entirely new; the pace of change however is a new variable we will need to learn to adapt to. The rapid pace seems to evoke severe and polarized reactions. We can immediately see pro and against camps forming around AI in both the industry and academic sectors. Nevertheless, the obvious fact seems to be that AI use will continue to increase exponentially, it will drive innovation and efficiency. Definitely, along the way there will be unfortunate cases in the literature such as this. The silver lining is that the active discourse around such events is loud, and messy, but undoubtedly leads to more maturity in our understanding of this exciting new variable. I am very much looking forward to the new developments in this area, and also to the guardrails we need to develop to deal with misuse of AI during this inevitable and irreversible rapid growth.

Opinion by ACM04:

I echo my colleagues' concerns regarding the implications raised, which suggest inherent issues across various stages of the publication process. As highlighted by Lisa, one key issue revolves around the lack of disclosure of AI use. It's worth noting that the utilization of AI tools itself isn’t problematic and indeed serves as a useful tool for authors particularly those for whom English isn't their primary language.

 Nevertheless, as the International Committee of Medical Journal Editors (ICMJE) and the World Association of Medical Editors (WAME), guidelines have made clear, that the authors should remain responsible for the contents of their work, and any use of AI tools should be transparently disclosed (in the acknowledgments and/or methods sections) so that editors, reviewers, and readers have this information available to them. In the specific example cited, the presence of the phrase "Certainly, here is a possible introduction for your topic:" unequivocally indicates the involvement of an AI language model (LLM); however, if these words had been deleted, it would be less clear, and as Lisa also commented, reliable tools to detect AI-generated text are not available to editors (in the same way as, for example, plagiarism detection tools); so editors will rely on the authors making suitable disclosures. To, end this publishers and editors should make sure the need for these disclosures is clearly signposted on a journal's website, such as in the Author Guidelines, and should also be able to provide advice to authors on how to disclose usage appropriately.

Moreover, as discussed by my colleagues, there are procedural concerns surrounding the handling of the manuscript by the journal. The oversight of this particular sentence by the editor, reviewers, and copyeditors is disconcerting. While this oversight may not inherently implicate the validity of the study's results or conclusions (as Lisa comments, the reviewers may have focused on the data rather than the language), it undoubtedly undermines readers' confidence in the paper and erodes trust in the publication process at large.

Moving forward, it is incumbent upon the publisher to conduct a thorough investigation to rectify any inaccuracies or oversights identified, potentially necessitating a correction or even retraction, contingent on the investigation's findings. Additionally, there should be an emphasis on disseminating the outcomes of this investigation to educate authors, publishers, and editors alike on the appropriate and transparent use of AI technologies in scholarly publishing.

Opinion by ACM05:

I concur with the above comprehensive analysis, from the perspectives of authors, reviewers, and publishers/journals in terms of controlling the use of generative AI in scholarly writing and publishing processes, while also underscoring broader considerations regarding research and publishing integrity in general. Furthermore, I wish to augment the discourse with the following observations:

In this specific case, I maintain that the authors should have primarily used the generative AI (ChatGPT) tool for English editing and writing assistance, rather than generating the entire experimental results and the article itself. However, at the very least, they should have expunged the "dialogue phrases with the ChatGPT" from the Introduction and transparently disclosed the AI usage in crafting their paper. Such disclosure would have elucidated their limitations in language proficiency (particularly for the first author, who is usually a graduate student, and their co-authored supervisor, who failed to perform a final check on the paper).

The journal/publisher indeed has guidelines for the Declaration of generative AI in scientific writing in their author instructions (https://www.elsevier.com/about/policies-and-standards/publishing-ethics#Authors), which state: 'The PUBLISHER's AI author policy states that authors are allowed to use generative AI and AI-assisted technologies in the writing process before submission, but only to improve the language and readability of their paper and with the appropriate disclosure, ... ...' Regrettably, the journal's editorial and production editors failed to scrutinize the paper meticulously to identify such issues and lacked the technical measures to readdress the matter of this kind. If properly disclosed, in the absence of plagiarism and with proper referencing of external sources, the use of new technologies, including generative AI and AI-assisted technologies in the writing process before submission to solely improve the language and readability of the paper, should be acceptable, just as increasingly more new technologies have been used in the research itself.

Although this obvious problem should have also been identified by the reviewer(s), their primary task is to check and confirm the quality of the paper in terms of sound science and proper presentation.

This again suggests that all parties involved in the peer review process should improve their performance - journals/publishers should implement stricter peer review procedures and provide all possible cutting-edge technical support, and the reviewers should pay more attention to verifying the quality of the paper in terms of science and presentation, as well as providing comprehensive and constructive comments for the authors to improve. It also underscores the importance of adopting more open, transparent, and interactive peer reviews in science and its publication.

Finally, even for major (mostly commercial) STEM publishers, though they are generally regarded as huge and reputable due to their vast number of journal portfolios and subject coverage, as well as their leadership in publishing technologies, their business is virtually based on the individual journal operations, which vary in different subject areas, with some being in-house edited and others being partnership journals edited externally. All these factors can challenge their daily publishing activities, including but not limited to the specific cases picked out herein.

Opinion by ACM06:

I appreciate the council’s initiative in addressing this crucial issue. An additional, yet interconnected concern pertains to the evolving evaluation criteria for published papers. There is an ongoing discussion surrounding the evaluation of Open Educational Resources (OER) content, with a shift towards advocating for qualitative metrics alongside traditional quantitative measures. Presently, our reliance heavily leans toward citation scores and numerical counts, with less emphasis on other evaluative dimensions. This topic currently commands significant attention within Slovenia, reflecting the broader international conversation on enhancing the assessment framework for scholarly outputs.

I would start with the current state of AI discussion right now. Presently, a notable challenge lies in the rapid advancement of AI technology, outpacing the development of clear regulatory frameworks and guidelines governing its usage. Undoubtedly, researchers (or at least some of them) use AI tools to speed the writing process of articles for publication. The tools are out there, waiting to be used. There are a lot of AI tools that no longer just help with the literature review but also do statistical analysis with the data provided and make an interpretation and conclusion. I would list a few of them if you are interested: Scopus AI, Elicit, ResearchRabbit, PowerDrill, SciSpace, Perplexity, Semantic Scholar, Jenni.ai, Scholarcy, Scite, Consensus, Litmaps, GraphMaker, Bard and for the writing numerous apps paraphrase the text -Grammarly, Instatext, Deepl, Quillbot.

However, the publishers (as I learned, but it may be different from Journal to Journal) do not necessarily object to text enhancement through these tools. I also believe that we cannot prevent authors for using them but I think we need clear regulations on how to use them.

Authors face the dual pressures of meeting academic benchmarks for advancement, such as habilitation, while contending with vague journal policies regarding AI tool utilization in manuscript preparation. The policies are still in the early stages. Also the editors are left alone to research on these matters and provide the best solutions on their own to make responsible decisions about the review process.

It is paramount to familiarize ourselves with the functionality of AI tools and the distinctive content and writing style they generate.  I also think that in a short time it will be impossible to distinguish between human written text and generated text by AI. We simply do not have the resources that could help us detect the AI written text. Moreover, the exponential surge in manuscript submissions over recent years exacerbates the strain on the review process, with a tripling observed within the past three years alone.

 I wish to clarify that my intention is not to absolve editors of accountability in the presented case studies, as these instances have underscored notable deficiencies in the review process.  However, I anticipate a heightened likelihood of similar occurrences if corrective measures are not instituted. This necessitates the introduction of additional review stages to monitor for inconsistencies, currently lacking within the process. Yet, while editors can propose amendments, the implementation of necessary upgrades to submission systems lies within the purview of publishers, a process likely to entail a considerable timeframe.

I am somewhat taken aback by the publisher's failure to retract the paper thus far, particularly when considering past instances where papers were retracted for seemingly minor reasons. The following papers authored by Macchiarini have been retracted:

November 2012, retracted by the journal for copying a table from another paper without citing it:[43]

March 2017, retracted by authors[62] after Karolinska requested retraction in December 2016;[63][64] after Nature had issued an editorial notice of concern in October 2016:[65]

Macchiarini's 2011 Lancet paper described the treatment of Beyene. In February 2016 the Royal Swedish Academy of Sciences called for the Lancet to correct the paper, as Beyene had died,[66][67] in March 2016 four authors asked to be removed as authors, and in April 2016 the Lancet issued a notice of concern;[68] this paper too has since been retracted.

October 2023, 2008 Lancet paper, retracted by the journal for falsification, including the claim that 'the graft immediately provided the recipient with a functional airway, improved her quality of life, and had a normal appearance and mechanical properties at 4 months':[69]

October 2023, 2013 Lancet paper, retracted by the journal for containing fabrication and falsification in several places, and three falsified figures:[70]

Opinion by ACM07:

When this article was brought to my attention and asked how this could have gotten past the peer reviewers. I countered with, “How did it get past the copyeditors?” While this was just playful ribbing, I think we both had a point, which is also expressed by others in this discussion. There are many touch points in the publication process designed to validate, fact-check, and correct content, and each of those touch points failed. 
As someone who works on the technology side of scholarly publishing, I can confirm that the tools are not yet mature enough to easily catch AI-generated content in the same way that there are tools for plagiarism checking, methodology checking, and reference checking. However, it is just a matter of time. Image checking has a similar problem. There are tools to catch image manipulation, but AI-generated images are not manipulated, which means that catching fake images is much more difficult than catching altered images. 

The use of AI is not new, we all use spell check, and as AI gets more integrated into the writing tools we use every day, it is going to become a generally accepted part of the writing process. The concern should not be that an AI tool was used to prepare research for publication, the concern should be that an AI tool has made up the research out of whole cloth. The impulse to create fake science comes out of the publish-or-perish culture in academia. I recently participated in a peer review innovations workshop where the topic of research integrity became a major talking point. The group consisted mostly of publishers and editors who felt that institutions, like universities and research centers, needed to take responsibility for their researchers’ actions since those institutions ultimately have the most influence over researcher careers and funding sources.

Having worked for 30 years in peer review, as a desk editor, and as a systems provider, I also agree that the peer review process is extremely challenging due to the excess of research papers and the tendency of editors to utilize the same pool of reviewers again and again. There is a need for diversity in the reviewer pool, which would not only increase the pool of reviewers but would also bring in new and diverse voices. We need collaborative peer review where senior researchers work with junior researchers, openly and formally, to teach them how to peer review.  There are some good examples of peer review services that are doing some of this today. For example, PREreview, Review Commons, PCI, and PreLights.

Finally, some additional strategies for combating AI-generated research include FAIR and open data sharing, Registered Reports, transparent peer review, and documentation of the editorial process through a mechanism like DocMaps.

Acknowledgment:

We extend our sincere appreciation to the ACSE advisors Dikran Toroser, Haseeb Md. Irfanullah, Laura Dormer, Lisa DeTora, Mingfang Lu, Tony Alves and Vesna Skrbinjek for contributing their insights on this critical issue. We hope that their feedback will prove invaluable in addressing similar challenges in the future.

Keywords

AI-Generated Publications ACSE Advisory Cabinet Retraction Watch Elsevier

Recent Articles

World Brain Launches New Version of Eliza, an AI-Powered Peer Review Tool
World Brain Launches New Version of Eliza, an AI-Powered Peer Review Tool

World Brain, a leading innovator in AI-driven solutions for the academic publishing industry, has launched an enhanced vers...

Read more ⟶

Declining International Research Partnerships Reflect China's New Strategy
Declining International Research Partnerships Reflect China's New Strategy

China's participation in foreign research collaborations has been progressively decreasing, a trend that is indicative of a m...

Read more ⟶

Innovation and Technology Highlighted in Peer Review Week 2024
Innovation and Technology Highlighted in Peer Review Week 2024

The Asian Council of Science Editors (ACSE) is pleased to announce the return of Peer Review Week, an annual event that uni...

Read more ⟶