Author's Commentary on "AI-Generated Diet for Pregnant Women"
As the use of artificial intelligence applications based on Large Language Models (e.g., ChatGPT and Elicit) becomes prevalent in research, discussing possible benefits and shortcomings is crucial. In this scenario, DietGPT combines subjects’ anonymized medical information and food intake with insights from existing literature to offer tailored dietary recommendations. While this is a fictitious scenario, chatbots have been used to offer dietary advice. For example, in February 2022, The National Eating Disorders Association (NEDA) launched its chatbot, called “Tessa,” to offer advice to people seeking help for eating disorders. NEDA disabled Tessa in May 2023 after some complained about receiving harmful advice (more here).
The case prompts researchers to reflect on using software in research, modifying IRB applications, and dealing with accidental findings. By wrestling with these issues, researchers will be better prepared to navigate similar situations in their engagement with software applications and will develop a deeper understanding of involved challenges.
Below are discussion tips for the discussion questions.
- Are there any requirements for using mobile applications in research?
Researchers might consider referring to relevant directives provided by their IRB, highlight Terms of Service, User Agreements, and liability limitations of mobile applications, and consider FDA suggestions on using mobile medical applications.
- After being contacted by the advocacy group, Mandy thinks to herself: “It is not my fault that DietGPT suggestions to some participants did not work. Calculating sugar intake and generating daily meal suggestions for all participants were so complicated and time consuming that I could have not replicated them, so there was no way I could have prevented this.” Is she right to think so?
Researchers could discuss alternatives to using DietGPT and emphasize Mandy's due diligence duty (e.g., read reports and reviews about DietGPT, enquire colleague’s views, test DietGPT and compare its conclusions with similar applications).
- When researchers use mobile applications, who is responsible for harmful suggestions and conclusions and why? What could Mandy have done to mitigate risks?
Researchers could refer to the lack of agency and consciousness of tools to stress that when employing any software or mobile application, users are ultimately responsible. Furthermore, researchers could emphasize Mandy's due diligence duty to mitigate risks.
- Suppose the advocacy group determines that DietGPT has provided culturally insensitive suggestions, leading to their dismissal by non-American mothers. How might this conclusion influence our perspective on the adequacy of Mandy's research design in ensuring participants’ safety?
Researchers could discuss Mandy’s responsibility to recognize and anticipate DietGPT's limitations and their timely communication to participants. For example, Mandy could have incorporated various strategies to her research protocol to evaluate how closely subjects followed DietGPT's suggested meal plans (e.g., using regular check-ins and surveys).
- Non-publication of results (so-called file drawer problem) is unethical and considered wasteful (see here). However, Mandy’s findings were tangential to her main research question and so, this is not a clearcut case. What would you do if you were in Mandy’s shoes?
Researchers could highlight the possibility for alternative publication outlets that encourage publishing unexpected results with the aim to create awareness about DietGPT’s limitations. Additionally, reaching out to colleagues who work on preterm birth to seek advice and explore the possibility of larger follow-up studies would be helpful.
- Mandy has three reasons for not wanting to publish her results. Are her concerns valid?
Researchers can refer to IRB modification process (also called IRB amendment) and relevant institutional regulations (see e.g., NIH Policy 3014-204). Researchers can share experiences about collaboration with colleagues who have not been involved in a study from the start and challenges of dealing with accidental findings or results that were not based on a hypothesis that was being tested.
- The case indicates that Mandy has not lied. Even in her response to the advocacy group, she dodged the question without lying per se. Can you name any flouted moral principles other than honesty and highlight their significance for research?
Researchers can use various frameworks suggested for ethical conduct of research to identify relevant principles. Examples include the Office of Research Integrity shared values and ethical principles of research offered by David Resnik or those offered by Beauchamp & Childress.
- If DietGPT’s suggestions had positively decreased the likelihood of preeclampsia, Mandy would publish results. What would be the ethical way to acknowledge DietGPT’s contribution?
Software applications (including chatbots) should be cited in-text and mentioned among references, but they cannot be authors or be mentioned in the acknowledgements (see here).
- Can you think of any ethical norms that mandates Mandy to publish her tangential results about preterm birth?
Researchers can use various frameworks suggested for ethical conduct of research to identify relevant ethical values and principles. Examples include the Office of Research Integrity shared values and ethical principles of research offered by David Resnik.
- Can you think of any ethical norms that mandates Mandy to communicate her tangential results about preterm birth to participants? White American mothers were not negatively affected. Should they also be informed?
Researchers can discuss erosion of society’s trust in science and use various frameworks suggested for ethical conduct of research to identify relevant principles. Examples include the Office of Research Integrity shared values and ethical principles of research offered by David Resnik.