Studying LLM Usage in Software Engineering (S5)

Description

Studying how software engineers use LLMs and LLM-based tools is important for understanding the current state of practice in SE. Researchers can observe software engineers’ usage of LLM-based tools in the field, or study if and how they adopt such tools, their usage patterns, as well as perceived benefits and challenges. Surveys, interviews, observational studies, or analysis of usage logs can provide insights into how LLMs are integrated into development processes, how they influence decision making, and what factors affect their acceptance and effectiveness. Such studies can inform improvements for existing LLM-based tools, motivate the design of novel tools, or derive best practices for LLM-assisted software engineering. They can also uncover risks or deficiencies of existing tools.

Example(s)

Based on a convergent mixed-methods study, Russo (2024) has found that early adoption of generative AI by software engineers is primarily driven by compatibility with existing workflows (Russo 2024). Khojah et al. (2024) investigated the use of ChatGPT (GPT-3.5) by professional software engineers in a week-long observational study (Khojah et al. 2024). They found that most developers do not use the code generated by ChatGPT directly but instead use the output as a guide to implement their own solutions. Azanza et al. (2024)’s case study found that LLMs could “enhance personalized, instant onboarding support; however, relying on proprietary external LLMs poses significant data privacy risks” (Azanza et al. 2024). Jahic and Sami (2024) found that most participants from 15 software companies they surveyed had already adopted AI (especially ChatGPT) for SE tasks; but cited copyright and privacy issues, as well as inconsistent or low-quality outputs, as barriers to adoption (Jahic and Sami 2024). Retrospective studies that analyze data generated while developers use LLMs can provide additional insights into human-LLM interactions. For example, researchers can employ data mining methods to build large-scale conversation datasets, such as the DevGPT dataset introduced by Xiao et al. (2024) (Xiao et al. 2024). Conversations can then be analyzed using quantitative (Rabbi et al. 2024) and qualitative (Mohamed, Parvin, and Parra 2024) analysis methods.

References

Azanza, Maider, Juanan Pereira, Arantza Irastorza, and Aritz Galdos. 2024. “Can LLMs Facilitate Onboarding Software Developers? An Ongoing Industrial Case Study.” In 36th International Conference on Software Engineering Education and Training, CSEE&t 2024, 1–6. IEEE. https://doi.org/10.1109/CSEET62301.2024.10662989.

Jahic, Jasmin, and Ashkan Sami. 2024. “State of Practice: LLMs in Software Engineering and Software Architecture.” In 21st IEEE International Conference on Software Architecture, ICSA 2024 - Companion, Hyderabad, India, June 4-8, 2024, 311–18. IEEE. https://doi.org/10.1109/ICSA-C63560.2024.00059.

Khojah, Ranim, Mazen Mohamad, Philipp Leitner, and Francisco Gomes de Oliveira Neto. 2024. “Beyond Code Generation: An Observational Study of ChatGPT Usage in Software Engineering Practice.” Proc. ACM Softw. Eng. 1 (FSE): 1819–40. https://doi.org/10.1145/3660788.

Mohamed, Suad, Abdullah Parvin, and Esteban Parra. 2024. “Chatting with AI: Deciphering Developer Conversations with ChatGPT.” In 21st IEEE/ACM International Conference on Mining Software Repositories, MSR 2024, Lisbon, Portugal, April 15-16, 2024, edited by Diomidis Spinellis, Alberto Bacchelli, and Eleni Constantinou, 187–91. ACM. https://doi.org/10.1145/3643991.3645078.

Rabbi, Md. Fazle, Arifa I. Champa, Minhaz Fahim Zibran, and Md. Rakibul Islam. 2024. “AI Writes, We Analyze: The ChatGPT Python Code Saga.” In 21st IEEE/ACM International Conference on Mining Software Repositories, MSR 2024, Lisbon, Portugal, April 15-16, 2024, edited by Diomidis Spinellis, Alberto Bacchelli, and Eleni Constantinou, 177–81. ACM. https://doi.org/10.1145/3643991.3645076.

Russo, Daniel. 2024. “Navigating the Complexity of Generative AI Adoption in Software Engineering.” ACM Transactions on Software Engineering and Methodology 33 (5): 1–50.

Xiao, Tao, Christoph Treude, Hideaki Hata, and Kenichi Matsumoto. 2024. “DevGPT: Studying Developer-ChatGPT Conversations.” In 21st IEEE/ACM International Conference on Mining Software Repositories, MSR 2024, Lisbon, Portugal, April 15-16, 2024, edited by Diomidis Spinellis, Alberto Bacchelli, and Eleni Constantinou, 227–30. ACM. https://doi.org/10.1145/3643991.3648400.