You are here
Home > News > ChatGPT vs accounting students: who comes out on top?

ChatGPT vs accounting students: who comes out on top?

chatGPT

Tāmaki Makaurau – Auckland university students outperformed ChatGPT overall in a massive crowd-sourced study utilising more than 25,000 questions from 186 institutions’ accounting assessments.

The study also found that the artificial intelligence tool sometimes made up facts, made nonsensical errors such as adding two numbers in a subtraction problem, and often provided descriptive explanations for its answers, even if they were incorrect.

The study’s 328 co-authors from around the world, including University of Auckland accounting and finance academics Ruth Dimes and David Hay, entered assessment questions into ChatGPT-3 and evaluated the accuracy of its responses between December 2022 and January 2023.

Ruth Dimes, who directs the university business master’s programme utilised two recent exams from the analysing financial statements course.

“I entered the exam questions into ChatGPT and recorded how it performed compared to the students’ grades. My findings were consistent with the study overall and I was surprised that ChatGPT didn’t perform as well as I thought it might have,” she says.

Professor of Auditing David Hay used exam and test questions from the auditing course and found that the bot was able to perform slightly better in auditing courses compared to financial accounting courses, but still not as well as the students.

The study, led by Professor David Wood of Brigham Young University in Utah, includes a total of 25,817 questions (25,181 gradable by ChatGPT) that appeared across 869 different class assessments.

It posed 2268 questions from textbook test banks covering topics such as accounting information systems (AIS), auditing, financial accounting, managerial accounting, and tax.

The co-authors evaluated ChatGPT’s answers to the questions they entered and determined whether they were correct, partially correct, or incorrect.

The results indicate that across all assessments, students scored an average of 76.7 percent, while ChatGPT scored 47.4 percent based on fully correct answers.

However, after giving ChatGPT some credit for partially correct answers, it would have scraped through many courses with an average of 56.5 percent overall.

The study also revealed differences in ChatGPT’s performance based on the topic area of the assessment. Specifically, the chatbot performed relatively better on AIS and auditing assessments compared to tax, financial, and managerial assessments.

Dimes says she’s interested in seeing how newer versions of ChatGPT and other AI tools would perform if a similar study were undertaken at another point in time.

Top