1 Introduction

Concerns about job degradation and unemployment due to the introduction of AI-related technology have intensified after the emergence of Large Language Models (LLMs), which are portrayed as being capable of replacing humans in key aspects of mental work, and therefore as being capable of replacing even humans with advanced training. Questions regarding the replacement of humans by computing artifacts of all kinds are not new, as they emerged concurrently with capitalist modernity [1]. There are, however, important differences. Of direct relevance to ethics is the difference between an earlier period of electronic computing, in which the issue of a technologically-induced loss of jobs and the impoverishment of their content was considered to fall under ‘social’ policy, and a recent one, in which the discussion of this issue is overwhelmed by an interest in the rapidly expanding field that is known as ‘ethics of technology’, which has AI ethics at its core.Footnote 1 As some of us have recently argued, this transition overlaps with the change from ideologizing the end to work in the context of a passage to a ‘postindustrial society’ to a talk about a ‘4th Industrial Revolution’. Of special interest to the issue of technologically-induced loss/degradation of jobs is the associated passage from a welfare to a neoliberal version of capitalism, inseparable as this has been from the shifting of the issue from social policy to ethics of technology and from the rhetoric about the coming of the postindustrial order to the emergence of one more industrial revolution [2].

The present article seeks to introduce one additional dimension of this shift, which pertains not only to posing questions about how the use of intelligent machines will change work, but also to the possibility of finding it increasingly difficult to avoid the use of these machines when it comes to formulating these questions. The shock that the introduction of ChatGPT, an exemplar of LLMs and AI more generally, seems to have generated, even among the protagonists of the AI scientific and engineering community [7], suggests that the prospect of having to use a ChatGPT-type of technology in order to formulate the questions to be asked in regard to this ChatGPT-type of technology may be very close, if not already here. Simply put, we may soon (if not already) have to work using ChatGPTs in order to question how ChatGPTs are changing work; we may soon (if not already) find ourselves at the point where the questions about the ethics of using ChatGPTs are being shaped by the use of ChatGPTs.

To elaborate on this prospect, in this article we introduce both a ChatGPT-generated questionnaire about ChatGPT and the responses to this questionnaire from a group of humans (H) and from the use of the ChatGPT (M). In order to make the comparison more reflective, we actually produced two kinds of ChatGPT replies to the questionnaire, one based on the ChatGPT data without the mediation of a question on our part (machine-based answer 1, in short M1), and one after the mediation of a question that asked ChatGPT to reply as if it was a human (M2). Considering how important biases may be promoted through the black-boxed design of AI, having to rely on the use of AI for answers about the ethics of AI, and, in fact even for rasing questions about the ethics of AI, represents a great challenge.Footnote 2 The present article invites attention to this challenge by a first report of, and reflection on, the answers (and questions) received through working with the use of ChatGPT on a questionnaire about loss/degradation of jobs due to the introduction of ChatGPT.

Artificial Intelligence is by now presented as a solution to all problems, from planetary environmental degradation [11] to global health crises [12]. However, as it has been well documented, AI is shaped by the social, political and economic environment from which it emerges (and within which it is embedded), and therefore it ought not to be perceived as neutral [13, 14]. Garvey refers to AI as a “suite of techniques intended to make machines capable of performing tasks considered ‘intelligent’ when performed by people” [15]. As Borenstein and Howard argue, humans are the “root of the problem” regarding ethical issues about artificial intelligence, because, like every other technology, AI is designed and used by people. Ethical issues concerning AI do not appear and disappear magically, but instead they may come from every person (designers, developers, engineers, users, etc.) involved with it in any possible way [16]. The manner in which computer engineers and scientists configure technology plays a crucial role in our daily lives [17]. At the same time, as Garvey argues, the scientists and engineers who configure AI technology tend to focus on its benefits rather than on the problems that need to be addressed [14].

As mentioned above, in former times (1960s–1990s) the discourse was about the postindustrial society that artificial intelligence would bring, while more recently (1990s-present) the discourse about the postindustrial era has been eclipsed and almost everyone talks about one more Industrial Revolution, namely the 4th one. In the debates about an emergence of a postindustrial society, the assumption was that humans would no longer have to work because intelligent machines would do the job for them [2]. In comparison, in discussions about a 4th Industrial Revolution, the emphasis is placed on humans losing their jobs to artificial intelligence [18]. The idea of ‘creative destruction’ that dominates these discussions is that “roughly put, new forms of industrial production destroy previous economic structures, while creating new ones, and, thus, the created new jobs will balance out the ones lost”. Creative destruction “is accompanied by the tacit suggestion that growth is of workers’ profit and well-being”. Discourses about, first, the lack of skills and the ways for overcoming it, second, the power relations inside society, and third, suggestions for a robust public sector, are almost eclipsed. In fact, “the invoking of creative destruction obliterates the discussion of the accompanying power structure” [2].

Concerns about the prospect of the replacement of humans by intelligent artifacts, like the ones expressed in reference to recent and emerging language models, are actually as old as capitalism. Presenting computing artifacts as being capable of artificial intelligence actually has a much deeper past than normally assumed, with artificial intelligence having been mechanical (age of steam age) and electrical (age of electricity) before it became electronic. From early on in industrial modernity, computing artifacts were presented as intelligent, while those who worked with them were portrayed as their mere “attendants”, “keepers” and “operators”. Indicatively, state-of-the-art electrification computers of the interwar period (e.g., “calculating boards” “artificial lines” and “network analyzers”) were strongly ideologized as thinking machines [1, 19].

We here focus on the chatbot “ChatGPT” and its presentation in discourses concerning quality of work and unemployment due to AI. ChatGPT is a “model using Reinforcement Learning from Human Feedback (RLHF)” that “interacts in a conversational way” and is based on the GPT-3 model that has been designed by OpenAI [20]. With the use of ChatGPT, human-like text could be generated in response to human input. This is why it is discussed as a tipping point for AI [21]. The ChatGPT responses are considered appropriate for an expanded range of uses. Customer service and support, content creation, and language translation tasks are only some of them [22].

There is, indeed, a rapidly growing interest in the way that the use of ChatGPT may degrade the quality of jobs and cause unemployment. In December 2022, Nature hosted two articles, entitled “Are ChatGPT and AlphaCode going to replace programmers?”, and, “AI bot ChatGPT writes smart essays—should professors worry?” [23, 24]. In the first one, Castelvecchi criticizes the idea that the use of ChatGPT has already reached a point where it could be a threat to people’s jobs, while in the second, Stokel-Walker argues about the ability of someone to use ChatGPT to write essays and he highlights the risk that this might entail for education. The question that Stokel-Walker tries to answer is whether the use of ChatGPT could actually lead to the writing of assignments that professors could not understand that they were written with the use of AI. Starke, in an article entitled “No, the new AI chatbot ChatGPT won’t take your job”, published in ScienceNorway, argues that the complex questions of an exam cannot actually be answered with the use of ChatGPT. In his view, professors would have no problem understanding if an essay has been written by humans alone or by humans who have used ChatGPT [25].

Discourses and questions about the possible domination of ChatGPT in jobs such as customer service, content creator, marketing, journalism, media, editors, writers, and artists are by now clearly in the foreground. Despite the short period of the availability of ChatGPT (it was released on the 30th of November 2022 [23]), a wealth of articles have already been published regarding its risks and, more specifically, on how it could lead to highly increased levels of unemployment and job degradation. In a Search Engine Journal article, titled “Will ChatGPT Take Your Job?”, Frederick agrees that machines will take our jobs, but argues that this is something that will happen in the distant future. For now, in his view, the use of AI and ChatGPT will only help us to do our jobs “with much greater efficiency and effectiveness” [26]. Marr, in an article titled “How Will ChatGPT Affect Your Job If You Work In Advertising And Marketing?”, published in Forbes, advises people to adopt AI tools such as ChatGPT in their jobs, claiming that said adoption is the only way to stay competitive and to keep up with future needs [27]. In an article titled “ChatGPT and other AI apps are going to create new winners and losers in the job market”, published in Telecoms, Wooden argues that the use of AI systems such as ChatGPT will lead to the replacement of some jobs, but, also, to the creation of others [28]. In a New York Times article titled “Does ChatGPT Mean Robots Are Coming For the Skilled Jobs?”, Krugman argues that AI is replacing jobs that demand manual labor, but could also become a “knowledge-worker”. Even though he believes that AI will “improve our lives in general”, he is concerned that many people may end up unemployed due to AI systems such as ChatGPT [29].

Sanders and Schneier suggest that “if we’re lucky, maybe this kind of strategy-generating A.I. could revitalize the democratization of democracy by giving this kind of lobbying power to the powerless” [30]. In their view, the impact of AI systems such as ChatGPT on unemployment is not a matter of luck, but it depends on the way that these systems are both designed and used by society. Telving argues that AI systems like ChatGPT are not independent of society. Therefore, the issue is how citizens may ask the right questions in order, first, to make organizations and engineers design and develop, and, second, governments regulate technology so that there will be “time to think before launch”. Society should not perceive technology merely as a tool to be used; it should focus on the whole of the process of designing technology [31].

We may conclude this introduction by emphasizing that the rhetoric about the replacement of humans by artificial intelligence, in this case that of ChatGPT, has to be measured against stories that show that new forms of intelligent human work are indispensable for making ChatGPT usable; for example, the story that connects the expectation of the extremely high profits to be gained by the company that owns ChatGPT to the equally extremely low wages of the Kenyan workers employed to clean ChatGPT from extremely toxic content. This is the story that Perrigo tells in an article titled “OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic”, which suggests that prospective profits from ChatGPT presupposed the exposure of Africans to an extremely toxic mental work environment, given that said Africans were being paid $1.32 to $2 per hour to clean ChatGPT from inappropriate content. This content was so mentally toxic that many of the African workers (in this case from Kenya) exposed to it ended up suffering from mental breakdowns [32]. This story comes to match the one told by Hecht, who has shown that the profit from nuclear electricity plants operating in countries of the Global North presupposed the concealing of the extremely toxic manual labor that African workers (from South Africa) were exposed to in the Global South, in the context of mining the uranium needed to run these plants [33]. The story of the ChatGPT Kenyan workers adds legitimacy to the strong concerns that Edwards expressed in regard to the accountability and transparency of LLMs in his “ChatGPT: An Author Without Ethics” [34]. We may just add that stories like this suggest that ChatGPT reproduces a pattern of successfully presenting a (digital) computing machine as intelligent only by concealing the (analog) human labor without which it could not be run (in this case, the human labor to provide the analogy needed in order to filter out the toxic content that could make ChatGPT problematic) [19].

2 Methods

2.1 Research question

Articles in science news magazines like Nature, and in influential newspapers like The New York Times, suggest that AI systems, such as ChatGPT, will lead to unemployment and loss in the quality of jobs. By drawing up a questionnaire with the use of ChatGPT, and by comparing the answers given by humans (H) and with the use of ChatGPT (M), we sought to elaborate on how humans and ChatGPT compare as regards their answers about unemployment and job degradation. We do so by taking advantage of a further comparison, namely that between the replies to the questionnaire by means of ChatGPT when relying only on its data (M1) and by means of ChatGPT after asking it to imitate a human (M2).

2.2 Questionnaire design

The questionnaire was prepared with the use of ChatGPT. We started by asking questions about possible unemployment and loss in the quality of jobs. Taking into account the ChatGPT responses to these questions, we moved on to ask ChatGPT to provide us with a full multiple-choice questionnaire that would consist of 10 questions. The possible answers to every question would be 5 at the very most. Humans could select more than one answer, when appropriate. Noticeably, the questionnaire designed with the use of ChatGPT (Table 1) does not meet the requirements of a five-point Likert scale, where the possible answers to every question should be 5 [35]. Limiting as the lack of a Likert scale may be, we decided to go along with it, given that our purpose was to point to possible limits from the reliance on ChatGPT. As mentioned in the introduction, we decided to ask ChatGPT to respond to the questionnaire twice. When first asked to respond to the questionnaire, ChatGPT replied: my opinions are generated from the data that I've been trained on, and may not necessarily reflect the opinions of a human. To expand upon how this affected the way the questionnaire was answered, we decided to ask ChatGPT to reply to the questionnaire once more, this time by imitating a human. The comment from ChatGPT before doing so was: If I were an average person answering this questionnaire, my answers might be different based on my personal experiences, knowledge, beliefs and opinions… but I can provide a general perspective.

Table 1 The questionnaire designed with the use of ChatGPT, which was answered by humans (H) and by means of ChatGPT, twice; first (M1), based on its data, and second (M2), based on imitating a human

2.3 Sampling and data collection

Information about what ChatGPT is was provided in the introduction to the questionnaire, based on a text generated using ChatGPT, so that even those who were unfamiliar with ChatGPT would know the context. The first question actually was about whether respondents knew about ChatGPT or not. The questionnaire was generated with the use of ChatGPT on January 2023. After the questionnaire took its final form (that of Table 1), it was posted online by means of Google Forms. The questionnaire, communicated via social media, received 216 anonymous responses during the months of January and February 2023, from respondents from Europe and North America. They formed a group that was rather heterogeneous in terms of age, sex and occupation, with most of its members having advanced training. We utilize this as a group of human respondents that can help us to point to the need for the kind of comparisons we propose, leaving open the potential for studies that will enrich this approach through respondents that will form groups that will be more representative of both global and local issues. The data collected were analyzed using statistical software (Microsoft Office Excel) and the results are reported as numerical values and percentages in the “Results and discussion” section.

3 Research ethics

This research was conducted so as to fully guarantee respondent anonymity and personal data confidentiality. Informed consent was obtained from each respondent. Respondents were provided with sufficient information about the research, the data to be collected, and how this data would be used and stored, before answering the questionnaire. They were also asked to give their consent to the future anonymous use of their data for research purposes. Additionally, respondents were informed that the questionnaire they were asked to answer was generated with the use of AI (ChatGPT), as well as about their rights to withdraw from the study and to have their data deleted or anonymized if they would change their minds.

4 Results and discussion

4.1 Human responses

Table 2 presents the demographic characteristics of the respondents in the study. The mean age of the 216 respondents was 30.5 years with a standard deviation of 9.07. In terms of gender, the majority of respondents were male (60.2%), followed by female (36.1%), and other (3.7%). All percentages have been rounded to the 1st decimal place.

Table 2 Demographic characteristics of the respondents

With regard to the education level, the majority of respondents had a graduate degree (54.6%), followed by holders of a university degree (35.2%) and high school graduates (10.2%). Regarding their occupation, the majority of respondents were professionals (e.g., physicians, lawyers and engineers) (41.7%), followed by students (25%), managers (12%), and other (21.3%).

The first question sought to determine if they were familiar with ChatGPT prior to being asked to complete the questionnaire. A total of 48.1% of the respondents reported that they knew what ChatGPT is, while 51.9% reported that they did not know about it.

The respondents were then asked to rate the extent to which they believe language models like ChatGPT will impact quality of jobs and unemployment in the future (Question 2). A total of 46.3% of the respondents reported that they believe that they will strongly impact the quality of jobs and unemployment in the future. This suggests that a significant portion of the respondents are aware of the potential disruption that language models may cause in the job market. 41.2% of the respondents reported that the use of language models like ChatGPT will, to some extent, affect the future quality of jobs and level of unemployment. 12.5% of the respondents reported that they believe that language models like ChatGPT will either not affect much or not affect at all the quality of jobs and unemployment. This suggests that there is some uncertainty or skepticism among respondents about the potential impact of the use of language models on job loss/degradation. Overall, the answers to the second question suggest that the majority of respondents believe that using language models like ChatGPT will have a significant impact on the quality of jobs and unemployment in the future.

Question 3 was about specifying the industries expected to be affected the most. The respondents were able to select more than one answer. The majority (82.4%) answered that language models like ChatGPT will mostly affect the customer service and support industry. The second most affected industry was the one having to do with content creation, which was selected by 49.1% of the respondents. Manufacturing, healthcare, and other industries received 21.3%, 23.1%, and 25% of the answers, respectively.

The fourth question was about the level of concern regarding the possibility that the increased use of language models like ChatGPT would cause job reductions in specific industries. The majority of respondents stated their concern, since 50% reported being somewhat concerned and 25.9% very concerned. However, 22.2% of respondents reported that they are not very concerned and 1.9% not concerned at all. This suggests that the large majority of the respondents expects a negative impact of language models, like ChatGPT, on employment.

Question 5 was about the possibility that the increased use of language models would generate new job opportunities. Opinions were divided, with 45.4% answering that the increased use of language models will lead to some new job opportunities and 43.5% that they do not believe that many new opportunities will be created. Only 6.5% seemed to be sure about the effect of the use of language models on the job market, answering that many new opportunities will be created, while 4.6% reported that they do not believe that any new opportunities will open up at all. The answers to questions 4 and 5 are shown in Fig. 1.

Fig. 1
figure 1

Responses to questions 4 and 5 of the questionnaire

The sixth question was about how society should respond. The majority of respondents (57.4%) answered that society should respond by providing training and retraining programs as well as unemployment benefits. 41.7% of the respondents replied that a universal basic income should be guaranteed and 39.8% that society should support the creation of new jobs. Only 9.3% of the respondents answered that the implementation of automation technology should be slowed down whereas 2.8% selected other options. These results suggest that respondents think that retraining and unemployment benefits are important.

Question 7 was about the overall potential benefits and drawbacks of using language models like ChatGPT. Most respondents (63.9%) selected cost savings, increased efficiency, and improved accuracy in certain tasks as potential benefits. In addition, 57.4% of the respondents selected job displacement and the need for additional training and support for those affected by automation as potential drawbacks. 35.2% responded that improved customer service and support were potential benefits and 30.6% answered that increased productivity was a potential benefit. Only 1.9% of the respondents selected other options. These results suggest that respondents are aware of both the potential benefits and the drawbacks of the use of language models like ChatGPT, and, further, that the benefits could outweigh the drawbacks.

The responses to the eighth question point to respondent diversity regarding awareness of any specific language models like ChatGPT being used in the workforce. 28.7% was aware of pertinent examples, 38% reported not being aware of any such examples, and 27.8% replied that they had heard about it but they did not have enough relevant information. 5.5% answered that they did not know anything about it. These results indicate that the knowledge about the current usage of these models is not extensive.

Question 9 was about the responsibility of governments and organizations to mitigate the negative effects of the use of language models like ChatGPT on employment. A considerable majority (71.3%) replied that governments and organizations have a big responsibility, while 19.4% that they have a moderate responsibility, and 9.3% small or no responsibility. These results suggest that the majority of respondents strongly believe that governments and organizations have a significant responsibility to address the potential negative impact of the use of language models on employment.

The last question (10) was about whether, on the whole, they thought that the development and use of language models like ChatGPT should be regulated to ensure that they are used ethically and responsibly. The majority of respondents (62%) strongly agreed with this statement, 24.1% agreed, 7.4% were neutral, and 6.5% disagreed or strongly disagreed. These results suggest that the majority of respondents believe that the development and use of language models like ChatGPT should be regulated. Responses to questions 9 and 10 are shown in Fig. 2.

Fig. 2
figure 2

Responses to questions 9 and 10 of the questionnaire

4.2 ChatGPT responses

The answers provided by the human group were compared to the answers provided by ChatGPT when based solely on its data (M1), and when asked to imitate a human (M2).

The M1 answer on the knowledge of what ChatGPT is was positive (Question 1) while the M2 was negative. Recalling that 51.9% of the group of humans responded that they did not know what ChatGPT is before opening the questionnaire, it is the M2 answer that agreed with most of the respondents.

As regards the second question, the M1 response was that language models will significantly affect the quality of jobs and unemployment in the future, while the M2 reply was that they will somewhat affect the quality of jobs and unemployment in the future. In this case, the M1 response of ChatGPT corresponds to the most popular answer in the group of humans (46.3%) and the M2 one constitutes the second most popular answer (41.2%).

The M1 and M2 responses were the same as far as what industries are expected to be affected the most (Question 3), them being customer service and support, and content creation. They agree with the ones given by the humans, 82.4% of whom answered customer service and support and 49.1% said content creation.

The M1 answer to the fourth question was “not at all concerned” while the M2 was “somewhat concerned”. Suggestively, the answers given by the humans were 75.9% “very concerned” or “somewhat concerned”. There is here a noticeable difference between M1 and M2, and a clear difference between H and M1.

The M1 response on the potential for new opportunities for employment due to language models (Question 5) was “yes, many new opportunities”, which is the answer that only 6.5% of the respondents gave. By contrast, the M2 response was “yes, some new opportunities”, which is the answer given by the highest percentage of humans (45.4%).

The M1 and M2 answers to the question about society (Question 6) having to invest in providing training programs, retraining, and unemployment benefits were the same, and they agreed with the answer of the majority of the human respondents (57.4%).

Question 7, which was about potential benefits and drawbacks, was answered similarly by M1 and M2, and is in agreement with the preferred H answer (63.9% expecting “cost savings, increased efficiency, and improved accuracy in certain tasks”; 57.4% “job displacement, and the need for additional training and support for those impacted by automation”).

The M1 answer to the eighth question was “yes, I am aware of examples” of language models being used in the workforce. This is the answer given by 28.7% of the humans. The M2 answer, which was “I heard about it, but I don't have enough information”, was chosen by 27.8% of the respondents.

Question 9, on the responsibility of governments and organizations when it comes to mitigating the negative effects of language models on employment, the M1 answer was “big”, just like that of the majority of the humans (71.3%), while the M2 was “moderate”, like the minority of the humans (19.4%).

Finally, the M1 and M2 answers to the tenth question simply agreed that “the development and use of language models like ChatGPT should be regulated to ensure that they are used ethically and responsibly”, while 62% of the members of the human group has chosen “strongly agree”.

5 Conclusion

We may summarize the major agreements and differences in the responses to the questionnaire by H and M, as well as by M1 and M2, by noticing that the M1 answers agreed with the H ones in 2, 3, 6, 7, 9 but differed in 1, 4, 5, 8, 10, while the M2 in 1, 3, 4, 5, 6, 7 and 2, 8, 9, 10, respectively. As for a comparison between the M1 and the M2 answers, they differ in questions 1, 2, 4, 5, 8, 9. For a tabularization of these comparisons, see (Table 3).

Table 3 A comparison of H, M1 and M2 responses to the questionnaire

First, based on the synthetic picture offered by this table, we may note that the answers produced with the use of ChatGPT when based on its data (M1) are, comparatively, rather optimistic. They portray a future in which people should not be concerned about the impact of ChatGPT on the job market, because it would not only create “many new opportunities” for employment for them, but it would also not lead to job losses. This future differs from the one emerging from the answers of the human group of reference, the strong majority of which (75.9%) is very or somewhat concerned that the use of ChatGPT will lead to job losses, while about half of it (48.1%) thinks that the use of ChatGPT will not create many, if at all, new opportunities. We may add here the difference between the M (“agree” for both M1 and M2) and the H (“strongly agree” from 62% of respondents) answers with reference to the need for regulation.

Second, it is worth noting that the responses produced with the use of ChatGPT even after it was asked to imitate a human (M2) agree with the most popular answers from the human group (H) only in 6 out of the 10 questions. The qualitative dimension of the difference in the H and M2 responses is apparent in the responses to Question 9, with 71.3% of the humans thinking that governments and organizations have a “big” responsibility to mitigate the negative effects of language models on employment while the M2 ChatGPT answer is in favor of a “moderate” one.

Third, the difference between the M1 answers, produced by means of ChatGPT and based solely on its data, and the M2 ones, produced with the use of ChatGPT after it was asked to imitate a human, also invites attention. M1 and M2 gave the same answer in only 4 out of the 10 questions. At the very minimum, it invites simultaneous attention both to the data that LLMs rely on and to the way the answers to questions based on this data are affected by the possible mediation of an extra orientation, like the one offered in the case of M2 through asking ChatGPT to imitate a human.

If the use of ChatGPT in designing and responding to similar questionnaires is to appear (or even become) soon (if not already) as a research necessity, there is an urgency to move on to mapping and addressing the ethical challenges involved. In our case, the use of ChatGPT in order to both produce and reply to a questionnaire might have intensified the effect of biases in the research conducted by a human group like that of the authors of this article. For example, biases due to the opaqueness of the data that the use of an LLM generally relies on, which is already a big source of concern for researchers, may be interacting in complex ways with the limits of the data to be specifically collected for a research like ours. In the research this article was based on, we collected only a minimum amount of new data from humans because we were interested in pointing to this complexity through the detection of noticeable variations between human (H) and LLM-type machine-mediated (M) responses, as well as between M responses to questions with (M2) and without (M1) a decisive intervening by the research team.