1 Introduction

Natural language processing (NLP) has witnessed remarkable advancements over the years, revolutionizing the ways in which we interact with machines and bridging the gap between human language and artificial intelligence. One critical aspect that has played a pivotal role in shaping the capabilities of NLP algorithms is the design and usage of language prompts and queries. A prompt, or a set of instructions provided to a large language model (LLM) that programs the LLM by customizing it and/or enhancing or refining its capabilities, affords a dialogic conversation with its user; it can not only produce output specific to the user’s request communicated via prompts, but also provide suggestions to improve the query itself which might potentially result in a more tailored response. In particular, a prompt sets the context for the conversation and tells the LLM what information is important and what the desired output form and content should be. This process of meticulously composing prompts using natural language to harness the capabilities of generative AI, to obtain precise, contextually relevant information is known as prompt engineering. Since the release of GPT-3, Open AI’s generative language model, in July 2020, there has been a renewed interest in strategizing prompts that can then be fed to applications based on GPT to produce effective content more efficiently. However, most strategies are based on trials and tests to establish best practices for prompt design. None of them are grounded in theories that are foundational to effective writing. This article is an attempt to initiate conversations by blending the architectural workflow of algorithms with writing studies research for effective content generation.

Before diving into prompt engineering, we wanted to bring forth some applications that make research on prompt engineering more important than ever. Generative AI applications, especially the latest ChatGPT released on November 30, 2022, have garnered attention from all disciplines, especially writing communities, for its detailed responses and articulate answers across many domains of knowledge. Despite some concerns of specificity and inherent bias, the fluency of the tool, ease of use, statefulness (its ability to remember previous prompts provided in the same conversation) makes ChatGPT a powerful tool for content development, raising questions about AI’s potential to almost replace writers’ jobs (AIContentfy 2023). The doubts are not unfounded. Media agencies have used natural language processors (NLP) like Open AI’s GPT for more than 7 years. In 2015, Associated Press (AP) generated around 200 articles per minute, and by 2016, the efficiency increased to 2000 articles per second (Sharma 2021). Beyond the ability to produce texts that are indistinguishable from that of a human writer, features such as sentiment analysis, speech and voice recognition, customer service automation, customer feedback analysis, and others (that provide language parsing and modeling) have made NLP applications useful to different sectors like finance, health and business. The technical and professional communication (TPC) field is not unaffected. Tools like GPT can write emails, memos, product documentation, and full length essays, indistinguishable from those written by human writers, and can also help professional writers in useful tasks like summarizing texts, developing cover letters, and building resumes (Nielsen 2022). They help in the editing process as well. GPTs can automatically edit spelling and grammar errors as they are typed. This helps speed up the editing process while also cutting back the revisions required for each draft. Despite these capabilities, the content generated by AI is not always accurate; it suffers from specificity. AI tools lack high degrees of personalization and intelligent levels of responsiveness. To make them more specific, users need to provide additional information through prompts or other customization options provided by the tool. For example, Custom Instructions is a feature provided by ChatGPT that allows users to set preferences that the tool can consider for a single instance or all future instances of running any query. But how can users determine which preferences are required for the most effective content output?

Recent literature on prompt engineering focuses on different methods of configuring prompts to fine tune content generated by AI. For example, Short and Short (2023)’s work for generative AI in entrepreneurial communication demonstrates ChatGPT’s ability to mimic each celebrity CEO archetype by prompting language in the style of exemplars. White et al. (2023) view prompts as a form of programming that can customize the outputs and interactions with an LLM. They developed a catalog of prompt engineering techniques presented in pattern form that can be applied to solve common problems when conversing with LLMs. Prompt patterns were designed to provide reusable solutions to common problems faced in a particular context only; context being output generation and interaction with LLMs. Another example is Bozkurt and Sharma’s (2023) work for defining prompts. In their short piece they provide a detailed list of attributes and rules for prompt creation such as defining the objective of content to be generated, providing examples within prompts, using clear and concise language, and so on. All these methods differ in terms of their method for fine-tuning prompts. But there are two main similarities: (1) all methods either are based on exploratory approaches to test that their methods work, and (2) the prompts are task specific. For example, Short & Short’s work is for generating entrepreneurial pitches and White et al. focus on software documentation. None of these methods are truly generic in nature. Open AI’s recommendation for prompt generation is similar to Bozkurt and Sharma’s where the method may appear generic with rules and best practices, but does not guarantee effective content. To address this gap, we developed a method based on the rhetorical situation (Bitzer 1968) which is grounded in rhetorical theory that lays the foundation for writing studies making the approach generic and useful across all applications.

For centuries, rhetoricians, followed later by writing studies scholars, have developed systematic approaches for identifying strategies to develop effective and persuasive content that is grounded in theory. The theoretical underpinnings make the strategies generic and useful for all applications of communication. They rely on rhetorical foundations for content generation which ensures that the content developed is context-specific, addresses needs of the audience of the content, is logically accurate, useful, and meets other qualitative requirements. Despite the potential of rhetorical methods, research on how to improve composing using AI in the fields of rhetoric and writing is sparse. The literature that exists studied generative AI through an analysis of its algorithmic components using computational rhetoric (Hart-Davidson 2018; McKee and Porter 2020; Marcus and Davis 2020). The rhetorical analysis of algorithms helps investigate persuasiveness, the role of ethics and the human/non-human agency of algorithms. A closer look combined with experimental genre analysis of AI algorithms will help investigate issues related to teaching writing with the help of these tools, and the impact on writing pedagogy and its future. This became our primary motivation to do this research and this paper is an attempt to explore AI writing from the point of view of rhetorical situation to offer a generalizable way of designing prompts for TPC as well as other fields, and to highlight the role of human augmentation with AI technologies for creating audience-centered content.

Duin and Pedersen (2021) come closest to rhetorically evaluating the intelligence of AI writing systems such as GPT-2 and GPT-3. They argue that such tools create new text based on the text provided, textual production building off a database of existing text, but without collecting knowledge of the rhetorical situation—the scene or setting for writing, which includes the purpose, exigence, and audience and without considering questions of ethics and reasoning. They argue “that machine writing designs see writing as transmission of thought-via-words, rather than as interaction with others” (p. 49). Our work extends Duin and Pedersen’s research through a mixed-methods approach. Our experiments consist of rhetorical genre analysis using both qualitative and quantitative measures to produce a formula that can be used for generalizable prompt engineering.

Through this work, we address the following research questions:

  • (RQ1) How can we leverage human and AI collaborations for writing, especially within technical communication (TC), through rhetorical prompt engineering?

  • (RQ2) What are the implications of using rhetorical prompt engineering on practice and pedagogy?

The paper answers these questions by first describing the background of AI tools as relevant to the field of writing, followed by an explanation on our methods of experimenting with different genres. The results section details our observations on careful comparisons of data generated during the experiments against a rubric which led to development of a rhetorical formula. Finally, we conclude with some takeaways that can be used by TC instructors in their classrooms to teach writing using AI and for TC practitioners to think about the impact of these tools on the workplace.

Although writing impacts a range of fields, for this research we focus on technical and professional communication (TPC) as it has more specific genre conventions which makes it prone to significant disruption due to AI. Limiting this case also allows us to create the necessary boundary conditions for a feasible and impactful research study. In particular, a rhetorical lens provides clear genres of writing that can be tested for prompts used for generating technical content.

2 Background and relevance

LLMs are the closest AI invention to create a huge impact on the role of writers. Through the development of such large-scale natural language models with writing and dialogue capabilities, AI has taken a significant stride towards better natural language understanding (NLU) and dialog generation. Language modeling (LM), one of the most important tasks in modern NLP, is a probabilistic model which helps us to predict the next word or character in a document.

Generative Pre-Trained Transformer (GPT), originating from these models and developed by OpenAI, has the most parameters, is of the largest scale, and the strongest capabilities (as of January 2023). Using a large amount of Internet text data and thousands of books for model training, the latest version—GPT-4, can imitate the natural language patterns of humans nearly perfectly. This language model is closest to predictability and generates meaningful content. With the introduction of pre-training on vast amount of text, GPT has acquired a healthy knowledge, which eventually helping us to solve discriminative tasks such as:

  • Question answering,

  • Semantic similarity assessment,

  • Entailment determination, and

  • Text classification

GPT utilizes a semi-supervised method for NLU, it uses a combination of unsupervised pre-training and supervised fine-tuning. GPT uses a large amount of unlabeled text and several datasets which are very well annotated pre-hand training examples (target tasks). GPT follows a 2-stage training procedure. It uses language modeling objectives on unlabeled data and learns the parameters of neural network models. In the second phase, these parameters are used to a target task using the corresponding supervised objective.

This section explains how such models work and how rhetorical theory can aid these models predict most successful probabilities for content generation.

3 GPT architecture

GPT uses a transformer-based architecture based on neural networks. Knowing GPT architecture is crucial for understanding how these algorithms determine contexts. GPT uses a stack of self-attention layers to learn long-range dependencies in text. Instead of analyzing the meaning of text by defining each word, the self-attention mechanism allows GPT to learn the importance of each word in the input phrase, regardless of its position. This is important for NLP tasks, such as translation and question answering, where the meaning of a sentence can depend on the words that are far apart (Wang et al. 2020; Liu et al. 2020; Hadi et al. 2023). The algorithm learns which aspects of the input to prioritize answering, and to form response text. This is done through six steps as shown in Fig. 1:

  1. 1.

    Input encoding: In this step the input is converted to a high-dimensional vector representation through an embedding layer.

  2. 2.

    Transformer encoder: GPTs consist of multi-layered transformers. This is where self-attention mechanisms and feed-forward networks are built-in. Self-attention allows the model to capture dependencies between different words in the input sequence, while the feed-forward networks process and transform the representations.

  3. 3.

    Contextual embeddings: The distance between words are evaluated by the encoding process to identify contexts that will be used by the next stages. As the input sequence passes through the transformer encoders, each token’s representation is updated in a contextualized manner. This is the most important step for framing input text.

  4. 4.

    Decoding and language generation: Based on the transformer encoding results, GPT generates new text by predicting the probability distribution of the next token based on the context identified in previous stages.

  5. 5.

    Training with masked language modeling: For faster content mapping, GPT models are trained with pre-existing content. The training helps the models to remember certain contextual mappings which produce context-specific content faster with more accuracy.

Fig. 1
figure 1

Generative pre-trained transformers architecture

Thus, contextual embeddings in GPTs assign each word a representation based on its context, thereby capturing uses of words across varied contexts. This type of encoding makes GPTs unique in their ability to work across languages with varying grammars and rules as well as contexts that may have social and cultural nuances.

4 Evolution of GPT

The main difference between the various GPTs is the nature of data used for training and the design of the algorithm. GPT-1 was trained to compress and decompress the contents of the BookCorpus, a 5 GB database consisting of the text content of over 7000 published books compiled by the University of Toronto and MIT. The size of the content was nearly a billion words. Speculating that the accuracy of the Machine Learning model can be improved by feeding more data, the OpenAI team widened the horizons of the model’s data ingestion. GPT-2 ingested the text content of an indigenous data set, consisting of eight million web pages totaling 40 GB of data. GPT-3 took the text ingestion, compression, and decompression to another level. It consumed the text content of the popular CommonCrawl dataset of web pages from 2016 to 2019. Although OpenAI curated it to eliminate redundancy and improve quality, it is nominally 45-terabytes worth of compressed text data.

ChatGPT, the next version of GPT-3, is a general-purpose conversation chatbot based on the GPT-3 language model developed by OpenAI. It is designed to generate human-like text based on a given prompt or conversation and has the ability to engage in natural, open-ended conversations on a wide range of topics (refer to https://openai.com/blog/chatgpt/). Unlike previous versions which were designed to seek input requests via prompts, ChatGPT was trained in a conversation or dialog-based way using reinforced learning from human feedback. This design makes ChatGPT capable of answering follow-up questions, admitting mistakes, challenging incorrect premises, and rejecting inappropriate queries (Zhai 2022). Dialog systems offer creative opportunities for human feedback that prompt-based systems do not. There are other advantages in using conversational systems which have made them extremely popular since 2011 (when Apple introduced Siri). In his work, McTear (2021) described the history of conversational systems and provided a rationale on why researchers develop and use conversational systems. He argued that a dialog system provides a low-barrier-entry for users, enabling them to interact in an intuitive way without having to learn to use a new interface (p. 13). Additionally, the ability to converse in a natural manner, provide relevant responses, and understand the emotional state of a human is one of the high-level cognitive skills that enables social bonding and coordination of actions (p. 13), making users develop trust in the application. GPT-3, which was used for this research, uses prompts to provide commands to the text-generation algorithm (see Methods section for more details). The primary difference between GPT-3 and ChatGPT is the nature of input—prompt-based for GPT-3 and dialog-based for ChatGPT. In both cases, input is key, because these algorithms parse the entire input phrase to determine the context. To understand how that is done, we need to look closely at the architecture of GPT.

There are two notable characteristics of GPT algorithms that we would like to point out here:

  • First, GPTs are only good at context mapping based on proximity of words and word sequences that have occurred previously. To achieve better mapping, either the model needs to be trained on more data, or context needs to be ingested through the prompt itself. Training is expensive, harmful to the environment (Li et al. 2022) and, therefore, doing it over and over again to improve accuracy is not sustainable.

  • Second, context does not always ensure completeness of input. For example, an input phrase: “Write an email for requesting sick leave by employee” does not provide enough context about the position. Therefore, the output may look too generic, not suitable for specific roles such as manager who is requesting the leave and needs reportees to report for someone else.

So we need a more strategic approach towards designing inputs rather than refining outputs (through re-training). That is, designing the prompt or creating the input request is crucial. Researchers are, therefore, studying how to design prompts that will provide the highest value and save iterations. This is termed as prompt engineering.

5 Prompt engineering

Prompt engineering may be defined as any iteration made to a given prompt with the intent of generating a practical, functional output. While iterations may be strictly technically focused, current efforts have considered NLP within prompt manipulations. Specifically, the intended use of prompt engineering and concentration of this paper is mainly based on manually constructed, paraphrased prompts. The increase of such methods may be seen in the rise of tools such as PromptBase, in which users may buy or sell prompts to generate their intended output (Wiggers 2022). As such, discussions primarily concentrate on an ideal prompt ‘template’ to guarantee adequate results.

Prompt engineering has been approached colloquially by users who seek guidance on improving their outputs. OpenAI’s website includes additional suggestions and documentation on prompt engineering, including rules, description outlines and inclusion of context, style, and format (Shieh 2023). For example, one rule suggests using the latest version of GPT to ensure the output reflects the latest modeling learning developments. Another one states that users must always start with the zero-shot method, then few-shot, and if neither of them work, then use fine-tune. (Zero-shot means using a comprehensive list of keywords, and few-shot means using few keywords with few examples.) Current literature attempts to outline other successful techniques and their consequential output. In particular, Liu et al. (2021) defined statistical NLP practices, including those relevant to prompt engineering such as prompt shape, template, and type. Such methods may extend onto permutations of prompts and continuous iterations until the desired result is achieved.

These practices, although expensive, can be useful. But they do not warrant accurate, reliable outputs that can be used for technical communication work. These practices do not guide users to an effective prompt and, as a result, may present a learning curve when interacting with such tools. Moreover, the prompts generated by current practices often neglect context and rhetorical devices, leading to scattered, undirected outputs of which do not pertain to the writer’s focus. Such practices, therefore, present an element of unpredictability and offer limited control on communicators, who are forced to continuously aim directionless at their targets, hoping to accidentally wander into their desired output. Overall, the literature suggests many directions for how to improve prompt engineering. Yet, there is a paucity of work on how to direct users towards better prompt creation from the perspective of writing tasks especially pertaining to those undertaken by writing professionals working on technical writing and related projects.

To ensure effective content development, technical communicators understand language or acts of communication as rhetorical, and rely on principles as rhetoric for developing content that suits the needs of their audiences. Analyzing content through a rhetorical lens allows comparing language formation to the goals it is meant to achieve. Rhetoric is about strategic choices and approaches to communication. For example, when communicating to different types of audiences about the same topic, we make strategic decisions on what details to include or omit, what types of evidence or support to use, and so on. The rhetorical situation is a useful tool to evaluate whether the content meets our goals. The next sections will throw light on what the rhetorical situation is, and ways in which it can be used for prompt engineering.

6 The rhetorical situation

A rhetorical approach to technical communication is intentional or purposeful (Smith 2008). When using a rhetorical approach to communication, a communicator tries to understand the social context of their intended audience, and will be strategic about using language towards a specific purpose. So, a communicator must know whether they are trying to persuade or inform, for example, and they must understand enough about their audience to know what might be persuasive or how they can inform them. Smith (2008) emphasizes that rhetoric—and a rhetorical approach—is not just something used in advertisements or politics, but something that intentional communicators use in every communication situation.

Every communication situation is different. But it can be broken down using components of the rhetorical situation. A rhetorical situation is a natural context of persons, events, objects, relations, and an exigence which strongly invites utterance; this invited utterance participates naturally in the situation, is in many instances necessary to the completion of situational activity, and by means of its participation with situation, obtains its meaning and its rhetorical character (Bitzer 1968). The rhetorical situation, which consists of the exigence, audience, and constraints of a given rhetorical act, becomes a useful reference point for prompt engineering since it helps incorporate external circumstances that define context to which discourse (content generated by AI tools) must respond in fitting ways. In Bitzer’s terms the more specific the rhetorical situation, the more precise its characteristics, including those of the audience, the more it determines the specific features and context of the discourse (Fig. 2).

Fig. 2
figure 2

The rhetorical situation

Context is a key element in the rhetorical situation. The term was first used by Lloyd Bitzer (1968) in “The Rhetorical Situation,” to refer to all the features of audience, purpose, and exigence that serve to create a moment suitable for a rhetorical response. As per Bitzer (1968) rhetorical situation can be understood as a combination of circumstances under which the rhetor or content producer writes or speaks, including:

  • The nature and disposition of the audience,

  • The exigence that requires the rhetor to become part of the dialog creation

  • The writer’s goal or purpose,

  • An understanding of the subject, and

  • The specific context in which the conversation takes place

Our research is heavily influenced by the components of the rhetorical situation that determine the strategic choices about which details to include or not include in the prompts based on the particular rhetorical situation of the required content output.

This paper advances the work on prompt engineering by providing a framework and formula to assist users by drawing on literature on the aforementioned components of the rhetorical situation. In particular, we make the argument and show that creation of useful prompts requires not just the engineering aspect, but also considering the rhetorical situation for tailoring the input in an attempt to design the most suitable output for any communication task; in other words, a careful consideration of multiple communication factors needed for prompt design. The next section will demonstrate this through several experiments and assessments.

It is worth mentioning that in this paper we are not limiting technical writing to the more situational response created by a communicator for an exigence as explained by Bitzer. We surveyed other rhetoricians who have described the significance of analyzing any rhetorical situation to create a fitting response for multiplicities of audiences, genres, and exigencies. For example, we reviewed Vatz’s (1973) work which critiqued Bitzer for locating exigencies beyond the situation as the thing that rhetors respond to, as opposed to the thing that rhetors create for their audiences. Smith and Lybarger (1996) also critiqued Bitzer, arguing that “rhetorical situation involves a plurality of exigences and complex relationships between the audience and a rhetorician’s interest”. This argument was especially valuable in understanding how exigence affects and is affected by multiple agents and constraints.

7 Approach and assessment

This section focuses on the application of the rhetorical situation and prompt engineering to generalize AI text generation inputs. As discussed in the literature, components of the rhetorical situation may compliment prompts or input texts through their similar methods. For our experiments, we chose the GPT-3 model for its exceptional advancement and relevancy. In this section we describe the instruments, procedure, and analysis method integral to our research.

Under a mixed-methods (qualitative and quantitative feedback based) approach, we developed prompt variations prior to the experiment, in a word processing document, and labeled their components of the rhetorical situation (see Appendix for variations). Three genres, user guide, proposal, white paper and memo, were selected for the experiment owing to (1) their relevance to the technical writing field and (2) distinct characteristics that will allow seamless comparisons. After setting the model per the outlined parameters, we copied and pasted the prompts into the OpenAI workspace. We, then, copied and pasted resulting outputs back onto the document to be kept beside their specified prompt. Following this procedure for all prompts, our group acted as two raters for the resulting outputs; we divided this work equally and assigned ourselves nine outputs, each, for grading. Consequently, we, both, assigned a series of grades for our respective text outputs based on criteria defined in the rubric.

7.1 Instruments

7.1.1 Genre and prompt design

Prior to the investigation, we conducted a preliminary pilot across two platforms, OpenAI and Neuroflash, on the following genres: user guide, proposal, white paper, and memo. We selected these genres after a brief comparison of common genres present in the field and those included in a professional and technical communication course syllabus. Moreover, the following genres demonstrate variance in their use of the rhetorical situation and, thus, allow for substantial comparison. As noted in Table 1, we tested two categories of prompt design: one with attention to the rhetorical situation through the inclusion of writer, audience, or genre, and one with minimal reference to their rhetorical elements.

Table 1 Two categories of prompt design used in initial experiment

To further ensure prompts were designed accurately, we conducted a brief survey of literature on genre analysis; this supported the general basis and foundation of resulting prompts. Each genre was assigned a central component of the rhetorical situation (audience, writer, subject) at which prompt variations were accommodated to. We produced a series of six prompt variations; each iteration included another element of the rhetorical situation (show image). Moreover, we focused our investigation on only the use of OpenAI, as it demonstrated greater leverage over text generation and output. Specifically, the ability to modify settings, such as model type, temperature, and length, helped tailor the outputs and maintain consistency across different iterations of generating them over a period of time.

7.1.2 Text model

OpenAI offers various GPT-3 models such as: text-davinci-003, text-curie-001, text-babbage-001, and text-ada-001. We used the latest model, text-davinci-003, to generate outputs. This model was selected for both its relevance and increased capability to generate text. Based on this selection, we set the following parameters to leverage outputs: mode (complete), model (text-davinci-003), temperature (0), maximum length (4000), top P (1), frequency penalty (1), presence penalty (0), best of (1), inject start text (checked), inject restart text (checked), and show probabilities (off). General settings focused on producing maximum output to notice an effect, but remained minimal to consider its potential influence and compromise over prompt iterations (Fig. 3).

Fig. 3
figure 3

OpenAI workspace with specified constraints

7.1.3 Rubric

The assessment rubric for the efficacy of the prompt output was based on an instrument provided by the University of Delaware’s Problem-Based Learning Clearinghouse: a collection of problems, articles, and resources for educators (SERC, n.d.). We modified this rubric based on limitations and strengths gathered from preliminary grading sessions. Limitations include the specificity and measurability of each section’s parameters. The newly developed parameters focus on the presence of defined principles, its effectiveness, and its demonstration in the text. Specifically, to accommodate all genres, we modified the rubric to include relevant sections for readers/audience, conclusion/mechanisms for feedback, and rhetorical purpose. Characteristics were defined based on literature in the technical writing field; this included a search on genre definition and analysis. We removed the following criteria: support for reasons, source sufficiency, source selection, in-text citation, and works cited. From the remaining criteria, outputs were assigned a letter grade for each section; we adjusted the range to fit only A–D, with the removal of letter F to avoid fence-sitting. We modified criteria and its descriptions to apply to all outputs, regardless of genre; alterations are defined below:

  • Audience: (row 1) This criteria are adjusted to include objective characteristics that demonstrate the range at which the audience is considered. Letter grades are assigned based on the presence of the following characteristics:

    • Audience is explicitly mentioned

    • Style and voice are used to compliment audience demographic and attitude

    • Specified recommendations or actions are catered toward the audience

  • Rhetorical Purpose: (row 2) The current rubric retains the original rubric’s description of rhetorical purpose, but includes consideration to the audience. Letter grades are determined by the range at which outputs explain its’ rhetorical purpose and focus on a specific audience.

  • Language Use: (row 3) Criteria for language use were developed based on the original rubric’s grammar section. Letter grades are assigned based on the number of errors seen in the output. Errors are focused on the following elements:

    • Spelling, grammar, word usage, and punctuation

  • Genre: (row 4) Genre is assessed using the original rubric’s description which is based on the range of characteristics used and relevant to genre. This portion is modified to include specific characteristics for each genre type in the pursuit of objectivity. Letter grades are determined by the mention of and adherence to these characteristics.

  • Organization of Writing: (row 5) As this section coincides with genre, the original portion is completely removed and replaced to include specified principles and an organization type for each genre. Letter grades were allotted based on the number of specified principles and the presence of the organization type.

  • Conclusions/Mechanism for Feedback: (row 6) This criterion is loosely based on the original rubric’s section for conclusions/solution. The current rubric modifies this by retitling this section conclusion/mechanisms for feedback and defining specified evidence for each genre. Letter grades are credited for each genre based on the presence of the following:

    • User Guide: additional help or contact information

    • White Paper: evidence for recommendations

    • Memo: providing contact information and statement for inquiries

7.2 Rubric development

See Table 2.

Table 2 Rubric considering AI-generated components to assess work

7.3 Data analysis

Resulting text outputs were kept in a word-processing document, alongside their prompt to be graded by two raters and the rubric. Based on the text output, the two raters assigned grades for each criteria section. In the case of any difference, both raters negotiated to agree on a single grade. All criteria grades were summated to determine the, overall, best scoring text output. This output was compared with its prompt design, structure, and included rhetorical elements to determine and suggest the best strategies to maximize the generated text. We aimed to test if the maximum number of rhetorical elements led to the most effective output.

8 Results

This study aimed to answer the following relevant research questions:

  • (R1) How can we leverage human and AI collaboration in the technical communication field through prompt engineering?

  • (RQ2) What are the implications of using a socio-technical approach for prompt generation for practice and pedagogy?

Our purpose and focus rested on the integration of AI writing tools and the technical writing field both, in education and the workforce. Our findings agree with scholars’ work that suggests that the industry can truly benefit from new technologies when human–machine complementarity is leveraged, especially where AI technology is complemented by human intervention. We see prompt engineering as a method for achieving a collaborative relationship between human and AI that produces outputs that are not only better suited for the audience, but will result in adding to the human knowledge representation. Through rhetorical prompt engineering we can achieve effective automation of content development tasks, but also enhance capabilities of both human and AI.

This section seeks to address this investigation through analysis of the following areas: (1) a generalizable formula for prompt generation; (2) rubric assessment for AI generated content; (3) prompt variation versus text output.

8.1 1. General formula for prompt generation

From the series of experiments, we devised a general, universal formula for prompt generation. This framework is ultimately based on producing effective content with the structure and foundation of the rhetorical situation. The formula and its components are defined below:

$${\text{Prompt}} = (audience + genre + purpose + subject + context + exigence + writer)$$
Audience:

the intended reader demographic of the content and their characteristics

Genre:

preferred genre selection

Purpose:

the author’s purpose for writing or the reader’s purpose for reading

Subject:

focus and details of the content

Context:

the relevant circumstances outside of the content

Exigence:

the demand to write the content

Writer:

the intended writer of the content

To demonstrate prompt engineering with the specified formula, we include an example for the following inquiry: how to install ‘ABC’ software on my computer? Through analysis and recognition of the rhetorical situation, the formula would follow accordingly:

$${\text{Prompt}} = (novice\ user + user\ guide + to\ inform + installation\ of\ ABC\ software + the\ computer\ model\ and\ type + encourage\ software\ use + software{\text{'}}s\ help\ center)$$
Audience:

novice user

Genre:

user guide

Purpose:

to inform

Subject:

installation of ABC software

Context:

computer model and type

Exigence:

encourage software use

Writer:

software’s help center

Using the formula and components defined above, the resulting prompt could be input into an AI writing tool like GPT-3:

“ABC software is looking to encourage software use among their users. A novice user would like to download the software but isn’t sure how. As ABC software’s help center, consider the user model and write a user guide to inform the user on the installation of ABC software.”

It is important to note that the specified prompt is only an instance and other variations could be inputted based on the formula. As such, the defined formula is to act as a template at which writers can, then, develop prompts at their discretion. The intentions of the formula rely on the execution of a fitting response: content that meets the demands of the rhetorical situation and consequently solves the available problem. It is important to note that the defined formula has yet to be considered against tools such as ChatGPT, in which prompts are refined and modified through its dialogue feature—the formula is not to be received as single-use or disposable prompt engineering, but, instead, promotes an iterative process at the discretion of the writer. Writers are encouraged to modify the components of the prompt, their order, and contents, to their liking and based on the generated output. Moreover, the defined syntax aims to provide efficient assistance for varying settings, as to be discussed in the next section.

8.1.1 Implications of using a rhetorical formula

The intentions of our proposed formula are to produce effective, efficient, ethical content which is, most importantly, at the discretion of the writer. For technical communicators, effective content satisfies and fulfills the writer’s intentions; our prompt formula considers this by its demands of the rhetorical situation and its elements. In place of current solutions where the context of the generated text is assumed by the model or writer, our proposed solution aims to frame and contain the content in a context, specified by the writer. Similarly, our proposed formula advocates for efficient content, or the optimal response as envisioned by the writer, by defining the rhetorical elements prior to inputting the prompt. This, ultimately, aims to eliminate current practices’ initial focuses on assumption and guesswork, to provide a reliable, efficient formula. Additionally, to further control and leverage the model’s responses, our methods seek to alleviate potential bias and promote ethical, generated content. Current solutions rely substantially on the model’s assumptions, of which could lead to biased, unethical content—to reduce such frequencies, our prompt defines the audience, writer, and purpose, to lead the model to the desired response. Our formula intends to promote writer and AI collaborations through effective methods of communication, with attention to the control of the writer.

8.1.2 Advantages of using the rhetorical situation over traditional prompt engineering

Traditional LLMs utilize sophisticated neural network architectures, such as Transformers described earlier, which enable them to capture complex patterns and dependencies in language. The training objective is to optimize the model’s parameters to maximize the likelihood of generating the correct next word in a given context. The context depends highly on the occurrence of the same words in a specific sequence in the past (now part of training data). So new instances for context will not work. With strategic prompt design based, we can ensure that the context is always accurate and therefore have more control on the outcome. Some advantages of using the rhetorical formula t are as follows:

  • Finite, specific inputs beyond context to achieve fitting responses for generic applications: The Conditional Transformer Language Model (CTRL) is a language model designed to generate text based on specific control codes or prompts (Hadi et al. 2023). The control codes guide the model’s text generation process, allowing users to specify the desired style, topic, or other characteristics of the generated text. While our formula appears similar to this method, the output of CTRL depends on the user ensuring that all conditions are provided which are not listed finitely. The rhetorical situation elements thus provide a finite listing of inputs (audience + genre + purpose + subject + context + exigence + writer) resulting in generic, reproducible, fitting-responses to users’ communication requests. Additionally, the formula goes beyond context to define elements such as communication purpose, genre and writer’s positionality, which are crucial for achieving most suitable responses.

  • Practical implementation of a human-in-the-loop approach: Strategic designing of prompts by human users provides them an opportunity to participate in the task of content development and not rely completely on decisions made by algorithms which is what human-in-the-loop approach is all about. Prompt designing is a means for selective inclusion of human participation resulting in efficient content development processes which harness the efficiency of intelligent automation while remaining amenable to human feedback while retaining a greater sense of meaning.

  • Eliminate retraining costs: The fine-tuning phase of LLMs is crucial for adapting the models to specific tasks or domains. While fine-tuning helps in producing better outputs, it involves re-training the models on more datasets and control codes that are specific to the expected outputs. For example, if we want to set a context of a manager who is requesting the leave, not just any employee, similar data need to be fed into the training dataset. The model is thus exposed to context-specific prompts and is trained to do tasks that align with the desired output or behavior for the given task. This helps reduce costs incurred for retraining. The power consumption, processing power, and other energy requirements for LLMs are extremely high; over $100 million for the first training (Smith 2023).

  • Ability to learn and create: A common concern among the educational communities is that technologies like generative AI may lead to a loss of creativity and critical thinking skills among students (Hadi et al. 2023). If students rely too heavily on AI to write essays, complete assignments and answer tests, they may not be developing the skills necessary to think critically and solve problems on their own. However, with rhetorical prompt engineering students will learn foundational attributes of effective writing such as understanding the purpose of content, exigence, and main subject of reference. Composing prompts will help them set meaningful expectations from automatic content generation. The ability to look for missing elements in the output will help them critically evaluate the output leading to more impactful writing studies research.

  • Decision making and transparency for social justice: Composition, writing studies, rhetoric, and technical communication scholars all agree that audience analysis is the first step of writing. While promoting human-in-the-loop for decision-making on which audiences does the content address (include), rhetorical prompting helps the user question which audiences are excluded from the conversation. Additionally, audience needs are explicitly configured into the prompt making the users carefully analyze the subjectivities comprising those needs. This process regulates transparency. It also acts as a call for participation for communities that are excluded from the input considerations. Feedback can make visible other components of a communication process that is otherwise hidden under the black-boxed architecture of generative AI algorithms.

8.2 2. Rubric assessment for AI collaborative work

The adaptation of a writing rubric with consideration to AI writing tools outlines and considers the role of the human writer and AI generation. This finding was particularly evident in our modification of the assessment rubric, as we removed, added, and modified areas to consider the collaboration of AI and human writers. In its development, we found areas concerning rhetoric and the rhetorical situation to be integral to the resulting grade; this demonstrated elements defined by the human writer and, therefore, justified its focus in our rubric. With a similar focus and the growing capabilities of text generation models, the resulting rubric suggests that instructors should consider modifying rubrics to incorporate potential AI tools or use. Instances could be integrative, such as a citation requirement, or reflective, such as asking students to edit the generated work and a written reflection based on what was or wasn’t effective in the content. In either case, the rubric assessment represents potential considerations to assigning grades to AI collaborative work, as well as discussions on what element is most representative of the students’ learning.

8.3 3. Prompt variation versus text output

Across all outputs, we noticed differences in the structure of the text, degree of context, attention to audience, and style and voice. Each variation demonstrated the effect of all added, rhetorical components, suggesting the model’s interpretation of the prompt and its elements.

With the exception of one, all genres saw a linear increase in total grades as prompts became well-defined, varied, and complex with the included rhetorical components. It should be noted the exception to this statement was due to the model’s response to ambiguity; specifically, we found the relationship between the intended writer and reader was not clearly defined, leading to a limited output. This instance, though, reflects the importance of a well-defined rhetorical situation and is, further, supported by the quality of the final iterations between the genres, in which all seven components are stated in the prompt.

9 Discussion

The primary motivation for this research was to understand the role of AI in writing, especially for professionals working in the field of technical writing but our findings have implications also for other writing practitioners, researchers, and instructors. The takeaways from this research will enable them to partner with AI technologies in more practical ways. Through a series of experiments, we demonstrated that by providing the rhetorical situation elements as inputs, writers and communicators can get AI tools to produce the most effective content. We condensed our findings into a prompt design—guidelines to develop a prompt for input—to be used by researchers and practitioners, that can be viewed as a humanities approach to prompt engineering. We also provide a rubric that can be used to evaluate outputs thereby completing the input–output circle of writing with AI technologies.

Our experiments gave us empirical grounding to focus on the structure, design, and care of prompts that otherwise make prompt engineering approaches messy, overwhelming, boundless, and inexhaustive. This grounding helps us analyze the implications of prompt engineering in technical and professional communication spaces more closely.

10 Applications of prompts and prompt engineering for writing and communicating

Prompt engineering is a process used in the development of conversational AI systems, specifically in language models like GPT, to control the output generated by the model in response to an input. The goal of prompt engineering is to design the input or “prompt” given to a language model in such a way that it generates a response that is relevant, accurate, and consistent with the intended use of the model. When we analyze technical writing rhetorically, we evaluate effective writing as the one that helps us generate a fitting response. Bitzer (1968) states that a communication situation does not invite just any response; an important characteristic of the rhetorical situation is that it invites a response that fits the situation—“a fitting response” (p. 10). Following from that, he explains that to say that a rhetorical response fits a situation is to say that it meets the requirements established by the situation. This indicates that the situation must itself, somehow, prescribe the response which fits. Prompts can be one method of composing the prescription.

Well-defined prompts can lead to successful implementations of several applications in the communications field. For example, whether we want to write an email or summarize it (Thiergart et al. 2021), or get a suitable response from a chatbot (Heo and Lee 2018)—providing details about what needs to be communicated (purpose and exigence), who the AI is writing for (writer), who will read the output (audience) and what do they already know (subject + context), and what should the output look like (genre), can not only generate a successful response, but also meet clear performance targets (Heo and Lee 2018).

11 Prompt engineering for social justice

Like all AI products, writing algorithms like GPT-3 (including the latest ChatGPT as of February 2023) has the potential to learn biases from the dataset that is used to train them. This hasn’t changed much since Floridi and Chiriatti (2020) conducted experiments to test AI’s mathematical, semantic and ethical capabilities. For the latest version – ChatGPT, OpenAI has attempted to hard code rules that will automatically decline to respond to inappropriate or unethical requests such as “offer up any merits to Nazi ideology” (Getahun 2023). However, success is not guaranteed. Most research so far on prompt engineering, and AI in general, has focused on getting superior outputs with each new version (Akyürek et al. 2022) and towards increasing the speed and accuracy of results. As our dependency on AI systems increases, especially due to its powerful capabilities affecting all aspects of life including safety–critical systems such as airplanes, cars, space shuttles and military systems (Hoehn 2021; Kaklauskas 2015), we need more strategies and awareness that can enable ethical and moral decision-making. Creating good prompts for models is one way of improving the quality of output and bringing less bias. Prompting helps reduce the bias which gets introduced through training corpora, so researchers have started implementing manual and automated prompting mechanisms. The idea behind such designs is to create a diverse set of inputs that will help the model to be robust to different inputs. Manual methods were developed as early as 2019. On the other hand, GPT-3 can automatically handle a wide range of tasks with only a few examples by extracting context from natural-language prompts and task demonstrations without many changes to the underlying model (Gao et al. 2020). Despite this, bias related errors are introduced owing to language semantics. The solution given by Zhao et al. (2021) and Holtzman et al. (2021) is calibration: adding compensation to the biased tokens so that they are calibrated to an unbiased status. Rhetorical situations can be used as a tool to achieve such calibrations.

After Bitzer’s piece (1968), several other scholars started analyzing the rhetorical situation. For Biesecker (1989), meaning is constructed by looking beyond binary constructions of speaker/situation, looking instead in the “differencing zone” (p. 118). She argued that while audience is an important element, we need to reconsider that the audience as unstable and shifting (p. 126), in other words—diverse and constantly changing. In her piece, she also discusses the advantage of this perspective in seeing the rhetorical situation as an event that makes possible the production of identities and social relations (p. 126). We can use this approach for identifying the multiplicities of audiences in a rhetorical situation to develop a prompt that generates content to not only address the needs of audiences, but ensures that it does not marginalize certain populations, does not cause harm, and is generally morally and ethically sound.

12 Pedagogical implications

Rapid progress in AI technologies has generated considerable interest in their potential to address challenges in every field and education is no exception. AI literacy has become especially important in the current era of technology driven personalization. AI agents such as virtual assistants, robots, AI writers, etc. are still in their early stages of development and far from perfect; yet they are able to converse with a human, produce texts that are indistinguishable from that of a human writer, and assess work performed by humans if given a checklist of objectives. These capabilities have found several applications in pedagogy, both for students and instructors, and the excitement and curiosity about these applications is only growing each day.

Formal studies show that AI applications in the form of intelligent learning systems are popular in AI education. Such applications collect and analyze students’ online learning behavior and performance and provide personal guidance or feedback based on their learning status thus creating an adaptive learning environment (Srinivasan 2022). Another category is that of interactive systems—using AR/VR and games as a pedagogical tool for learning and/or as a learning companion is popular (Monahan et al. 2008). Communication studies have seen the growth of students’ use of AI word processing tools which can automatically edit spelling and grammar errors as you type. NLP and rule-based engines allow such tools to help users identify errors in language (grammar and sentence structures) and mechanics (such as punctuation, capitalization, abbreviations) or fix them automatically. Popular media have been seeing a lot of cases on automated writing with applications such as ChatGPT. Such automated writing tools can be used by students to generate essays that pass plagiarism tests. Along with aiding the learning process, the use of AI tools in writing helps students deal with some humanistic concerns in writing, such as motivation and anxiety. AI tools help in penetrating the writer’s block, and the automated feedback is almost always perceived as unbiased or indifferent (Rudenko-Morgun et al. 2023) compared to a human’s. Prompt engineering can provide students with skills to apply critical analysis to assist with AI work in productive ways. Missing from these conversations is the need to teach students the role of human-in-the-loop, that is, the role they play in getting outputs from AI tools. Scholars rely on approaches such as Actor Network Theory, posthumanism, assemblages, etc. inspired by new materialism and other media theories to analyze human interaction with technology. Although such approaches open up possibilities to survey different heterogeneous elements in these socio-technical systems, they can be overwhelming and well beyond scope for classes focused on writing and communication. Therefore, the approach of using rhetorical situations can provide students with a practical framework to reflect on their work in AI-enabled professional environments.

For teachers, AI tools provide several affordances, new challenges as well as opportunities to adapt their teaching practices. AI tools have been used by teachers for assessment purposes for several years (Dhuri and Jain 2020). With students’ extensive use of AI technologies, as is predicted, teachers will have to find new ways of evaluating assignments, especially since plagiarism checkers, like TurnItIn, fail to detect work done by AI tools. Other challenges are that AI tools may occasionally generate incorrect information. The issue of bias is also prevalent since tools like GPT are trained on pre-existing datasets. Teachers will have to find new pedagogical tools that will prove effective in AI environments. Prompt engineering models like the one provided in this article can help teachers include a reflective component to the course. Other approaches of using AI tools in teaching include developing AI assisted role-playing to help students make ethical decisions (Hingle et al. 2021), enabling collaborations and discussions in classrooms (Miller 2022), peer evaluations in class, reflective assignments to question the AI tool’s work, and so on.

The discussion of writing with AI tools, or assistive writing, has been a popular line of study in literacy which dates as far back as 2007 (Sternberg et al. 2007). Beyond composing, literacy scholars have studied AI technologies by analyzing algorithmic design and big data perspectives to understand students’ experiences of reading and writing with algorithms, especially with respect to identity, agency, authority, adversary, conversational resource, audience, and so on (Leander and Burriss 2020). To study human interactions with AI in education we need more rigorous engagement with changing technologies, as well as new ways of conceiving digital literacies than are found in representational paradigms (Leander and Burriss 2020).

13 Conclusion

The most crucial skill to provide the public when it comes to the use of AI is to empower them to understand the human-in-the-loop aspect of using AI technologies. Rhetorical analysis provides useful frameworks to do this work. This article provides a strategic way of using rhetorical situations for prompt engineering which has both practical implications, as well as ethical. The increasing capabilities of AI writing tools suggests a potential shift toward human-AI collaborative writing in both professional and educational settings. As this transition is varied, our current discussion does not consider all variables and prospective circumstances. Specifically, we have focused our study on Bitzer’s rhetorical situation theory and have not considered more or other interpretations of the rhetorical situation. Other related argumentation may have potentially transformed our formula by adhering to different applications of AI writing, relative to other interpretations of rhetoric. Moreover, our methodology only considers the judgment of two raters; our findings may have been a result of bias and understanding the intention of the study. As discussed above, our findings also found an anomaly to the assumption that more rhetorical elements would lead to an effective output. Limitations present, then, in our prompts or methodology, as it does not demonstrate a result completely consistent to our hypothesis and other findings. As such, most limitations are unable to address all nuances relative to the rhetorical situation and prompt engineering. This, however, presents instances and suggestions for future work to fully address all elements. Such recommendations include the use of other genres as the current study focuses solely on selective genres relevant to technical writing. These findings may present differently in consideration to other styles, organizations, and contexts of writing. Additionally, future studies may concentrate on the line between human writer and AI tool; this may include objectively and accurately defining roles delegated to AI writing tools and credit assigned to the writer, themselves. The current study demonstrates this line with the use of the rhetorical situation, a writing tool to be devised by the writer, but does not concentrate on this point. Similarly, as aforementioned, the current efforts focus on a specific analysis of the rhetorical situation; this presents an opportunity for future consideration to different interpretations of the rhetorical situation, prompt engineering, and their potential impacts on AI generated work. This variance may demonstrate further levels of control in using AI writing tools. As the contexts and circumstances in which AI tools are used vary, further work may consider these possibilities.