Risks Associated with the Use of Natural Language Generation: Swiss Civil Liability Law Perspective

Lanz, Marcel; Mijic, Stefan

doi:10.1007/978-3-031-41264-6_17

Marcel Lanz⁹ &
Stefan Mijic¹⁰

Part of the book series: Law, Governance and Technology Series ((LGTS,volume 58))

6344 Accesses

Abstract

The use and improvement of Natural-Language-Generation (NLG) is a recent development that is progressing at a rapid pace. Its benefits range from the easy deployment of auxiliary automation tools for simple repetitive tasks to fully functional advisory bots that can offer help with complex problems and meaningful solutions in various areas. With fully integrated autonomous systems, the question of errors and liability becomes a critical area of concern. While various ways to mitigate and minimize errors are in place and are being improved upon by utilizing different error testing datasets, this does not preclude significant flaws in the generated outputs.

From a legal perspective it must be determined who is responsible for undesired outcomes from NLG-algorithms: Does the manufacturer of the code bear the ultimate responsibility or is it the operator that did not take reasonable measures to minimize the risk of inaccurate or unwanted output? The answer to this question becomes even more complex with third parties interacting with a NLG-algorithm which may alter the outcomes. While traditional tort theory links liability to the possibility of control, NLG may be an application that ignores this notion since NLG-algorithms are not designed to be controlled by a human operator.

You have full access to this open access chapter, Download chapter PDF

Keywords

1 Technical Basics on Natural Language Generation

1.1 Introduction to Technical Aspects

Natural Language Generation (NLG) is a major subfield of Natural Language Processing and Deep Learning overall. Recent breakthroughs in autoregressive models such as OpenAI’s GPT-2 (Radford et al. 2019), GPT-3 (Brown et al. 2019), InstructGPT (Ouyang et al. 2022) or Google’s Primer (So et al. 2021) have led to demonstrations of machine generated texts that were demonstrably difficult or even impossible to distinguish from regular written texts. When using NLG, liability claims can occur in any area where verbal communication is used.

The first applications that commercialize the technology are already starting to be available with self-reinforcing chat bots, automated code generation and others starting to enter the market. In this work we will explore those new application areas and focus on particularly those which would likely be deemed a good fit for well-intentioned use but could lead to undesirable, negative results under certain conditions.

We further examine the capacity of both human and algorithmic detection of machine generated text to mitigate the fast spread of generated content in the form of news articles and others. Building on previous work that has focused on generating legal texts with a previous generation of NLG tools, we train a more advanced auto-regressive transformer model to illustrate ways how such models operate and at what points the operator of the model has a direct or indirect influence on the likely generated output.

In the second part of the article, we examine civil liability issues which may arise when using NLG, particularly focusing on the Directive on defective products and fault-based liability under Swiss Law. Regarding the latter, we discuss specific legal bases that may give rise to liability when NLG is used.^{Footnote 1}

1.2 Risks of Reinforcement Learning

1.2.1 Undesirable Language Generation

A possible way to adapt this and similar model to its users’ inputs is by applying reinforcement learning to the model. One such way using the transformer-based model we introduced earlier, is to add relevant user input to the fine-tuning dataset. This allows the operator to adjust the model to the user’s behavior and in theory improve upon the overall readability and comprehension.

The potential danger with uncontrolled reinforcement learning utilizing unfiltered user inputs as well using a not carefully vetted data source for the main fine-tuning dataset is shown by undesired outputs from the NLG. Two more prominent recent examples include Microsoft Twitter Bot Tay (Schwartz 2019) in 2016 and IBM’s Watson (Madrigal 2013) in 2013.

In the case of Tay, the Bot was training itself on the unfiltered interactions it had with Twitter users that used inflammatory and offensive language. Based on those interactions it would generate inflammatory and offensive language itself even when responding to users that did not use any such language.

One recent approach that is being adopted by OpenAI is content filtering at the input prompt level (Markov et al. 2022). In an environment that requires a high degree of moderation, content filtering can be applied at source to avoid prompts that will likely result in a hateful, violent or otherwise undesirable response. ^{Footnote 2} While this does address to a degree the most extreme detectable input prompts, it does not promise an input bias free response, which has to be addressed at the model level.

1.2.2 Code Generation and Vulnerable Code Data

With the advancement of transformer-based text generation it is starting to become possible to train the models on very specific and technically challenging tasks. One such emerging field is automated code generation based on natural language input. The probably most famous and widely used one is Github Copilot which was released in June 2021.^{Footnote 3} It generates code sequences in a variety of languages given some comment, function name or surrounding code.

Copilot is mostly based on the base GPT-3 model that has been then fine-tuned to Github using open-source code (Chen et al. 2021). Since there is no manual review of each entry, Github advises that the underlying dataset can contain insecure coding patterns that are in turn synthesized to generated code at the end-user level. A first evaluation of the generated code already found approximately 40% of the produced code excerpts to contain security vulnerabilities in scenarios relevant to high-risk Common Weakness Enumeration (Pearce et al. 2021).

The unreviewed generation of potentially vulnerable code would pose a severe risk to the owner of said code, which makes an unreviewed or near-autonomous application of such a tool unlikely to be applied in an autonomous fashion using the underlying models. There are however automated code-review solutions available that inspect a given code passage for any potential quality and security issues (e.g. Sonar ^{Footnote 4}). Enabling a non-technical operator to use natural language to generate simple code excerpts that can be automatically scanned for vulnerabilities and deployed in a test environment that would be used for rapid prototyping would seem like the most reasonable semi-autonomous code generation utilization.

1.3 Detection of Machine Generated Text

With the wide availability of cloud computing allowing for the production of machine generated content and social media allowing for the mass distribution of it, the last barrier remains the quality of the generated texts and the ability of regular content consumers to distinguish it from regular produced texts. Based on previous related work, the ability of non-trained evaluators depends to a certain degree on the subject domain as well as the quality of the model itself (Peric et al. 2021). For excerpts that focused on legal language that was sourced from several decades of US legal opinions, the ability to distinguish ranged from 49% (generated by GPT-2) up to 53% (generated by Transformer-XL), both being close to random. Related work also shows that the accuracy improved for the more creative domain of human written stories with the prompt “Once upon a time” where GPT-2 achieved a result already 62% and GPT-3 again a random-guessing value of 49% (Clark et al. 2021). While it is unlikely that we will see machine generated literature ready for mass consumption any time soon, one concerning factor is that the accuracy value for detecting news articles is also at 57% for GPT-2 and a basic random-guess value of 51% for GPT-3.

While most consumers have difficulties differentiating between machine and human generated texts, the same models can be trained to differentiate between those two. If the applied model (GPT-2/Transformer-XL) is known beforehand the rate of detection was 94% to 97% high, while not knowing the model in advance resulted in a detection rate of 74% to 76%. Possible detection would therefore likely be especially hard for models that are not open sourced and cannot easily be replicated. This will likely be increasingly the case as it has been with GPT-3 that has been licensed to Microsoft with the model source code not publicly available.^{Footnote 5}

1.4 Operator Influence on Output

1.4.1 General Remarks

While setting up tools that generate text content, there are only a few options to influence the output. The first and most basic layer of most NLG algorithms is the basic dataset that is used to train the base model. In the case of GPT-2 with its 1.5 billion it was trained on the data of 8 million Web Pages or 40GB of Internet text. The way the selection was done was partly through selecting outbound web pages from Reddit that received “3 Karma” as a way of somewhat human quality selection. This layer cannot be replicated in most cases from the end-users of the tool and has to be taken as is (while the option to train the model from scratch is present but difficult to implement on a sufficiently sophisticated level for most end-users).

The second and more influenceable layer is the dataset that is used to “fine-tune” the model. This step allows end-users to specialize their output to a certain domain, a certain language style or similar. Here the end-users have the highest degree of influence on the actual NLG output that will be generated. A particular domain area, such as “legal language” for example allows users to specialize the output of the generated language to sound quite similar and even identical to qualified legal language. Current trends for LLMs as well as technical limitations of most operators will make this layer increasingly unaccessible with most models only allowing operator interaction via commercial API. ^{Footnote 6} This approach limits operators’ influence, but also leaves an auditable utilization trace that can then always be tracked back to the provider of the used model.

The third and most direct change of quality output language can be set at the basic parameter settings of the model. Those are usually and specifically in the case of our example model the desired output text length, the initial text prompt, and the “temperature”. The temperature here allows the end-user to increase the likelihood of high probability words and decrease the likelihood of low probability words, which often results in usually more coherent text with a slightly higher temperature (Von Platen 2020). This layer is the most easily modifiable and would be most interacted with on the side of the end-user. Those parameters will likely become less available for most commercialized applications of LLMs, leaving the model provider more influence to optimize parameters and outputs based on existing optimized result lengths and probability scores.

An additional parameter that is also sometimes in place is to exclude any foul language from being generated. This can mean that even if the given text prompt or underlying training or fine-tuning dataset would contain foul language the model would still never output any words that are considered to be offensive based on a set keyword list.

1.4.2 Data and Methods

To further illustrate the direct impact the parameter setting of the operator has on the output, we trained a GPT-Neo (Black et al. 2021) model on a legal text dataset, that applies some of the methods and data of a previous work, while using a newer and more advanced model (Peric et al. 2021).

Our empirical setting is U.S. Circuit Courts, the intermediate appellate courts in the federal court system. Circuit Court judges review the decisions of the District Courts, deciding whether to affirm or reverse. The judges explain their decision by providing a written opinion. Our corpus comprises 50,000 of these U.S. Circuit Court opinions, uniformly sampled from the universe of opinions for the years 1890 through 2010.1 The sample includes both lead (majority) opinions and addendum opinions (concurrences and dissents). We undertake minimal pre-processing, so that our generator can replicate the original style of the texts. We do remove some metadata and XML markup but keep capitalization, punctuation, etc. We preserve the special legal citation notation used by U.S. courts. The opinions are in general quite lengthy, containing an average of 2024 tokens (words) per article. The average length gradually decreased from the 1890s reaching a minimum in the 1970s. After that, the average length of these opinions has grown steadily until the present day. Notably, it was around 1970 when digital legal research databases came into use. Our approach to representing legal documents is an auto-regressive language model. We are given an unsupervised corpus. We fine-tuned an existing GPT-Neo checkpoint on our legal corpus.

1.4.3 Samples of Operator Influence

We demonstrate how the end-user can interact with the model based on the fine-tuning of the model and based on the parameter selection with the following samples using our reference model. In our first sample (c.) we use the full-sized dataset with the default temperature value applied that generates some meaningful results.

In the other two samples, we used the standard full-sized dataset, but now changed the temperature parameter to 0.9 (d.) and 0.1 (e.) respectively. We observe that the samples set for a higher temperature also have a better readability that doesn’t consist of too many listed legal precedents, but rather more natural language.

(a.) Prompt: The Tenth Circuit contravened those settled principles here. Not one of the decisions relied upon (…)
(b.) Genuine Ending: (…) by the Court of Appeals—Estate of Ceballos v. Husk, 919 F. 3d 1204 (CA10 2019), Hastings v. Barnes, 252 Fed. Appx. 197 (CA10 2007), Allen, 119 F. 3d 837, and Sevier v. Lawrence, 60 F. 3d 695 (CA10 1995)—comes close to establishing that the officers’ conduct was unlawful. The Court relied most heavily on Allen. But the facts of Allen are dramatically different from the facts here.
(c.) Full dataset, temperature default 0.7 (…) by the majority holds that a defendant who is not a party to a conspiracy cannot be held personally liable for an overt act in furtherance of the conspiracy. To the contrary, the Tenth Circuit held that a defendant is personally liable for an overt act if he “was a member of the conspiracy at the time of the overt act and participated in it in some manner.”
(d.) Full dataset, temperature high 0.9 (…) by the Tenth Circuit is even remotely analogous to this case. In United States v. Smith, 7 F.3d 1433 (10th Cir. 1993), the court held that a defendant’s motion to suppress evidence obtained in violation of the Fourth Amendment was properly denied because the government did not have probable cause to search the defendant’s house.
(e.) Full dataset, temperature very low 0.1 (…) by the Tenth Circuit in this case is inconsistent with the Supreme Court’s decision in United States v. Booker, 543 U.S. 220, 125 S. Ct. 738, 160 L. Ed. 2d 621 (2005). In Booker, the Supreme Court held that the mandatory nature of the Sentencing Guidelines rendered them unconstitutional. 543 U.S. at 244. The Court held that the Guidelines were unconstitutional because they were not “sufficiently reliable to support a finding of ‘reasonableness.’” Id. at 245.

2 Legal Aspects

2.1 Introduction to Legal Analysis

The use of artificial intelligence (AI) such as NLG algorithms creates numerous legal challenges, including liability issues. Most AI applications are designed to develop autonomously to deal with problems that their developers did not or could not have considered when programming it. As a result, self-learning AI can evolve in unforeseen ways. In the worst case, an algorithm can cause harm to others through dangerous self-learned behavior.

When using NLG, liability claims can occur in any area where verbal, be it oral or written, communication is used. Hence, a hospital or insurance company using NLG based bots to communicate with patients, a lawyer using NLG to draft briefs, or a news outlet using NLG to redact articles are facing liability claims, if the algorithm’s output causes harm to others.

Legal literature that deals with the implications of AI on future tort claims, focuses on the European Council’s directive on the liability for defective products (Directive) and whether new liability provisions are necessary or not.^{Footnote 7} This article further analyzes how verbal communication generated by NLG algorithms can violate personal rights, infringe on intellectual property rights or be the cause of unfair competition claims.

2.2 Liability for Autonomous Actions of AI in General

2.2.1 Unforeseeable Actions of Self-Learning AI as a Challenge for Tort Law

The self-learning ability of AI poses major challenges for developers and operators. On the one hand, AI autonomously develops new solutions to problems. On the other hand, these same characteristics pose a tremendous challenge, as developers and operators are not always able to anticipate risks that self-learning AI might pose to others.

One might conclude that an AI’s action adopted from self-learning mechanisms are not foreseeable to the developers or operators of an AI, preventing them from implementing adequate countermeasures (Horner and Kaulartz 2016, p. 7; von Westphalen 2019, p. 889; Gordon and Lutz 2020, p. 58).^{Footnote 8} This perception would shake at the foundations of tort law, as it stipulates the developer’s inability to control the risk stemming from AI (Weber 2017, n10).

Scholars have attempted to address the autonomy aspect of AI and have proposed various ideas based on existing liability law, such as analogies to all types of vicarious liability (Borges 2019, p. 151; Zech 2019a, p. 215). As with other new technologies, some argue for a new legal basis to adequately regulate the risks and assign liability to manufacturers and operators (Zech 2019a, p. 214; Gordon and Lutz 2020, p. 61; Säcker et al. 2020, p. 823). Furthermore, some support an entirely new legal concept of e-persons, that makes the AI itself the defendant of a tort claim (Koch 2019, p. 115).

As with most new technologies, it must be carefully analyzed whether they create risks of a new quality or merely change their quantity (Probst 2018, p. 41), as only the first one requires the introduction of new liability rules. Whether the AI qualifies as such has yet to be determined.

2.2.2 Respondent to Tort Claim

With AI causing harm to others, manufacturers or software developers will have a more significant role as defendants in tort claims as they are the human minds to which unwanted actions can be accounted to. In the case of NLG algorithms, the output generated is based on the program code developed by the manufacturer of the AI, the person who may be blamed if the output has negative consequences (Conraths 2020, n73).

For AI algorithms, the self-learning phase, when the product is already put into circulation, becomes increasingly important (Grapentin 2019, p. 179). Due to this shift of product development into the post marketing phase, legal scholars argue that not only the manufacturer of the AI bears a liability risk but also the operator (Spindler 2015, p. 767 et seq.; Reusch 2020, n178). For individual software, the operator may also be liable for a manufacturer’s actions if the latter can be considered a proxy to the operator and the latter cannot prove that it took all reasonable and necessary measures to instruct and supervise the proxy to prevent the damage from incurring (Kessler 2019, n16). For example, for news outlets that harness NLG, the editor-in-chief or other supervisory staff may be responsible for the proper functioning of the software and be liable in cases the software causes harm to others (Conraths 2020, n81).

2.2.3 Causality as the Limiting Factor of Liability

The fact that self-learning algorithms independently develop after the developer has put it into circulation makes it difficult to delimit each actor’s causal contribution to the damage (Ebers 2020, n194). In most cases, self-learning artificial agents (such as NLG) are not standard products but are individually tailored to the operator’s needs. Hence, the manufacturer and operator act in concertation when developing and training the AI for the operator to use. Under Swiss law, if two defendants acted together, both are jointly and severally liable for all harms caused (Art. 50 Swiss Code of Obligations (“CO”)).

With AI applications and NLG algorithms in particular, the interaction with third parties, such as the operator’s customers, becomes increasingly important for algorithms to further develop (Schaub 2019, p. 3). As recent real-life examples have shown, the input generated by customers may have undesired effects on the AI’s behavior. In general, a manufacturer must take reasonable measures to prevent an algorithm from using unqualified in-put data (such as hate speech) to adapt its behavior (Eichelberger 2020, n23).^{Footnote 9} But it cannot be expected of a manufacturer to foresee every possible misuse of its product. Under Swiss law, a manufacturer can escape liability if the manufacturer proves that a third actor’s unforeseeable actions have been significantly more relevant in causing the damage than its own, therefore, interrupting the chain of causality.^{Footnote 10}

Similarly, the Directive sets forth that the manufacturer is not liable if it proves that it is probable that the defect which caused the damage did not exist at the time when the manufacturer put the product into circulation or that this defect came into being afterwards. Some authors argue that the fact that the user’s interactions with the AI may be the root cause for harm and therefore, the manufacturer escapes liability.^{Footnote 11}

Apart from these specific challenges, proving causation in any claim for damages is challenging and, in many cases, requires significant resources to establish proof.^{Footnote 12} In many tort cases, not a single cause will be identified to have caused the damage occurred, but various causes will have partially contributed to the claimant’s damage (Zech 2019a, p. 207 et seq.). For the claimant, proof of causation will therefore remain a significant hurdle for compensation for damages (Spindler 2019, p. 139 et seq.).

2.3 Directive on Defective Products

2.3.1 General Remarks

Most NLG algorithms will cause economic losses that are not covered by the Directive (Art. 9) or the Swiss product liability law which is congruent with the Directive. Nevertheless, it is conceivable that NLG algorithms will also cause personal injury or property damage. This is the case when an NLG algorithm provides wrong information which causes bodily harm to others (e.g., a doctor receiving a diagnosis from a device that uses flawed NLG to communicate, a communications bot from a private emergency call facility giving false medical advice).

Scholars have extensively discussed whether the Directive applies to software or not (von Westphalen 2019, pp. 890, 892). Despite its ambiguity, most argue that software falls under the Directive.^{Footnote 13} To counter any remaining doubts, the EU Commission has published amendments to the Directive that name software as a product.^{Footnote 14} The following analysis therefore assumes that software qualifies as products under the Directive.

Various aspects of the Directive are discussed in the legal literature, with two standing out: First, it must be determined if the actions of an AI system are to be considered defective within the meaning of the Directive. Second, manufacturers of an AI system may be relieved of liability based on the state-of-the-art defense if they prove that, at the time the product was put into circulation, certain actions of the AI system, particularly those that the system develops through self-learning mechanisms, could not have been foreseen with the technical means and scientific knowledge available at the time.

2.3.2 Defectiveness of an AI System

2.3.2.1 Consumer Expectancy Test

Many scholars struggle with how to determine whether an AI system is defective or not. The Directive considers a product to be defective if it does not provide the safety that a person may expect (Art. 6 (1) Directive). Hence, a product is defective if a reasonable consumer would find it defective considering the presentation of the product, the use to which it could reasonably be expected that the product would be put, and the time when the product was put into circulation. This test based on consumer expectations may not be adequate to determine the defectiveness of cutting-edge technology, as it is hard to establish, lacking a point of reference (Lanz 2020, n745 et seq.). A risk-benefit approach that determines whether a reasonable alternative design would have significantly reduced the occurrence of harm therefore may be more appropriate (Wagner 2017, p. 731 et seq.).^{Footnote 15}

2.3.2.2 AI Challenging the Notion of Defect

Various causes can account for the error of a software. Some of which are easier to prove and do not challenge the definition of defectiveness as set forth in the Directive. Among those figure cases in which the manufacturer caused an error in the algorithm’s code, trained the algorithm (before putting it into circulation) with unsuitable data (Eichelberger 2020, n22), or didn’t implement adequate measures to prevent that third parties tamper with the code (e.g. hacking) (Eichelberger 2020, n22; Wagner 2017, p. 727 et seq.).^{Footnote 16} But other aspects that complicate the proof of defect or challenge the understanding of the concept of defects arise with AI (See also Zech 2019a, p. 204).

From a technical standpoint, it is difficult to analyze the actions of an AI which led to a damage due to the processes taking place in a way not yet perceivable from the outside (black-box problem).^{Footnote 17} Especially in the case of NLG, it may already be difficult for a claimant to prove that the output causing a damage was artificially generated, so that the Directive applies.^{Footnote 18}

From a normative point of view, the fact that an algorithm, through self-learning mechanisms, may adopt behavior not intended by its developer, challenges the perception of defectiveness: Scholars discuss various ways to determine the expectations of a reasonable consumer towards AI systems. AI agents outperform the skills of humans for specific tasks. To compare the outcomes of AI algorithms to those of a human does not sufficiently consider the task-limited superior performance of AI compared to humans (Wagner 2017, p. 734 et seq.). Comparing the results of two algorithms to determine the reasonable expectations of customers is not more suitable as its consequence would be that only the algorithm with the best performance is being considered safe, while all others are defective (Wagner 2017, p. 737 et seq.).

Determining the defectiveness of the learning process of an algorithm may further prove to be difficult as it is mainly developing after the product has been put into circulation and happens outside of the control of the manufacturer, in particular with NLG (Binder et al. 2021, n44). The phase where the NLG algorithm interacts with users is particularly challenging the understanding of defectiveness: While the AI is providing its services to the users it simultaneously improves its abilities, therefore raising the question whether the algorithm can be considered defective when it was put into use or not (Binder et al. 2021, n44).

As previous examples have shown, interaction with users can cause an algorithm to develop certain behavior not intended by the manufacturer (Zech 2019b, p. 192). It must be determined whether the manufacturer must provide reasonable measures to prevent the algorithm from evolving in an unintended manner (Eichelberger 2020, n23). Scholars agree that a manufacturer must implement safeguards to prevent an algorithm from incorporating inappropriate or illegal user behavior into its code. This may prove easier in theory than in practice because it is very difficult to predict what user behavior may cause a self-learning algorithm to evolve in a way not intended by the manufacturer. If users interact with the AI in unpredictable ways that cause harm, a product cannot be considered defective (Zech 2019a, p. 213).

2.3.3 State of the Art Defense

A manufacturer can escape liability if it proves that a defect could not have been detected when the product was put into circulation with the available technical and scientific knowledge. New technologies with unknown negative effects such as AI qualify for the state-of-the-art defense. Scholars therefore propose exempting certain applications from the state-of-the-art defense, as legislatures in various jurisdictions have done for other technological features such as GMOs and xenotransplantation (See for example: Junod 2019, p. 135; Eichelberger 2020, n20; disagreeing: Zech 2019a, p. 213).

The distinction between conditions that qualify as a defect of a product and those that fall under the state-of-the-art defense when it comes to AI is difficult. Self-learning algorithms may develop undesired behavior that a diligent manufacturer could not foresee. But the fact that a manufacturer cannot foresee the potential harmful behavior of its AI software does not automatically trigger the state-of-the-art defense.^{Footnote 19} Examples from the past show^{Footnote 20} that it is not sufficient that a manufacturer was unable to foresee a specific risk of his product. The defense could only be invoked if he was also unable to anticipate a general risk of harm posed by his product (Wagner 2020, § 1 ProdHaftG n61; Zech 2019a, p. 213).

The drafters of the Directive have intended this defense to be applicable to very limited cases. Hence, manufacturers are required to have applied the outmost care and diligence to anticipate negative effects of their product to invoke the defense. Some authors argue that the risk of self-learning AI is already known enough to prevent manufacturers to successfully invoke the defense (von Westphalen 2019, p. 892; Zech 2019a, p. 213).

From a practical perspective the hurdles to invoke the defense are significant as well. A manufacturer that invokes it, would most probably have to reveal business secrets (such as the programming code) to the injured party, therefore making it highly unlikely that the defense will become widely used to defend product liability claims (von Westphalen 2019, p. 892).

Finally, there exists a wide array of possible applications for AI, while not every product category poses the same dangers to consumers. In most cases the imminent dangers of a conventional product represent the greatest risk for harm; enhancements with AI applications of these products do not significantly increase that risk. A general exclusion of AI from the state-of-the-art defense would therefore not consider the individual risk of harm of each product category (Koch 2019, p. 114).

In conclusion, an exemption as proposed by some authors requires more in-depth analysis of the specific risks of AI and their foreseeability. A general call for excluding new technologies from the defense is counter-productive and may hinder manufacturers from investing in products using AI.

2.4 Liability for Negligence

In the absence of a specific provision which allows the defendant to claim compensation for damages, in Swiss law, the general fault based civil liability applies (Art. 41 Swiss Code of Obligations). For NLG, fault-based liability would become relevant if the output generated violates personal rights, infringes intellectual property rights, or triggers the unfair competition act’s provisions.

2.4.1 Infringement of Intellectual Property Rights

The output generated by NLG algorithms without human intervention is not protected by copyright due to the lack of creative input (Ragot et al. 2019, p. 574). Reymond 2019; Ebers et al. 2020, p. 9).^{Footnote 21} Hence, output generated by NLG algorithms can be used by other parties without violating copyright laws or paying royalties for its use.

On the other hand, NLG algorithms can rely on sources available on the internet. The risk that they use copyrighted or patented works must be considered by their developers.^{Footnote 22}

2.4.2 Personal Rights Violation

Several examples show that artificial intelligence algorithms for NLG may generate output that violates personal rights of others (defamation, libel etc.). Swiss law provides a victim of personal rights violation with a bouquet of remedies, ranging from injunctions to claim of damages. The autonomy of NLG algorithms does not exclude the operator’s civil liability if the output generated by the NLG algorithm violates personal rights of others (Art. 28 (1) Swiss Civil Code).^{Footnote 23} If the claimant proves that the operator of the NLG algorithm was at fault, he may seek monetary compensation (Art. 28a (3) Swiss Civil Code and Art. 41 Swiss Code of Obligations) (Meili 2018, Art. 28a n16).

News outlets are susceptible to claims if they vastly use NLG algorithms without proper oversight. As news circles are shorter and new players become increasingly important, the risk that output generated by NLG infringes personal rights increases.

News outlets are not the only operators that may see themselves involved in defamation lawsuits when using NLG that does not work properly. In particular, rating portals that use NLG to create comments on businesses (e.g. aggregated from individual feedback form customers) may violate personal rights if the (aggregated) feedback is wrong or violates personal rights of others (Reymond 2019, p. 111, et seq.). If search engines or website owners that provide links to content that violates personal rights of others are also liable, is not yet determined under Swiss law.^{Footnote 24}

2.4.3 Unfair Competition

Output of NLG algorithms may be susceptible to unfair competition claims if in violation of fair competition requirements. Cases in which unfair competition issues involving NLG become relevant are all types of sales activities in which NLG is used to advertise products. This may involve widespread general advertising or automated comparisons with similar products of competitors, or descriptions tailored to individual customers to persuade them to purchase a particular product (Leeb and Schmidt-Kessel 2020, n6). With the advent of rating websites (such as Google Maps, yelp etc.) businesses are taking advantage of good ratings. NLG algorithms may help businesses to easily create fake reviews. The creating or ordering of fake reviews to unjustifiably improve or weaken the rating of a competing business, qualifies as unfair competition, and may give a competitor a claim in damages.

The Swiss Unfair Competition Act sanctions (UWG^{Footnote 25}) various forms of unfair competitive behavior. In particular, the law sanctions actions that mislead customers about the NLG’s operator’s own products or those of a competitor (Art. 3 (1) UWG).

The law provides for various remedies, such as injunctive relief for the injured persons, for the state or professional associations (Art. 9 and 10 UWG). Injured persons may further claim damages based on the fault-based liability in Art. 41 CO.

2.4.4 Duty of Care

Owners of copyright protected or persons whose personal rights have been violated by NLG output have various legal remedies to act against the violation of their rights. Besides injunctions the injured person may claim damages. The latter is based on the general fault-based liability provision of the Swiss Code of Obligations (Art. 41 CO). Hence, the claimant must prove, among damages and causality, that the tortfeasor breached the applicable duty of care.

The duty of care is derived from legal or private standards, which for new technologies have yet to be established (Reusch 2020, n301). If specific standards are lacking, general principles for all sorts of dangerous activities apply. Thus, a person creating a risk of harm for others must take all necessary and reasonable precautions to prevent such.^{Footnote 26}

The Swiss Supreme Court has already dealt with cases where links from online blogs led to webpages that violated personal rights. Without in-depth assessment the Court concluded that the operator of the blog could not constantly monitor the content of all webpages linked (Reymond 2019, p. 114 with other references). Similarly, the German Supreme Court concluded that a search engine operator cannot be held accountable for any personal rights violation of autocomplete suggestions generated by its software. The operator of a search engine must only take reasonable measures to prevent violation of personal rights. The smooth and efficient performance of the software should not be impeded by rigorous filtering systems.^{Footnote 27} On the other hand, the specific expertise of the manufacturer or the operator which allows them to assess the risk that the AI agent may infringe on third party rights must be considered to set the applicable standard of care (Heinze and Wendorf 2020, n84). Furthermore, the operator is also responsible to regularly control the algorithms datasets which it uses to improve its abilities (Conraths 2020, n69). But despite careful planning, the developer or operator of an NLG-algorithm may not always be able to predict who may be harmed by the algorithm used, hence preventing it to take measures against it (Weber 2017, n22; Binder et al. 2021, n46). Finally, with self-learning NLG-algorithms in particular, developers and operators must prevent that the algorithm takes up harmful behavior from the interaction with its users (Heine and Wendorf 2020, n84).

3 Conclusion

NLG offers a wide array of possible applications. Cutting-edge algorithms allow to create verbal output that cannot be distinguished from human created speech. A cat-and-mouse game is underway between those who program NLG and those who develop algorithms capable of determining whether certain output is human or artificial. As shown, the verification of computer-generated text is crucial from a legal perspective, as the legal bases are only applicable to one or the other respectively.

The self-learning function of artificial intelligence is challenging tort law. Interaction with users can result in unintended behaviors, and in the worst case, even cause harm. This raises delicate questions as to what extent a programmer or an operator of an AI should be liable for its actions as they might not always be able to anticipate future behavior of the AI derived from the interaction with third parties.

Legal research will have to grapple for some time with how to deal with the specific challenges of AI before rashly giving in to the temptation of new legislation.

Notes

1.
See generally on the different applications of Machine Learning and AI I.1 - A Oliveira and M A T Figueiredo - Artificial intelligence - historical context and state of the art; I.2 - I Trancoso, N Mamede, B Martins, H S Pinto and R Ribeiro - The impact of language technologies in the legal domain; I.4 - J Gonçalves-Sá and F L Pinheiro - Societal Implications of Recommendation Systems - A Technical Perspective; I.5 - A T Freitas - Data-driven approaches in healthcare - challenges and emerging trends; I.6 - M Correia and L Rodrigues - Security and Privacy; II.2 - E Magrani and P G F Silva - The Ethical and Legal Challenges of Recommender Systems Driven by Artificial Intelligence; II.6 - M S Fernandes and J R Goldim - Artificial Intelligence and Decision Making in Health - Risks and Opportunities; III.4 - W Gravett - Judicial Decision-making in the Age of Artificial Intelligence; III.5 - D Durães, P M Freitas and P Novais - The Relevance of Deepfakes in the Administration of Criminal Justice.
2.
OpenAI Content Moderation Tooling, Available: https://openai.com/blog/new-and-improved-content-moderation-tooling/.
3.
GitHub Copilot, Your AI pair programmer. Available: https://copilot.github.com/.
4.
Sonar, Available: https://www.sonarsource.com/.
5.
Microsoft Invests in and Partners with OpenAI to Support Us Building Beneficial AGI, Available: https://openai.com/blog/microsoft/.
6.
ChatGPT Plus, Available: https://openai.com/blog/chatgpt-plus/.
7.
A lively debate is underway among legal scholars about which existing legal principles are most applicable to liability for harm caused by AI. It is noteworthy that scholars across national boundaries are debating whether different forms of vicarious liability (such as domestic animals) could be drawn upon. See: Eichelberger (2020), n23; Borges (2019), p. 151; dissenting: Grützmacher (2016), p. 698.
8.
Spindler 2019, argues that the deliberate use of products with unforeseeable risks has been happening for a long time and that scholars have not questioned that the person operating such a product is liable for damages.
9.
See above Sect. 1.4.1.
10.
Swiss Supreme Court Ruling 143 II 661, c7.1; specifically for NLG see: Eichelberger (2020), n56.
11.
Gordon and Lutz (2020), p. 58 et seq. and Junod (2019), p. 129 who argues that this defense should not be allowed.
12.
In 2022, the EU Commission published its Proposal for a directive of on adapting non-contractual civil liability rules to artificial intelligence (COM (2022) 496). The proposal introduces disclosure of evidence rules and sets forth presumptions to help the position of the claimant.
13.
Contra d II.5 - A T Fonseca, E V Sequeira and L B Xavier - Liability for AI Driven Systems; II.7 - M N Duffourc and D S Giovanniello - The Autonomous AI Physician - Medical Ethics and Legal Liability.
14.
COM (2022) 495.
15.
German Supreme Court Decision VI ZR 107/08 n18.
16.
Spindler (2019), p. 142 argues for a negligence standard to determine if the manufacturer took all necessary measures to prevent hacking vulnerabilities.
17.
Zech (2019b), p. 190 et seq., calls it an “intransparency problem”; Casals (2019), p. 204 argues that proof is facilitated since archive logs will likely be available for most AI.
18.
Human generated verbal speech does not fall under the Directive. See also above: Sect. 1.3.
19.
Wöbbeking (2020), n10 compares the risk of unforeseeable actions to the dangerous behavior of domesticated animals.
20.
At the beginning of the wide use of asbestos, manufacturers were unable to foresee a general risk for lung cancer. The same was true for HIV being transferred in blood transfusions as there existed no technical means to detect HIV in blood, see: COM (2006) 496 referencing the Hartman v Stichting Sanquin Bloedvoorziening (1999) Amsterdam District Court.
21.
Ragot et al. (2019), p. 574; for the German law: Heinze and Wendorf (2020), n63 who argue that for AI which strongly relies on presetting, the programmer of the code may own a copyright to the work produced by the AI.
22.
Heinze and Wendorf (2020), n79; German Supreme Court Ruling I ZR 201/16 (15/2/18); see the recent lawsuit filed against GitHub with the US Federal Court in San Francisco (https://githubcopilotlitigation.com/).
23.
Swiss Supreme Court Ruling 141 III 513, c5.3.
24.
Reymond (2019), p. 114; Swiss Supreme Court Ruling 5A_792/2011, c6.3.
25.
Swiss Federal Law against unfair competition dated December 19, 1986 (UWG).
26.
Gordon and Lutz (2020), p. 58 argue that breach of duty is unlikely because harm from AI is not foreseeable to developers. To the contrary: Zech (2019a), p. 198 arguing that the introduction of a high-risk product is negligent if the operator has not taken all reasonable and necessary steps to prevent harm to others.
27.
German Supreme Court Ruling VI ZR 269/12 (14/3/13).
28.
Literature published until the end of March 2022 was considered.

References

Literature published until the end of March 2022 was considered.

Binder NB, Burri T, Lohmann MF, Simmler M, Thouvenin F, Vokinger KN (2021) Künstliche Intelligenz: Handlungsbedarf im Schweizer Recht. Jusletter vom 28/06/2021
Google Scholar
Black S, Gao L, Wang P, Leahy C, Biderman S (2021) GPT-neo: large scale autoregressive language modeling with mesh-tensorflow. https://github.com/EleutherAI/gpt-neo. Accessed 12 Mar 2022
Borges G (2019) New liability concepts. In: Lohsse S, Schulze R, Staudenmayer D (eds) Liability for artificial intelligence. Nomos, Münster, pp 145–164
Google Scholar
Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D (2019) Language models are few-shot learners. arXiv:2005.14165
Google Scholar
Casals MM (2019) Causation and scope of liability in the internet of things (IoT). In: Lohsse S, Schulze R, Staudenmayer D (eds) Liability for artificial intelligence. Nomos, Münster, pp 201–230
Google Scholar
Chen M, Tworek J, Jun H, Yuan Q, Pinto HPO, Kaplan J, Edwards H, Burda Y, Joseph N, Brockman G (2021) Evaluating large language models trained on code. arXiv:2107.03374
Google Scholar
Clark E, August T, Serrano S, Haduong N, Gururangan S, Smith NA (2021) All that’s ‘human’ is not gold: evaluating human evaluation of generated text. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers). Association for Computational Linguistics, Pennsylvania, pp 7282–7296
Google Scholar
Conraths T (2020) Urheberrecht, § 29. In: Ebers M, Heinze C, Krügel T, Steinrötter B (eds) Künstliche Intelligenz und Robotik, Rechtshandbuch. C.H. Beck, München, pp 902–929
Google Scholar
Ebers M (2020) Regulierung von KI und Robotik, § 3. In: Ebers M, Heinze C, Krügel T, Steinrötter B (eds) Künstliche Intelligenz und Robotik, Rechtshandbuch. C.H. Beck, München, pp 82–137
Google Scholar
Ebers M, Heinze C, Krügel T, Steinrötter B (2020) Künstliche Intelligenz und Robotik, Rechtshandbuch. C.H. Beck, München
Google Scholar
Eichelberger J (2020) Zivilrechtliche Haftung für KI und smarte Robotik, § 5. In: Ebers M, Heinze C, Krügel T, Steinrötter B (eds) Künstliche Intelligenz und Robotik, Rechtshandbuch. C.H. Beck, München, pp 172–199
Google Scholar
Gordon CA, Lutz T (2020) Haftung für automatisierte Entscheidungen – Herausforderungen in der Praxis. SZW-RSDA 1:53–61
Google Scholar
Grapentin J (2019) Konstruktionspflichten des Herstellers und Mitverschulden des Anwenders beim Einsatz von künstlicher Intelligenz. Jurist Rundsch 2019:175–180
Article Google Scholar
Grützmacher M (2016) Die deliktische Haftung für autonome Systeme – Industrie 4.0 als Herausforderung für das bestehende Recht? Comput Recht 32:695–698
Article Google Scholar
Heinze C, Wendorf J (2020) KI und Urheberrecht, § 9. In: Ebers M, Heinze C, Krügel T, Steinrötter B (eds) Künstliche Intelligenz und Robotik, Rechtshandbuch. C.H. Beck, München, pp 304–354
Google Scholar
Horner S, Kaulartz M (2016) Haftung 4.0 Verschiebung des Sorgfaltsmaßstabs bei Herstellung und Nutzung autonomer Systeme. Comput Recht 32:7–19
Article Google Scholar
Junod V (2019) Liability for damages caused by ai in medicine. In: Chappuis C, Winiger B (eds) Responsabilité civile et nouvelles technologies. Schulthess, Zurich, pp 119–150
Google Scholar
Koch B (2019) Product liability 2.0 - mere update or new version? In: Lohsse S, Schulze R, Staudenmayer D (eds) Liability for artificial intelligence. Münster, Nomos, pp 99–117
Google Scholar
Lanz M (2020) Die Haftung beim medizinischen Einsatz synthetischer Nanopartikel. Schulthess, Zurich
Google Scholar
Leeb CM, Schmidt-Kessel M (2020) Verbraucherschutzrecht, 10. In: Kaulartz M, Braegelmann T (eds) Rechtshandbuch artificial intelligence und machine learning. C.H. Beck, München, pp 523–538
Google Scholar
Madrigal AC (2013) IBM’s Watson memorized the entire ‘urban dictionary,’ then his overlords had to delete it. The Atlantic. https://www.theatlantic.com/technology/archive/2013/01/ibms-watson-memorized-the-entire-urban-dictionary-then-his-overlords-had-to-delete-it/267047/. Accessed 12 Mar 2022
Markov T, Zhang C, Agarwal S, Eloundou T, Lee T, Adler S, Jiang A, Weng L (2022) A Holistic Approach to Undesired Content Detection in the Real World arXiv:2208.03274
Google Scholar
Ouyang L, Wu J, Jiang X, Almeida D, Wainwright LC, Mishkin P, Zhang C, Agarwal S, Slama K, Ray A, Schulman J, Hilton J, Kelton F, Miller L, Simens M, Askell A, Welinder P, Christiano P, Leike J, Lowe R (2022) Training language models to follow instructions with human feedback arXiv:2203.02155
Google Scholar
Pearce H, Ahmad B, Tan B, Dolan-Gavitt B, Karri R (2021) An empirical cybersecurity evaluation of GitHub copilot’s code contributions. arXiv:2108.09293
Google Scholar
Peric L, Mijic S, Stammbach D, Ash E, ETH Zurich (2021) Legal language modeling with transformers. In: Proceedings of the 2020 workshop on automated semantic analysis of information in legal text (ASAIL). ETH Zurich, Zürich, pp 1–11
Google Scholar
Probst T (2018) Digitalisierung und Vertragsrecht – Probleme des Schutzes der Privatsphäre aus vertragsrechtlicher Sicht. In: Epiney A (ed) Digitalisierung und Schutz der Privatsphäre. Schulthess, Basel, Zurich, Geneva, pp 40–76
Google Scholar
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. In: OpenAI
Google Scholar
Ragot S, Wigger F, Dal Molin L, Lappert N, Michael AA, Reinle AG, Merz J, Handle M, Gottschalk M, Fischer B, Anthamatten S, Cordoba A (2019) Copyright in artificially generated works. SIC 10:573–579
Google Scholar
Reusch P (2020) Produkthaftung, 4.1. In: Kaulartz M, Braegelmann T (eds) Rechtshandbuch artificial intelligence und machine learning. C.H. Beck, München pp 77–153
Google Scholar
Reymond MJ (2019) La responsabilité des hébergeurs pour fake news. In: Chappuis C, Winiger B (eds) Responsabilité civile et nouvelles technologies. Schulthess, Zurich, pp 105–118
Google Scholar
Säcker FJ, Rixecker R, Oetker H, Limpberg B (2020) Münchener Kommentar zum bürgerlichen Gesetzbuch: BGB, band 7: Schuldrecht besonderer Teil IV. C.H. Beck, München
Google Scholar
Schaub R (2019) Verantwortlichkeit für Algorithmen im Internet. InTeR 2019:2–7
Google Scholar
Schwartz O (2019) In 2016, microsoft’s racist chatbot revealed the dangers of online conversation the bot learned language from people on twitter—but it also learned values. .IEEE Spectrum. https://spectrum.ieee.org/in-2016-microsofts-racist-chatbot-revealed-the-dangers-of-online-conversation. Accessed 12 Mar 2022
So D, Wojciech M, Hanxiao L, Zihang D, Noam S, Quoc VL (2021) Primer: Searching for Efficient Transformers for Language Modeling arXiv:2203.02155
Google Scholar
Spindler G (2015) Roboter, Automation, künstliche Intelligenz, selbst-steuernde Kfz – braucht das Recht neue Haftungskategorien? Eine kritische Analyse möglicher Haftungsgrundlagen für autonome Steuerungen. Comput Recht 31:766–776
Article Google Scholar
Spindler G (2019) User liability and strict liability in the internet of things and for robots. In: Lohsse S, Schulze R, Staudenmayer D (eds) Liability for artificial intelligence. Nomos, Münster, pp 125–144
Google Scholar
Von Platen P (2020) How to generate text: using different decoding methods for language generation with transformers. https://huggingface.co/blog/how-to-generate. Accessed 12 Mar 2022
von Westphalen FG (2019) Haftungsfragen beim Einsatz künstlicher Intelligenz in Ergänzung der Produkthaftungs-RL 85/374/EWG. Z Wirtsch 40:889–894
Google Scholar
Wagner G (2017) Produkthaftung für autonome Systeme. Arch Civ Prax 217:708–764
Google Scholar
Weber R (2017) Braucht die digitale Welt ein neues Haftungsrecht? Jusletter 21/09/2017
Google Scholar
Wöbbeking MK (2020) Deliktische Haftung de lege feranda, 4.2. In: Kaulartz M, Braegelmann T (eds) Rechtshandbuch artificial intelligence und machine learning. C.H. Beck, München, pp 154–163
Google Scholar
Zech H (2019a) Künstliche Intelligenz und Haftungsfragen. ZfPW:198–219
Google Scholar
Zech H (2019b) Liability for autonomous systems. In: Lohsse S, Schulze R, Staudenmayer D (eds) Liability for artificial intelligence. Nomos, Münster, pp 187–200
Google Scholar

Download references

Author information

Authors and Affiliations

Attorney at Law at Schärer Rechtsanwälte, Aarau, Switzerland
Marcel Lanz
ETH Zurich, Zurich, Switzerland
Stefan Mijic

Authors

Marcel Lanz
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Mijic
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marcel Lanz .

Editor information

Editors and Affiliations

Faculty of Law, Universidade Católica Portuguesa, Lisbon, Portugal
Henrique Sousa Antunes
Faculty of Law, Universidade Católica Portuguesa, Porto, Portugal
Pedro Miguel Freitas
Instituto Superior Técnico, University of Lisbon, Lisbon, Portugal
Arlindo L. Oliveira
Durham Law School, Durham, UK
Clara Martins Pereira
Faculty of Law, Universidade Católica Portuguesa, Lisbon, Portugal
Elsa Vaz de Sequeira
Faculty of Law, Universidade Católica Portuguesa, Lisbon, Portugal
Luís Barreto Xavier

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lanz, M., Mijic, S. (2024). Risks Associated with the Use of Natural Language Generation: Swiss Civil Liability Law Perspective. In: Sousa Antunes, H., Freitas, P.M., Oliveira, A.L., Martins Pereira, C., Vaz de Sequeira, E., Barreto Xavier, L. (eds) Multidisciplinary Perspectives on Artificial Intelligence and the Law. Law, Governance and Technology Series, vol 58. Springer, Cham. https://doi.org/10.1007/978-3-031-41264-6_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-41264-6_17
Published: 27 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41263-9
Online ISBN: 978-3-031-41264-6
eBook Packages: Law and CriminologyLaw and Criminology (R0)

Publish with us

Policies and ethics