How to Design an AI Ethics Board

GovAI · Research Paper · 21 pages

Type

Report

classification

Source

GovAI

publisher

Published

2023

April 1, 2023

Series

Research Paper

document class

Pages

source PDF

Words

11,913

full text on file

Topics

tagged subjects

Full text

On file

readable here

Source of record

GovAI

governance

Abstract

How to design an AI ethics board Jonas Schuett∗ Centre for the Governance of AI jonas.schuett@governance.ai Anka Reuel∗ Stanford University anka@cs.stanford.edu Alexis Carlier Centre for the Governance of AI alexispcarlier@gmail.com

Full text

Abstract

Organizations that develop and deploy artificial intelligence (AI) systems need to take measures to reduce the associated risks. In this paper, we examine how AI companies could design an AI ethics board in a way that reduces risks from AI. We identify five high-level design choices: (1) What responsibilities should the board have? (2) What should its legal structure be? (3) Who should sit on the board?

(4) How should it make decisions and should its decisions be binding? (5) What resources does it need? We break down each of these questions into more specific sub-questions, list options, and discuss how different design choices affect the board’s ability to reduce risks from AI. Several failures have shown that designing an AI ethics board can be challenging. This paper provides a toolbox that can help AI companies to overcome these challenges. 1 Introduction It becomes increasingly clear that state-of-the-art artificial intelligence (AI) systems pose significant societal risks. AI systems used for drug discovery could be misused for the design of biochemical weapons [116]. A failure of AI systems used to control nuclear power plants or other critical infrastructure could also have devastating consequences [35]. Another concern is that, as models become larger and larger, certain dangerous capabilities might emerge at some point. Scholars and practitioners are increasingly worried about power-seeking behavior, situational awareness, and the ability to persuade people [25, 79, 82]. Organizations that develop and deploy AI systems need to take measures to reduce these risks to an acceptable level. In this paper, we examine how AI companies could design an AI ethics board in a way that reduces risks from AI. By “ethics board”, we mean a collective body intended to promote an organization’s ethical behavior. Some AI companies already have an AI ethics board. For example, Meta’s Oversight Board makes binding decisions about the content on Facebook and Instagram [86, 58, 121]. Microsoft’s AI, Ethics and Effects in Engineering and Research (AETHER) Committee advises their leadership “on the challenges and opportunities presented by AI innovations” [68]. DeepMind’s Institutional Review Committee (IRC) oversees their human rights policy [34] and has already played a key role in the AlphaFold release [57]. These examples show that AI ethics boards are of practical relevance. But there have also been a number of failures. Google’s Advanced Technology External Advisory Council (ATEAC) faced significant resistance over the inclusion of disputable members. It was shut down only one week after its announcement [94, 5, 42, 118]. Axon’s AI and Policing Technologies Ethics Board was effectively discontinued in June 2022 after three years of operations [109]. Nine out of eleven members resigned after Axon announced plans to develop taser-equipped drones to be ∗Equal contribution arXiv:2304.07249v1 [cs.CY] 14 Apr 2023 used in schools without consulting the board first [39]. (In late 2022, Axon announced their new ethics board: the Ethics & Equity Advisory Council [EEAC], which gives feedback on a limited number of products “through a racial equity and ethics lens” [11].) These cases show that designing an AI ethics board can be challenging. It also highlights the need for more research. Although there has been some research on AI ethics boards, the topic remains understudied. The most important work for our purposes is a whitepaper by Accenture [101]. They discuss key benefits of AI ethics boards and identify key design questions. However, their discussion lacks both breadth and depth. They discuss only a handful of design considerations and do not go into detail. They also do not focus on leading AI companies and risk reduction. Besides that, there is some literature on the purpose [55, 115, 73] and practical challenges of AI ethics boards [93, 45]. There are also several case studies of existing boards, including Meta’s Oversight Board [121] and Microsoft’s AETHER Committee [78]. And finally, there is some discussion of the role of AI ethics boards in academic research [14, 112]. Taken together, there seem to be at least two gaps in the literature. First, there is only limited work on the practical question of how to design an AI ethics board. Second, there is no discussion of how specific design considerations can help to reduce risks from AI. In light of these gaps, the paper seeks to answer two research questions (RQs):

• RQ1: What are the key design choices that AI companies have to make when setting up an AI ethics board?

• RQ2: How could different design choices affect the board’s ability to reduce risks from AI? The paper has two areas of focus. First, it focuses on organizations that develop state-of-the-art AI systems. This includes medium-sized research labs (e.g. OpenAI, DeepMind, and Anthropic) and big tech companies (e.g. Microsoft and Google). We use the term “AI company” or “company” to refer to them. Although we do not mention other types of companies (e.g. hardware companies), we expect that they might also benefit from our analysis. Second, the paper focuses on the board’s ability to reduce risks (see RQ2). By “risk”, we mean the “combination of the probability of occurrence of harm and the severity of that harm” [52]. (But note that there are other risk definitions [51]). In terms of severity, we focus on adverse effects on large groups of people and society as a whole, especially threats to their lives and physical integrity. We are less interested in financial losses and risks to organizations themselves (e.g. litigation or reputation risks). In terms of likelihood, we also consider low-probability, high-impact risks, sometimes referred to as “black swans” [113, 9, 60]. The two main sources of harm (“hazards”) we consider are accidents [1, 7] and cases of misuse [22, 41, 2]. In the following, we consider five high-level design choices: What responsibilities should the board have (Section 2)? What should its legal structure be (Section 3)? Who should sit on the board (Section 4)? How should it make decisions and should its decisions be binding (Section 5)? What resources does it need (Section 6)? We break down each of these questions into more specific sub-questions, list options, and discuss how they could affect the board’s ability to reduce risks from AI. The paper concludes with a summary of the most important design considerations and suggestions for further research (Section 7). 2 Responsibilities What responsibilities should the board have? We use the term “responsibility” to refer to the board’s purpose (what it aims to achieve), its rights (what it can do), and duties (what it must do). The board’s responsibilities are typically specified in its charter or bylaws. In the following, we focus on responsibilities that could help to reduce risks from AI (see RQ2). The ethics board could advise the board of directors (Section 2.1), oversee model releases and publications (Section 2.2), support risk assessments (Section 2.3), review the company’s risk management practices (Section 2.4), interpret AI ethics principles (Section 2.5), or serve as a contact point for whistleblowers (Section 2.6). Note that these responsibilities are neither mutually exclusive nor collectively exhaustive. The board could also have more than one responsibility.

2.1 Advising the board of directors

The board of directors plays a key role in the corporate governance of AI companies [28]. It sets the company’s strategic priorities, is responsible for risk oversight, and has significant influence over management (e.g. it can replace senior executives). But since many board members only work part-time and rely on information provided to them by management, they need support from an independent ally in the company [33]. Internal audit can be this ally, but the ethics board could serve as an additional layer of assurance [102]. Options. The ethics board could provide strategic advice on various topics. It could advocate against high-risk decisions and call for a more prudent and wiser course.

• Research priorities. Most AI companies have an overarching research agenda (e.g. DeepMind’s focus on reinforcement learning [108] or Anthropic’s focus on empirical safety research [3]). This agenda influences what projects the company works on. The ethics board could try to influence that agenda. It could advocate for increasing focus on safety and alignment research [1, 47, 79]. More generally, it could caution against advancing capabilities faster than safety measures. The underlying principle is called “differential technological development” [19, 84, 100].

• Commercialization strategy. The ethics board could also advise on the company’s commercialization strategy. On the one hand, it is understandable that AI companies want to monetize their systems (e.g. to pay increasing costs for compute [105]). On the other hand, commercial pressure might incentivize companies to cut corners on safety [6, 76]. For example, Google famously announced to “recalibrate” the level of risk it is willing to take in response to OpenAI’s release of ChatGPT [43]. It has also been reported that disagreements over OpenAI’s commercialization strategy were the reason why key employees left the company to start Anthropic [119].

• Strategic partnerships. AI labs might enter into strategic partnerships with profit-oriented companies (see e.g. the extended partnership between Microsoft and OpenAI [67]) or with the military (see e.g. “Project Maven”, Google’s collaboration with the U.S. Department of Defense [29]). Although such partnerships are not inherently bad, they could contribute to an increase of risk (e.g. if they lead to an equipment of nuclear weapons with AI technology [64]).

• Fundraising and M&A transactions. AI companies frequently need to bring in new investors. For example, in January 2023, it has been reported that OpenAI raised $10B from Microsoft [48, 83]. But if new investors care more about profits, this could gradually shift the company’s focus away from safety and ethics towards profit maximization. The same might happen if AI companies merge or get acquired. The underlying phenomena is called “mission drift” [44]. Discussion. How much would advising the board of directors reduce risk? This depends on many different factors. It would be easier if the ethics board has a direct communication channel to the board of directors, ideally to a dedicated risk committee. It would also be easier if the board of directors is able to do something about risks. They need risk-related expertise and governance structures to exercise their power (e.g. a chief risk officer [CRO] as a single point of accountability). But the board of directors also needs to take risks seriously and be willing to do something about them. This will often require a good relationship between the ethics board and the board of directors. Inversely, it would be harder for the ethics board to reduce risk if the board of directors mainly cares about other things (e.g. profits or prestige), especially since the ethics board is usually not able to force the board of directors to do something.

2.2 Overseeing model releases and publications

Many risks are caused by accidents [1, 7] or the misuse of specific AI systems [22, 41, 2]. In both cases, the deployment decision is a decisive moment. Ideally, companies should discover potential failure modes and vulnerabilities before they deploy a system, and stop the deployment process if they cannot reduce risks to an acceptable level. But not all risks are caused by the deployment of individual models. Some risks also stem from the publication of research, as research findings can be misused [116, 22, 41, 2, 8, 107, 21]. The dissemination of potentially harmful information, including research findings, is called “infohazards” [20, 62]. Publications can also fuel harmful narratives. For example, it has been argued that the “arms race” rhetoric is highly problematic [26]. Options. An ethics board could try to reduce these risks by creating a release strategy [111, 110, 81] and norms for the responsible publication of research [32, 8, 106, 91]. For example, the release strategy could establish “structured access” as the norm for deploying powerful AI systems [106]. Instead of open-sourcing new models, companies might want to deploy them via an application programming interface (API), which would allow them to conduct know-your-customer (KYC) screenings and restrict access if necessary, while allowing the world to use and study the model. The release strategy could also specify instances where a “staged release” seems adequate. Stage release refers to the strategy of releasing a smaller model first, and only releasing larger models if no meaningful cases of misuse are observed. OpenAI has coined the term and championed the approach when releasing GPT-2 [111]. But note that the approach has also been criticized [32]. The ethics board could also create an infohazard policy. The AI research organization Conjecture has published its policy [62]. We expect most AI companies to have similar policies, but do not make them public. In addition to that, the board could oversee specific model releases and publications (not just the abstract strategies and policies). It could serve as an institutional review board (IRB) that cares about safety and ethics more generally, not just the protection of human subjects [14, 112]. In particular, it could review the risks of a model or publication itself, do a sanity check of existing reviews, or commission an external review (Section 2.3). Discussion. How much would this reduce risk? Among other things, this depends on whether board members have the necessary expertise (Section 4.4), whether the board’s decisions are binding (Section 5.2), and whether they have the necessary resources (Section 6). The decision to release a model or publish research is one of the most important points of intervention for governance mechanisms that are intended to reduce risks. An additional attempt to steer such decisions in a good direction therefore seems desirable.

2.3 Supporting risk assessments

By “risk assessment”, we mean the identification, analysis, and evaluation of risks [52, 51]. Assessing the risks of state-of-the-art AI systems is extremely difficult: (1) The risk landscape is highly complex and evolves rapidly. For example, the increasing use of so-called “foundation models” [18] might lead to new diffuse and systemic risks (e.g. threats to epistemic security [104]). (2) Defining normative thresholds is extremely difficult: What level of risk is acceptable? How fair is fair enough? (3) In many cases, AI companies are also detached from the people who are most affected by their systems, often historically marginalized communities [70, 16]. (4) Risk assessments might become even more difficult in the future. For example, systems might become capable of deceiving their operators and only “pretending” to be safe in a testing environment [79]. Options. The ethics board could actively contribute to the different steps of a risk assessment. It could use a risk taxonomy to flag missing hazards [120], comment on a heatmap that illustrates the likelihood and severity of a risk [50], or try to circumvent a safety filter [99]. It could also commission a third-party audit [97, 23, 37, 98, 72? ] or red team [40, 92, 99]. It could report its findings to the board of directors which would have the necessary power to intervene (Section 2.1). Depending on its power, it might even be able to veto or at least delay deployment decisions (Section 5.2). Discussion. Some companies already take extensive measures to assess risks before deploying state-of-the-art AI systems [57, 24, 3]. It is unclear how much value the support of an ethics board would add to such efforts. But especially when dealing with catastrophic risks, having an additional “layer of defense” seems generally desirable. The underlying concept is called “defense in depth” [30]. This approach could be seen as a solution to the problem that “there is no silver bullet” [24]. But supporting risk assessments could also have negative effects. If other teams rely on the board’s work, they might assess risks less thoroughly. This would be particularly problematic if the board is not able to do it properly (e.g. it can only perform sanity checks). But this effect could be mitigated by clearly communicating expectations and creating appropriate incentives.

2.4 Reviewing risk management practices

Instead of or in addition to supporting specific risk assessments (Section 2.3), the ethics board could review the company’s risk management practices more generally. In other words, it could try to improve the company’s “risk governance” [117, 63]. Risk management practices at AI companies seem to be less advanced compared to other industries like aviation [49]. “They might look good on paper, but do not work in practice” [102]. There are not yet any established best practices and companies rarely adhere to best practices from other industries (though there are promising developments around risk management standards). And practices that companies develop themselves might not be as effective. For example, there might be blind spots for certain types of risks (e.g. diffuse or systemic risks) or they might not account for cognitive biases (e.g. availability bias or scope neglect [122]). Options. The ethics board could assess the adequacy and effectiveness of the company’s risk management practices. It could assess whether the company complies with relevant regulations [103], standards [80, 53], or its own policies and processes. It could also try to find flaws in a more open-ended fashion. Depending on its expertise and capacity, it could do this on its own (e.g. by reviewing risk-related policies and interviewing people in risk-related positions) or commission an external review of risk management practices (e.g. by an audit firm [71]). Note that this role is usually performed by the company’s internal audit function, but the ethics board could provide an additional layer of assurance [102]. They could report their findings directly to the risk committee of the board of directors and the chief risk officer (CRO) who could make risk management practices more effective. Discussion. If companies already have an internal audit function, the additional value would be limited; the ethics board would merely be an additional defense layer [102]. However, if companies do not already have an internal audit function, the added value could be significant. Without a deliberate attempt to identify ineffective risk management practices, some limitations will likely remain unnoticed [102]. But the value ultimately depends on the individuals who conduct the review. This might be problematic because it will require a very specific type of expertise that most members of an ethics board do not have (Section 4.4). It is also very time-consuming, so a part-time board might not be able to do it properly (Section 4.5). Both issues should be taken into account when appointing members.

2.5 Interpreting AI ethics principles

Many AI companies have ethics principles [54, 46], but “principles alone cannot guarantee ethical AI” [69]. They are necessarily vague and need to be put into practice [74, 123, 104]. Options. The ethics board could interpret principles in the abstract (e.g. defining terms or clarifying the purpose of specific principles) or in concrete cases (e.g. whether a new research project violates a specific principle). In doing so, it could influence a wide range of risk-related decisions. For example, the board might decide that releasing a model that can easily be misused would violate the principle “be socially beneficial”, which is part of Google’s AI principles (Google, n.d.). When interpreting principles, the board could take a risk-based approach: the higher the risk, the more the company needs to do to mitigate it [12, 65, 27]. The board could also suggest amendments to the principles. Discussion. How much would this reduce risk? It will be more effective if the principles play a key role within the company. For example, Google’s motto “don’t be evil”—which it quietly removed in 2018—used to be part of its code of conduct and, reportedly, had a significant influence on its culture [31]. Employees could threaten to leave the company or engage in other forms of activism if principles are violated [13]. Interpreting ethics principles would also be more effective if the board’s interpretation is binding (Section 5.2), and if the principles are public, mainly because civil society could hold the company accountable [28]. It would be less effective if the principles are mainly a PR tool. This practice is called “ethics washing” [15, 104, 38].

2.6 Contact point for whistleblowers

Detecting misconduct is often difficult: it is hard to observe from the outside, while insiders might not report it because they face a conflict between personal values and loyalty [56, 36] or because they fear negative consequences [17]. For example, an engineer might find a severe safety flaw, but the research lead wants to release the model nonetheless and threatens to fire the engineer if they speak up. In such cases, whistleblower protection is vital. Options. An ethics board could protect whistleblowers by providing a trusted contact point. The ethics board could report the case to the board of directors, especially the board risk committee, who Contract Company Contract Services Ethics Board Individuals Contract Company Ethics Board Board of Directors Services a b Company Individuals Contract Services cFigure 1: Three potential structures of an external ethics board could engage with management to do something about it. It could also advise the whistleblower on steps they could take to protect themselves (e.g. seeking legal assistance) or to do something about the misconduct (e.g. leaking the information to the press or a government agency). Discussion. The ethics board would be more trustworthy than other organizational units (at least if it is independent from management). But since it would still be part of the company (Section 3.2), or at least in a contractual relationship with it (Section 3.1), confidentiality would be less of a problem. This can be particularly important if the information is highly sensitive and its dissemination could be harmful in itself [116, 20, 21]. The ethics board can only serve this role if employees trust the ethics board, they know about the board’s commitment to whistleblower protection, and at least one board member needs to have relevant expertise and experience. For more information on the drivers of effective whistleblowing, we refer to the relevant literature [77, 4]. Anecdotally, whistleblowing within large AI companies has had some successes, though it did not always work [28]. Overall, this role seems very promising, but the issue is highly delicate and could easily make things worse. 3 Structure What should the board’s (legal) structure be? We can distinguish between internal (Section 3.1) and external structures (Section 3.2). The board could also have substructures (Section 3.3).

3.1 External boards

The ethics board could be external. The company and the ethics board could be two separate legal entities. The relationship between the two entities would then be governed by a contract. Options. The ethics board could be a nonprofit organization (e.g. a 501(c)(3)) or a for-profit company (e.g. a public-benefit corporation [PBC]). The individuals who provide services to the company could be members of the board of directors of the ethics board (Figure 1a). Alternatively, they could be a group of individuals contracted by the ethics board (Figure 1b) or by the company (Figure 1c). There could also be more complex structures. For example, Meta’s Oversight Board consists of two separate entities: a purpose trust and a limited liability company (LLC) [90, 114]. The purpose trust is funded by Meta and funds the LLC. The trustees are appointed by Meta, appoint individuals, and manage the LLC. The individuals are contracted by the LLC and provide services to Facebook and Instagram (Figure 2). Discussion. External ethics boards have a number of advantages: (1) They can legally bind the company through the contractual relationship (Section 5.1). This would be much more difficult for internal structures (Section 3.2). (2) The board would be more independent, mainly because it would be less affected by internal incentives (e.g. board members could prioritize the public interest over the company’s interests). (3) It would be a more credible commitment because it would be more effective and more independent. The company might therefore be perceived as being more responsible. (4) Contract LLC Instagram Facebook Meta Contract Services Individuals Oversight Board FundsAppoints Purpose Trust Trustees Funds Manage AppointFigure 2: Structure of Meta’s Oversight Board The ethics board could potentially contract with more than one company. In doing so, it might build up more expertise and benefit from economies of scale. But external boards also have disadvantages. We expect that few companies are willing to make such a strong commitment, precisely because it would undermine its independence. It might also take longer to get the necessary information and a nuanced view of the inner workings of the company (e.g. norms and culture).

3.2 Internal boards

The ethics board could also be part of the company. Its members would be company employees. And the company would have full control over the board’s structure, its activities, and its members. Options. An internal board could be a team, i.e. a permanent group of employees with a specific area of responsibility. But it could also be a working group or committee, i.e. a temporary group of employees with a specific area of responsibility, usually in addition to their main activity. For example, DeepMind’s IRC seems to be a committee, not a team [57, 34]. Discussion. The key advantage of internal boards is that it is easier for them to get information (e.g. because they have a better network within the organization). They will typically also have a better understanding of the inner workings of the company (e.g. norms and culture). But internal structures also have disadvantages. They can be disbanded at the discretion of senior management or the board of directors. It would be much harder to play an adversarial role and openly talk about risks, especially when potential mitigations are in conflict with other objectives (e.g. profits). The board would not have much (legal) power. Decisions cannot be enforced. To have influence, it relies on good relationships with management (if collaborative) or the board of directors (if adversarial). Finally, board members would be less protected from repercussions if they advocate for unfavorable measures.

3.3 Substructures

Both internal and external boards could have substructures. Certain responsibilities could be delegated to a part of the ethics board. Options. Two common substructures are committees and liaisons. (Note that an internal ethics board can be a committee of the company, but the ethics board can also have committees.) (1) Committees could be permanent (for recurring responsibilities) or temporary (to address one-time issues). For example, the board could have a permanent “deployment committee” that reviews model releases (Section 2.2), or it could have a temporary committee for advising the board on an upcoming M&A transaction. For more information about the merits of committees in the context of the board of directors, we refer to the relevant literature. Meta’s Oversight Board has two types of committees: a “case selection committee” which sets criteria for cases that the board will select for review, and a “membership committee” which proposes new board members and recommends the removal or renewal of existing members [87]. They can also set up other committees. Liaisons are another type of substructure. Some members of the ethics board could join specific teams or other organizational structures (e.g. attend meetings of research projects or the board of directors). They would get more information about the inner workings of the company and can build better relationships with internal stakeholders (which can be vital if the board wants to protect whistleblowers, see Section 2.6). Inversely, non-board members could be invited to attend board meetings. This could be important if the board lacks the necessary competence to make a certain decision (Section 4.4). For example, they could invite someone from the technical safety team to help them interpret the results of a third-party model audit. Microsoft’s AETHER Committee regularly invites engineers to working groups [66]. Discussion. On the one hand, substructures can make the board more complex and add friction. On the other hand, they allow for faster decision-making because less people are involved and group discussions tend to be more efficient. Against this background, we expect that substructures are probably only needed in larger ethics boards (Section 4.3). 4 Membership Who should sit on the board? In particular, how should members join (Section 4.1) and leave the board (Section 4.2)? How many members should the board have (Section 4.3)? What characteristics should they have (Section 4.4)? How much time should they spend on the board (Section 4.5)? And should they be compensated (Section 4.6)?

4.1 Joining the board

How should members join the board? Options. We need to distinguish between the appointment of the initial and subsequent board members. Initial members could be directly appointed by the company’s board of directors. But the company could also set up a special formation committee which appoints the initial board members. The former was the case at Axon’s AI and Policing Technologies Ethics Board [10], the latter at Meta’s Oversight Board [88]. Subsequent board members are usually appointed by the board itself. Meta’s Oversight Board has a special committee that selects subsequent members after a review of the candidates’ qualifications and a background check [88]. But they could also be appointed by the company’s board of directors. Candidates could be suggested (not appointed) by other board members, the board of directors, or the general public. At Meta’s Oversight Board, new members can be suggested by other board members, the board of directors, and the general public [88]. Discussion. The appointment of initial board members is particularly important. If the company does not get this right, it could threaten the survival of the entire board. For example, Google appointed two controversial members to the initial board which sparked internal petitions to remove them and contributed to the board’s failure [94]. The appointment should be done by someone with enough time and expertise. This suggests that a formation committee will often be advisable. The board would be more independent if it can appoint subsequent members itself. Otherwise, the company could influence the direction of the ethics board over time.

4.2 Leaving the board

How should members leave the board? Options. There are at least three ways in which members could leave the board. First, their term could expire. The board’s charter or bylaws could specify a term limit. Members would leave the board when their term expires. For example, at Meta’s Oversight Board, the term ends after three years, but appointments can be renewed twice [88]. Second, members could resign voluntarily. While Table 1: Size of different AI ethics boards Ethics board Members Source Meta’s Oversight Board 22 [89] Microsoft’s AETHER Committee 20 [78] Google’s ATEAC 8 [118] Axon’s AI and Policing Technologies Ethics Board 11 [10] Axon’s Ethics & Equity Advisory Council 11 (US), 7 (UK) [11] members might resign for personal reasons, a resignation can also be used to express protest. For example, in the case of Google’s ATEAC, Alessandro Acquisti announced his resignation on Twitter to express protest against the setup of the board [5]. Similarly, in the case of Axon’s AI and Policing Technologies Ethics Board, nine out of eleven members publically resigned after Axon announced plans to develop taser-equipped drones to be used in schools without consulting the board first [39]. Third, board members could be removed involuntarily. Discussion. Since any removal of board members is a serious step, it should only be possible under special conditions. In particular, it should require a special majority and a special reason (e.g. a violation of the board’s code of conduct or charter). To preserve the independence of the board, it should not be possible to remove board members for substantive decisions they have made.

4.3 Size of the board

How many members should the board have? Options. In theory, the board can have any number of members. In practice, most boards have between 10-20 members (Table 1). Discussion. On the one hand, larger boards can work on more cases and they can go into more detail. They can also be more diverse [45]. On the other hand, it will often be difficult to find enough qualified people. Group discussions in smaller boards tend to be easier and it is easier to reach consensus (e.g. if a qualified majority is required [Section 5.1]). Smaller boards allow for closer personal relationships between board members. But conflicts of interest could have an outsized effect in smaller boards. As a rule of thumb, the number of members should scale with the board’s workload (“more cases, more members”).

4.4 Characteristics of members

What characteristics should board members have? Options. When appointing board members, companies should at least consider candidates’ expertise, diversity, seniority, and public perception. Discussion. (1) Different boards will require different types of expertise [101]. But we expect most boards to benefit from technical, ethical, and legal expertise. (2) Members should be diverse along various dimensions, such as gender, race, and geographical representation [45]. For example, Meta’s Oversight Board has geographic diversity requirements in its bylaws [87]. They should adequately represent historically marginalized communities [70, 16]. Diverse perspectives are particularly important in the context of risk assessment (Section 2.3). For example, this will make it more likely that unprecedented risks are identified. (3) Board members may be more or less senior. By “seniority”, we mean a person’s position of status which typically corresponds to their work experience and is reflected in their title. More senior people tend to have more subject-matter expertise. The board of directors and senior management might also take them more seriously. As a consequence, it might be easier for them to build trust, get information, and influence key decisions. This is particularly important for boards that only advise and are not able to make binding decisions. However, it will often be harder for the company to find senior people. And in many cases, the actual work is done by junior people. (4) Finally, some board members might be “celebrities”. They would add “glamor” to the board, which the company could use for PR reasons. Inversely, appointing highly controversial candidates (e.g. who express sympathy to extreme political views) might put off other candidates and undermine the board’s credibility.

4.5 Time commitment

How much time should members spend on the board? Options. Board members could work full-time (around 40 hours per week), part-time (around 15-20 hours per week), or even less (around 1-2 hours per week or as needed). None of the existing (external) boards seem to require full-time work. Members of Meta’s Oversight Board work part-time [59]. And members of Axon’s AI and Policing Technologies Ethics Board only had two official board meetings per year, with ad-hoc contact between these meetings [10]. Discussion. The more time members spend working on the board, the more they can engage with individual cases. This would be crucial if cases are complex and stakes are high (e.g. if the board supports pre-deployment risk assessments, see Section 2.3). Full-time board members would also get a better understanding of the inner workings of the company. For some responsibilities, the board needs this understanding (e.g. if the board reviews the company’s risk management practices, see Section 2.4). However, we expect it to be much harder to find qualified candidates who are willing to work full-time because they will likely have existing obligations or other opportunities. This is exacerbated by the fact that the relevant expertise is scarce. And even if a company finds qualified candidates who are willing to work full-time, hiring several full-time members can be a significant expense.

4.6 Compensation

Should board members be compensated? Options. There are three options. First, serving on the ethics board could be unpaid. Second, board members could get reimbursed for their expenses (e.g. for traveling or for commissioning outside expertise). For example, Axon paid its board members $5,000 per year, plus a $5,000 honorarium per attended board meeting, plus travel expenses (AI and Policing Technologies Ethics Board, 2019). Third, board members could be fully compensated, either via a regular salary or honorarium. For example, it has been reported that members of Meta’s Oversight Board are being paid a six-figure salary [59]. Discussion. Not compensating board members or only reimbursing their expenses is only reasonable for part-time or light-touch boards. Full-time boards need to be compensated. Otherwise, it will be extremely difficult to find qualified candidates. For a more detailed discussion of how compensation can affect independence, see Section 6.1. 5 Decision-making

5.1 Decision-making process

How should the board make decisions? Options. We expect virtually all boards to make decisions by voting. This raises a number of questions:

• Majority. What majority should be necessary to adopt a decision? Boards could vote by absolute majority, i.e. a decision is adopted if it is supported by more than 50% of votes. For certain types of decisions, the board may also require a qualified majority (e.g. a unanimous vote or a 67% majority). Alternatively, boards could vote by plurality (or relative majority), i.e. a decision is adopted if it gets more votes than any other but does not receive more than half of all votes cast. The majority could be calculated based on the total number of board members (e.g. if the board has 10 members, 6 votes would constitute a simple majority), or the number of members present (e.g. if 7 members are present, 4 votes would constitute a simple majority). At Meta’s Oversight board, “outcomes will be determined by majority rule, based on the number of members present” [87].

• Voting rights. Who should be able to vote? There are three options. First, all board members could have voting rights. Second, only some board members could have voting rights. For example, only members of subcommittees could be able to vote on issues related to that subcommittee. This is the case at Meta’s Oversight Board [87]. It would also be conceivable that some members only advise on special issues; they might be less involved in the board’s day-to-day work. These board members, while formally being part of the board, might not have voting rights. Third, non-board members could have (temporary) voting rights. For example, the board could ask external experts to advise on specific issues. These experts could be granted voting rights for this particular issue.

• Voting power. A related, but different question is: how much should a vote count? We expect this question to be irrelevant for most boards, as “one person, one vote” is so commonsensical. However, in some cases, boards may want to deviate from this. For example, the board could use quadratic voting, which allows individuals to express the degree of their preferences, rather than just the direction of their preferences [96, 61].

• Quorum. What should the minimum number of members necessary to vote be? This is called a “quorum”. In principle, the quorum can be everything between one and all board members, though there might be legal requirements for some external structures. A natural quorum is the number of board members who could constitute a majority (e.g. more than 50% of board members if a simple majority is sufficient). It is also possible to have a different quorum for different types of decisions. Note that a lack of quorum might make the decision void or voidable.

• Voting method. How should the board vote? The most common voting methods are paper ballots, show of hands, postally, or electronically (e.g. using a voting app). According to its bylaws, voting at Meta’s Oversight Board takes place “in-person or electronically” [87].

• Abstention. In some cases, board members may want to abstain from a vote (e.g. because they do not feel adequately informed about the issue at hand, are uncertain, or mildly disapprove of the decision, but do not want to actively oppose it). Abstention could always be permitted or prohibited. The board could also allow abstention for some decisions, but not for others. Board members must abstain if they have a conflict of interest. At Meta’s Oversight Board, abstention is only prohibited for one type of decisions, namely for case deliberation [87].

• Proxy voting. Some board members may want to ask someone else to vote on their behalf. This is called “proxy voting”. Proxy voting could always be permitted or prohibited. The board could also allow proxy voting under certain circumstances (e.g. in the event of illness), only for certain decisions (e.g. less consequential decisions), or upon request. Meta’s Oversight Board does not allow proxy voting [87].

• Frequency of board meetings. How often should the board meet to vote? There are three options. First, the board could meet periodically (e.g. weekly, monthly, quarterly, or annually). Second, the board could meet on an ad hoc basis. Special meetings could be arranged at the board’s discretion, upon request by the company, and/or based on a catalog of special occasions (e.g. prior to the deployment of a new model). Third, the board could do both, i.e. meeting periodically and on an ad hoc basis. Meta’s Oversight Board meets annually and has special board meetings “in emergency or exceptional cases” [87]. Google’s ATEAC planned to have four meetings per year [118].

• In-person or remote meetings. Should board meetings be held in person or remotely? We expect this design choice to be less important than most others, but it is a necessary one nonetheless. At Meta’s Oversight Board, meetings take place in person, though it does allow exceptions “in limited and exceptional circumstances”; its committees meet either in person or remotely [87].

• Preparation and convocation of board meetings. How should board meetings be prepared and convened? More precisely, who can convene a board meeting? What is the notice period? How should members be invited? What should the invitation entail? And do members need to indicate if they will attend? At Meta’s Oversight Board, “written notice of periodic and special meetings must specify the date, time, location, and purpose for convening the board. This notice will be provided at least eight weeks in advance for in-person convenings and, unless in case of imminent emergency, at least two days in advance for remote convenings. Members are required to acknowledge receipt of this notice and also indicate their attendance in a timely fashion” [87].

• Documentation and communication of decisions. Finally, it needs to be specified how decisions are documented and communicated. More precisely, which decisions should be documented and communicated? What exactly should be documented and communicated? And who should get access to the documentation? At Facebook’s Oversight Board, “minutes will be taken and circulated to board members within one week” [87]. It does not publicly release meeting minutes, but has sometimes allowed reporters in their meetings. Google’s ATEAC planned to “publish a report summarizing the discussions” [118]. Axon’s AI Ethics Board published two annual reports [95]. In their 2019 report, they also highlight “the importance of public engagement and transparency” [10]. Discussion. Some of these questions might seem like formalities, but they can significantly affect the board’s work. For example, if the necessary majority or the quorum are too high, the board might not be able to adopt certain decisions. This could bias the board towards inaction. Similarly, if the board is not able to convene ad hoc meetings or only upon request by the company, they would not be able to respond adequately to emergencies.

5.2 Bindingness of decisions

Should the board’s decisions be binding? Options. This mainly depends on the board’s structure (Section 3). External boards can be set up in a way that their decisions are binding, i.e. enforceable by legal means. Both parties need to contractually agree that the board’s decisions are in fact binding. This agreement could also contain further details about the enforcement of the board’s decisions (e.g. contractual penalties). It is worth noting, however, that the ethics board cannot force the company to follow its decisions. The worst legal consequence for the company is a contractual liability. If the ethics board is part of the company, it is very difficult, if not impossible, to ensure that the board’s decisions are legally binding. If the board is able to make binding decisions, it needs to be specified whether and, if so, under what conditions the company can override them. For example, the contract could give the company’s board of directors the option to override a decision if they achieve the same voting majority as the ethics board. But even if the board’s decisions are not enforceable by legal means, there are non-legal means that can incentivize the company to follow the board’s decision. For example, the board could make its decisions public, which could spark a public outcry. One or more board members could (threaten to) resign, which might lead to negative PR. Employees could also (threaten to) leave the company (e.g. via an open letter), which could be a serious threat, depending how talent-constraint the company is. Finally, shareholders could engage in shareholder activism. In practice, the only ethics board that is able to make binding decisions is Meta’s Oversight Board, which has the power to override content moderation decisions. Discussion. Boards that are able to make legally binding decisions are likely more effective, i.e. they are able to achieve their goals to a higher degree (e.g. reducing risks to an acceptable level). They would also be a more credible commitment to safety and ethics. However, we expect that many companies would oppose creating such a powerful ethics board, mainly because it would undermine the company’s power. There might also be legal constraints on how much power the company can transfer to the ethics board. 6 Resources What resources does the board need? In particular, how much funding does the board need and where should the funding come from (Section 6.1)? How should the board get information (Section 6.2)? And should it have access to outside expertise (Section 6.3)?

6.1 Funding

How much funding does the board need and where should the funding come from? Options. The board might need funding to pay its members salaries or reimburse expenses (Section 4.6), to commission outside expertise (e.g. third-party audits or expert consulting), or to organize events (e.g. in-person board meetings). Funding could also allow board members to spend their time on non-administrative tasks. For example, the Policing Project provided staff support, facilitated meetings, conducted research, and drafted reports for Axon’s former AI and Policing Technologies Ethics Board [95]. How much funding the board needs varies widely—from essentially no funding to tens of millions of dollars. For example, Meta’s Oversight Board has an annual budget of $20 million [85]. Funding could come from the company (e.g. directly or via a trust) or philanthropists. Other funding sources do not seem plausible (e.g. state funding or research grants). Discussion. The board’s independence could be undermined if funding comes directly from the company. The company could use the provision of funds as leverage to make the board take decisions that are more aligned with its interests. A more indirect funding mechanism therefore seems preferable. For example, Meta funds the purpose trust for multiple years in advance [85].

6.2 Information

How should the board get information? Options. What information the board needs is highly context-specific and mainly depend on the board’s responsibilities (Section 2). The board’s structure determines what sources of information are available (Section 3). While internal boards have access to some information by default, external boards have to rely on public information and information the company decides to share with them. Both internal and external boards might be able to gather additional information themselves (e.g. via formal document requests or informal coffee chats with employees). Discussion. Getting information from the company is convenient for the board, but the information might be biased. The company might—intentionally or not—withhold, overemphasize, or misrepresent certain information. The company could also delay the provision of information or present them in a way that makes it difficult for the board to process (e.g. by hiding important information in long documents). To mitigate these risks, the board might prefer gathering information itself. In particular, the board might want to build good relationships with a few trusted employees. While this might be less biased, it would also be more time-consuming. It might also be impossible to get certain first-hand information (e.g. protocols of past meetings of the board of directors). It is worth noting that not all company information is equally biased. For example, while reports by management might be too positive, whistleblower reports might be too negative. The most objective information will likely come from the internal audit team and external assurance providers [102]. In general, there is no single best information source. Boards need to combine multiple sources and cross-check important information.

6.3 Outside expertise

Should the board have access to outside expertise? Options. There are at least three types of outside expertise the ethics board could harvest. First, it could hire a specialized firm (e.g. a law or consulting firm) to answer questions that are beyond its expertise (e.g. whether the company complies with the NIST AI Risk Management Framework). Second, it could hire an audit firm (e.g. to audit a specific model, the company’s governance, or its own practices). Third, it could build academic partnerships (e.g. to red-team a model). Discussion. It might make sense for the ethics board to rely on outside expertise if they have limited expertise or time. They could also use it to get a more objective perspective, as information provided to them by the company can be biased (Section 6.2). However, the company might use the same sources of outside expertise. For example, if a company is open to a third-party audit, it would commission the audit directly (why would it ask the ethics board to do it on its behalf?). In such Table 2: Summary of design choices High-level questions Sub-questions / options What responsibilities should the board have?

• Advising the board of directors

• Overseeing model releases and publications

• Supporting risk assessments

• Reviewing risk management practices

• Interpreting Al ethics principles

• Serving as a contact for whistleblowers What should the board’s legal structure be?

• The board could be a separate legal entity that contracts with the company (external board)

• It could also be part of the company (internal board)

• Should it have substructures (e.g. committees)? Who should sit on the board?

• How should initial and subsequent members be appointed?

• How should they leave the board?

• How many members should the board have?

• What characteristics should they have?

• How much time should they spend on the board?

• Should they be compensated? How should the board make decisions?

• What decision-making process should the board use?

• Should its decisions be binding? What resources does the should the board need?

• How much funding does the board need and where should the funding come from?

• How should the board get information?

• Should the board have access to outside expertise? cases, the ethics board would merely “double-check” the company’s or the third party’s work. While the added value would be low, the costs could be high (especially for commissioning an external audit or expert consulting). 7 Conclusion Summary. In this paper, we have identified key design choices that AI companies need to make when setting up an ethics board (RQ1). For each of them, we have listed different options and discussed how they would affect the board’s ability to reduce risks from AI (RQ2). Table 2 contains a summary of the design choices we have covered. Key claims. Throughout this paper, we have made four key claims. First, ethics boards can take many different shapes. Most design choices are highly context-specific. It is therefore very difficult to make abstract recommendations. There is no one-size-fits-all. Second, ethics boards should be seen as an additional “layer of defense”. They do not have an original role in the corporate governance of AI companies. They do not serve a function that no other organizational structure serves. Instead, most ethics boards support, complement, or duplicate existing efforts. While this reduces efficiency, an additional safety net seems warranted in high-stakes situations. Third, merely having an ethics board is not sufficient. Most of the value depends on its members and their willingness and ability to pursue its mission. Thus, appointing the right people is crucial. Inversely, there is precedent that appointing the wrong people can threaten the survival of the entire board. Fourth, while some design choices might seem like formalities (e.g. when the board is quorate), they can have a significant impact on the effectiveness of the board (e.g. by slowing down decisions). They should not be taken lightly. Questions for further research. The paper left many questions unanswered and more research is needed. In particular, our list of design choices is not comprehensive. For example, we did not address the issue of board oversight. If an ethics board has substantial powers, the board itself also needs adequate oversight. A “meta oversight board”—a central organization that oversees various AI ethics boards—could be a possible solution. Apart from that, our list of potential responsibilities could be extended. For example, the company could grant the ethics board the right to appoint one or more members of its board of directors. The ethics board could also oversee and coordinate responses to model evals. For example, if certain dangerous capabilities are detected, the company may want to contact government and coordinate with other labs to pause capabilities research. We wish to conclude with a word of caution. Setting up an ethics board is not a silver bullet—“there is no silver bullet” [24]. Instead, it should be seen as yet another mechanism in a portfolio of mechanisms.

Acknowledgements

We are grateful for valuable feedback from Christina Barta, Carrick Flynn, Cullen O’Keefe, Virginia Blanton, Andrew Strait, Tim Fist, and Milan Griffes. Anka Reuel worked on the project during the 2022 CHERI Summer Research Program. All remaining errors are our own.

References

[1] D. Amodei, C. Olah, J. Steinhardt, P. Christiano, J. Schulman, and D. Mané. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565, 2016. [2] M. Anderljung and J. Hazell. Protecting society from AI misuse: When are restrictions on capabilities warranted? arXiv preprint arXiv:2303.09377, 2023. [3] Anthropic. Core views on AI safety: When, why, what, and how. https://www.anthropic. com/index/core-views-on-ai-safety, 2023. [4] C. R. Apaza and Y. Chang. What makes whistleblowing effective: Whistleblowing in Peru and South Korea. Public Integrity, 13(2):113–130, 2011. [5] A. Aquisti. https://twitter.com/ssnstudy/status/1112099054551515138, 2019. [6] S. Armstrong, N. Bostrom, and C. Shulman. Racing to the precipice: A model of artificial intelligence development. AI & Society, 31:201–206, 2016. [7] Z. Arnold and H. Toner. AI accidents: An emerging threat. Center for Security and Emerging Technology, Georgetown University, 2021. [8] C. Ashurst, E. Hine, P. Sedille, and A. Carlier. AI ethics statements: Analysis and lessons learnt from NeurIPS broader impact statements. In 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 2047–2056, 2022. [9] T. Aven. On the meaning of a black swan in a risk context. Safety Science, 57:44–51, 2013. [10] Axon. First report of the Axon AI & Policing Technology Ethics Board, 2019. [11] Axon. Ethics & Equity Advisory Council. https://www.axon.com/eeac, 2022. [12] R. Baldwin and J. Black. Driving priorities in risk-based regulation: What’s the problem? Journal of Law and Society, 43(4):565–595, 2016. [13] H. Belfield. Activism by the AI community: Analysing recent achievements and future prospects. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 15–21, 2020. [14] M. S. Bernstein, M. Levi, D. Magnus, B. A. Rajala, D. Satz, and Q. Waeiss. Ethics and society review: Ethics reflection as a precondition to research funding. Proceedings of the National Academy of Sciences, 118(52), 2021. [15] E. Bietti. From ethics washing to ethics bashing: A view on tech ethics from within moral philosophy. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 210–219, 2020. [16] A. Birhane, W. Isaac, V. Prabhakaran, M. Díaz, M. C. Elish, I. Gabriel, and S. Mohamed. Power to the people? Opportunities and challenges for participatory AI. Equity and Access in Algorithms, Mechanisms, and Optimization, pages 1–8, 2022. [17] B. Bjørkelo. Workplace bullying after whistleblowing: Future research and implications. Journal of Managerial Psychology, 28(3):306–323, 2013. [18] R. Bommasani, D. A. Hudson, E. Adeli, R. Altman, S. Arora, S. von Arx, M. S. Bernstein, J. Bohg, A. Bosselut, E. Brunskill, E. Brynjolfsson, S. Buch, D. Card, R. Castellon, N. Chatterji, A. Chen, K. Creel, J. Q. Davis, D. Demszky, C. Donahue, M. Doumbouya, E. Durmus, S. Ermon, J. Etchemendy, K. Ethayarajh, L. Fei-Fei, C. Finn, T. Gale, L. Gillespie, K. Goel, N. Goodman, S. Grossman, N. Guha, T. Hashimoto, P. Henderson, J. Hewitt, D. E. Ho, J. Hong, K. Hsu, J. Huang, T. Icard, S. Jain, D. Jurafsky, P. Kalluri, S. Karamcheti, G. Keeling, F. Khani, O. Khattab, P. W. Koh, M. Krass, R. Krishna, R. Kuditipudi, A. Kumar, F. Ladhak, M. Lee, T. Lee, J. Leskovec, I. Levent, X. L. Li, X. Li, T. Ma, A. Malik, C. D. Manning, S. Mirchandani, E. Mitchell, Z. Munyikwa, S. Nair, A. Narayan, D. Narayanan, B. Newman, A. Nie, J. C. Niebles, H. Nilforoshan, J. Nyarko, G. Ogut, L. Orr, I. Papadimitriou, J. S. Park, C. Piech, E. Portelance, C. Potts, A. Raghunathan, R. Reich, H. Ren, F. Rong, Y. Roohani, C. Ruiz, J. Ryan, C. Ré, D. Sadigh, S. Sagawa, K. Santhanam, A. Shih, K. Srinivasan, A. Tamkin, R. Taori, A. W. Thomas, F. Tramèr, R. E. Wang, W. Wang, B. Wu, J. Wu, Y. Wu, S. M. Xie, M. Yasunaga, J. You, M. Zaharia, M. Zhang, T. Zhang, X. Zhang, Y. Zhang, L. Zheng, K. Zhou, and P. Liang. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2022. [19] N. Bostrom. Existential risks: Analyzing human extinction scenarios and related hazards. Journal of Evolution and Technology, 9(1), 2001. [20] N. Bostrom. Information hazards: A typology of potential harms from knowledge. Review of Contemporary Philosophy, 10:44–79, 2011. [21] N. Bostrom. The vulnerable world hypothesis. Global Policy, 10(4):455–476, 2019. [22] M. Brundage, S. Avin, J. Clark, H. Toner, P. Eckersley, B. Garfinkel, A. Dafoe, P. Scharre, T. Zeitzoff, B. Filar, H. Anderson, H. Roff, G. C. Allen, J. Steinhardt, C. Flynn, S. O. hÉigeartaigh, S. Beard, H. Belfield, S. Farquhar, C. Lyle, R. Crootof, O. Evans, M. Page, J. Bryson, R. Yampolskiy, and D. Amodei. The malicious use of artificial intelligence: Forecasting, prevention, and mitigation. arXiv preprint arXiv:1802.07228, 2018. [23] M. Brundage, S. Avin, J. Wang, H. Belfield, G. Krueger, G. Hadfield, H. Khlaaf, J. Yang, H. Toner, R. Fong, T. Maharaj, P. W. Koh, S. Hooker, J. Leung, A. Trask, E. Bluemke, J. Lebensold, C. O’Keefe, M. Koren, T. Ryffel, J. Rubinovitz, T. Besiroglu, F. Carugati, J. Clark, P. Eckersley, S. de Haas, M. Johnson, B. Laurie, A. Ingerman, I. Krawczuk, A. Askell, R. Cammarota, A. Lohn, D. Krueger, C. Stix, P. Henderson, L. Graham, C. Prunkl, B. Martin, E. Seger, N. Zilberman, S. O. hÉigeartaigh, F. Kroeger, G. Sastry, R. Kagan, A. Weller, B. Tse, E. Barnes, A. Dafoe, P. Scharre, A. Herbert-Voss, M. Rasser, S. Sodhani, C. Flynn, T. K. Gilbert, L. Dyer, S. Khan, Y. Bengio, and M. Anderljung. Toward trustworthy AI development: Mechanisms for supporting verifiable claims. arXiv preprint arXiv:2004.07213, 2020. [24] M. Brundage, K. Mayer, T. Eloundou, S. Agarwal, S. Adler, G. Krueger, J. Leike, and P. Mishkin. Lessons learned on language model safety and misuse. OpenAI, 2022. [25] J. Carlsmith. Is power-seeking AI an existential risk? arXiv preprint arXiv:2206.13353, 2022. [26] S. Cave and S. S. ÓhÉigeartaigh. An AI race for strategic advantage: Rhetoric and risks. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 36–40, 2018. [27] J. Chamberlain. The risk-based approach of the European Union’s proposed artificial intelligence regulation: Some comments from a tort law perspective. European Journal of Risk Regulation, 14(1):1–13, 2022. [28] P. Cihon, J. Schuett, and S. D. Baum. Corporate governance of artificial intelligence in the public interest. Information, 12(7), 2021. [29] K. Conger and D. Cameron. Google is helping the Pentagon build AI for drones. Gizmodo, 2018. [30] O. Cotton-Barratt, M. Daniel, and A. Sandberg. Defence in depth against human extinction: Prevention, response, resilience, and why they all matter. Global Policy, 11(3):271–282, 2020. [31] P. Crofts and H. van Rijswijk. Negotiating ’evil’: Google, Project Maven and the corporate form. Law, Technology and Humans, 2(1):1–16, 2020. [32] R. Crootof. Artificial intelligence research needs responsible publication norms. Lawfare Blog, 2019. [33] H. Davies and M. Zhivitskaya. Three lines of defence: A robust organising framework, or just lines in the sand? Global Policy, 9:34–42, 2018. [34] DeepMind. Human rights policy. https://www.deepmind.com/human-rights-policy, 2022. [35] J. Degrave, F. Felici, J. Buchli, M. Neunert, B. Tracey, F. Carpanese, T. Ewalds, R. Hafner, A. Abdolmaleki, D. de Las Casas, et al. Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897):414–419, 2022. [36] J. Dungan, A. Waytz, and L. Young. The psychology of whistleblowing. Current Opinion in Psychology, 6:129–133, 2015. [37] G. Falco, B. Shneiderman, J. Badger, R. Carrier, A. Dahbura, D. Danks, M. Eling, A. Goodloe, J. Gupta, C. Hart, et al. Governing AI safety through independent audits. Nature Machine Intelligence, 3(7):566–571, 2021. [38] L. Floridi. Translating principles into practices of digital ethics: Five risks of being unethical. Ethics, Governance, and Policies in Artificial Intelligence, pages 81–90, 2021. [39] B. Friedman, W. Abd-Almageed, M. Brundage, R. Calo, D. Citron, R. Delsol, C. Harris, J. Lynch, and M. McBride. Statement of resigning Axon AI ethics board members. Policing Project, 2022. [40] D. Ganguli, L. Lovitt, J. Kernion, A. Askell, Y. Bai, S. Kadavath, B. Mann, E. Perez, N. Schiefer, K. Ndousse, A. Jones, S. Bowman, A. Chen, T. Conerly, N. DasSarma, D. Drain, N. Elhage, S. El-Showk, S. Fort, Z. Hatfield-Dodds, T. Henighan, D. Hernandez, T. Hume, J. Jacobson, S. Johnston, S. Kravec, C. Olsson, S. Ringer, E. Tran-Johnson, D. Amodei, T. Brown, N. Joseph, S. McCandlish, C. Olah, J. Kaplan, and J. Clark. Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned. arXiv preprint arXiv:2209.07858, 2022. [41] J. A. Goldstein, G. Sastry, M. Musser, R. DiResta, M. Gentzel, and K. Sedova. Generative language models and automated influence operations: Emerging threats and potential mitigations. arXiv preprint arXiv:2301.04246, 2023. [42] Googlers Against Transphobia. Googlers against transphobia and hate. Medium, 2019. [43] N. Grant. Google calls in help from Larry Page and Sergey Brin for A.I. fight. The New York Times, 2023. [44] M. G. Grimes, T. A. Williams, and E. Y. Zhao. Anchors aweigh: The sources, variety, and challenges of mission drift. Academy of Management Review, 44(4):819–845, 2019. [45] A. Gupta and V. Heath. AI ethics groups are repeating one of society’s classic mistakes. MIT Technology Review, 2020. [46] T. Hagendorff. The ethics of AI ethics: An evaluation of guidelines. Minds and Machines, 30(1):99–120, 2020. [47] D. Hendrycks, N. Carlini, J. Schulman, and J. Steinhardt. Unsolved problems in ML safety. arXiv preprint arXiv:2109.13916, 2022. [48] L. Hoffman and R. Albergotti. Microsoft eyes $10 billion bet on ChatGPT. Semafor, 2023. [49] W. Hunt. The flight to safety-critical AI. Center for Long-Term Cybersecurity, UC Berkeley, 2020. [50] IEC. 31010:2019 Risk management — Risk assessment techniques, 2019. [51] ISO. 31000:2018 Risk management — Guidelines, 2018. [52] ISO/IEC. Guide 51:2014 Safety aspects — Guidelines for their inclusion in standards, 2014. [53] ISO/IEC. 23894:2023 Information technology — Artificial intelligence — Guidance on risk management, 2023. [54] A. Jobin, M. Ienca, and E. Vayena. The global landscape of AI ethics guidelines. Nature Machine Intelligence, 1(9):389–399, 2019. [55] S. R. Jordan. Designing artificial intelligence review boards: creating risk metrics for review of AI. In 2019 IEEE International Symposium on Technology and Society (ISTAS), pages 1–7, 2019. [56] P. B. Jubb. Whistleblowing: A restrictive definition and interpretation. Journal of Business Ethics, 21:77–94, 1999. [57] K. Kavukcuoglu, P. Kohli, L. Ibrahim, D. Bloxwich, and S. Brown. How our principles helped define alphafold’s release. https://www.deepmind.com/blog/ how-our-principles-helped-define-alphafolds-release, 2022. [58] K. Klonick. The Facebook Oversight Board: Creating an independent institution to adjudicate online free expression. Yale Law Journal, 129(2418), 2020. [59] K. Klonick. Insight the making of Facebook’s supreme court. New Yorker, 2021. [60] N. Kolt. Algorithmic black swans. Washington University Law Review, 101, 2023. [61] S. P. Lalley and E. G. Weyl. Quadratic voting: How mechanism design can radicalize democracy. In AEA Papers and Proceedings, volume 108, pages 33–37, 2018. [62] C. Leahy, S. Black, C. Scammell, and A. Miotti. Conjecture: Internal infohazard policy. Alignment Forum, 2022. [63] S. A. Lundqvist. Why firms implement risk governance: Stepping beyond traditional risk management to enterprise risk management. Journal of Accounting and Public Policy, 34(5):441– 466, 2015. [64] M. M. Maas. How viable is international arms control for military artificial intelligence? three lessons from nuclear weapons. Contemporary Security Policy, 40(3):285–311, 2019. [65] T. Mahler. Between risk management and proportionality: The risk-based approach in the EU’s Artificial Intelligence Act proposal. Nordic Yearbook of Law and Informatics, 2021. [66] Microsoft. Putting principles into practice: How we approach responsible AI at Microsoft. https://www.microsoft.com/cms/api/am/binary/RE4pKH5, 2020. [67] Microsoft. Microsoft and OpenAI extend partnership. https://blogs.microsoft.com/ blog/2023/01/23/microsoftandopenaiextendpartnership/, 2023. [68] Microsoft. Our approach. https://www.microsoft.com/en-us/ai/our-approach, 2023. [69] B. Mittelstadt. Principles alone cannot guarantee ethical AI. Nature machine intelligence, 1(11):501–507, 2019. [70] S. Mohamed, M.-T. Png, and W. Isaac. Decolonial AI: Decolonial theory as sociotechnical foresight in artificial intelligence. Philosophy & Technology, 33:659–684, 2020. [71] J. Mökander and L. Floridi. Operationalising AI governance through ethics-based auditing: An industry case study. AI and Ethics, pages 1–18, 2022. [72] J. Mökander, J. Morley, M. Taddeo, and L. Floridi. Ethics-based auditing of automated decision-making systems: Nature, scope, and limitations. Science and Engineering Ethics, 27(44), 2021. [73] J. Morley, A. Elhalal, F. Garcia, L. Kinsey, J. Mökander, and L. Floridi. Ethics as a service: a pragmatic operationalisation of AI ethics. Minds and Machines, 31(2):239–256, 2021. [74] J. Morley, L. Floridi, L. Kinsey, and A. Elhalal. From what to how: an initial review of publicly available AI ethics tools, methods and research to translate principles into practices. Science and Engineering Ethics, 26(4):2141–2168, 2020. [75] J. Mökander, J. Schuett, H. R. Kirk, and L. Floridi. Auditing large language models: A three-layered approach. arXiv preprint arXiv:2302.08500, 2023. [76] W. Naudé and N. Dimitri. The race for an artificial general intelligence: implications for public policy. AI & Society, 35:367–379, 2020. [77] J. P. Near and M. P. Miceli. Effective whistle-blowing. Academy of management review, 20(3):679–708, 1995. [78] J. Newman. Decision points in AI governance. Center for Long-Term Cybersecurity, UC Berkeley, 2020. [79] R. Ngo, L. Chan, and S. Mindermann. The alignment problem from a deep learning perspective. arXiv preprint arXiv:2209.00626, 2023. [80] NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0), 2023. [81] OpenAI. Best practices for deploying language models. https://openai.com/blog/ best-practices-for-deploying-language-models, 2022. [82] OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023. [83] OpenAI. OpenAI and Microsoft extend partnership, 2023. [84] T. Ord. The precipice: Existential risk and the future of humanity. Hachette Books, 2020. [85] Oversight Board. Securing ongoing funding. https://www.oversightboard.com/news/ 1111826643064185-securing-ongoing-funding-for-the-oversight-board/, 2022. [86] Oversight Board. https://www.oversightboard.com, 2023. [87] Oversight Board. Bylaws. https://www.oversightboard.com/sr/governance/ bylaws, 2023. [88] Oversight Board. Charter. https://oversightboard.com/attachment/ 494475942886876/, 2023. [89] Oversight Board. Our commitment. https://www.oversightboard.com/ meet-the-board/, 2023. [90] Oversight Board. Trustees. https://www.oversightboard.com/governance, 2023. [91] Partnership on AI. Managing the risks of AI research, 2021. [92] E. Perez, S. Huang, F. Song, T. Cai, R. Ring, J. Aslanides, A. Glaese, N. McAleese, and G. Irving. Red teaming language models with language models. arXiv preprint arXiv:2202.03286, 2022. [93] M. Petermann, N. Tempini, I. K. Garcia, K. Whitaker, and A. Strait. Looking before we leap. Ada Lovelace Institute, 2022. [94] K. Piper. Google’s brand-new AI ethics board is already falling apart. Vox, 2019. [95] Policing Project. Reports of the axon AI ethics board. https://www.policingproject. org/axon, 2020. [96] E. A. Posner and E. G. Weyl. Quadratic voting as efficient corporate governance. The University of Chicago Law Review, 81(1):251–272, 2014. [97] I. D. Raji and J. Buolamwini. Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial AI products. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, pages 429–435, 2019. [98] I. D. Raji, P. Xu, C. Honigsberg, and D. Ho. Outsider oversight: Designing a third party audit ecosystem for AI governance. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pages 557–571, 2022. [99] J. Rando, D. Paleka, D. Lindner, L. Heim, and F. Tramèr. Red-teaming the Stable Diffusion safety filter. arXiv preprint arXiv:2210.04610, 2022. [100] J. Sandbrink, H. Hobbs, J. Swett, A. Dafoe, and A. Sandberg. Differential technology development: A responsible innovation principle for navigating technology risks. SSRN, 2022. [101] R. Sandler, J. Basl, and S. Tiell. Building data and AI ethics committees. Accenture & Northeastern University, 2019. [102] J. Schuett. Three lines of defense against risks from AI. arXiv preprint arXiv:2212.08364, 2022. [103] J. Schuett. Risk management in the Artificial Intelligence Act. European Journal of Risk Regulation, pages 1–19, 2023. [104] E. Seger. In defence of principlism in AI ethics and governance. Philosophy & Technology, 35(2):45, 2022. [105] J. Sevilla, L. Heim, A. Ho, T. Besiroglu, M. Hobbhahn, and P. Villalobos. Compute trends across three eras of machine learning. arXiv preprint arXiv:2202.05924, 2022. [106] T. Shevlane. Structured access: An emerging paradigm for safe AI deployment. In The Oxford Handbook of AI Governance, 2022. [107] T. Shevlane and A. Dafoe. The offense-defense balance of scientific knowledge: Does publishing AI research reduce misuse? In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, pages 173–179, 2020. [108] D. Silver, S. Singh, D. Precup, and R. S. Sutton. Reward is enough. Artificial Intelligence, 299, 2021. [109] R. Smith. Axon committed to listening and learning so that we can fulfill our mission to protect life, together. https://www.axon.com/news/technology/ axon-committed-to-listening-and-learning, 2022. [110] I. Solaiman. The gradient of generative AI release: Methods and considerations. arXiv preprint arXiv:2302.04844, 2023. [111] I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-Voss, J. Wu, A. Radford, G. Krueger, J. W. Kim, S. Kreps, M. McCain, A. Newhouse, J. Blazakis, K. McGuffie, and J. Wang. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203, 2019. [112] M. Srikumar, R. Finlay, G. Abuhamad, C. Ashurst, R. Campbell, E. Campbell-Ratcliffe, H. Hongo, S. R. Jordan, J. Lindley, A. Ovadya, et al. Advancing ethics review practices in AI research. Nature Machine Intelligence, 4(12):1061–1064, 2022. [113] N. N. Taleb. The Black Swan: The Impact of the Highly Improbable. Random House, 2007. [114] V. Thomas, J. Duda, and T. Maurer. Independence with a purpose: Facebook’s creative use of Delaware’s purpose trust statute to establish independent oversight. Business Law Today, 2019. [115] S. Tiell. Create an ethics committee to keep your AI initiative in check. Harvard Business Review, 15, 2019. [116] F. Urbina, F. Lentzos, C. Invernizzi, and S. Ekins. Dual use of artificial-intelligence-powered drug discovery. Nature Machine Intelligence, 4(3):189–191, 2022. [117] M. B. Van Asselt and O. Renn. Risk governance. Journal of Risk Research, 14(4):431–449, 2011. [118] K. Walker. An external advisory council to help advance the responsible development of AI. https://blog.google/technology/ai/ external-advisory-council-help-advance-responsible-development-ai/, 2019. [119] R. Waters and M. Kruppa. Rebel AI group raises record cash after machine learning schism. https://www.ft.com/content/8de92f3a-228e-4bb8-961f-96f2dce70ebb, 2021. [120] L. Weidinger, J. Mellor, M. Rauh, C. Griffin, J. Uesato, P.-S. Huang, M. Cheng, M. Glaese, B. Balle, A. Kasirzadeh, Z. Kenton, S. Brown, W. Hawkins, T. Stepleton, C. Biles, A. Birhane, J. Haas, L. Rimell, L. A. Hendricks, W. Isaac, S. Legassick, G. Irving, and I. Gabriel. Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359, 2021. [121] D. Wong and L. Floridi. Meta’s Oversight Board: A review and critical assessment. Minds and Machines, pages 1–24, 2022. [122] E. Yudkowsky. Cognitive biases potentially affecting judgment of global risks. In Global catastrophic risks, pages 91–119, 2008. [123] J. Zhou and F. Chen. AI ethics: From principles to practice. AI & Society, pages 1–11, 2022.