Could Software Developers Be Held Legally Responsible for Management Decisions? Thoughts on VA HB2121
Some thoughts on concerns on Del. Maldonado's Bill to Regulate AI
Let me start this off by saying I am in no way a legal expert or even particularly knowledgeable about the language of laws and bills, but I have worked professionally as a software developer for nearly a decade now, I have a bachelors in Computer Science and I have some additional education in machine learning algorithms, and I have worked on a lot of software that would wrongly be considered AI under this bill.
Concerns About Definitions
"Artificial intelligence" means a set of technologies that enables machines to perform tasks under varying and unpredictable circumstances that typically require human oversight or intelligence, or that can learn from experience and improve performance when exposed to data sets.
This definition encompasses a wide array of software not typically considered "artificial intelligence." Virtually any software that analyzes data could fall under this definition, as it includes "machines performing tasks under varying and unpredictable circumstances that typically require human oversight or intelligence." This broad phrasing essentially describes almost all software ever written.
For instance, all software is ideally developed to handle a wide range of varying and unpredictable circumstances. If you purchase an item from an e-commerce store, does the store's software handle the case where you attempt to order -1 items? Such an edge case—an attempt to purchase a negative quantity—could certainly be considered unpredictable, and the quantities vary. Without computers, such orders would require manual human oversight. Should software that handles this scenario by simply checking for positive values count as AI?
"...or that can learn from experience and improve performance when exposed to data sets."
This part of the definition aligns more closely with what is generally understood to be "AI." Artificial intelligence typically refers to a subset of software known as "machine learning." This involves systems trained on historical data to develop a mathematical model for producing outcomes based on inputs, rather than systems explicitly coded to produce specific outputs for given inputs.
The distinction between AI and traditional machine learning often lies in the complexity of the algorithms and models used. For example, software employing Bayes’ theorem to predict a flower type based on color, length, and shape would not generally be considered AI. However, software capable of analyzing a photograph of a flower, extracting its features, and then predicting its type could reasonably be classified as AI.
A more precise definition might replace the "or" in the original definition with "and," but even this might not narrow the scope sufficiently.
Later in the bill, the following terms are defined:
"Foundation model" means a machine learning model that (i) is trained on broad data at scale, (ii) is designed for generality of output, and (iii) can be adapted to a wide range of distinctive tasks.
"Generative artificial intelligence" means artificial intelligence based on a foundation model that is capable of and used to produce synthetic digital content, including audio, images, text, and videos.
This is a much better definition. It focuses on key characteristics that distinguish AI from typical deterministic software. Adding a requirement for foundation models to the AI definition would address many of my concerns.
Duties of Developers
59.1-608. Duties of developers.
A. A developer of a generative artificial intelligence system or service shall apply provenance data, either directly or through the use of third-party technologies, to synthetic digital content that is wholly generated by such developer's generative artificial intelligence system or service.
B. A developer of a generative artificial intelligence system or service shall make available to the public:
A provenance application tool; and
A provenance reader.
"Developer" means any person doing business in the Commonwealth that develops or significantly updates a generative artificial intelligence system or service that is offered, sold, leased, given, or otherwise provided to consumers in the Commonwealth.
This provision raises significant concerns. It places the entire burden of compliance on software developers, without addressing the roles of project managers, executives, stakeholders, and others who might push for features that could contravene the law.
Software developers are not licensed in any state. There is no licensing board, no formal code of ethics, no whistleblowing/reporting structure, and no real protection against termination for refusing to follow illegal instructions. On a team of 15 developers where management has deprioritized work required by this law, how would you determine which developer to hold legally responsible? Why wouldn’t the corporation bear the responsibility of the decisions it makes?
That said, there is an aspect of this law that I appreciate. It mandates the creation of a software tool called a "provenance reader," defined as follows:
"Provenance reader" means an online service, product, or feature that enables a user to view the provenance data, if any, of synthetic digital content.
If this means that all training data used to produce a model must be made public, it could have significant benefits. For example, it would ensure companies are not using copyrighted materials without permission and that the data used to train models accurately represents reality, preventing potential manipulation of the public. There are some concerns here though - what if I build an AI system that is trained in PII? Surely we should not be exposing individuals personal data to the entire world to see. A provision like this needs to come with more clarity on the responsibility of the organization to provide data security to end users.
This bill is a Good Idea
The intent of this bill is great. We absolutely need to ensure that AI software is used responsibly, and that the public and consumers can understand how that data is being used. But I’m extremely worried about the scope of this bill. As written, it could apply to almost all software written in the state of Virginia. It places way too much of the burden on software developers who have no real recourse when management instructs them to take an action that is illegal under this law. And while it rightfully pushes for transparency in data, it doesn’t address any privacy concerns that might arise from the transparency. In fact, the word “Privacy” does not appear once in the current draft of the bill.
Note: The photo used in this article was provided for free under the Unsplash license by Tingey Injury Law Firm, West Charleston Boulevard, Las Vegas, NV, USA