Assigning Blame and Credit to AI Systems

In the 1950s, Isaac Asimov imagined a world where robots lived and worked in human society, and in one of his short science fiction stories, he discusses the travails and tribulations of a robot “Cal“, who serves a human writer, and “wishes” to be a writer himself, after Cal discovers that he cannot be considered helpful for performing his master’s work – writing. This heartwarming and profound story is a benign example of the future previous generations imagined (perhaps erroneously although charmingly enough).

An Asymptote to the Asimovian Atelier

In recent years, developments in AI and Machine Learning techniques have encompassed generative modeling and algorithms used to generate art, to send control signals to self-driving vehicles to correct themselves on the highway, or to generate music and text for artists and writers alike. There have been forays into AI-based authoring, AI-based composition of music, and AI-based writing of novels. While these algorithms are incipient in their scope and capability, some generative algorithms such as GPT-3 show huge promise. Some years ago in 2016, I was part of a session conducted by Google’s Tensorflow team where we performed simple style transfer tasks using deep learning on convolutional neural networks. The advanced algorithms we have access to in 2020 make these tasks quite trivial and the current state of research makes the possibility of even more advanced algorithms acute. A good technical summary of how to build some of these algorithms – including the DCGANs and CycleGANs that have become so popular is the book embedded below. There is lots of literature for the technically minded and Github has scores of repositories on the subject of generative AI, not to mention other kinds of AI models.

Despite this creeping progress that AI researchers today are making towards the world imagined by Isaac Asimov several decades ago, certain key elements of AI and its impact on our society are likely to be up for discussion and debate well into future decades.

The problem of accountability and blame (or credit) assignment in AI systems is one of these important ongoing discussions, and it promises to upend some of our existing notions about how people and human and natural systems function. We are guaranteed to peer more into the underlying chain of causation in AI-based systems and the cause effect chains that arise due to their interactions with human society at large – in all spheres from science and medicine to engineering and technology, to art and creativity.

Yes, but is it art?

Edmond de Belamy.png
Edmond De Belamy, a piece of art created by a Generative Adversarial Network

This discussion promises to shine new light on agency, free will, decision making and other aspects of societal functioning. Starting with relatively innocuous questions such as the one posed in this fantastic paper – “Who gets credit for AI generated art?”, which is a discussion on Edmond De Belamy. Additionally, it serves as a broader swath of exploration veritably covering our mental models of what reasonable causes and effects are, what phenomena can be considered indivisible or black boxes, and what the meaning and extent of human agency is, – all these will be explored. Let’s start with the obvious – how our media (and by extension most of society) see artificial intelligence – this is the example of the art discussed in the aforementioned paper.

Human interpretations (from the media) of the credit assignment problem posed by the generative AI painter

Operational Domains, System Definitions and Moral Crumple Zones

To understand some elements of how AI based systems can be handled in society, we need (but may not necessarily get) the following:

  1. Acceptable boundaries and a definition of what constitutes an AI agent’s field of operation (domain)
  2. A range of effects and first order characteristics of such effects that are a direct consequence of the agent’s action
  3. An understanding of how second, third and higher order effects may be ascribed to the agent, or to broader systems at play.

In addition to the above, a clear operational definition or description of the AI system in question is important in any discussions that ascribe blame or credit to the system in question. We will see how this is important in a further section of the post.

At the outset, it may seem as though a self-driving car that has killed a person may deserve blame, or its manufacturer may be liable to legal action. It may seem that the person using the DGCAN algorithm in the aforementioned link on AI generated art would be liable to collect the proceeds from the sale of art produced by the algorithm. It may also seem as though the external environment plays no tangible role in the outcomes of the AI systems, since they are “AI systems” after all.

However, things get much more complex when we start looking at the details of such real-world interactions. The acceptable boundaries of an AI agent’s field of operation are not always well defined. For instance, would a self-driving car be to blame, if there was an unexpected event (such as a human-caused accident, or an oil tanker spill) on the road ahead, and because of a limitation of the system, the system flies out of control in this environment, and ends up killing its occupants? Would the creator of the DCGAN algorithm have to be penalized in case the art doesn’t fetch some anticipated price at an auction? One of these questions may sound more serious and the other more innocuous than the other, but in fact, they both tie back to the same boundaries and domain of operation we discussed earlier, and what we accept as the first order effects or characteristics of an AI agent, and what we don’t.

One could group the ambiguities around disproportionate responses by human systems to the results of AI systems, as “moral crumple zones”. Indeed, in the context of the human interaction with the natural world (because the two domains of beasts and men intersect there), we could term these “ecological crumple zones” – where human interest in extracting value of a specific kind from natural systems makes us amenable to cause widespread and disproportionate imbalances in the world – as can be seen in over-fishing or over-whaling, the hunting of animals to extinction and the destruction of the rainforests. In the context of the AI-human interaction question, moral crumple zones (as defined by Epstein et al) constitutes disproportionate responses directed at one facet of an AI system due to agony caused to a human or a group of humans due to an AI’s action.

Our task then, is to reduce the span of these “moral crumple zones” – and a key to this is the operational definition of the AI system mentioned above. This isn’t necessarily a direct or even useful answer to the dilemma posed by the art sale or the self-driving car’s blame assignment problem, but does take us in that general direction.

A Submarine by Any Other Name: Narrow and Broad AI

It is helpful sometimes to think of agents in natural systems in terms of synthetic equivalents, although this may be a form of shoehorning their relevance, value and significance. For instance, one could call a fish, “a natural submersible agent” – although this may capture an aspect of the fish, i.e., its locomotion, it doesn’t necessarily do justice to the role of the fish in the context of the broader ecosystem, as an organism that’s part of the grand natural dance and as a complex organism in its own right, given to a physiological homeostasis of its own. In the same way, we could refer to birds and aircraft as analogues in the natural and synthetic spheres, and although you could argue that both are “evolved” (one by nature, another by the human hand), it is clear that the same limitations of the analogy as discussed earlier apply to this situation too.

Clearly, anthromorphic AI is a long way off. Whatever stories the media spin around us about the coming strong AI, or the fact that “An AI beat Lee Sedol at the Game of Go“, it remains that these are engineered systems, and are not to be confused with agents with human level intelligence. Surely, much of this AI hyperbole exists despite qualifications to the contrary, because it is a sign of our times to ascribe undue credit or undue intelligence to things that are not especially intelligent of their own accord. A DCGAN algorithm, for instance, consists of a generator neural network, and a discriminator neural network which are trained in lock-step in alternative training cycles, seeking to create a generator which just barely manages to fox the discriminator in a generation task. This doesn’t constitute broad intelligence.

Broad AI, in the words of Gary Marcus, is a phenomenon that is the converse of the narrow AI capabilities we have begun to see in the world today – essentially, narrow AI like the DCGAN alluded to above, or the deep learning network that guides a self-driving car, are simplistic, purpose-built decision making systems, which have come to be conflated with the capability of a human over time. Lee Sedol lost to a reinforcement learning model – I’ll admit this is a model of significant complexity, but it is a model nevertheless, and not a full fledged intelligent agent in any sense other than in the narrow domain. While Lee Sedol retains the ability to gaze at the stars, wonder, appreciate art and a good cup of tea, and still pontificate on the ramifications of move #37 made by Alpha Go, Alpha Go itself remains a narrow construct, excellent at taking decisions in a specific domain and a specific context, with no clue of how to solve broader problems in human society.

The broader point here, then, is the fact that anthropomorphicity in terminology used to describe AI can be a hurdle to truly understanding what the AI system is doing in the first place. Just as a submarine and a fish aren’t one and the same thing. Even though they use the principles of hydrodynamics for accomplishing locomotion tasks, one of these is a machine with a limited specification set and clearly defined system and operational parameters, while the other is a complex, evolving lifeform with broad influence and ramifications for the ecosystem it inhabits. The moral crumple zone for an evolving lifeform is different than that for an anthropomorphic robot. The moral crumple zone for the fish is likely to be smaller, and that for the AI agent is likely to be larger, especially if used as a tool by sophisticated creatures such as human beings. The latter is likely (given the nature of progress we’re making in AI systems) to be of far more influence and may wield much more power than the former, which is playing a part in a whole harmoniously and will either be predator or prey in a vast ecological chain of cause and effect. The cyclical nature of the ecological homeostasis of the lifeform lends to itself a termination point, whereas artificial intelligence agents may not function in the same way, and may have much more longevity and influence, especially given that they are tools and ultimately are human means to a human end.

AGI and Alexander the God: A Shift in Agency

It isn’t unfair to suggest, in light of this discussion on narrow and broad AI, that an artificial general intelligence (AGI) may be subject to far more scrutiny than are the narrow intelligence of today. I posit that there is a shift in the agency of a system that extends broadly enough, to the point that while initially, an individual or team may have been culpable for the undesirable side effects of the system (such as blaming the lack of seat belts in a car to a specific individual upon failure of the car to secure a passenger), to a systemic problem that expands to include the whole system itself (a great example of this being the fraudulent, stupid and self-serving financial system that gave us the sub-prime crisis in the late 2000s).

The agency of a narrow AI system can more or less be easily understood and matters of legal and scientific dispute may thereby be more clearly elucidated, than in the case of a broad AI or a general intelligence. In the latter case, we essentially have parts of a whole to blame for different effects on the boundaries of the competence of the system

Another of Asimov’s sci-fi short stories discusses a young protagonist who sets out to become rich by predicting the stock market, and successfully pulls this off by getting his computer to learn how the stock market from the news, the associated changes in the stock market and is able to tailor his trades to world events and phenomena, leaving him a very rich man. This story, like many others in science fiction is saddled with the twin paradox of technical debt and innovation in a vacuum, not to mention the efficient market hypothesis – had Alexander been able to do even a fraction of what is claimed in the story, he would have run up against a wall by doing so himself, compared to if he were “standing on the shoulders of giants who came before him”. In this sense, Alexander’s project in the Asimov story may be likened to a vast, multi-generational capitalist project in a never-ending free market economy (which in turn sounds less and less realistic given the vagaries of the world economy over time, the more I think about it). Such a project would, in all likelihood have expanded its scope from merely a profit engine, to becoming one capable of influencing the world, because after a point, the shortest path to the goal of maximizing rewards would involve direct intervention. Somewhere along this trajectory, Alexander’s AI would stop being narrow AI, and start becoming a more general intelligence. So, if a rainforest disappeared last week due to Alexander’s AI’s need to deforest it for preferentially ticking the stocks up, would you blame that on the profit maximization rule built into the AI, or would you blame it on the greed of the humans who benefit from Alexander’s AI? At this scale, we all realize (as the legislators will) that the AI is a means to an end.

Coming back to the problem of ascribing credit and blame – the story in question is quintessential in how the media and the non-technical world at large see AI – as a looming, foreboding influence in the present that will enable large scale income inequality and that would somehow summon demons out of the fabric of reality. In fact, I’m not ashamed to use the hyperbole here since Stephen Hawking himself ended up using a similar metaphor. The reality, of course, is that we really have a number of narrow AI systems that are interacting with the broader human-created systems such as in trade or industry, which can have an outsize effect on how we interact with the world. How does one characterize agency in this situation? Narrow AI systems may have a specification, but can a broader system with multiple specialization areas (and by extension multiple domains, multiple interaction points with society, greater complexity and lesser interpretability) be considered an agent in its own right for legal and social / cultural / scientific purposes? The moral crumple zones in these broad AIs will also tend to have bigger moral crumple zones – so does that give society the responsibility to treat these AIs as they would treat, say, an air hostess who has to apologize on behalf of the airline to the customer, or a customer service agent who has to take the blame for a product? If yes, does the idea of humaneness become relevant in this context, and what does that entail?

Ascribing credit and blame in case of the widespread use of narrow AI in complex human systems is probably as complex if not more complex as the broader and hitherto unseen problem of legislating an artificial general intelligence.

Concluding Remarks

Credit and blame assignment in AI systems – like speculation and wonder – are innate to human responses to and characterization of phenomena, systems and forces that are beyond our full grasp, but which we perceive to be simpler to understand than they often are. And this isn’t necessarily a bad thing, but an indication of how we think. Short of upending everything we understand and know about agency and pulling us into an infinite regress of causality (“The whataboutery could indeed go all the way back to the Big Bang,” said he humorously), we can devise practical ways and means of characterizing the domains and scope of AI systems and plan for progress. This includes healthy discussions about the scope and limits of algorithms, the kinds of situations in which they may be used (and their due regulation, such as in warfare – including cyberwarfare), and keeping the human stakeholders of these systems informed in advance of the risks of using both narrow AI systems today and broader AI systems in the future. Without planning for progress in AI systems in this manner, we’re likely to inherit a future in which the admixture of human societal systems and AI capabilities will leave us befuddled as a society that cannot maximize outcomes for individuals because of the fact that it has to carry on in a manner that is convenient to the clueless bulk of society, or at the very least, to architects of the mess that it has become.


Further reading