Does AI come with built-in values and ethics?

Derek Dempsey
Nov 30, 2024
7 min read

By Derek Dempsey

Derek Dempsey is a highly-experienced industry consultant and former academic specialising in applied AI, machine learning and predictive modelling. Specialising in financial crime prevention, he has worked around the world for many years with organisations across many different industries.

The dominant view about technology is that, in itself, it is value-neutral. It is how it is used that is ethical or not. However, there is a competing view that some technology has implicit value by its very nature. Nuclear power and its development as a destructive weapon is often cited as an example of this. Arguably, this is an inherently dangerous technology because of its potential to cause harm. Similarly, there are widespread arguments that AI is an inherently dangerous technology which, if allowed to develop unchecked, could bring real harm to humanity. Clearly, AI does raise new issues and will continue to do so. Some see it as an existential threat.

Separately, there is also the argument that AI systems reflect, and potentially entrench, human values that are implicit in the data used to train these systems. This includes human biases against gender, ethnicity, age, disability and so on. This only applies to those human-centred AI systems using this type of data but it’s an important segment. Here, the argument is based around the quality of the solution rather than the existential threat of the solution.

Therefore, we have two concepts of value here:

There is systemic value, whereby an AI system incorporates value by its intrinsic nature; because of what it is and what it potentially can do. For example, live facial recognition or social scoring –both prohibited in EU Member States by the European Union's Artificial Intelligence Act (2024).
There is content value, whereby an AI system incorporates values within the decisions or output it generates. For example, gender bias in a recruitment algorithm.

AI systems are trained on datasets that are built and generally labelled for the training task (see Note). The labelling defines the outcomes that the system has to learn. For vision systems, it might be the objects in the images that need to be labelled by humans. For a fraud detection system, historic data of previously confirmed fraud cases forms the basis of the labelling. The AI system therefore reflects both data availability and previous human judgements unless active steps are taken to mitigate this. Where these systems are using human characteristics, they should be checked for bias, and if there is unacceptable bias, then this needs to be corrected. There are usually ways to do this, but this also introduces an ethical stance into the system by essentially removing undesirable values (e.g. ethnic bias). So, neutrality itself is an important value here that most people think should be a required value of all systems.

The COMPAS pre-trial assessment system

A case where both concepts of value arise is the use of pre-trial assessment systems in US courts. COMPAS was developed in the US to provide a score at pre-trial on whether a suspect should be granted bail or not. This was aimed at addressing the huge problem of pre-trial detention. The system is trained on historic pre-trial case data and aims to replicate this but remove any bias. Thus, whatever normative values have been used previously will be reflected in the AI decisions. This usage is very concerning for many people but those involved argued, with some justification, that automated decisioning can be consistent, evidence-based and therefore potentially able to address unconscious biases in human decisioning. In practice, the system was advisory to the sitting judge as all such systems require human oversight and ultimate approval. However, it was clearly influential in the judges’ decisions.

An independent review showed that the COMPAS system was biased against black suspects in certain ways. Although contested, some State judiciaries chose to discontinue it. Retraining the system to remove this bias could have been an option but once the public argument had been lost this was difficult. More fundamental is the second type of argument which is that such decision-making should never be entrusted to AI. Even though the judge retained the final say, it is easy to envisage over time an automated system essentially providing the main basis for the decision or even operating independently. One of the arguments here is the lack of transparency in decision-making. Another is that such important decisions about human welfare should never be in the hands of AI. Where then is the accountability for these decisions? These systems continue to be used quite widely and their benefit or otherwise continues to be debated. In general, their use has reduced levels of pretrial incarceration but not addressed perceived ethnic bias within the system, as had been hoped.

Self-driving cars – why do we want them?

Self-driving vehicles represent one the most advanced AI technologies in operation. Perhaps in ten years’ time they will be a common sight on the roads of all major cities. Is this a good thing? Why do we want self-driving vehicles? Why do we want robotic automation over human agency? More pertinently, who is developing these vehicles? Who benefits? Is this another example of technological advancement regardless of human impact? Is such technology really value-neutral if it eliminates jobs and significantly changes our social environment? This question is the subject of a recent book by Stephanie Hare's recent book Technology is not Neutral: A Short Guide to Technology Ethics (2024). Hare only looks briefly at self-driving vehicles but her thesis is that we do need to consider the potential societal impact of such new technologies and not simply develop them and then see what happens.

Value encoding

Some systems require 'value encoding' rather than 'implicit learning'. Value encoding is the method whereby certain values can be represented in and effectively processed by AI systems. A self-driving vehicle, for example, needs to have responses to potential incidents and collisions programmed into it so that it avoids causing harm to humans. There have been incidents already. In Arizona in 2018, an autonomous vehicle (AV) was confused by a woman walking across a road with a bicycle. The AV identified her first as an unknown object, then as a vehicle, and finally as a bicycle. Due to these mis-classifications, the system did not accurately predict her path and crashed into her. The onboard safety driver was reportedly distracted, watching a television show on her phone at the time of the incident, and failed to react in time to prevent the collision. Every such incident, tragic as they may be, can be used to improve the system so that, ultimately, far fewer accidents occur than with human drivers. Variations on the famous trolley problem may also arise. Does the vehicle swerve to miss two people only to risk causing the death of another person, for example? How is this encoded? How does the system deal with the multiple ways such situations can arise? All this needs to be addressed in the development phase.

Intelligent systems and creative problem-solving

Can we say then that such AI systems are making autonomous ethical judgments? Even if values are encoded as rules, they still need to be correctly applied. The system needs to interpret its world. A robot faced with the classic trolley problem and programmed with Asimov’s First Law of Robotics (A robot must not harm a human or allow a human to come to harm through inaction) would face an impossible dilemma. It cannot cause harm to a human so cannot shift the tracks and it cannot, through inactivity, allow harm to a human, so must shift the tracks. There is no utilitarian calculus within Asimov’s law such that many human lives are intrinsically more important than one human life. Unless the robot can find a creative solution, it is caught in an infinite loop. As would many people be.

Interestingly, a robot that responded by endlessly switching the tracks back and forth might just solve the dilemma by derailing the trolley.

Concluding comments

The development of AI is an ongoing learning process for humanity. AI systems may be fundamentally mathematical but they can, or indeed will, acquire values through training data or through intentional programming. Their human designers, however, will always determine their purpose and goals and whether they are doing this correctly and safely. This is like any other technology. The amount of human engagement around all AI systems is enormous. The advent of systems that truly learn by themselves is yet to happen.

At the same time, as they attain increasing capacity for autonomous action and agency, the question of their value-base becomes increasingly important. Questions of bias are important but not fundamental to the larger debate. Bias can be measured and often addressed, but this doesn’t solve societal problems of bias. We still need to do that ourselves. The myth of ‘entrenchment’ is really based on human failure to sort out our own biases. If we don’t, then our AI systems will continue to reflect them.

The question about why we need these AI systems, such as self-driving vehicles, is interesting. Would a single woman needing a taxi home late at night choose a self-driving vehicle, given the choice? Quite possibly. Will they reduce accidents? Quite possibly. Most systems satisfy some demand. The halting of technological progress is also difficult. If we reached a consensus, we could ban some systems. The EU AI Act has done this for multiple systems across the EU zone, though not for self-driving vehicles. The EU view is that some technologies, such as live facial recognition, is either liable to be misused or fundamentally wrong because it represents a threat to civil liberties. This speaks to the nature of the society that we want.

Many AI systems operate in value-neutral environments. A navigation system just navigates. Many systems that impact human choices, such as recommendation algorithms – for example, recommending music or literature – are simply applying your own biases. This is largely incidental but potentially useful as well as profitable. People worry about money lending decisions but these, and almost everything financial, are already highly regulated, whether guided by AI or not.

Generative AI enables the creation of content: text, audio or visual. We have yet to see what values might be contained or trained into these systems but they can be trained to hold certain views and opinions. Just as a fiction writer can create characters, these systems will be able to take on personas. This will probably need to be regulated, at the very least to ensure that people are aware they are interacting with an AI. Ultimately, we can decide what values and systems should be permitted.

Note

In this context, labelling refers to the process of assigning specific annotations or tags to data to provide AI systems with the necessary information to learn and make predictions during training.