On “Trustworthy” AI.

If anything is true, it’s that AI has become a hot topic in recent years. However, it’s almost as if the people who ‘make’ AI are rather worried that we won’t trust it. There is a great deal of chatter about making AI more trustworthy in some way, as if this will be a solution to the problem.

In one of the chapters of the Trust Systems book I make an attempt to address the problem. It’s not what you think it is.

Here’s an extract. I hope you enjoy it. Comments are, as ever, welcomed.

Consider this: when an AI is released into the “real” world, every experience it has changes it. It is almost instantly no longer the thing that was released. Who is to blame if it fails? When Tay was released (rather: subjected) to Twitter in 2016 she was innocent in the sense that she didn’t know different (although some things she did know not to touch). Her subsequent descent into racism and homophobia was perfectly understandable (have you seen what gets posted on ‘social’ media?). Much more to the point, she wasn’t the agent that was released onto Twitter as soon as she was released. There really is no-one to blame.


Sure, Microsoft apologized, but most importantly, Microsoft apologized like this: “We are deeply sorry for the unintended offensive and hurtful tweets from Tay…” It is easy to say that Microsoft was at fault, but Tay posted the tweets.

Did you notice something in the preceding paragraphs? I’ll leave you to think about it.

There is a great deal of airtime devoted to making AI more trustworthy by, for example, increasing transparency, or predictability, or whatever, in the hope that people will trust it. The goal is to get people to trust AI, of course, so that all its beneficence will be showered upon us, and we will be, as it were, “All watched over by machines of loving grace.” (Which, if you don’t know it, was the last line of a poem by Richard Brautigan, as well as the rock band!).

Sure, that was sarcasm, but the point is this: some people want us to “trust” AI. Naturally, the answer would seem to be to make AI more trustworthy.

This is answering the wrong question.

Trustworthiness is something that, if you have got this far, you know is the provenance of the thing or person you are thinking of trusting. That is to say, we don’t give trustworthiness to something, it either is or is not trustworthy to some extent. What we (can choose to) give is trust. More to the point, we can choose to give trust even if the thing we are trusting is untrustworthy. Even if we know it is untrustworthy.

To labour the point a little more, let’s turn to the Media Equation. As a reminder: people treat technology as a social actor (they are even polite to technology).

The argument that we shouldn’t trust technology because it is basically just an inanimate, manufactured ‘thing’ is entirely moot.

I’m not going to argue one way or another about whether or not we should trust an AI. That cat is already out of the proverbial bag. If you haven’t seen that yet, let me spell it out for you: that people already see their technology as a social actor means that they almost certainly also think of it in terms of trust. It truly doesn’t matter if they should or not, they just do.

This leaves us with only one option, which is what Reeves and Nass told us all along: design technology on a path of least resistance. Accept that people will be doing what people do and make it easier for them to do so. Even if you don’t, they will anyway, so why make it hard?

Let’s briefly return to the trustworthiness of AI. I’ve already said it’s pretty much a done deal anyway — we will see AI in terms of trust regardless of what might happen. The argument that we should make AI more trustworthy so that people will trust it is pointless.

What is not pointless is thinking about what “trustworthy” actually means. It doesn’t mean “more transparent”, for instance. Consider: the more we know about something, the more we can control (or predict) its actions, and so the less we need to even consider trust. Transparency doesn’t increase trustworthiness, it just removes the need to trust in the first place (or gives us a really good reason not to trust!).

But of course, AI, autonomous vehicles, robot surgeons and the like are not transparent. As we already alluded to in a 2012 paper, we’ve already crossed the line of making things too hard for mere mortals to understand. Coupled with the rather obvious fact that there is no way you can make a learning system really transparent to even its creator after it has learned something that wasn’t controlled, we are left with only the choice to consider trust. There is not another choice. Transparency is a red herring.

That given, what can we do? We are already in a situation where people will be thinking about trust, one way or another. What is it that we can do to make them be more positive?

Again: this is not the right question.


If you want someone to trust you, be trustworthy. It’s actually simple. Behave in a trustworthy fashion. Be seen as trustworthy. Don’t steal information. Don’t make stupid predictions. Don’t accuse people with different skin colours of being more likely to re-offend than others. Don’t treat women differently from men. Don’t flag black or brown people as cheating in exams simply because of the colour of their skin.

Just don’t. It’s honestly not that hard.

It’s actually not rocket science (which is good because I am not a rocket scientist). If the systems we create behave in a way that people see is untrustworthy, they will not trust them. And of course, with excellent reason.

And if you are about to say “it is hard actually” then what follows is as a result almost certainly true: We are applying AI in all kinds of places where we shouldn’t because the AI can’t do it properly yet.

And we expect people will want to trust it positively?

Let me ask one question: if you saw a human being behaving the way much of the AI we have experienced does toward different kinds of people, what would you do?

Published by Steve

Partner, Dad, Prof, Writer

One thought on “On “Trustworthy” AI.

  1. Hi Steve, It would seem zero trust (neutral position, Trust required to proceed) has been adopted by security vendors marketing teams as a slogan to gain sales. To answer your question, for AI using a distrust model would be applicable, as with learning bias and neural networks humans simply cannot trust or understand the logic behind the MACHINE decision. As a distrust model is a negative position, and in a security context would have the MACHINE isolated from areas of trust (such as twitter posts) and the untrusted outputs validated and augmented by humans (like a prison work program in human context) The intent of such a model is to empower and uplift human intelligence. I would be interested in your thoughts along these lines , Kind regards Phil #HAMI (Human augmented MACHINE INPUT)


Leave a Reply to Phil Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: