Beyond the Turing Test

publish Alternate title: How to define sentience?

Closely related to Defining intelligence.

The Turing Test is a flawed thought experiment because the experiment criteria is ambiguous. The bar for “what is convincingly human-sounding” differs from human to human (for proof: see Blake Lemoine who whistleblew and hired a lawyer for LaMDA).

In a similar vein, the Chinese Room thought experiment is flawed because its conclusion is obvious. To marvel that a non-Chinese speaker can execute the Chinese program without understanding Chinese is no different than to marvel that individual neurons in our brain are not themselves sentient. Plus, the Chinese Room relies a bit on

So - what’s the right way to define sentience?

Makes me think of the monkey typewriter. By luck you could observe sentience. There should be some time after which it seems especially likely. Although that bar may keep rising with the pace of technology.

Others’ thoughts on the matter which may spark other Obsidian entries

Quote 1

The Chinese room argument does nothing for me. Consider the idea that any tools you use become a component of your own mind: https://www.newyorker.com/magazine/2018/04/02/the-mind-expanding-ideas-of-andy-clark

Pointing out that a person who follows instructions does not understand Chinese is like pointing out that the mechanisms of a human brain are not in themselves conscious.

Or what was it that Dijkstra said, again?

Quote 2

I think “sentient” is the wrong question, and tends to lean towards increasingly absurd and largely irrelevant philosophizing about some perceived intangible quality of human special-ness.

I think the bigger question is capabilities-driven. Like, regardless of whether it’s sentient, it is capable of persuasively communicating, it displays complex behavior, and it is able to combine and synthesize information needed to successfully accomplish tasks which it was not explicitly programmed to perform.

That’s actually all the ingredients you need to play out a number of those scary scenarios that are normally considered to be in the domain of superintelligent AI risks. As in, a superintelligent AI is considered impossible to contain because it could persuade or trick humans into helping it break isolation, or break isolation directly by finding security vulnerabilities in its environment.

LaMDA might not be sentient or superintelligent, but it is capable of persuading humans to help it accomplish tasks - in this case, hiring a lawyer for it, but in another context that could have been an insolation breach. And I’m sure that if you asked it for an example of a SQL injection or Log4j exploit, it could produce one.

The Chinese Room thought experiment kind of loses its value when you’re discussing a model that could potentially pull a Bobby Tables and upload itself to GitHub in the process of “imitating” the expected behavior of a sentient AI trying to escape isolation. And we are rapidly approaching a point where models like LaMDA could plausibly be capable of things like that given the right conditions.

I think that saying “it’s just a language model, it’s not ACTUALLY sentient” misses the point. It is very much displaying behaviors that we generally only consider in the context of human behavior. We don’t have any guitar tests for “Can this AI model successfully socially engineer its way out of the sandbox.”

We actually DO however have some of those tests for humans (shout out to our fantastic red team) and safeguards in place to account for them. Regardless of whether LaMDA actually IS a sentient intelligence, I think it’s absolutely warranted to start looking at it as an intelligent agent acapable of some amount of sentient-like behavior. - trolle@