回到Silas S. Brown(赛乐思)的网站首页

《人工智能》的错用名称 The AI Misnomer

After China started to promote progress in "artificial intelligence", Chinese friends who find out that I'm a computer scientist sometimes ask me questions like "what will become of society after AI takes all our jobs?" Perhaps many people all over the world are in fear of "artificial intelligence" because they think of how it is portrayed in science fiction, whereas the entire world's media has generally failed to explain what it means in reality.

The 1990s science-fiction television drama "Star Trek The Next Generation" featured an android "Lieutenant Commander Data" who had a "positronic brain" (an idea from 1940s author Isaac Asimov) which apparently gave Data self-awareness or consciousness. The 1979 Star Trek motion picture had an artificial antagonist that used a female robot to confront the protagonists in their spaceship, and the final act showed this antagonist to have originally been a simple space probe, but collecting too much data caused it to become self-aware and gain consciousness (and a misunderstanding that Earth's inhabitants should be destroyed, so the film can have a plot). According to this kind of science fiction, self-awareness is not only something that we can take for granted is easily made part of AI, but it also something that can spontaneously emerge in a simple computer (and turn against humans). But this kind of fiction also claims that the origin of life is so easy that it happened everywhere and the entire galaxy is brimming full of aliens, whereas evidence suggests our real universe does not have those conditions: life cannot arise so easily, and neither can self-awareness or consciousness.

MIT computer scientist John McCarthy who coined the term "artificial intelligence" in 1956 was the inventor of the LISP programming language and wrote Chess software (and suggested a form of cloud-computing service in the 1960s), but the things he was describing were not meant to be anything to do with consciousness or self-awareness. A computer's processor is essentially an electronic circuit: a transistor can let one current switch another current on or off, so we can use particular arrangements of transistors to process binary-number patterns (performing arithmetic, moving data patterns around, etc) by switching. We can arrange transistors such that different binary patterns on the input can cause different parts of the circuit to be switched on, and we can have the system process its input in a sequence, so that we can refer to it as "obeying the software's instructions" whereas actually it is simply a circuit being manipulated by the software's binary patterns and it has no understanding. But this kind of machine code is hard to write, so computer scientists have been inventing "high-level languages" which, although still being computer languages, are easier to write; software called "compilers" and "interpreters" can translate these high-level languages into the necessary binary patterns for the machine. But a compiler's 'translation' is most definitely not the same thing as machine translation of natural (human) languages: writing reliable natural-language translation software is very difficult (such software often makes mistakes, as I'll mention below), whereas "compiling" a computer language is very reliable because these languages have been specifically designed for it. Today there are many high-level languages, and I recommend all computer scientists should at least know Python, C, C++, Java, JavaScript, Bash, Makefile etc. McCarthy hoped that future programming languages can look more like declarations of facts, letting the compiler translate these into commands, but the system still has no real understanding; it just needs a first-rate compiler designer to pull it off. This kind of software behaves like a device reacting to patterns: the programmer says "if this pattern happens, then take this action, or if that pattern happens, then take that action" etc, and has to consider beforehand in great detail what kind of input patterns may happen; overlooking a rare situation can result in the software behaving inappropriately or crashing when that situation happens.

现代的所谓《人工智能》系统往往使用一个“机器学习”方法,但不真的是学习(系统仍然没有悟性),只不过是个近路所以编制员不必自己写所有规则。比如说,我写了一个软件把汉字翻译成汉语拼音。但我怎么处理多音字? 比如“差”有时是chà有时是chā有时是chāi,我该写什么规则? 真人可以明白“如果是关于差别,是chā,如果是关于差来是chāi,如果是关于错误是chà”,但电脑只不过是机器,怎么会了解句子关于什么? 所以,我系统的具体规则(2020年11月版本)是:“看看从3汉字前和2汉字后的上下文,如果那些5汉字包括:
What is called "artificial intelligence" today typically involves a method called "machine learning" but it's not real learning (the system still has no real understanding)---it's just a shortcut so that the programmers don't have to write so many rules themselves. For example, I wrote a piece of software to turn Chinese characters into Pinyin pronunciation. But how do I handle the case where one Chinese character can be pronounced in more than one way? For example, the character can mean "wrong" or "different" or "send" and each of these meanings has a different pronunciation, so when reading it in a sentence, the reader is supposed to judge from the context whether this particular instance of it is "wrong" or "different" or "send" and pronounce it accordingly. (English examples include "lead", "wind", "bow" etc.) A real person can interpret the meaning of the surrounding text, but the computer is just a machine, so how can it understand what the sentence is about? So my system's specific rule (in the November 2020 version) is: "look at the 3 Chinese characters before this point, and the 2 Chinese characters after it, and see if these 5 include any of the following words:

'I' or 'go' or `’s' or `you' or `to' or `then' or `person' or `make' or `-s' or `will' or `and' or `sky' or `child' or `emperor' or `king' or `rub' or `first' or `Je' or `must' or `out' or `big' or `slave' or `stand' or `although' or `personage' or `basis' or `second' or `cultured' or `anger' or `heart' or `time' or `cast' or `lord' or `besides' or `-th' or `sacrifice' or `Mal' or `Ms' or `every' or `send him'

那么这个‘差’是chāi而不是chà。”我为什么写这么复杂的规则? 我不写了。这条规则是我所写的“机器学习”系统建造的。因为我把差不多4千页的汉语拼音书刊进入我的系统,那个资料包括2千多例子示范如何写“差”,而我所写的软件从此能查考“差”是个多音字(因为不同的例子用不同的拼音)所以它就处理每一例子的上下文而找到最少的规则能识别哪的上下文有“chāi”。
then the meaning is 'send' rather than 'wrong'". Why did I write such a complicated rule? I didn't. My computer did. This rule was generated by my "machine learning" system, because I fed it about 4,000 pages of example Pinyin annotations of Chinese characters, which included over 2,000 examples of how this word should be handled, from which my software could tell it had more than one reading, so it processed the context of each example and found the smallest set of rules that would distinguish which instances are read as "send."

这真的是“智能”吗? 我怀疑。看起来运转: 如果我写“我拆他去”,系统可以写wǒ chāi tā qù而不是wǒ chà tā qù因为“我”和“去”好像帮助系统给它线索这个“差”很可能是“chāi”。但假设我写“我很差”,那也包括“我”,所以也是chāi吗? 看来我的系统以为这样。我看出这个之后,我会很容易修改我的系统: 我只得把“我很差=wǒ hěn chà”加入我的例子材料,并让系统再次处理那些材料。这次它不选择“我”做区别的线索,反而选择另一个。不过,我怎么知道我的系统多可靠? 虽然我打算继续加入更多例子,使系统越来越多可靠,但恐怕我们如果输入跟例子句子截然不同的话,一直会找一些错误的结果,因为这样的“机器学习”不真的是学习,只是好像个把戏让人以为电脑学习了。“机器学习”能使我的编写员生活更容易(我自己写所有规则更难),但系统不但是有时候错了,而且我不能总是猜想系统什么时候将会犯错。我仍然向朋友推荐我的系统(因为我看见其他的加拼音系统更错)但我不想说这是真正的“学习”或“智能”。但这是《人工智能》的状态!
Is this really "intelligence"? I doubt it. It seems to work: if I write "I send him to go" in Chinese, the system can correctly infer that this use of the character should be read as "send" because "I" and "go" are there to provide that clue. But if I write the Chinese for "I am wrong" (which also includes "I"), the word should be read as "wrong" but the system still reads it as "send" because of the "I" being there. Of course, once I know about this problem, I can easily fix it just by adding the phrase "I am wrong" to the examples corpus and then re-running the processing of that corpus, whereupon it will drop "I" from the list of context words that can be used to differentiate the case and instead select another. But how do I know how reliable my system is? Although I plan to continue adding more examples and make it more and more reliable, I'm afraid that input very different from the example sentences will always find some instances of mistaken results, because this kind of "machine learning" isn't really learning, it's just like a conjuring trick to make people think the computer has learned. "Machine learning" can make my life as a programmer easier (writing all those rules by myself is more difficult), but not only is the system sometimes wrong, but also I myself can't always guess when it is going to be wrong. I still recommend my system to my friends (because I've seen other software to add pinyin that is even worse), but I don't want to say this is real "learning" or "intelligence". But this kind of thing is the current state of ``artificial intelligence!''

机器翻译虽然有进步(因为工程师不断把新例子加入他们的“机器学习”系统)但仍然会犯错,有时很严重。比如,本地Crayfish公司张婷2020年写“我最近被英国上议院邀请出席听证会时, 就据理力争,让立法者们很多的了解中国投资人给英国带来的贡献,以及如果阻止中国投资给英国科技界带来的负面影响”微信的英文翻译是相反的意思: “我辩论了,让立法者们知道很多关于中国投资人给英国的贡献和这贡献对英国科技界带来的负面影响”。请注意这是微信翻译,不是谷歌翻译,所以请别快快地说这只不过是美国公司对中国的偏见。原因是“人工智能”的界限。这个问题能够被修改,但我们仍然等候下个问题。这不真的是智能,也不真的可靠。(此外,我不敢用任何机器翻译系统写这个网页,所有汉语错误是我自己犯了。)
Machine translation, although it has progressed (because engineers continuously add new examples to their "machine learning" systems), nevertheless still makes mistakes, sometimes quite serious. For example, in 2020 Ting Zhang of the local "Crayfish" company wrote (in Chinese) "when I was recently invited to attend a hearing at the House of Lords, I made a strong case to help the lawgivers clearly understand the contribution to the UK from Chinese investors, and the negative effects on Britain's science and technology sector if Chinese investment were to be prohibited" (my translation). But WeChat's automatic translation effort came out as the opposite meaning: "When I was recently invited by the House of Lords to attend the hearing, I argued. Let lawmakers know a lot about the contribution of Chinese investors to the UK and the negative impact of Chinese investment [sic!] on the British technology community." Please note this was WeChat translate, not Google, so please don't quickly say this error must have been due to American companies being prejudiced against China---what we are seeing here is simply a limitation of "artificial intelligence". This particular problem can be fixed, but then we'll still be waiting for the next one to happen. This isn't really intelligence, and it isn't really reliable. (And I did not use any machine translation when writing this web page: all Chinese mistakes are my own.)

信号处理有类似的情况。比如,考虑一下《运动腕带》那样的产品。你行走它怎么会算步? 怎么知道你的睡眠如何? 这是另一个“机器学习”产品:我们不知道规则的细节,因为某个电脑从多种多样的例子自动推断出来。苏州科技孵化代表团2019年来剑桥送给听众华为运动腕带,包括我在内(我一点尴尬因为我是为了“人脉”而来了),回家时我发觉那个手表如果被我手指摇摆它就计算更多脚步,但下载软件的新版本后它不这么容易计算脚步(它也好像知道是在我的手腕子或者在我的手指,如果不在手腕子它就没有通知我的短信),我不知道这新版本用什么规则识别这两个情况,但我猜想是因为他们加收藏并入更多例子的数据。虽然这看起来很棒,但我知道必然有别的情况仍然能使它犯错(所以他们在说明书说这不是真正的医疗器),这是“机器学习”的限度。(其实我用过它的振动闹钟功能,但后来手带瓦解了。)
A similar situation arises in signal processing. For example, consider products like fitness trackers. How does it count your steps when you walk? How does it know how your sleep was? This is another product of "machine learning" where we do not know the details of the rules because some computer has automatically deduced them from many examples. When a delegation from Suzhou's Innovation Incubator came to Cambridge in 2019, they gifted Huawei fitness trackers to the audience, including me (which made me feel a little awkward as I'd only come for the "networking" with Chinese research students), and when I got home I found I could trick it into counting steps simply by swinging it like a pendulum from my finger, but after downloading a software update this was no longer possible (it also seemed to somehow "know" whether it's on my wrist or just being held by the fingers, and sent text-message notifications only if it's actually on the wrist); I don't know what rules were included in this new version to differentiate between sensor inputs corresponding to these two situations, but I suppose it's because they collected and merged in more example data. Although this seems very good, I know there will be other situations where it would still make mistakes (so in the instruction book they printed that this is not a real medical device): this is a limit of "machine learning." (I did use the watch for quiet vibrating alarms, until its strap broke in 2022.)

"Computer vision" has similar limits. Although there are more and more good uses of it (such as blind programmer Saqib Shaikh's helping Microsoft write the Seeing AI app as a computer vision tool for the blind, and Rana el Kaliouby's work to read emotions from faces which can help some autistic people and with vehicle safety etc), the system still lacks real understanding, and although it is useful it is never quite 100% reliable. Facial recognition in security systems always has a certain error rate, so we'd better not rely on it completely: if my bank wants both a password and a face, that might not be too big a problem, but if they use the face only then there's bound to be someone else in the world with a face similar enough to mine to trigger the system into a false recognition, and we won't completely understand exactly what conditions caused that misread. If the future really has self-driving cars then we must understand their limits.

[Road sign in Cambridge: Follow diversion signs not your sat nav] 我不害怕《人工智能》,我所害怕的是人类过度依靠这些系统做决定。所谓的“自动化偏误”是,人以为电脑的建议比其他人的建议更可靠。以前,复杂的软件(比如五线谱或数学排版系统)被成为“专家系统”(因为某个专家,比如史丹佛大学电脑科学家高德纳,概述了自己的知识程序那个系统),往往比非专业的用户更可靠。这几天的潮词是“算法”,但你如果不知道这算法是如何写的,你不真的知道是可靠的。以前我的一个英国朋友拜访剑桥而用他的汽车给我接送某个地方,不过他的卫星导航设备说我们得用Silver Street我知道有路障,他说“路障不可能,放心那些卫星一直是对的而我刚才更新了地图”。所以我们来到了路障,我得解释虽然卫星本身很可靠,但车里设备的更新不总是包括所有路障等,所以我们仍然需要真正智能。在另一个情况,我有英国朋友试试用机器声音学汉语,不过他所用的机器声音犯错,他不相信机器声音可以犯错,所以我不得不列出一些汉语机器声音错误的例子。还有,我曾经写软件帮助一个本地的癌症研究团体打算他们的实验,不过那软件需要知道他们想用多少试管,所以我写个问题“多少试管?”而我需要个“默认选项”所以我说“3?”期待生物学家的用户看这个说“哦,不是3,我输入16”,但我听说某人看见我的“3”就说“电脑说应该是3,电脑说应该是3,我们得改变为3”。所以我使下个版本更仔细的计算合适的默认选项,但这清楚是自动化偏误:如果我本人进的去他们的实验室说“你想用3个吗?”那个用户必定不那么容易信赖我的建议,但如果那个建议来自“算法”他不质疑算法的正确性。
I'm not afraid of "artificial intelligence" but I'm afraid of people relying on these systems' decisions too much. "Automation bias" is when people mistakenly think a computer's suggestion is more reliable than another human's suggestion. In the past, complicated software (such as typesetting packages for music or mathematics) were called "expert systems" because some expert (such as mathematician and computer scientist Donald Knuth of Stanford) summarised their knowledge to program the system. The system's decision was usually better than that of an average non-professional user. These days the 'buzzword' is "algorithm" instead of "expert system", but if you don't know how the algorithm was written, you don't really know it's reliable. One of my English friends once visited Cambridge and give me a lift somewhere in his car; the car's satellite navigation equipment directed us through Silver Street, where I knew there was a barrier, but he said not to worry as the satellites are always right and he'd recently updated the map. So we drove up to the barrier, and I had to explain that although the satellites themselves are usually quite reliable (at least in peacetime), the car equipment's updates do not always include all new information about barriers etc, so we still need our real intelligence. On another occasion, an English friend tried to use a computer voice to learn Chinese, but the computer voice was making a mistake, and he couldn't believe that a computer voice could possibly make mistakes, so I had to list some examples of Chinese mistakes in computer voices just to show they shouldn't be taken as an authority. And when I wrote some software to help a local cancer-research group plan their experiments, the software needed to know how many test tubes they wanted to use, so I wrote a question "how many test tubes, 3?" because I needed a 'default' answer and I thought 3 was as good a number as any---I was fully expecting the biologist user to say "oh no not 3, we want 16, let's put that in" or whatever, but I later found out that someone saw my "3" and said "computer says it should be 3---computer says it should be 3---we've got to change our setup to 3" (oh dear). So I made the next edition calculate the default more carefully, but this is a clear example of automation bias: if I myself (as a human) had walked into their lab and said "do you want 3 test-tubes here?" that user would not have taken it from me nearly as easily as they took it from my "algorithm" on the computer which didn't actually know any better than I did in this instance.

"Artificial intelligence" has knowledge but not understanding---it's still useful, but we with understanding had better check its suggestions. "Behaviourists" like Dennett claim that consciousness is merely an illusion to help us prioritise which signals to process, and Dennett uses the example of "blindsight", but I can confidently say I know more than Dennett about blindsight because I've experienced it---I have cortical visual impairment (my brain does not fully process visual signals), and I have experienced both with-qualia vision and without-qualia blindsight-guessing (although I can't rely on either system all the time), and I know there's a difference between the two (at least in my brain) so I think there's something wrong with Dennett's theory that it's all the same (but we don't know if Galton is right to speculate that Dennett himself has no qualia). Yes Dehaene's afraid too much focus on qualia can lead to dualist philosophies like the ancient Babylonian idea of man's immortal soul, but we don't have to go there---I'd rather just say we are material beings but we still don't understand how self-awareness and consciousness really works (all of today's theories have problems) so we simply can't create a machine with genuine understanding. This is not the argument from Searle's "Chinese room" thought experiment: he just said "suppose a worker in the room doesn't understand Chinese but follows some rules that let him choose correct Chinese character cards to display the correct answers to questions" and the refutation is simply that the worker's understanding of Chinese is not relevant to the question of whether the system as a whole (including the rules) understands Chinese. But I claim that such a system isn't even possible: if machine translation is anything to go by, no set of rules can always produce the kind of answer that a person with understanding would provide, so the "Chinese room" is an impossible supposition.

People sometimes ask me to predict the future of "artificial intelligence" but I don't know. My guesses are:

1. 自闭症人士要发明新奇的算法。(因为有些的水平思考很强)
1. A really novel algorithm will be invented by someone on the autism spectrum with strong lateral-thinking ability.

2. 阅卷老师要经验《图灵测试》的质量控制:电脑写假试卷,看看老师会不会被没有悟性的“智能”欺骗
2. Exam markers will experience "Turing tests" as quality control: computers will generate fake student scripts and the marker has to show they won't be fooled by fake understanding.

3. 劣等的管理人得小心不会被人工智能取代。(优质的管理人真的明白任务和职工,但如果他们只会说“你得做什么? 什么子任务? 什么时候结束? 已经结束了吗? 结束了吗? 快点!” 等等,但不真的明白所提到的任务到底是什么,他们得小心如何证明比人工智能还好。)
3. Poor-quality managers had better be careful not to be replaced by artificial intelligence. (Good managers know the work and the workers and understand both, but if some manager can only say things like "What do you need to do? What are the sub-tasks? When will you finish? Have you finished yet? Finished yet? Be quick!" without even knowing what the work is, then they'd better be careful to show they're bringing more value than some piece of software that says those things.)

4. 如果有太多天花乱坠的宣传,下一个“人工智能低谷”(政府和公司被失望而放弃人工智能)可能会发生,不过如果人更多理解《人工智能》的限度,也许能避免激发这个“低谷”
4. If there's too much hype, we might get another "AI winter" as governments and companies get disappointed and abandon AI, but if people increase their understanding of the limitations of "artificial intelligence" then perhaps that can be averted.

5. 产业继续需要“传统”(非“机器学习”)的编写员,因为人想理解规则的原因(希望“可解释的人工智慧”有进步但我怀疑它能百分之百取代程式编写员)
5. The industry will still require "traditional" (not machine-learning) programmers, because people want to understand the reasons behind the rules. (Although I hope "explainable AI" progresses, I doubt it will ever be able to completely replace traditional human programmers.)

6. 希望大家最终能学会避免自动化偏误
6. I hope everyone will eventually learn to avoid automation bias.

(I wrote the above in 2020 in response to a discussion on a pandemic chat group. In 2023 when I was invited to give a short talk on copyright of AI art at the Oxford China Forum, I slightly refined the argument by bringing in Raymond Cattell's distinction between "fluid" and "crystallised" intelligence. They posted the video on the YouTube platform; the more interesting Q&A section was also recorded but I don't know where.)

All material © Silas S. Brown unless otherwise stated.