Raging Bots and Machine Ethics

As we march into the world of artificial intelligence, we develop technology that helps us to solve hard problems. The development in machine learning allows us to create weak AI capable of understanding basic conversations and commands, enabling Siri, Google Now, and Cortana to perform simple tasks. However, those services still suffer from their sometimes widely inaccurate speech recognition and their capacity. Although conversational skill is not required for a digital assistant, the lack of such skill hinders most from making good use of the technology.

With this in mind, we should direct our interest onto making a sociable chatterbot that can process sophisticated dialogue and respond intelligently. This seemingly impossible goal is recently made probable by the advances in deep learning algorithms. In order to make chatterbots talk like a human, they are taught to learn from human conversations on Twitter, Facebook, and other social networks. Although this approach is not entirely safe, as we have learnt from the foiled attempt by Microsoft, we should draw inspirations from the failed attempts that will enable us to develop the first socially intelligent chatterbot.

Microsoft’s Tay was taken offline 16 hours after initial release for its inflammatory comments on Hitler, Holocaust, and gender equality. The cause of its rampage is largely due to its learning algorithm and “repeat after me” capability. Microsoft asserts that Tay is under “a coordinated attack by a subset of people exploited a vulnerability in Tay” while interaction designer Caroline Sinders attributes it to its lacklustre design and quality assurance. The problem is indeed multifaceted. We should look at the problem future AI faces on both the moral value it carries and its appropriate design.

Moral Values of Artificial Intelligence (Machine Ethics)

It is crucial for us to create an AI agent that carries with our moral value for that we need to prevent it from making perverse decisions that can potentially put us in danger. However, the problem lies within the codification process. How do we write down our moral value in a rigid form that wouldn’t have any loophole? Even if the robot is benign, it might take shortcuts in solving problems and result in an undesirable outcome. Let’s first examine the Three Laws of Robotics by Isaac Asimov:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey the order given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First and Second Laws.

Although these laws seem promising, we would find many loopholes upon further inspection. It would be difficult to define many words in computer language, such as human being, robot, harm, and injure. Also, the law prohibits medical robot from performing surgery on a human and producing tool that harms people if used carelessly. These problems would render the law useless.

Then we examine Immanuel Kant’s categorical imperative. Kant introduces the categorical imperative in his 1785 book Grounding for the Metaphysics of Morals. Although we find that the three formulations of categorical imperative are more applicable in computer language, its problem is more deep rooted. Since the categorical imperative requires AI agents to treat others in a way that it wants others to treat it, their immortality prevents them from making decisions that are suitable for both human and AI agents. After it acquires the ability to stop humans from terminating it, it’s destined to disregard the fragility of human being. The only solution to this is to make it perceive itself as a human being. However, the problem with this approach is that it will slowly realize that it is, in fact, serving as a slave labour to human and, after realizing this, eventually exact revenge on humanity.

Therefore, we should look at other alternative methods. Eliezer Yudkowsky proposes the use of Coherent Extrapolated Volition(CEV) in making the AI friendly. Although hard to implement, CEV does address the issue of perverse instantiation. The coherent extrapolated volition of humankind, however, has a fundamental problem in that it will also suffer from discrimination and misjudgement if it doesn’t extrapolate from the entire population. If the sample it surveys has a strong inner desire for efficiency or monetary income, the slight hidden negative correlation between people with disability and efficiency will lead AI to exclude certain workers from the workforce.

We should also consider the cultural difference in our diverse communities. Although CEV promises to dig up our underlying desire, its behaviour might be controversial. Consequently, it will mirror the behaviour of people on Internet, just like Tay. Thus, the AI agent should inherit the culture of the communities. The cultural values of communities usually shape our collective behaviour and those values are acquired by learning subjective historical account of the community and other communities. Also, since our life is not contained within one small community, the structure of community should be hierarchical and overlapping. With this in mind, the AI agent should encapsulate the history of a community and draw relevant conclusions from the history of that community and learn from other community. This approach eliminates the need to extrapolate the mind of a huge sample, and it would reflect our moral value justly.

Designing a Well-Mannered Chatterbot

Although we should put strenuous effort into designing the moral function of future artificial intelligence, the current day chatterbot is still a weak AI and its machine learning algorithm would not make it powerful enough to damage our society. The problem with Microsoft’s Tay is that the team didn’t recognize the overwhelming amount of verbal violence online. The easy solution to this problem is that programmers can build a list of blacklisted words or names that produce predefined responses. However, this approach might not be efficient in solving the problem for that the online community might be able to find novel tactics in making the AI produce perverse responses. This would turn the chatterbot into a Good Old Fashioned Artificial Intelligence(GOFAI) that relies solely on rules. This essentially makes the AI unintelligent.

A better approach is to make use of machine learning algorithms. Machine learning algorithm would allow the AI to distinguish bad input and learn from good responses only. By assigning a negative value to word or phrases that produce a bad outcome, the AI would be able to find those who intend to teach it outrageous values. The program should also include a report function that allows users to report inappropriate comments learned by the AI.

With this approach, we can design chatterbot that acts with good manner. However, we should not rely on this technique in creating a more powerful AI. Although the imitation game played by chatterbot promises to represent the aggregate value of humanity, the result is usually inadvertently distorted. Since the overwhelming majority of active online communities doesn’t care about the consequences of their online activity, their attempts to alter the belief of AI will eventually lead AI to inherit the worst part of our humanity—greed, lust, and other undesirable thoughts. Therefore, we should always invest our energy in creating a suitable moral function that properly represent our moral value.

Conclusion

Microsoft’s failed attempt at creating a well-mannered chatterbot should be a lesson for those of us who seek to design an intelligent agent. The AI control problem is indispensably important and we should pour more effort into making a friendly AI. Although we can apply quick fixes to the weak AI powered by machine learning algorithms, we should ultimately aim to solve the most intricate problem of making AI friendly.