SEOUL, SOUTH KOREA — Korean Go grandmaster Lee Sedol has won his first game in this week’s challenge match with AlphaGo, an artificially intelligent computing system developed by researchers at Google. With AlphaGo winning the match’s first three games earlier in the week, the machine had already claimed victory in this historic test of artificial intelligence. But on Saturday evening inside Seoul’s Four Seasons hotel, with his win in Game Four, Lee Sedol clawed back a degree of pride for himself and, indeed, the millions of people who watched the match online.
AlphaGo’s dominance in the first three games was notable because no machine had previously beaten a top human player at Go—and because some technologies at the heart the system are already used inside Google and so many other big-name Internet companies. AlphaGo highlights the enormous power of these technologies, and it points the way forward for other techniques that have driven its enormous success. These techniques are poised to reinvent everything from scientific research to robotics. And yet, as Lee Sedol showed today, machines are by no means infallible.
Because AlphaGo is driven by machine learning technologies—technologies that allow machines to learn tasks largely on their own—Google could, over the next weeks and months, retrain AlphaGo to an even higher level of performance. But Lee Sedol’s win in Game Four is a reminder that even the most proficient AI still has a long way to go before it can truly duplicate human thought. Yes, a machine can beat a top human at Go. But it can’t pass an eighth grade science test—much less converse like a human or, well, exhibit good old common sense.
Where is the Weakness?
Though the match had been decided the day before—with AlphaGo taking a three-games-to-none lead in the best-of-five contest—Game Four began with its own drama. As match commentator Chris Garlock said just before the game began, one big question remained: Does AlphaGo have a weakness? It was a question that first arose during the press conference in the wake of Game Three.
The press conference was a rather solemn affair, with Lee Sedol apologizing to the Korean public and the larger Go community. “I don’t know what to say today, but I think I will have to express my apologies first,” he told the press, through an interpreter. “I should have shown a better result, a better outcome, a better contest in terms of the games played. I do apologize for not being able to satisfy a lot of people’s expectations.” The Korean admitted to buckling under the immense public pressure—the match was literally front-page news in Korea, where an estimated 8 million people play Go and Lee Sedol is a national figure even among those who don’t follow the game—and now that much of the pressure was off, he vowed to continue looking for that weakness.
“Although AlphaGo is a strong program,, I would not say that it is a perfect program,” he said. “Yes, compared to human beings, it’s move are different and at times superior. But I do think there are weaknesses for AlphaGo, and I felt that during the first game and the second game as well.”
He was particularly upset with his play during the second game, when he felt he made crucial mistakes and failed to capitalize on mistakes by AlphaGo. “There were a number of opportunities that I admittedly missed,” he said.
Game Two All Over Again
Game Four began a lot like Game Two, as if Lee Sedol was trying to make amends for past mistakes. As in Game Two, he played white, which meant that AlphaGo made the first move, and he responded in much the same way he did three days earlier. “It’s just about the same game,” commentator Michael Redmond said six moves into the match. But a replay of Game Two wouldn’t be easy. Playing white—and moving second—is a significant disadvantage.
As the game progressed, the other English commentator, Chris Garlock, asked whether Lee Sedol, in an effort to find a weakness, might resort to playing moves that were as unusual as possible. But as Redmond pointed out, that didn’t really work for the Korean in Game One. Judging from what we know about the way AlphaGo operates, it’s unlikely that unusual or even blatantly weird moves would be particularly effective against the machine.
Using what are called deep neural networks—networks of hardware and software that mimic the web of neurons in the human brain—AlphaGo first learned the game of Go by analyzing thousands upon thousands of moves made by real live human players. But then, thanks to another technology called reinforcement learning, it climbed to an entirely different and higher level by playing game after game after game against itself. In essence, these games generated all sorts of new moves that the machine could use to retrain itself. By definition, these are inhuman moves.
This system does not operate by playing in familiar ways. It thrives by playing in a way no human ever would.
‘Off The Map’
As the game progressed, Lee Sedol was taking far more time with each move than his inanimate opponent. This was also the case in Games Two and Three, when, after his play clock ran down, the Korean was forced to play at a rapid fire pace. Over the course of these games, AlphaGo seemed to manage its time quite well, and this was no accident. Before the match, Demis Hassabis, who oversees the team that build AlphaGo, told us that the team recently added another neural network that helps the system manage time.
Generally speaking, the game continued to resemble Game Two. Lee Sedol seemed to command a large amount of territory on the board, while AlphaGo seemed to command very little. This is hardly a sign that Sedol was ahead in the game, but it did indicate that he was using much the same strategy he used in second game. “That’s more or less what he is doing,” Redmond said. To this point, the Korean grandmaster had yet to find any real weaknesses. But the commentators had some suggestions. “I’d just pull the plug,” Redmond said, in his typically dry way. “It’s dependent on its Internet connection, isn’t it? All we need is someone with scissors.”
Indeed, AlphaGo does depend on an Internet connection, which ties into a vast network of machines inside Google data centers across the globe. But a pair of scissors wouldn’t really cut it. Prior to the match, Google actually ran its own fiber optic cables into the Four Seasons Hotel, so that it would be sure its Internet connection wouldn’t go down. Jeff Dean, one of the companies most tenured and most important engineers, was on hand earlier in the week to help with the technical setup.
When they first built AlphaGo, Hassabis and team trained and ran the system on a single machine. But in October, just before the system’s closed-door match with three-time European champion Fan Hui, the team upgraded to a much higher level of processing power. Deep neural networks typically run on large networks of machines equipped with graphics processing units, or GPUs, chips that were originally designed to render images for games and other highly graphical software but are also well suited to this breed of machine learning. In October, Hassabis said that AlphaGo ran on a network that spanned 170 GPU cards and 1,200 standard processors, or CPUs.
‘A Very Dangerous Fight’
As the game approached the two hour mark, Redmond called it “Lee Sedol’s type of game.” In other words, it was developing “into a very dangerous fight.” Lee Sedol likes to play on a knife edge. And he’s very good at it. But as Redmond pointed out, so is AlphaGo.
Lee Sedol seemed to be in a better place than he was in Game Three, and he seemed calmer as well. But after another twenty minutes of play, Redmond, himself a very successful Go player, felt that AlphaGo had the edge. “It feels pretty good for black,” he said. And Lee Sedol had only about 25 minutes left on his play clock, nearly an hour less than AlphaGo—though only about 70 moves had been played. One a play clock runs out, a player must make each move in less than 60 seconds.
At these point, AlphaGo started to play what Redmond and Garlock considered unimpressive or “slack” moves. But the irony is that this may indicate that the machine is confident of a win. AlphaGo makes moves that maximize its probability of winning, not that maximize its margin of victory. “This was AlphaGo saying: ‘I think I’m ahead. I’m going to wrap this stuff up,’” Garlock said. “And Lee Sedol needs to do something special, even if it doesn’t work. Otherwise, it’s just not going to be enough.” Indeed, Lee Sedol, leaning forward with his face in his hands, seemed to be particularly deep in concentration, and he took several minutes to make his next move.
Then, near the three hour mark, his play clock ran out. Meanwhile, AlphaGo was playing in its typically “confusing” way, as Redmond put it. “I get the impression that AlphaGo has gone off on a tangent,” he said. Again, this hardly meant AlphaGo was in trouble, but it prompted Garlock to ask if the machine would ever resign.
It would, and it’s approach to resignation is surprisingly, well, human. According to David Silver, another researcher on the team that build AlphaGo, the machine will resign not when it has zero chance of winning, but when its chance of winning dips below 20 percent. “We feel that this is more respectful to the way humans play the game,” Silver told me earlier in the week. “It would be disrespectful to continue playing in a position which is clearly so close to loss that it’s almost over.”
The machine did not resign. But it did start playing moves that the commentators described as absolutely terrible. “This move just doesn’t make sense,” Redmond said at one point. “I get the feeling AlphaGo is running out of winning moves.” Redmond acknowledged that, in the past, AlphaGo often made poor moves when it felt it was ahead. But “not that bad,” he said.
‘He Has a Chance This Time’
Suddenly, the commentators felt that the match was looking quite good for Lee Sedol. But the Korean continued to struggle with time. He had twice failed to make a move within the allotted 60 seconds. And then he made a move just milliseconds before the clock ran out again. If he hadn’t beaten the clock, his allotted time per move would have dropped to 30 seconds. “That was close,” Redmond said. And then he did it again.
About three hours and forty minutes into the match, Lee Sedol stood up from the table and left the room, taking an (allowed) break from play. Under the rules, his play clock did not resume until he had returned. The play had reached a level of excitement we hadn’t seen since Game One. As Redmond said: “Lee Sedol has a chance this time.”
The commentators continued to ask about the machine resigning, and finally, Google’s David Silver walked into the commentary room and passed them word that the machine would resign when its chances dipped below that 20 percent threshold. They then asked Silver to join them on stage and explain AlphaGo’s string of what they considered inexplicably bad moves. But he declined.
In event, Redmond said that he wouldn’t be surprised if the game went to all the way to its conclusion—without a resignation—something that hasn’t happened in any of the previous games. And then he made an unqualified prediction. “I think Lee Sedol is going to win here,” he said.
The End Game
But the Korean still faced clock trouble. And it was telling that AlphaGo had not resigned. The machine still had 15 minutes left on its original play clock as the end game arrived, with the two opponents rapidly trying to rack up the points they had angled for over the last four and a half hours.
Then AlphaGo played what Redmond called “another nonsense move.” He and Garlock were unimpressed with the machine’s end game. And as the commentators continued to questions its moves, the machine did indeed resign.
There was an enormous cheer from the Korean commentary room and its throng of Korean reporters and photographers. And then cam the applause in the English room. A day before, the atmosphere was palpably solemn. But Lee Sedol did find a weakness in AlphaGo. And the mood changed.