TechieTricks.com
What AlphaGo Can Teach Us About How People Learn What AlphaGo Can Teach Us About How People Learn
We are, of course, looking at ways to apply MuZero to real world problems, and there are some encouraging initial results. To give a... What AlphaGo Can Teach Us About How People Learn


We are, of course, looking at ways to apply MuZero to real world problems, and there are some encouraging initial results. To give a concrete example, traffic on the internet is dominated by video, and a big open problem is how to compress those videos as efficiently as possible. You can think of this as a reinforcement learning problem because there are these very complicated programs that compress the video, but what you see next is unknown. But when you plug something like MuZero into it, our initial results look very promising in terms of saving significant amounts of data, maybe something like 5 percent of the bits that are used in compressing a video.

Longer term, where do you think reinforcement learning will have the biggest impact?

I think of a system that can help you as a user achieve your goals as effectively as possible. A really powerful system that sees all the things that you see, that has all the same senses that you have, which is able to help you achieve your goals in your life. I think that is a really important one. Another transformative one, looking long term, is something which could provide a personalized health care solution. There are privacy and ethical issues that have to be addressed, but it will have huge transformative value; it will change the face of medicine and people’s quality of life.

Is there anything you think machines will learn to do within your lifetime?

I don’t want to put a timescale on it, but I would say that everything that a human can achieve, I ultimately think that a machine can. The brain is a computational process, I don’t think there’s any magic going on there.

Can we reach the point where we can understand and implement algorithms as effective and powerful as the human brain? Well, I don’t know what the timescale is. But I think that the journey is exciting. And we should be aiming to achieve that. The first step in taking that journey is to try to understand what it even means to achieve intelligence? What problem are we trying to solve in solving intelligence?

Beyond practical uses, are you confident that you can go from mastering games like chess and Atari to real intelligence? What makes you think that reinforcement learning will lead to machines with common sense understanding?

There’s a hypothesis, we call it the reward-is-enough hypothesis, which says that the essential process of intelligence could be as simple as a system seeking to maximize its reward, and that process of trying to achieve a goal and trying to maximize reward is enough to give rise to all the attributes of intelligence that we see in natural intelligence. It’s a hypothesis, we don’t know whether it is true, but it kind of gives a direction to research.

If we take common sense specifically, the reward-is-enough hypothesis says well, if common sense is useful to a system, that means it should actually help it to better achieve its goals.

It sounds like you think that your area of expertise—reinforcement learning—is in some sense fundamental to understanding, or “solving,” intelligence. Is that right?

I really see it as very essential. I think the big question is, is it true? Because it certainly flies in the face of how a lot of people view AI, which is that there’s this incredibly complex collection of mechanisms involved in intelligence, and each one of them has its own kind of problem that it’s solving or its own special way of working, or maybe there’s not even any clear problem definition at all for something like common sense. This theory says, no, actually there may be this one very clear and simple way to think about all of intelligence, which is that it’s a goal-optimizing system, and that if we find the way to optimize goals really, really well, then all of these other things will will will emerge from that process.



Source link

techietr