Episode 4: Examining the gap between AI and the human brain – the transcript
Ellie Burns and Mike Fingeroff explore gap between what AI can do today versus the human brain and relate that to the challenges that hardware designers have in their design flow. They also touch on the clashing requirements of coming up with a generic AI application that can perform many tasks versus applications that perform one task really well. To listen to the podcast, click here. For those that prefer reading instead of listening to the podcast, here is the transcript of this episode:
Mike Fingeroff: The gap between what the best AI applications can perform today versus the human brain is vast. Today we will discuss that gap and the challenges that hardware designers have in their design flow. We will also discuss the clashing requirements of coming up with a generic AI application that can perform many tasks versus applications that perform one task really well.
Mike Fingeroff: It sounds like the complexity of all these new and next-generation AI algorithms are really pushing the limits of today’s compute platforms. So, what are the challenges that algorithm and hardware design teams are going to face going forward? And what are some of the gaps that need to be addressed for these challenges in modern design flows?
Ellie Burns: Frankly, the gap between the efficiency of AI learning when compared to the human brain is huge. While we hear about all these great things with robotics and autonomous vehicles, our brain is an amazing, amazing machine. Let me give you an example of the learning capability of a child. Somewhere between 18 months and two years old, children start doing something that is absolutely remarkable. You show them how to do something once or twice. And then they start practicing the task themselves. Imagine sitting down with your two-year-old and you show her how to put a block inside of a hole. And then she practices that on her own and quickly learns the task. This is called ‘one-shot learning’. The biggest, most sophisticated computers have to be trained on hundreds of thousands of examples before they learn anything, let alone how to put that block in the square. This one-shot learning is the Holy Grail in AI and we are really far from it right now.
Ellie Burns: Two-year-olds can formulate plans and they can understand the world with abstract concepts. Intelligence is the combination of one-shot learning and these abstract concepts. This makes their learning seamless and produces the concepts that they can take from being shown something once to being able to say, “Okay, how does this work in the real world?” So, it’s this combination that the AI scientists are still hoping to crack. We’re still pretty far from it, but making progress one little bit at a time.
Ellie Burns: The second thing is, the human brain is very power efficient. To put it into perspective, the human brain basically weighs about three pounds. And, to do its amazing job, it uses a little over 10 watts of electricity. A deep neural network, like AlexNet, and ResNet are often performing a particular task and they consume megawatts of power. 10 watts compared to thousands and thousands of watts. We still have a pretty long ways to go in AI. However, we’re starting to see a lot of value for markets that are inferencing at the edge and they really must have a lower power and energy consumption. But if we keep on the track that we’re on (according to reports I have read), AI will use up a 10th of the world’s electricity by 2025. We need to come up with ways to significantly reduce this.
Mike Fingeroff: That’s incredible! It would also seem, with all the different applications that we’re seeing for AI, that it’s going to be difficult to build a generic solution that’s going to address each one of these individual end applications.
Ellie Burns: Yes, that’s a really good point because if I think about the same network that it’s going to take to do autonomous driving and ADAS, it is not the same network that’s needed for natural language. The whole industry has this kind of struggle because it takes a lot to build an ASIC or to build a new processor, and to build new hardware. Everybody wants to have a solution that is generic enough to work with all different kinds of networks, so that they can recoup their costs and sell it to everyone. But, if I’m the end-application person, don’t I want the very best platform to run ADAS? But I don’t want to buy a high-end NVIDIA GPU, if what I really want is a little tiny device that helps me do foreign language translation that’s on my watch. I can hardly wait to go to France and be able to just push a button on my watch and speak into it and have it speak out! That an application where I want a very specific thing and a generic processor is not always the best. I want to have something that’s very specific for the task at hand. But, this is a struggle because hardware development takes a long time. We need to be able to come up with these new architectures and new platforms. I really am seeing a tendency that companies want a very specific piece of hardware for a very specific job. But, it’s expensive and it’s difficult to do.
Ellie Burns: We’re also seeing new architectures. It is growing so fast! The hardware development process, which is relatively slow, isn’t keeping up. For example, AlexNet, is eight layers, 1.4 gigaflops; Resnet, which is 152 layers; Baidu, which takes 80 gigaflops, at 7000 hours. You probably read about AlphaGo? AlphaGo, which is one of the most complex neural networks, took almost 2000 CPUs and 280 GPUs to train. It took $3,000 per game in just the electric bill alone. Somehow we have to figure out a way to develop these things cost-effectively. How can we get the kind of performance that we need to be able to do all these exciting things, but at a practical use of energy and efficiency, so that we don’t use every drop of energy on this planet just in AI applications?
Mike Fingeroff: This seems like an incredible challenge to today’s RTL design teams.
Ellie Burns: Yes, it definitely is! So to close up this section, we are seeing the core algorithms and all these different architectures are changing so rapidly. In the last article I was reading stated that the compute requirements for AI and ML right now are doubling every 3.4 months. Just imagine that! Every 3.4 months, I need double the compute resources. This isn’t sustainable at this point. And also, the RTL design methodology is not able to keep up. We can’t generate new hardware in 3.4 months. By the time you generate new hardware for a new task, it’s obsolete. There’s not enough time to do an ASIC. We need to figure out a way for hardware design teams to keep up with process. How can they have enough time to figure out the best architecture, the best algorithm for a particular application? Because, in the end, it’s that application that needs to be able to go to market with competitive differentiation.
Ellie Burns: With the next set of podcasts, we want to really take a deeper dive into the challenges that AI and ML chip developers are facing, and look at some of the potential solutions and methodologies that can help make a difference and help close this gap.
Mike Fingeroff: Alright, great! Well, thanks, Ellie! It’s been a pleasure talking to you!
Ellie Burns: Okay. Well, I will talk to you soon!