Episode 2: Understanding training vs inferencing and AI in industry – the transcript
Previously, I summarized Episode 2: Understanding training vs inferencing and AI in industry, where our experts Ellie Burns and Mike Fingeroff got together to discuss the training and inferencing process for neural networks, how AI systems are deployed, and some of the industries that are benefiting from AI. For those that prefer reading instead of listening to the podcast, here is the transcript of this episode:
Mike Fingeroff: Hello, everyone! I am Mike Fingeroff, a technologist with the Catapult High-Level Synthesis group. With me is Ellie Burns, director of marketing for the Catapult product line.
Mike Fingeroff: In the word of AI, a key concept is how to train a neural network to perform a particular task efficiently and accurately. Then, a hardware solution is created that uses the result from that training called inferencing. People often are confused about these two concepts, so we will be discussing them today. We also be discussing the role of AI within the industry.
Mike Fingeroff: One of the things that people often differentiate between, when talking about neural networks, is they talk about training versus inferencing. Can you expand on that a little bit?
Ellie Burns: Yes, that’s really important to understand the difference between a neural network and a normal algorithm. In a normal algorithm, you tell it, “This is how you process, this is exactly what you do. You go from step A to step B to step C. In a neural network, you first have to train the network and then you can use it. There’s the training part – which is basically training or learning – and the inferencing part, which is using it after it’s been trained. So, before a Deep Neural Network can be deployed, that algorithm is first defined by the algorithm engineer, then it’s trained – and you typically use some sort of ML framework. An example of this is TensorFlow by Google, or Caffe or PyTorch. There’s several different frameworks generally based on Python, but there’s other infrastructures that can help with the algorithm and training process.
Ellie Burns: That training process consists of feeding really large data sets, such as the ImageNet data set, which is now up over several million images. But they need to be labeled. And that’s the important part. You feed that data into the network, and then you calculate an error based on the network output and how it compared to the expected results. You first go through that data and ask, “Is this a cat?” The image is labeled as a cat. But the network goes through and says, “Oh, I think that this is 20% a cat, or only 5% a cat.” Then you use that error to then tune the weights of the network. The next time the image goes through using these weights. It is a very iterative process. What you’re trying to do is continue to feed different images into the network and tune the weights so that eventually it converges on the correct answer. When you hear a term error rate, that’s really what it’s trying to do, is to continue to say, “I’m going to feed different paths in through this network until it gets the correct answer.”
Ellie Burns: This training process is extremely computational complex and really requires millions and millions of computations. This can take weeks of training time on high-performance GPU farms. The training process is typically using floating point data types. Floating point is very accurate and it needs this in order to feedback on the error in order to get it mathematically correct. Once you have those weights, you can then take them and feed in new data, and then use that in deployment for inferencing. I can train the network on a big CPU farm and then, what I really want to do is deploy this to my small little device or watch or glasses. And, at that point, I need to just take those weights and be able to put them onto dedicated and focused hardware. I can turn it into what we call fixed point arithmetic that doesn’t need to be floating point. I can make it use smaller bit widths. Essentially, in hardware terms, I get to keep making that network a little bit smaller and not to affect the data rate. Smaller equates to lower power. Inferencing and training have different hardware requirements.
Mike Fingeroff: We’re seeing these inference engines being deployed across a wide range of applications, anywhere from handheld applications where your cell phone is able to process images directly via the camera, to more safety critical applications, like autonomous driving. How good are these networks really performing? How accurate are they if they’re being used in some of these more safety critical applications?
Ellie Burns: That’s a really good question because what we’re seeing is AI is still really young and emerging. These algorithms are changing all of the time. While they’re still very advanced compared to a decade ago, I’d say they’re pretty simplistic compared to the human brain. We’re still not near to the point of what a human brain can do. Today’s state of the art is the convolutional neural network (CNN) that focuses on a static image. It processes one image at a time. A CNN can recognize, for example, “I’ve got a pedestrian standing on the sidewalk,” but the CNN doesn’t process things over time. It’s not predictive. It can’t predict whether that pedestrian is about to step in the street. And also, modern CNNs don’t really consider all of the data in the frame. It needs to just think about what’s in front of it, what’s on the road, what’s on the side of the road. It doesn’t need to see what’s in the sky, or need to process a lot of the things in the frame. Think about all the things that a human can do when driving. Think about all the decisions the driver is making! When you see something way down the road you can make you can make a prediction as to what might it do based on your past experience. Right now, AI and ML doesn’t have that. It’s really just processing images and trying to identify what’s in that image, but the human brain gets to take the experience of that and make a decision. So again, we have got a long ways to go before we get to the point of putting all of those things together.
Mike Fingeroff: What are the industries that you see that are either interested or currently using artificial intelligence and machine learning, as well as what are the types of applications that they’re focusing on currently?
Ellie Burns: I’m starting to see it in so many places that I didn’t really expect. We’ve heard a lot about autonomous drive and all of the vision applications, but I want to emphasize that a lot of these applications have not been replacing workers but are targeting productivity. What computers are really, really good at is taking in huge amounts of data and processing that data and finding patterns. For example, we’ve been working with Facebook, and Facebook is doing the next generation of augmented reality and virtual reality – being able to bring the application of virtual reality together with the glasses that you can wear while doing activities. It’s really fascinating stuff!
Ellie Burns: Also, there is a huge boom in healthcare and medicine. In healthcare and medicine, a lot of images are used. You could have thousands and thousands of images and use AI to begin to more quickly, more readily discover, let’s say, cancer and different treatments. We are seeing the security and facial recognition markets grow as well.
Ellie Burns: The other thing that I’m finding interesting is natural language processing for voice assistance and translation. AI is very, very good at recognizing patterns in language. 5G, AI, and ML are coming together to serve new markets. Right now a typical IoT device a Fitbit. It tracks what you’re doing, but it right now it doesn’t have a ton of high-end smarts in it. But, as 5G and ML are coming together, you can have all these devices connected. But the last thing that you want to do is have to send all of the data up into the Cloud. What you want to be able to do is process that data as directly. You want all the processing of the device data immediately at a very low power, at the edge where the data is coming in. One of the biggest things that we’re starting to see right now is, rather than having AI and ML and deep neural networks processed up on the cloud, we’re seeing a big move where somehow companies want to have applications where you can put all the smarts not only in your cell phone, but in your watch, in your glasses, and on your little devices that are in your home, at the edge.
Mike Fingeroff: Thanks Ellie for clearing up the differences between training and inferencing! It is amazing to see the wide range of AI applications in the industry. I am sure we will be seeing even more impressive products in the near future. Tune in to our next podcast where Ellie and I will discuss the challenges that AI hardware teams face. We will also discuss how AI is moving to edge IoT devices.