Deep Reinforcement Learning : Cog in the AI Wheel

Deep Reinforcement Learning : Cog in the AI Wheel

Deep reinforcement learning or abbreviated as DRL, has risen to the AI scene as an exciting and engaging concept. It has huge potential for deployment in a host of situations, like natural language processing, speech and pattern recognition to name a few. Not long ago, in 2016 at Google DeepMind Challenge Match, AlphaGo, a computer Go program developed by Google DeepMind beat 18-time world champion Lee Sedol, which was hailed as historic! This coupled with recent developments has brought DRL into spotlight.

To appreciate the DRL as a concept one needs to grasp the subtle distinctiveness between machine learning and reinforcement learning. While the models are trained on already known set of correct responses in machine learning, the concept of reinforcement learning the model is able to outperform by the interaction with the immediate environment. System of positive feedback wherein when the delivery from the agent matches the desired consequence, the agent is rewarded, like on scoring a point in a game.  This ensures the agent’s good behavior with identified benefits. It is here that RL matches the capabilities of human, which operate effortlessly on a wide variety of tasks, ranging from routine one to complex cognitive tasks. This whole concept of employing agents to learn through a method trial and error where deliveries are linked to reward or punishments is termed as reinforcement learning, just like a child responds when its actions are linked to punishments or rewards. Combine it with the agent’s enhanced capability to process and equip them with own knowledge from supplied inputs, such as vision, sans domain heuristics, just like a human would do is known as DRL. Here deep signifies deep learning of neural networks. Such agents are capable of taking on a multitude of challenging tasks.

The reinforcement learning edge

So, what exactly makes reinforcement learning stand out? It’s the answer to the delayed returns or the lag between an immediate action and the results obtained. This correlation between the action and delayed returns thus produced is done away with RL. Akin to humans, RL algorithms do have to factor in the delay as to check what a certain action has resulted in. This agent, who closely mimics us in the real environment, is capable of performing better in twisted, ambiguous settings, real-life environments when posed with a set of possible choices to pick. This is accomplished by the underlying “deep” neural network. DRL algorithms when applied to robotics, enable control policies for robots to be learned directly in the real world simply from camera inputs. Huge advancement in architecture and hardware has opened up the field like never before. The promise of DRL technology propelled Google to acquire DeepMind for $500 million in 2014, leading to a flurry of startups.

DRL in action

It is well suited for addressing issues in both dynamic as well as environments requiring adaption through progressive learning. It can be deployed to train AI models, which can further bring about automation and be a driver for complex systems like supply chain logistics, robotics, manufacturing, health care. Owing to the complexity of algorithmic convergence, features of DRL are exploited well for videogames (Atari) and others. So next time if you see a car in autopilot or a drone zooming past, you can attribute such visions to unfolding wonders of DRL!

Leave a Reply

Your email address will not be published.