Idea of Reinforcement Learning🤩
Reinforcement learning is learning what to do—how to map situations to actions—so
as to maximize a numerical reward signal. The learner is not told which actions to
take, but instead must discover which actions yield the most reward by trying them. In
the most interesting and challenging cases, actions may affect not only the immediate reward but also the next situation and, through that, all subsequent rewards. These two
characteristics—trial-and-error search and delayed reward—are the two most important
distinguishing features of reinforcement learning.
RL with Robotics🦾 🤯
https://www.youtube.com/watch?v=n2gE7n11h1Y
How RL is different from other types of Machine Learnings! 🤔
- Learning from interactions: RL is concerned with learning from interactions with an environment. The RL agent takes actions in an environment, receives feedback in the form of rewards or penalties, and learns to optimize its behavior over time.
- Feedback mechanism: In RL, the feedback signal is provided in the form of rewards or punishments. The agent's goal is to maximize the cumulative reward it receives over a sequence of actions, which requires learning from delayed and cumulative consequences. In contrast, supervised learning relies on labeled examples where the model learns to map inputs to predefined outputs, while unsupervised learning aims to discover patterns or structure in unlabeled data without explicit feedback.
- Exploration and exploitation: RL involves a trade-off between exploration and exploitation. Initially, the RL agent needs to explore different actions to gather information about the environment and discover optimal strategies. As the agent learns, it gradually shifts towards exploiting the learned knowledge to maximize rewards.
- Sequential decision-making: RL is suited for problems that involve sequential decision-making over time. The agent's actions have consequences that influence future states and rewards, creating a sequential dependency. RL algorithms aim to find optimal policies or decision-making strategies that maximize long-term rewards.
The Complete Reinforcement Learning Dictionary:
The Complete Reinforcement Learning Dictionary