Reinforcement learning (RL) has shown impressive results in a variety of applications. Well known examples include game and video game playing, robotics and, recently, “Autonomous navigation of stratospheric balloons”. A lot of the successes came about by combining the expressiveness of deep learning with the power of RL.
Already on their own though, both frameworks come with their own set of hyperparameters in need of proper tuning. Learning rates, regularization, optimizer and architecture design choices are just a few common hyperparameters that pop up in deep learning. In RL, among many others, careful consideration of how to trade off exploration and exploitation, how to discount rewards or how to handle large batch training is needed. (more…)