In reinforcement learning (RL), one of the major machine learning (ML) paradigms, an agent interacts with an environment. How well an RL agent can solve a problem, can be sensitive to choices such as the policy network architecture, the training hyperparameters, or the specific dynamics of the environment. A common strategy to deal with this sensitivity, is to first carefully design a neural architecture based on experience and domain-knowledge, followed by an optimization of the training hyperparameters for a specific formulation of the environment. But doing this all manually is not fun and also not really in the spirit of end-to-end optimization. Therefore, for tackling the RNA Design problem (Figure 2), we formulated an automatic reinforcement learning (auto-RL) approach, which searches for the best reinforcement learning formulations by jointly optimizing parameters of the environment (e.g. the shape of the reward function), the neural architecture, and training hyperparameters. The combination of agent (Meta-LEARNA) and environment, which our Auto-RL yielded, achieves new state-of-the-art results, while also being up to 1765 times faster in reaching the previous state-of-the-art performance for the problem of RNA design (Figure 1).