A versatile DAC can handle any situation at any time. (Image credit: Ina Lindauer)
When designing algorithms we want them to be as flexible as possible such that they can solve as many problems as possible. To solve a specific family of problems well, finding well-performing hyperparameter configurations requires us to either use extensive domain knowledge or resources. The second point is especially true if we want to use algorithms that we did not author ourselves. We most likely know nothing of the internals of the algorithm, so to figure out what makes this black-box tick, far too often, researchers of all flavors fall back to “Graduate Student Descent”.
Automated hyperparameter optimization and algorithm configuration offer far better alternatives, allowing for more structured and less error-prone optimization in potentially infinitely large search spaces. Throughout the search, various useful statistics are logged which can further allow us to gain insights into the algorithm, it’s hyperparameters and tasks the algorithm has to solve. At the end of the search, they present us with a hyperparameter configuration that was found to work well, e.g quickly finds a solution or has the best solution quality.
However the resulting static configurations these approaches are able to find exhibit two shortcomings:
- The configuration will only work well when the configured algorithm will be used to solve similar tasks.
- The iterative nature of most algorithms is ignored and a configuration is assumed to work best at each iteration of the algorithm.
To address Shortcoming 1, a combination of algorithm configuration and algorithm selection can be used. First search for well-performing but potentially complementary configurations (which solve different problems best) and then learn a selection mechanism to determine with which configuration to solve a problem. However, even this more general form of optimization (called per-instance algorithm configuration) is not able to address Shortcoming 2. (more…)