In this post, we will see how to resolve Ray Tune scheduler hyperparam_mutations vs. param_space
Question:
I am having a hard time understanding the need for what seems like two search space definitions in the same program flow. The tune.Tuner() object takes in a param_space argument, where we can set up the hyperparameter space to look into, however, it can also take in a scheduler.As an example, I have a HuggingFace transformer setup with a Population Based Training scheduler, with its own hyperparam_mutations, which looks like another hyperparameter space to look into.
- What is the interaction between these two spaces?
- If I just want to perturb learning_rate to see its effect on my accuracy, would I put this into the tuner’s param_space or into the scheduler’s hyperparam_mutations?
Best Answer:
This section in one of the PBT user guides touches on both questions.In particular, the
param_space
is used to get the initial samples, and thehyperparam_mutations
specifies the resample distributions (resampling being one of the possible mutation operations) and determines which parameters actually get mutated. If not specified in param_space, PBT samples fromhyperparam_mutations
initially.If you only want learning rate to be mutated, then that’s the only one that should be specified in
hyperparam_mutations
.
If you have better answer, please add a comment about this, thank you!
Source: Stackoverflow.com
Leave a Review