Resolved: Ray Tune scheduler hyperparam_mutations vs. param_space

In this post, we will see how to resolve Ray Tune scheduler hyperparam_mutations vs. param_space


I am having a hard time understanding the need for what seems like two search space definitions in the same program flow. The tune.Tuner() object takes in a param_space argument, where we can set up the hyperparameter space to look into, however, it can also take in a scheduler.
As an example, I have a HuggingFace transformer setup with a Population Based Training scheduler, with its own hyperparam_mutations, which looks like another hyperparameter space to look into.
  1. What is the interaction between these two spaces?
  2. If I just want to perturb learning_rate to see its effect on my accuracy, would I put this into the tuner’s param_space or into the scheduler’s hyperparam_mutations?

Best Answer:

This section in one of the PBT user guides touches on both questions.
  1. In particular, the param_space is used to get the initial samples, and the hyperparam_mutations specifies the resample distributions (resampling being one of the possible mutation operations) and determines which parameters actually get mutated. If not specified in param_space, PBT samples from hyperparam_mutations initially.

  2. If you only want learning rate to be mutated, then that’s the only one that should be specified in hyperparam_mutations.

If you have better answer, please add a comment about this, thank you!