Code repository for the paper Hyperparameter Optimization: A Spectral Approach by Elad Hazan, Adam Klivans, Yang Yuan.
We are using Python 3. The code currently only supports tuning binary hyperparameters (taking value from {-1,+1})
Before using the code, you may need to implement three parts by yourself:
You also need to fill options.txt to give the names of each hyperparameter. See options_example.txt for illustration. Do not add extra lines at the end of the file.
Now you may run "python main.py -N 60", if you have 60 hyperparameters. Notice that it also indicates that options.txt contains exactly 60 lines.
Harmonica will run 3 stages by default, and then run a base algorithm for fine tuning. It will output the best answer at the end.
You may comment/uncomment the two examples in samplings.py, and run "python main.py".
Harmonica extracts important features by a multi-stage learning process. The rough idea is the following (see paper for more details).
Step 1. Uniformly sample (say) 100 configurations
Step 2. Extend the feature vector with low degree monomials on hyperparameters
Step 3. Run Lasso on the extended feature vector, with alpha (weight on l_1 regularization term) equal to (say) 3
Step 4. Pick the top (say) 5 important monomials, fix them to minimize the sparse linear function that Lasso learned.
Step 5. Update function f, go back to Step 1.
Keep running this process for (say) 3 stages, and we already fix lots of important variables. Now we can call some base algorithm like Hyperband, Random search (or your favorite hyperparameter tuning algorithm) for fine tuning the remaining variables.
As we can see above, there are a few hyperparameters for Harmonica, like #samples, alpha, #important monomials, #stages. Fortunately, we observe that Harmonica is not very sensitive to those hyperparameters. Usually the default value works well.
It is easy to make Harmonica run in parallel during the sampling process. Here is a simple way for doing it with pssh in Azure. (EC2 is similar)
1. Add a few (say 10) virtual machines in the same resource group.
2. For every machine, set up the corresponding DNS name. Say, hyper001.eastus.cloudapp.azure.com
3. For simplicity, we rename these virtual machines M1, M2, ... , M10. (You may set up the ssh config file to do so)
4. Assume the main machine is M1. You may use pssh command to control M1-M10 with just one line:
pssh -h hosts.txt -t 0 -P -i 'ls'
Here hosts.txt is a text file with 10 lines. Each line contains the host name for one machine (say, M3, or M5). By running this command, M1-M10 will run 'ls' locally.
Now you are able to run programs on 10 machines simultaneously. How should we make sure everyone is working on different tasks?
First, create a shared filesystem.
Once it's done, every machine will have a local directory, say, /shared, which is shared with every other machine.
Now, you may first write a "task.txt" file, which contains, say, 100 tasks. Let's call it T1, T2, ..., T100.
On every machine, you run the program that reads the task.txt file, and understand there are 100 tasks to do. Then the program does the following.
Remarks
For any questions, please email Yang: yangyuan@cs.cornell.edu.