# Hyperparameter Search for Deep Learning (Basic)¶

Every DeepHyper search requires at least 2 Python objects as input:

• run: your “black-box” function returning the objective value to be maximized

• Problem: an instance of deephyper.problem.BaseProblem which defines the search space of input parameters to run

These objects are required for both HPS and NAS, but take on a slightly different meaning in the context of NAS.

We will illustrate DeepHyper HPS using a regression example. We generate synthetic data according to $$y = - \mathbf{x}^{T} \mathbf{x}$$ for random $$N$$-dimensional input vectors $$\mathbf{x}$$. Our regression model is a multilayer perceptron with 1 hidden layer, implemented in Keras. Using HPS, we will then tune the model hyperparameters to optimize the validation $$R^{2}$$ metric.

## Setting up the problem¶

Note

Be sure to work in a virtual environment where you can easily pip install new packages. This typically entails using either Anaconda, virtualenv, or Pipenv.

Let’s start by creating a new DeepHyper project workspace. This a directory where you will create search problem instances that are automatically installed and importable across your Python environment.

bash
$deephyper start-project hps_demo  A new hps_demo directory is created, containing the following files: hps_demo/ hps_demo/ __init__.py setup.py  We can now define DeepHyper search problems inside this directory, using either deephyper new-problem nas {name} or deephyper new-problem hps {name} for NAS or HPS, respectively. Let’s set up an HPS problem called polynome2 as follows: bash $ cd hps_demo/hps_demo/
$deephyper new-problem hps polynome2  A new HPS problem subdirectory should be in place. This is a Python subpackage containing sample code in the files __init__.py, load_data.py, model_run.py, and problem.py. Overall, your project directory should look like: hps_demo/ hps_demo/ __init__.py polynome2/ __init__.py load_data.py model_run.py problem.py setup.py  ## Generating data¶ The sample load_data.py will generate the training and validation data for our demo regression problem. While not required by the DeepHyper HPS API, it is helpful to encapsulate data loading and preparation in a separate module. This sample generates data from a function $$f$$ where $$X \in [a, b]^n$$ where $$f(X) = -\sum_{i=0}^{n-1} {x_i ^2}$$: polynome2/load_data.py  1import os 2import numpy as np 3 4np.random.seed(2018) 5 6 7def load_data(dim=10, a=-50, b=50, prop=0.80, size=10000): 8 """Generate a random distribution of data for polynome_2 function: -SUM(X**2) where "**" is an element wise operator in the continuous range [a, b]. 9 10 Args: 11 dim (int): size of input vector for the polynome_2 function. 12 a (int): minimum bound for all X dimensions. 13 b (int): maximum bound for all X dimensions. 14 prop (float): a value between [0., 1.] indicating how to split data between training set and validation set. prop corresponds to the ratio of data in training set. 1.-prop corresponds to the amount of data in validation set. 15 size (int): amount of data to generate. It is equal to len(training_data)+len(validation_data). 16 17 Returns: 18 tuple(tuple(ndarray, ndarray), tuple(ndarray, ndarray)): of Numpy arrays: (train_X, train_y), (valid_X, valid_y). 19 """ 20 21 def polynome_2(x): 22 return -sum([x_i ** 2 for x_i in x]) 23 24 d = b - a 25 x = np.array([a + np.random.random(dim) * d for i in range(size)]) 26 y = np.array([[polynome_2(v)] for v in x]) 27 28 sep_index = int(prop * size) 29 train_X = x[:sep_index] 30 train_y = y[:sep_index] 31 32 valid_X = x[sep_index:] 33 valid_y = y[sep_index:] 34 35 print(f"train_X shape: {np.shape(train_X)}") 36 print(f"train_y shape: {np.shape(train_y)}") 37 print(f"valid_X shape: {np.shape(valid_X)}") 38 print(f"valid_y shape: {np.shape(valid_y)}") 39 return (train_X, train_y), (valid_X, valid_y) 40 41 42if __name__ == "__main__": 43 load_data()  You can test the load_data function: bash python load_data.py  The expected output is: [Out] train_X shape: (8000, 10) train_y shape: (8000, 1) valid_X shape: (2000, 10) valid_y shape: (2000, 1)  ## The Keras model¶ model_run.py contains the code for the neural network that we will train. The model is implemented in the run() function below. We will provide this function to DeepHyper, which will call it to evaluate various hyperparameter settings. This function takes a point argument, which is a dictionary of tunable hyperparameters. In this case, we will tune: • The number of units of the Dense hidden layer (point['units']) • The activation function of the Dense layer (point['activation']) • The learning rate of the RMSprop optimizer (point['lr']). After training, the validation $$R^{2}$$ is returned by the run() function. This return value is the objective for maximization by the DeepHyper HPS search algorithm. Step 1: polynome2/model_run.py  1import numpy as np 2import keras.backend as K 3import keras 4from keras.callbacks import EarlyStopping 5from keras.layers import Dense 6from keras.models import Sequential 7from keras.optimizers import RMSprop 8 9import os 10import sys 11 12here = os.path.dirname(os.path.abspath(__file__)) 13sys.path.insert(0, here) 14from load_data import load_data 15 16 17def r2(y_true, y_pred): 18 SS_res = keras.backend.sum(keras.backend.square(y_true - y_pred), axis=0) 19 SS_tot = keras.backend.sum( 20 keras.backend.square(y_true - keras.backend.mean(y_true, axis=0)), axis=0 21 ) 22 output_scores = 1 - SS_res / (SS_tot + keras.backend.epsilon()) 23 r2 = keras.backend.mean(output_scores) 24 return r2 25 26 27HISTORY = None 28 29 30def run(point): 31 global HISTORY 32 (x_train, y_train), (x_valid, y_valid) = load_data() 33 34 if point["activation"] == "identity": 35 point["activation"] = None 36 37 model = Sequential() 38 model.add( 39 Dense( 40 point["units"], 41 activation=point["activation"], 42 input_shape=tuple(np.shape(x_train)[1:]), 43 ) 44 ) 45 model.add(Dense(1)) 46 47 model.summary() 48 49 model.compile(loss="mse", optimizer=RMSprop(lr=point["lr"]), metrics=[r2]) 50 51 history = model.fit( 52 x_train, 53 y_train, 54 batch_size=64, 55 epochs=1000, 56 verbose=1, 57 callbacks=[EarlyStopping(monitor="val_r2", mode="max", verbose=1, patience=10)], 58 validation_data=(x_valid, y_valid), 59 ) 60 61 HISTORY = history.history 62 63 return history.history["val_r2"][-1] 64 65 66if __name__ == "__main__": 67 point = {"units": 10, "activation": "relu", "lr": 0.01} 68 objective = run(point) 69 print("objective: ", objective) 70 import matplotlib.pyplot as plt 71 72 plt.plot(HISTORY["val_r2"]) 73 plt.xlabel("Epochs") 74 plt.ylabel("Objective:$R^2$") 75 plt.grid() 76 plt.show()  Note Adding an EarlyStopping(...) callback is a good idea to stop the training of your model as soon as it is stops to improve. ... callbacks=[EarlyStopping( monitor='val_r2', mode='max', verbose=1, patience=10 )] ...  We can first train this model to evaluate the baseline accuracy: bash python model_run.py  [Out] objective: -0.00040728187561035154  ## Defining the HPS Problem space¶ The run function in model_run.py expects a hyperparameter dictionary with three keys: units, activation, and lr. We define the acceptable ranges for these hyperparameters with the Problem object inside problem.py. Hyperparameter ranges are defined using the following syntax: • Discrete integer ranges are generated from a tuple: (lower: int, upper: int) • Continous parameters are generated from a tuple: (lower: float, upper: float) • Categorical or nonordinal hyperparameters ranges can be given as a list of possible values: [val1, val2, ...] You probably have one or more “reference” sets of hyperparameters that are either hand-crafted or chosen by intuition. To bootstrap the search with these so-called starting points, use the add_starting_point(...) method. Note Several starting points can be defined with Problem.add_starting_point(**dims). All starting points will be evaluated before generating other evaluations. polynome2/problem.py  1from deephyper.problem import HpProblem 2 3Problem = HpProblem() 4 5Problem.add_hyperparameter((1, 100), "units") 6Problem.add_hyperparameter(["identity", "relu", "sigmoid", "tanh"], "activation") 7Problem.add_hyperparameter((0.0001, 1.0), "lr") 8 9Problem.add_starting_point(units=10, activation="identity", lr=0.01) 10 11if __name__ == "__main__": 12 print(Problem)  You can look at the representation of your problem: bash python problem.py  The expected output is: [Out] Problem { 'activation': [None, 'relu', 'sigmoid', 'tanh'], 'lr': (0.0001, 1.0), 'units': (1, 100)} Starting Point {0: {'activation': None, 'lr': 0.01, 'units': 10}}  ## Running the search locally¶ Everything is ready to run. Recall the Python files defining our experiment: polynome2/ __init__.py load_data.py model_run.py problem.py  We have tested the syntax in all of these by running them individually. Now, let’s put it all together by tuning the 3 hyperparameters with asynchronous model-based search (AMBS). bash deephyper hps ambs --problem hps_demo.polynome2.problem.Problem --run hps_demo.polynome2.model_run.run  Note The above command will require a long time to execute completely. If you want to generate a smaller dataset, append –max-evals 100’ to the end of the command to expedite the process. Note In order to run DeepHyper locally and on other systems we are using deephyper.evaluator. For local evaluations we use the deephyper.evaluator.SubprocessEvaluator. Note Alternative to the command line above, paths to the problem.py and model_run.py files can be passed as arguments. DeepHyper requires that these modules contain an importable Problem instance and run callable, respectively. It is your responsibility to ensure that any other modules imported in problem.py or model_run.py are in the Python import search path. We strongly recommend using a virtual environment with the start-project and new-problem command line tools. This ensures that any helper modules are easily accessible using the syntax import problem_name.helper_module. After the search is over, you will find the following files in your working directory: deephyper.log results.csv results.json  ## Deephyper analytics¶ We will use the deephyper-analytics command line tool to investigate the results. Note See the Analytics installation instructions of deephyper-analytics. Run: bash deephyper-analytics notebook --type hps --output dh-analytics-hps.ipynb results.csv  Then start jupyter: bash jupyter notebook  Open the dh-analytics-hps notebook and run it: path to data file: polynome2/results.csv for customization please see: https://matplotlib.org/api/matplotlib_configuration_api.html ### Setup & data loading¶ path_to_data_file = 'polynome2/results.csv'  import matplotlib import matplotlib.pyplot as plt import numpy as np import pandas as pd import seaborn as sns from pprint import pprint from datetime import datetime from tqdm import tqdm from IPython.display import display, Markdown width = 21 height = 13 matplotlib.rcParams.update({ 'font.size': 21, 'figure.figsize': (width, height), 'figure.facecolor': 'white', 'savefig.dpi': 72, 'figure.subplot.bottom': 0.125, 'figure.edgecolor': 'white', 'xtick.labelsize': 21, 'ytick.labelsize': 21}) df = pd.read_csv(path_to_data_file) display(Markdown(f'The search did _{df.count()[0]}_ **evaluations**.')) df.head()  The search did 88 evaluations. activation lr units objective elapsed_sec 0 NaN 0.010000 10 -67.720345 4.683628 1 sigmoid 0.210479 78 -47.973845 7.850657 2 sigmoid 0.849683 18 -7.910984 11.379633 3 tanh 0.951716 19 -2.596602 16.031375 4 sigmoid 0.898754 74 -21.409714 19.312386 ### Statistical summary¶ df.describe()  lr units objective elapsed_sec count 100.000000 100.00000 100.000000 100.000000 mean 0.861301 13.12000 -3.468272 188.652953 std 0.112005 10.78746 11.586969 116.032871 min 0.010000 1.00000 -74.376173 4.683628 25% 0.861376 7.75000 -2.011465 87.576996 50% 0.871134 11.50000 -0.092576 178.604464 75% 0.876806 15.00000 0.494384 288.718287 max 0.997793 78.00000 0.746590 399.764441 ### Search trajectory¶ plt.plot(df.elapsed_sec, df.objective) plt.ylabel('Objective') plt.xlabel('Time (s.)') plt.xlim(0) plt.grid() plt.show()  ### Pairplots¶ not_include = ['elapsed_sec'] sns.pairplot(df.loc[:, filter(lambda n: n not in not_include, df.columns)], diag_kind="kde", markers="+", plot_kws=dict(s=50, edgecolor="b", linewidth=1), diag_kws=dict(shade=True)) plt.show()  corr = df.loc[:, filter(lambda n: n not in not_include, df.columns)].corr() sns.heatmap(corr, xticklabels=corr.columns, yticklabels=corr.columns, cmap=sns.diverging_palette(220, 10, as_cmap=True)) plt.show()  ### Best objective¶ i_max = df.objective.idxmax() df.iloc[i_max]  activation relu lr 0.882041 units 21 objective 0.74659 elapsed_sec 394.818 Name: 98, dtype: object  dict(df.iloc[i_max])  {'activation': 'relu', 'lr': 0.8820413612862609, 'units': 21, 'objective': 0.7465898108482361, 'elapsed_sec': 394.81818103790283}  The best point the search found: point = { 'activation': 'relu', 'lr': 0.8820413612862609, 'units': 21 }  Just pass this point to your run function Step 1: polynome2/model_run.py  1 2import numpy as np 3import keras.backend as K 4import keras 5from keras.callbacks import EarlyStopping 6from keras.layers import Dense 7from keras.models import Sequential 8from keras.optimizers import RMSprop 9 10import os 11import sys 12here = os.path.dirname(os.path.abspath(__file__)) 13sys.path.insert(0, here) 14from load_data import load_data 15 16 17def r2(y_true, y_pred): 18 SS_res = keras.backend.sum(keras.backend.square(y_true - y_pred), axis=0) 19 SS_tot = keras.backend.sum( 20 keras.backend.square(y_true - keras.backend.mean(y_true, axis=0)), axis=0 21 ) 22 output_scores = 1 - SS_res / (SS_tot + keras.backend.epsilon()) 23 r2 = keras.backend.mean(output_scores) 24 return r2 25 26 27HISTORY = None 28 29 30def run(point): 31 global HISTORY 32 (x_train, y_train), (x_valid, y_valid) = load_data() 33 34 model = Sequential() 35 model.add(Dense( 36 point['units'], 37 activation=point['activation'], 38 input_shape=tuple(np.shape(x_train)[1:]))) 39 model.add(Dense(1)) 40 41 model.summary() 42 43 model.compile(loss='mse', optimizer=RMSprop(lr=point['lr']), metrics=[r2]) 44 45 history = model.fit(x_train, y_train, 46 batch_size=64, 47 epochs=1000, 48 verbose=1, 49 callbacks=[EarlyStopping( 50 monitor='val_r2', 51 mode='max', 52 verbose=1, 53 patience=10 54 )], 55 validation_data=(x_valid, y_valid)) 56 57 HISTORY = history.history 58 59 return history.history['val_r2'][-1] 60 61 62if __name__ == '__main__': 63 point = { 64 'activation': 'relu', 65 'lr': 0.8820413612862609, 66 'units': 21 67 } 68 objective = run(point) 69 print('objective: ', objective) 70 import matplotlib.pyplot as plt 71 plt.plot(HISTORY['val_r2']) 72 plt.xlabel('Epochs') 73 plt.ylabel('Objective:$R^2$') 74 plt.grid() 75 plt.show()  And run the script: bash python model_run.py  [Out] objective: 0.47821942329406736  ## Running the search on ALCF’s Theta and Cooley¶ Now let’s run the same search, but scale out to run parallel model evaluations across the nodes of an HPC system such as Theta or Cooley. First create a Balsam database: bash $ balsam init polydb


Start and connect to the polydb database:

bash
$source balsamactivate polydb  Set up the demo polynome2 problem, as before: bash $ deephyper start-project hps_demo
$cd hps_demo/hps_demo/$ deephyper new-problem hps polynome2


Use the balsam-submit command to set up and dispatch an AMBS job to the local scheduler:

bash
\$ deephyper balsam-submit hps polynome2_demo -p hps_demo.polynome2.problem.Problem -r hps_demo.polynome2.model_run.run  \
-t 30 -q debug-cache-quad -n 4 -A datascience -j mpi

[Out]
Validating Problem...OK
Validating run...OK
Bootstrapping apps...OK
Creating HPS(AMBS) BalsamJob...OK
Performing job submission...
Submit OK: Qlaunch {   'command': '/lus/theta-fs0/projects/datascience/msalim/deephyper/deephyper/db/qsubmit/qlaunch12.sh',
'from_balsam': True,
'id': 12,
'job_mode': 'mpi',
'nodes': 4,
'prescheduled_only': False,
'project': 'datascience',
'scheduler_id': 370907,
'state': 'submitted',
'wall_minutes': 30,
'wf_filter': 'test_hps'}
**************************************************************************************************************************************
Success. The search will run at: /myprojects/deephyper/deephyper/db/data/test_hps/test_hps_2ef063ce
**************************************************************************************************************************************


Above, balsam-submit takes the following arguments:

1. The first positional argument mode is either hps or nas

2. The second positional argument workflow must be a unique identifier for the run. An error will be raised if this workflow already exists.

3. -p Problem and -r Run arguments define the search, as before

4. -t 60 indicates the walltime (minutes) of the scheduled job

5. -n 4 requests four nodes on which to run the search. DeepHyper will automatically scale the search out across available nodes.

6. -q Queue and -A Project pass the name of the job queue and project allocation to the HPC scheduler

7. -j or --job-mode must be either mpi or serial. This controls how Balsam launches your model_runs.

Once the search is done, you will find results in the directory shown in the banner: /myprojects/deephyper/deephyper/db/data/test_hps/test_hps_2ef063ce.

Note

The examples so far assume that your DeepHyper models run in the same Python environment as DeepHyper and each model runs on a single node. If you need more control over model execution, say, to run containerized models, or to run data-parallel model training with Horovod, you can hook into the Balsam job controller. See Configuring model execution with Balsam for a detailed example.