Neural Architecture Search (NAS)

A neural architecture search (NAS) problem can be defined using three files with a NAS problem directory within a python package:

nas_problems/
    setup.py
    nas_problems/
        __init__.py
        myproblem/
            __init__.py
            load_data.py
            problem.py
            search_space.py

We will illustrate the NAS problem definition using a regression example. We will use polynome function to generate training and test data and run a NAS to find the best search_space for this experiment.

Create a python package

Init a new nas project:

bash
deephyper start-project nas_problems

The project was created and installed in your current python environment. Then go to the nas_problems package and create a new problem:

bash
cd nas_problems/nas_problems/
deephyper new-problem nas polynome2

The problem was created. Then go to the problem polynome2 folder:

bash
cd nas_problems/nas_problems/polynome2

Create load_data.py

Fist, we will look at the load_data.py file that loads and returns the training and validation data. The load_data function generates data from a function \(f\) where \(X \in [a, b]^n\) such as \(f(X) = -\sum_{i=0}^{n-1} {x_i ^2}\):

polynome2/load_data.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import os
import numpy as np

np.random.seed(2018)


def load_data(dim=10, a=-50, b=50, prop=0.80, size=10000):
    """Generate a random distribution of data for polynome_2 function: -SUM(X**2) where "**" is an element wise operator in the continuous range [a, b].

    Args:
        dim (int): size of input vector for the polynome_2 function.
        a (int): minimum bound for all X dimensions.
        b (int): maximum bound for all X dimensions.
        prop (float): a value between [0., 1.] indicating how to split data between training set and validation set. `prop` corresponds to the ratio of data in training set. `1.-prop` corresponds to the amount of data in validation set.
        size (int): amount of data to generate. It is equal to `len(training_data)+len(validation_data).

    Returns:
        tuple(tuple(ndarray, ndarray), tuple(ndarray, ndarray)): of Numpy arrays: `(train_X, train_y), (valid_X, valid_y)`.
    """

    def polynome_2(x):
        return -sum([x_i ** 2 for x_i in x])

    d = b - a
    x = np.array([a + np.random.random(dim) * d for i in range(size)])
    y = np.array([[polynome_2(v)] for v in x])

    sep_index = int(prop * size)
    train_X = x[:sep_index]
    train_y = y[:sep_index]

    valid_X = x[sep_index:]
    valid_y = y[sep_index:]

    print(f"train_X shape: {np.shape(train_X)}")
    print(f"train_y shape: {np.shape(train_y)}")
    print(f"valid_X shape: {np.shape(valid_X)}")
    print(f"valid_y shape: {np.shape(valid_y)}")
    return (train_X, train_y), (valid_X, valid_y)


if __name__ == "__main__":
    load_data()

Test the load_data function:

bash
python load_data.py

The expected output is:

[Out]
train_X shape: (8000, 10)
train_y shape: (8000, 1)
valid_X shape: (2000, 10)
valid_y shape: (2000, 1)

Create search_space.py

Then, we will take a look at search_space.py which contains the code for the neural network search_space definition.

polynome2/search_space.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
import collections

import tensorflow as tf

from deephyper.nas.space import AutoKSearchSpace
from deephyper.nas.space.node import ConstantNode, VariableNode
from deephyper.nas.space.op.basic import Tensor
from deephyper.nas.space.op.connect import Connect
from deephyper.nas.space.op.merge import AddByProjecting
from deephyper.nas.space.op.op1d import Dense, Identity


def add_dense_to_(node):
    node.add_op(Identity())  # we do not want to create a layer in this case

    activations = [None, tf.nn.relu, tf.nn.tanh, tf.nn.sigmoid]
    for units in range(16, 97, 16):
        for activation in activations:
            node.add_op(Dense(units=units, activation=activation))


def create_search_space(
    input_shape=(10,), output_shape=(7,), num_layers=10, *args, **kwargs
):

    arch = AutoKSearchSpace(input_shape, output_shape, regression=True)
    source = prev_input = arch.input_nodes[0]

    # look over skip connections within a range of the 3 previous nodes
    anchor_points = collections.deque([source], maxlen=3)

    for _ in range(num_layers):
        vnode = VariableNode()
        add_dense_to_(vnode)

        arch.connect(prev_input, vnode)

        # * Cell output
        cell_output = vnode

        cmerge = ConstantNode()
        cmerge.set_op(AddByProjecting(arch, [cell_output], activation="relu"))

        for anchor in anchor_points:
            skipco = VariableNode()
            skipco.add_op(Tensor([]))
            skipco.add_op(Connect(arch, anchor))
            arch.connect(skipco, cmerge)

        # ! for next iter
        prev_input = cmerge
        anchor_points.append(prev_input)

    return arch


def test_create_search_space():
    """Generate a random neural network from the search_space definition.
    """
    from random import random
    from tensorflow.keras.utils import plot_model
    import tensorflow as tf

    search_space = create_search_space(num_layers=10)
    ops = [random() for _ in range(search_space.num_nodes)]

    print(f"This search_space needs {len(ops)} choices to generate a neural network.")

    search_space.set_ops(ops)

    model = search_space.create_model()
    model.summary()

    plot_model(model, to_file="sampled_neural_network.png", show_shapes=True)
    print("The sampled_neural_network.png file has been generated.")


if __name__ == "__main__":
    test_create_search_space()

Create problem.py

Now, we will take a look at problem.py which contains the code for the problem definition.

polynome2_nas/problem.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from deephyper.problem import NaProblem
from nas_problems.polynome2.load_data import load_data
from nas_problems.polynome2.search_space import create_search_space
from deephyper.nas.preprocessing import minmaxstdscaler

Problem = NaProblem(seed=2019)

Problem.load_data(load_data)

Problem.preprocessing(minmaxstdscaler)

Problem.search_space(create_search_space, num_layers=3)

Problem.hyperparameters(
    batch_size=32,
    learning_rate=0.01,
    optimizer="adam",
    num_epochs=20,
    callbacks=dict(
        EarlyStopping=dict(
            monitor="val_r2", mode="max", verbose=0, patience=5  # or 'val_acc' ?
        )
    ),
)

Problem.loss("mse")  # or 'categorical_crossentropy' ?

Problem.metrics(["r2"])  # or 'acc' ?

Problem.objective("val_r2__last")  # or 'val_acc__last' ?


# Just to print your problem, to test its definition and imports in the current python environment.
if __name__ == "__main__":
    print(Problem)

You can look at the representation of your problem:

bash
python problem.py

The expected output is:

[Out]
Problem is:
* SEED = 2019 *
    - search space   : nas_problems.polynome2.search_space.create_search_space
    - data loading   : nas_problems.polynome2.load_data.load_data
    - preprocessing  : deephyper.nas.preprocessing.minmaxstdscaler
    - hyperparameters:
        * verbose: 1
        * batch_size: 32
        * learning_rate: 0.01
        * optimizer: adam
        * num_epochs: 20
        * callbacks: {'EarlyStopping': {'monitor': 'val_r2', 'mode': 'max', 'verbose': 0, 'patience': 5}}
    - loss           : mse
    - metrics        :
        * r2
    - objective      : val_r2__last
    - post-training  : None

Running the search locally

Everything is ready to run. Let’s remember the search_space of our experiment:

polynome2/
    __init__.py
    load_data.py
    problem.py
    search_space.py

Each of these files have been tested one by one on the local machine. Next, we will run a random search (RDM).

bash
deephyper nas random --evaluator ray --problem nas_problems.polynome2.problem.Problem

Note

In order to run DeepHyper locally and on other systems we are using Evaluator Interface. For local evaluations we can use the RayEvaluator or the SubprocessEvaluator.

After the search is over, you will find the following files in your current folder:

deephyper.log

You can now use deephyper-analytics to plot some information about the search

bash
deephyper-analytics parse deephyper.log

A JSON file should have been generated. We will now create a juyter notebook (replace $MY_JSON_FILE by the name of the json file created with parse:

bash
deephyper-analytics single -p $MY_JSON_FILE

jupyter notebook dh-analytics-single.ipynb

Running the search on ALCF’s Theta and Cooley

Now let’s run the search on an HPC system such as Theta or Cooley. First create a Balsam database:

bash
balsam init polydb

Start and connect to the polydb database:

bash
source balsamactivate polydb

Create a Balsam AE application:

bash
balsam app --name AE --exe "$(which python) -m deephyper.search.nas.regevo"
[Out]
Application 1:
-----------------------------
Name:           PPO
Description:
Executable:     /lus/theta-fs0/projects/datascience/regele/dh-opt/bin/python -m deephyper.search.nas.regevo
Preprocess:
Postprocess:
bash
balsam job --name poly_exp --workflow poly_exp --app PPO --num-nodes 2 --args "--evaluator balsam --problem nas_problems.polynome2.problem.Problem"
[Out]
BalsamJob 575dba96-c9ec-4015-921c-abcb1f261fce
----------------------------------------------
workflow:                       poly_exp
name:                           poly_exp
description:
lock:
parents:                        []
input_files:                    *
stage_in_url:
stage_out_files:
stage_out_url:
wall_time_minutes:              1
num_nodes:                      2
coschedule_num_nodes:           0
ranks_per_node:                 1
cpu_affinity:                   none
threads_per_rank:               1
threads_per_core:               1
node_packing_count:             1
environ_vars:
application:                    PPO
args:                           --evaluator balsam --problem nas_problems.polynome2.problem.Problem
user_workdir:
wait_for_parents:               True
post_error_handler:             False
post_timeout_handler:           False
auto_timeout_retry:             True
state:                          CREATED
queued_launch_id:               None
data:                           {}
*** Executed command:         /lus/theta-fs0/projects/datascience/regele/dh-opt/bin/python -m deephyper.search.nas.regevo --evaluator balsam --problem nas_problems.polynome2.problem.Problem
*** Working directory:        /lus/theta-fs0/projects/datascience/regele/polydb/data/poly_exp/poly_exp_575dba96

Confirm adding job to DB [y/n]: y

Submit the search to the Cobalt scheduler:

bash
balsam submit-launch -n 6 -q debug-cache-quad -t 60 -A datascience --job-mode mpi --wf-filter poly_exp
[Out]
Submit OK: Qlaunch {   'command': '/lus/theta-fs0/projects/datascience/regele/polydb/qsubmit/qlaunch1.sh',
    'from_balsam': True,
    'id': 1,
    'job_mode': 'serial',
    'nodes': 6,
    'prescheduled_only': False,
    'project': 'datascience',
    'queue': 'debug-flat-quad',
    'scheduler_id': 347124,
    'state': 'submitted',
    'wall_minutes': 60,
    'wf_filter': 'poly_exp'}

Now the search is done. You will find results at polydb/data/poly_exp/poly_exp_575dba96.