6. Neural Architecture Search (Basic)#

In this tutorial we will learn the basics of neural architecture search (NAS). We will use artificial data generated from a polynomial function. Then, we will discover how to create a search space of neural architecture using a directed graph. Finally, we will see how to define the NAS settings and how to execute the search.

[1]:

!pip install deephyper["nas"]
!pip install ray

Collecting deephyper
  Downloading deephyper-0.3.3-py2.py3-none-any.whl (962 kB)
     |████████████████████████████████| 962 kB 5.1 MB/s
Requirement already satisfied: networkx in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.6.3)
Requirement already satisfied: tensorflow-probability in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.14.1)
Collecting ConfigSpace>=0.4.18
  Downloading ConfigSpace-0.4.20-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB)
     |████████████████████████████████| 4.2 MB 33.2 MB/s
Requirement already satisfied: matplotlib>=3.0.3 in /usr/local/lib/python3.7/dist-packages (from deephyper) (3.2.2)
Requirement already satisfied: xgboost in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.90)
Requirement already satisfied: tensorflow>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.7.0)
Requirement already satisfied: pydot in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.3.0)
Requirement already satisfied: statsmodels in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.10.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.19.5)
Requirement already satisfied: Jinja2 in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.11.3)
Requirement already satisfied: pandas>=0.24.2 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.1.5)
Collecting openml==0.10.2
  Downloading openml-0.10.2.tar.gz (158 kB)
     |████████████████████████████████| 158 kB 47.4 MB/s
Collecting dh-scikit-optimize==0.9.4
  Downloading dh_scikit_optimize-0.9.4-py2.py3-none-any.whl (102 kB)
     |████████████████████████████████| 102 kB 11.5 MB/s
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from deephyper) (4.62.3)
Requirement already satisfied: typeguard in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.7.1)
Requirement already satisfied: joblib>=0.10.3 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.1.0)
Collecting ray[default]>=1.3.0
  Downloading ray-1.8.0-cp37-cp37m-manylinux2014_x86_64.whl (54.7 MB)
     |████████████████████████████████| 54.7 MB 23 kB/s
Requirement already satisfied: scikit-learn>=0.23.1 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.0.1)
Requirement already satisfied: scipy>=0.19.1 in /usr/local/lib/python3.7/dist-packages (from dh-scikit-optimize==0.9.4->deephyper) (1.4.1)
Collecting pyaml>=16.9
  Downloading pyaml-21.10.1-py2.py3-none-any.whl (24 kB)
Collecting liac-arff>=2.4.0
  Downloading liac-arff-2.5.0.tar.gz (13 kB)
Collecting xmltodict
  Downloading xmltodict-0.12.0-py2.py3-none-any.whl (9.2 kB)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from openml==0.10.2->deephyper) (2.23.0)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/dist-packages (from openml==0.10.2->deephyper) (2.8.2)
Requirement already satisfied: cython in /usr/local/lib/python3.7/dist-packages (from ConfigSpace>=0.4.18->deephyper) (0.29.24)
Requirement already satisfied: pyparsing in /usr/local/lib/python3.7/dist-packages (from ConfigSpace>=0.4.18->deephyper) (2.4.7)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.0.3->deephyper) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.0.3->deephyper) (1.3.2)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.24.2->deephyper) (2018.9)
Requirement already satisfied: PyYAML in /usr/local/lib/python3.7/dist-packages (from pyaml>=16.9->dh-scikit-optimize==0.9.4->deephyper) (3.13)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil->openml==0.10.2->deephyper) (1.15.0)
Collecting redis>=3.5.0
  Downloading redis-4.0.1-py3-none-any.whl (118 kB)
     |████████████████████████████████| 118 kB 22.2 MB/s
Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (7.1.2)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (1.0.2)
Requirement already satisfied: grpcio>=1.28.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (1.41.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (3.3.2)
Requirement already satisfied: protobuf>=3.15.3 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (3.17.3)
Requirement already satisfied: attrs in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (21.2.0)
Requirement already satisfied: jsonschema in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (2.6.0)
Collecting gpustat>=1.0.0b1
  Downloading gpustat-1.0.0b1.tar.gz (82 kB)
     |████████████████████████████████| 82 kB 213 kB/s
Collecting py-spy>=0.2.0
  Downloading py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB)
     |████████████████████████████████| 3.0 MB 63.8 MB/s
Collecting aiohttp-cors
  Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB)
Collecting aiohttp>=3.7
  Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
     |████████████████████████████████| 1.1 MB 46.6 MB/s
Requirement already satisfied: prometheus-client>=0.7.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (0.12.0)
Collecting colorful
  Downloading colorful-0.5.4-py2.py3-none-any.whl (201 kB)
     |████████████████████████████████| 201 kB 53.7 MB/s
Collecting opencensus
  Downloading opencensus-0.8.0-py2.py3-none-any.whl (128 kB)
     |████████████████████████████████| 128 kB 75.8 MB/s
Collecting aioredis<2
  Downloading aioredis-1.3.1-py3-none-any.whl (65 kB)
     |████████████████████████████████| 65 kB 4.0 MB/s
Collecting aiosignal>=1.1.2
  Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)
Collecting frozenlist>=1.1.1
  Downloading frozenlist-1.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (192 kB)
     |████████████████████████████████| 192 kB 64.4 MB/s
Collecting yarl<2.0,>=1.0
  Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)
     |████████████████████████████████| 271 kB 49.6 MB/s
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/dist-packages (from aiohttp>=3.7->ray[default]>=1.3.0->deephyper) (3.10.0.2)
Collecting asynctest==0.13.0
  Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp>=3.7->ray[default]>=1.3.0->deephyper) (2.0.7)
Collecting multidict<7.0,>=4.5
  Downloading multidict-5.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (160 kB)
     |████████████████████████████████| 160 kB 57.6 MB/s
Collecting async-timeout<5.0,>=4.0.0a3
  Downloading async_timeout-4.0.1-py3-none-any.whl (5.7 kB)
Collecting hiredis
  Downloading hiredis-2.0.0-cp37-cp37m-manylinux2010_x86_64.whl (85 kB)
     |████████████████████████████████| 85 kB 4.1 MB/s
Requirement already satisfied: nvidia-ml-py3>=7.352.0 in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (7.352.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (5.4.8)
Collecting blessed>=1.17.1
  Downloading blessed-1.19.0-py2.py3-none-any.whl (57 kB)
     |████████████████████████████████| 57 kB 4.7 MB/s
Requirement already satisfied: wcwidth>=0.1.4 in /usr/local/lib/python3.7/dist-packages (from blessed>=1.17.1->gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (0.2.5)
Collecting deprecated
  Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.23.1->deephyper) (3.0.0)
Requirement already satisfied: tensorboard~=2.6 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: flatbuffers<3.0,>=1.12 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.0)
Requirement already satisfied: gast<0.5.0,>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.4.0)
Requirement already satisfied: wheel<1.0,>=0.32.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.37.0)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.22.0)
Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.2.0)
Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.13.3)
Requirement already satisfied: tensorflow-estimator<2.8,~=2.7.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: keras-preprocessing>=1.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.1.2)
Requirement already satisfied: absl-py>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.12.0)
Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (3.1.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (3.3.0)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.1.0)
Requirement already satisfied: keras<2.8,>=2.7.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.6.3)
Requirement already satisfied: libclang>=9.0.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (12.0.0)
Requirement already satisfied: cached-property in /usr/local/lib/python3.7/dist-packages (from h5py>=2.9.0->tensorflow>=2.0.0->deephyper) (1.5.2)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.4.6)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (57.4.0)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.0.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.8.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.3.4)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.6.1)
Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.35.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.7.2)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.2.8)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.2.4)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.3.0)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.8.2)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.4.8)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (1.24.3)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (2021.10.8)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.1.1)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->markdown>=2.6.8->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.6.0)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from Jinja2->deephyper) (2.0.1)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from opencensus->ray[default]>=1.3.0->deephyper) (1.26.3)
Collecting opencensus-context==0.1.2
  Downloading opencensus_context-0.1.2-py2.py3-none-any.whl (4.4 kB)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]>=1.3.0->deephyper) (1.53.0)
Requirement already satisfied: packaging>=14.3 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]>=1.3.0->deephyper) (21.2)
Requirement already satisfied: patsy>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from statsmodels->deephyper) (0.5.2)
Requirement already satisfied: decorator in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (4.4.2)
Requirement already satisfied: dm-tree in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (0.1.6)
Requirement already satisfied: cloudpickle>=1.3 in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (1.3.0)
Building wheels for collected packages: openml, liac-arff, gpustat
  Building wheel for openml (setup.py) ... done
  Created wheel for openml: filename=openml-0.10.2-py3-none-any.whl size=190318 sha256=6f431466cb702f739c2337475733f05ce4bb2bafb81cc3d9d1ab6941159d0152
  Stored in directory: /root/.cache/pip/wheels/9c/9e/f3/6a5ebf16527d7fe22d9bc1652bc9beb5dc9fcfdeb75e805400
  Building wheel for liac-arff (setup.py) ... done
  Created wheel for liac-arff: filename=liac_arff-2.5.0-py3-none-any.whl size=11731 sha256=60fe3cfb23ccb90a5eb581e4884df69a8e804b2d6fb7ec25b6ae071acde27916
  Stored in directory: /root/.cache/pip/wheels/1f/0f/15/332ca86cbebf25ddf98518caaf887945fbe1712b97a0f2493b
  Building wheel for gpustat (setup.py) ... done
  Created wheel for gpustat: filename=gpustat-1.0.0b1-py3-none-any.whl size=15979 sha256=80eda2084e3b9a55684ec3fbfc292b182ec25c8bef7220130107c82157038bb8
  Stored in directory: /root/.cache/pip/wheels/1a/16/e2/3e2437fba4c4b6a97a97bd96fce5d14e66cff5c4966fb1cc8c
Successfully built openml liac-arff gpustat
Installing collected packages: multidict, frozenlist, yarl, deprecated, asynctest, async-timeout, aiosignal, redis, opencensus-context, hiredis, blessed, aiohttp, xmltodict, ray, pyaml, py-spy, opencensus, liac-arff, gpustat, colorful, aioredis, aiohttp-cors, openml, dh-scikit-optimize, ConfigSpace, deephyper
Successfully installed ConfigSpace-0.4.20 aiohttp-3.8.1 aiohttp-cors-0.7.0 aioredis-1.3.1 aiosignal-1.2.0 async-timeout-4.0.1 asynctest-0.13.0 blessed-1.19.0 colorful-0.5.4 deephyper-0.3.3 deprecated-1.2.13 dh-scikit-optimize-0.9.4 frozenlist-1.2.0 gpustat-1.0.0b1 hiredis-2.0.0 liac-arff-2.5.0 multidict-5.2.0 opencensus-0.8.0 opencensus-context-0.1.2 openml-0.10.2 py-spy-0.3.11 pyaml-21.10.1 ray-1.8.0 redis-4.0.1 xmltodict-0.12.0 yarl-1.7.2

6.1. Loading the data#

First, we will create the load_data function which loads and returns the training and validation data. The load_data function generates data from a function \(f\) where \(\mathbf{x} \in [a, b]^n\) such as \(f(\mathbf{x}) = -\sum_{i=0}^{n-1} {x_i ^2}\):

[1]:

import numpy as np


def load_data(verbose=0, dim=10, a=-50, b=50, prop=0.80, size=10000):
    rs = np.random.RandomState(2018)

    def polynome_2(x):
        return -sum([x_i ** 2 for x_i in x])

    d = b - a
    x = np.array([a + rs.random(dim) * d for _ in range(size)])
    y = np.array([[polynome_2(v)] for v in x])

    sep_index = int(prop * size)
    X_train = x[:sep_index]
    y_train = y[:sep_index]

    X_valid = x[sep_index:]
    y_valid = y[sep_index:]

    if verbose:
        print(f"X_train shape: {np.shape(X_train)}")
        print(f"y_train shape: {np.shape(y_train)}")
        print(f"X_valid shape: {np.shape(X_valid)}")
        print(f"y_valid shape: {np.shape(y_valid)}")
    return (X_train, y_train), (X_valid, y_valid)


_ = load_data(verbose=1)

X_train shape: (8000, 10)
y_train shape: (8000, 1)
X_valid shape: (2000, 10)
y_valid shape: (2000, 1)

6.2. Define a neural architecture search space#

Let us define the neural architecture search space. To do this we use a KSearchSpace class. We define the ResNetMLPSpace search space which is a sub-class of KSearchSpace where we have to implement a build() method which return itself. The __init__ method is used to pass possible options of the search space such as the maximum number of layers self.num_layers.

The input nodes can be retrieved with self.input_nodes which is automatically built depending on the input_shape.

The search space is composed of ConstantNode and VariableNode. A ConstantNode defines a fixed operations whereas the VariableNode defines a list of possible operations (i.e., corresponds to a categorical decision variable). Operations can be defined directly from Keras Layers such as:

Dense = operation(tf.keras.layers.Dense)

All nodes of the search space without outer edges are automatically assumed to be output nodes.

[2]:

import collections

import tensorflow as tf

from deephyper.nas import KSearchSpace
from deephyper.nas.node import ConstantNode, VariableNode
from deephyper.nas.operation import operation, Zero, Connect, AddByProjecting, Identity


Dense = operation(tf.keras.layers.Dense)
Dropout = operation(tf.keras.layers.Dropout)
Add = operation(tf.keras.layers.Add)
Flatten = operation(tf.keras.layers.Flatten)

ACTIVATIONS = [
    tf.keras.activations.elu,
    tf.keras.activations.gelu,
    tf.keras.activations.hard_sigmoid,
    tf.keras.activations.linear,
    tf.keras.activations.relu,
    tf.keras.activations.selu,
    tf.keras.activations.sigmoid,
    tf.keras.activations.softplus,
    tf.keras.activations.softsign,
    tf.keras.activations.swish,
    tf.keras.activations.tanh,
]


class ResNetMLPSpace(KSearchSpace):

    def __init__(self, input_shape, output_shape, seed=None, num_layers=3, mode="regression"):
        super().__init__(input_shape, output_shape, seed=seed)

        self.num_layers = num_layers
        assert mode in ["regression", "classification"]
        self.mode = mode

    def build(self):

        source = self.input_nodes[0]
        output_dim = self.output_shape[0]

        out_sub_graph = self.build_sub_graph(source, self.num_layers)

        if self.mode == "regression":
            output = ConstantNode(op=Dense(output_dim))
            self.connect(out_sub_graph, output)
        else:
            output = ConstantNode(
                op=Dense(output_dim, activation="softmax")
            )  # One-hot encoding
            self.connect(out_sub_graph, output)

        return self

    def build_sub_graph(self, input_, num_layers=3):
        source = prev_input = input_

        # look over skip connections within a range of the 3 previous nodes
        anchor_points = collections.deque([source], maxlen=3)

        for _ in range(self.num_layers):
            dense = VariableNode()
            self.add_dense_to_(dense)
            self.connect(prev_input, dense)
            x = dense

            dropout = VariableNode()
            self.add_dropout_to_(dropout)
            self.connect(x, dropout)
            x = dropout

            add = ConstantNode()
            add.set_op(AddByProjecting(self, [x], activation="relu"))

            for anchor in anchor_points:
                skipco = VariableNode()
                skipco.add_op(Zero())
                skipco.add_op(Connect(self, anchor))
                self.connect(skipco, add)

            prev_input = add

            # ! for next iter
            anchor_points.append(prev_input)

        return prev_input

    def add_dense_to_(self, node):
        node.add_op(Identity())  # we do not want to create a layer in this case
        for units in range(16, 16 * 16 + 1, 16):
            for activation in ACTIVATIONS:
                node.add_op(Dense(units=units, activation=activation))

    def add_dropout_to_(self, node):
        a, b = 1e-3, 0.4
        node.add_op(Identity())
        dropout_range = np.exp(np.linspace(np.log(a), np.log(b), 10))  #! NAS
        for rate in dropout_range:
            node.add_op(Dropout(rate))

A KSearchSpace as some useful methods such as:

space.sample(choice) which returns a random model from the search space if choice == None or generate a model corresponding to the choice if not.
space.choices() which returns the list of discrete dimensions corresponding to the search space.

Let us visualize a few randomly sampled neural architecture from this search space.

[3]:

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from tensorflow.keras.utils import plot_model

shapes = dict(input_shape=(10,), output_shape=(1,))
space = ResNetMLPSpace(**shapes).build()

print("Choices: ", space.choices())

images = []
plt.figure(figsize=(15,15))
for i in range(4):

    plt.subplot(2,2,i+1)
    model = space.sample()
    plot_model(model,
               "random_model.png",
               show_shapes=False, show_layer_names=False)
    image = mpimg.imread("random_model.png")
    plt.imshow(image)
    plt.axis('off')

plt.show()

Choices:  [(0, 176), (0, 10), (0, 1), (0, 176), (0, 10), (0, 1), (0, 1), (0, 176), (0, 10), (0, 1), (0, 1), (0, 1)]

../../../_images/tutorials_tutorials_colab_NAS_basic_7_1.png

6.3. Create a problem instance#

Let us define the neural architecture search problem.

[4]:

from deephyper.problem import NaProblem
from deephyper.nas.preprocessing import minmaxstdscaler

# Create a Neural Architecture problem
problem = NaProblem()

# Link the load-data function
problem.load_data(load_data)

# The function passed to preprocessing has to return
# a scikit-learn like preprocessor.
problem.preprocessing(minmaxstdscaler)

# Link the defined search space
problem.search_space(ResNetMLPSpace)

# Fixed hyperparameters for all trained models
problem.hyperparameters(
    batch_size=32,
    learning_rate=0.01,
    optimizer="adam",
    epsilon=1e-7,
    num_epochs=20,
    callbacks=dict(
        EarlyStopping=dict(
            monitor="val_r2", mode="max", verbose=0, patience=5
        )
    ),
)

# Define the optimized loss (it can also be a function)
problem.loss("mse")

# Define metrics to compute for each training and validation epoch
problem.metrics(["r2"])

# Define the maximised objective
problem.objective("val_r2__last")

problem

[4]:

Problem is:
    - search space   : __main__.ResNetMLPSpace
    - data loading   : __main__.load_data
    - preprocessing  : deephyper.nas.preprocessing._base.minmaxstdscaler
    - hyperparameters:
        * verbose: 0
        * batch_size: 32
        * learning_rate: 0.01
        * optimizer: adam
        * num_epochs: 20
        * callbacks: {'EarlyStopping': {'monitor': 'val_r2', 'mode': 'max', 'verbose': 0, 'patience': 5}}
    - loss           : mse
    - metrics        :
        * r2
    - objective      : val_r2__last

Find more about NaProblem settings on the Problem documentation.

Tip

Adding an EarlyStopping(...) callback is a good idea to stop the training of your model as soon as it stops to improve.

...
EarlyStopping=dict(monitor="val_r2", mode="max", verbose=0, patience=5)
...

6.4. Running the search#

Create an Evaluator object using the ray backend to distribute the evaluation of the run-function. In neural architecture search DeepHyper provides the run_base_trainer function which automate the training process of a sampled model.

[5]:

import multiprocessing
num_cpus = multiprocessing.cpu_count()

print(f"{num_cpus} CPU{'s' if num_cpus > 1 else ''} are available on this system.")


from deephyper.evaluator import Evaluator
from deephyper.evaluator.callback import TqdmCallback

from deephyper.nas.run import run_base_trainer

evaluator = Evaluator.create(run_base_trainer,
                 method="ray",
                 method_kwargs={
                     # Start a new Ray server
                     "address": None,
                     # Defines the number of available CPUs
                     "num_cpus": min(4, num_cpus),
                     # Defines the number of CPUs for each task
                     "num_cpus_per_task": 1,
                     "callbacks": [TqdmCallback()]
                 })

print("Number of workers: ", evaluator.num_workers)

10 CPUs are available on this system.

/Users/romainegele/Documents/Argonne/deephyper/deephyper/evaluator/_evaluator.py:126: UserWarning: Applying nest-asyncio patch for IPython Shell!
  warnings.warn(
2023-01-30 18:08:58,502 INFO worker.py:1518 -- Started a local Ray instance.

Number of workers:  4

Tip

If executed locally, you can open the ray-dashboard at an address like http://127.0.0.1:port in a browser to monitor the CPU usage of the execution.

Finally, you can define a Random search called Random and link to it the defined problem and evaluator.

[6]:

from deephyper.search.nas import Random

search = Random(problem, evaluator)

[7]:

results = search.search(10)

(run_base_trainer pid=45436) 2023-01-30 18:09:05.625566: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
(run_base_trainer pid=45437) 2023-01-30 18:09:05.625456: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
(run_base_trainer pid=45438) 2023-01-30 18:09:05.631501: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
(run_base_trainer pid=45439) 2023-01-30 18:09:05.642482: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz

After the search is over, you will find the following files in your current folder:

results.csv
save/

Let us visualize the training of our models. First, we need to load the training history of each model which are located in save/history:

[8]:

import os
import json


histories = [os.path.join("save/history", f) for f in os.listdir("save/history/") if ".json" in f]

for i, fpath in enumerate(histories):
    with open(fpath, "r") as fd:
        histories[i] = json.load(fd)

print(list(histories[0].keys()))

['n_parameters', 'training_time', 'loss', 'r2', 'val_loss', 'val_r2']

[9]:

plt.figure()

for h in histories:
    plt.plot(h["val_r2"])

plt.ylabel("Validation $R^2$")
plt.xlabel("Epochs")
plt.show()

../../../_images/tutorials_tutorials_colab_NAS_basic_19_0.png

Once the search is over, a file named results.csv is saved in the current directory. The same dataframe is returned by the search.search(...) call. It contains the configurations evaluated during the search and their corresponding objective value (i.e, validation accuracy), timestamp_submit the time when the evaluator submitted the configuration to be evaluated and timestamp_gather the time when the evaluator received the configuration once evaluated (both are relative times with respect to the creation of the Evaluator instance). Each neural architecture is embedded as a list of discrete decision variables called arch_seq.

[10]:

results

[10]:

	p:arch_seq	objective	job_id	m:timestamp_submit	m:timestamp_gather
0	[1, 7, 1, 75, 7, 0, 0, 87, 5, 1, 0, 0]	0.968497	1	5.663778	11.017621
1	[17, 8, 1, 50, 3, 0, 1, 40, 3, 0, 0, 1]	0.950744	0	5.663629	11.263356
2	[89, 9, 1, 92, 9, 0, 1, 142, 9, 1, 0, 1]	0.975879	2	5.663916	14.272947
3	[133, 7, 0, 148, 10, 0, 1, 10, 7, 0, 1, 0]	0.924481	3	5.664056	14.722977
4	[1, 6, 0, 151, 4, 1, 0, 80, 4, 1, 0, 1]	0.936412	4	11.035176	14.983041
5	[127, 1, 0, 168, 9, 1, 0, 131, 5, 1, 0, 1]	0.965255	5	11.264783	15.918054
6	[103, 3, 1, 170, 6, 0, 0, 144, 10, 1, 0, 0]	0.965917	9	15.919445	19.091425
7	[20, 4, 0, 123, 0, 1, 1, 60, 10, 1, 1, 0]	0.937058	8	14.984620	19.759570
8	[84, 1, 1, 112, 1, 0, 0, 165, 2, 0, 0, 0]	0.930200	7	14.724561	20.370351
9	[175, 10, 0, 148, 6, 1, 1, 148, 10, 1, 1, 0]	0.897664	6	14.274681	21.072806

The deephyper-analytics command line is a way of analyzing this type of file. For example, we want to output the best configuration we can use the topk functionnality.

[12]:

results.nlargest(n=3, columns="objective")

[12]:

	p:arch_seq	objective	job_id	m:timestamp_submit	m:timestamp_gather
2	[89, 9, 1, 92, 9, 0, 1, 142, 9, 1, 0, 1]	0.975879	2	5.663916	14.272947
0	[1, 7, 1, 75, 7, 0, 0, 87, 5, 1, 0, 0]	0.968497	1	5.663778	11.017621
6	[103, 3, 1, 170, 6, 0, 0, 144, 10, 1, 0, 0]	0.965917	9	15.919445	19.091425

Where each architecture is described as a vector of scalar values named arch_seq. In fact, each of this scalar values represents chosen operations for the variable nodes of our search space.

6.5. Testing the best configuration#

We can visualize the architecture of the best configuration:

[13]:

best_config = results.iloc[results.objective.argmax()][:-2].to_dict()
arch_seq = json.loads(best_config["p:arch_seq"])
model = space.sample(arch_seq)
plot_model(model, show_shapes=False, show_layer_names=False)

[13]:

../../../_images/tutorials_tutorials_colab_NAS_basic_25_0.png

*** SIGTERM received at time=1675103578 ***
PC: @        0x19670fe18  (unknown)  kevent
[2023-01-30 19:32:58,377 E 45405 830659] logging.cc:361: *** SIGTERM received at time=1675103578 ***
[2023-01-30 19:32:58,378 E 45405 830659] logging.cc:361: PC: @        0x19670fe18  (unknown)  kevent

Neural Architecture Search (Basic)

Contents

6. Neural Architecture Search (Basic)#

6.1. Loading the data#

6.2. Define a neural architecture search space#

6.3. Create a problem instance#

6.4. Running the search#

6.5. Testing the best configuration#