6. Neural Architecture Search (Basic)#
In this tutorial we will learn the basics of neural architecture search (NAS). We will use artificial data generated from a polynomial function. Then, we will discover how to create a search space of neural architecture using a directed graph. Finally, we will see how to define the NAS settings and how to execute the search.
[1]:
!pip install deephyper["nas"]
!pip install ray
Collecting deephyper
Downloading deephyper-0.3.3-py2.py3-none-any.whl (962 kB)
|████████████████████████████████| 962 kB 5.1 MB/s
Requirement already satisfied: networkx in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.6.3)
Requirement already satisfied: tensorflow-probability in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.14.1)
Collecting ConfigSpace>=0.4.18
Downloading ConfigSpace-0.4.20-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB)
|████████████████████████████████| 4.2 MB 33.2 MB/s
Requirement already satisfied: matplotlib>=3.0.3 in /usr/local/lib/python3.7/dist-packages (from deephyper) (3.2.2)
Requirement already satisfied: xgboost in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.90)
Requirement already satisfied: tensorflow>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.7.0)
Requirement already satisfied: pydot in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.3.0)
Requirement already satisfied: statsmodels in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.10.2)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.19.5)
Requirement already satisfied: Jinja2 in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.11.3)
Requirement already satisfied: pandas>=0.24.2 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.1.5)
Collecting openml==0.10.2
Downloading openml-0.10.2.tar.gz (158 kB)
|████████████████████████████████| 158 kB 47.4 MB/s
Collecting dh-scikit-optimize==0.9.4
Downloading dh_scikit_optimize-0.9.4-py2.py3-none-any.whl (102 kB)
|████████████████████████████████| 102 kB 11.5 MB/s
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from deephyper) (4.62.3)
Requirement already satisfied: typeguard in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.7.1)
Requirement already satisfied: joblib>=0.10.3 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.1.0)
Collecting ray[default]>=1.3.0
Downloading ray-1.8.0-cp37-cp37m-manylinux2014_x86_64.whl (54.7 MB)
|████████████████████████████████| 54.7 MB 23 kB/s
Requirement already satisfied: scikit-learn>=0.23.1 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.0.1)
Requirement already satisfied: scipy>=0.19.1 in /usr/local/lib/python3.7/dist-packages (from dh-scikit-optimize==0.9.4->deephyper) (1.4.1)
Collecting pyaml>=16.9
Downloading pyaml-21.10.1-py2.py3-none-any.whl (24 kB)
Collecting liac-arff>=2.4.0
Downloading liac-arff-2.5.0.tar.gz (13 kB)
Collecting xmltodict
Downloading xmltodict-0.12.0-py2.py3-none-any.whl (9.2 kB)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from openml==0.10.2->deephyper) (2.23.0)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/dist-packages (from openml==0.10.2->deephyper) (2.8.2)
Requirement already satisfied: cython in /usr/local/lib/python3.7/dist-packages (from ConfigSpace>=0.4.18->deephyper) (0.29.24)
Requirement already satisfied: pyparsing in /usr/local/lib/python3.7/dist-packages (from ConfigSpace>=0.4.18->deephyper) (2.4.7)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.0.3->deephyper) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.0.3->deephyper) (1.3.2)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.24.2->deephyper) (2018.9)
Requirement already satisfied: PyYAML in /usr/local/lib/python3.7/dist-packages (from pyaml>=16.9->dh-scikit-optimize==0.9.4->deephyper) (3.13)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil->openml==0.10.2->deephyper) (1.15.0)
Collecting redis>=3.5.0
Downloading redis-4.0.1-py3-none-any.whl (118 kB)
|████████████████████████████████| 118 kB 22.2 MB/s
Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (7.1.2)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (1.0.2)
Requirement already satisfied: grpcio>=1.28.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (1.41.1)
Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (3.3.2)
Requirement already satisfied: protobuf>=3.15.3 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (3.17.3)
Requirement already satisfied: attrs in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (21.2.0)
Requirement already satisfied: jsonschema in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (2.6.0)
Collecting gpustat>=1.0.0b1
Downloading gpustat-1.0.0b1.tar.gz (82 kB)
|████████████████████████████████| 82 kB 213 kB/s
Collecting py-spy>=0.2.0
Downloading py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB)
|████████████████████████████████| 3.0 MB 63.8 MB/s
Collecting aiohttp-cors
Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB)
Collecting aiohttp>=3.7
Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
|████████████████████████████████| 1.1 MB 46.6 MB/s
Requirement already satisfied: prometheus-client>=0.7.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (0.12.0)
Collecting colorful
Downloading colorful-0.5.4-py2.py3-none-any.whl (201 kB)
|████████████████████████████████| 201 kB 53.7 MB/s
Collecting opencensus
Downloading opencensus-0.8.0-py2.py3-none-any.whl (128 kB)
|████████████████████████████████| 128 kB 75.8 MB/s
Collecting aioredis<2
Downloading aioredis-1.3.1-py3-none-any.whl (65 kB)
|████████████████████████████████| 65 kB 4.0 MB/s
Collecting aiosignal>=1.1.2
Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)
Collecting frozenlist>=1.1.1
Downloading frozenlist-1.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (192 kB)
|████████████████████████████████| 192 kB 64.4 MB/s
Collecting yarl<2.0,>=1.0
Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)
|████████████████████████████████| 271 kB 49.6 MB/s
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/dist-packages (from aiohttp>=3.7->ray[default]>=1.3.0->deephyper) (3.10.0.2)
Collecting asynctest==0.13.0
Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp>=3.7->ray[default]>=1.3.0->deephyper) (2.0.7)
Collecting multidict<7.0,>=4.5
Downloading multidict-5.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (160 kB)
|████████████████████████████████| 160 kB 57.6 MB/s
Collecting async-timeout<5.0,>=4.0.0a3
Downloading async_timeout-4.0.1-py3-none-any.whl (5.7 kB)
Collecting hiredis
Downloading hiredis-2.0.0-cp37-cp37m-manylinux2010_x86_64.whl (85 kB)
|████████████████████████████████| 85 kB 4.1 MB/s
Requirement already satisfied: nvidia-ml-py3>=7.352.0 in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (7.352.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (5.4.8)
Collecting blessed>=1.17.1
Downloading blessed-1.19.0-py2.py3-none-any.whl (57 kB)
|████████████████████████████████| 57 kB 4.7 MB/s
Requirement already satisfied: wcwidth>=0.1.4 in /usr/local/lib/python3.7/dist-packages (from blessed>=1.17.1->gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (0.2.5)
Collecting deprecated
Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.23.1->deephyper) (3.0.0)
Requirement already satisfied: tensorboard~=2.6 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: flatbuffers<3.0,>=1.12 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.0)
Requirement already satisfied: gast<0.5.0,>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.4.0)
Requirement already satisfied: wheel<1.0,>=0.32.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.37.0)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.22.0)
Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.2.0)
Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.13.3)
Requirement already satisfied: tensorflow-estimator<2.8,~=2.7.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: keras-preprocessing>=1.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.1.2)
Requirement already satisfied: absl-py>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.12.0)
Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (3.1.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (3.3.0)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.1.0)
Requirement already satisfied: keras<2.8,>=2.7.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.6.3)
Requirement already satisfied: libclang>=9.0.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (12.0.0)
Requirement already satisfied: cached-property in /usr/local/lib/python3.7/dist-packages (from h5py>=2.9.0->tensorflow>=2.0.0->deephyper) (1.5.2)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.4.6)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (57.4.0)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.0.1)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.8.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.3.4)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.6.1)
Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.35.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.7.2)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.2.8)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.2.4)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.3.0)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.8.2)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.4.8)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (2.10)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (1.24.3)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (2021.10.8)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.1.1)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->markdown>=2.6.8->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.6.0)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from Jinja2->deephyper) (2.0.1)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from opencensus->ray[default]>=1.3.0->deephyper) (1.26.3)
Collecting opencensus-context==0.1.2
Downloading opencensus_context-0.1.2-py2.py3-none-any.whl (4.4 kB)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]>=1.3.0->deephyper) (1.53.0)
Requirement already satisfied: packaging>=14.3 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]>=1.3.0->deephyper) (21.2)
Requirement already satisfied: patsy>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from statsmodels->deephyper) (0.5.2)
Requirement already satisfied: decorator in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (4.4.2)
Requirement already satisfied: dm-tree in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (0.1.6)
Requirement already satisfied: cloudpickle>=1.3 in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (1.3.0)
Building wheels for collected packages: openml, liac-arff, gpustat
Building wheel for openml (setup.py) ... done
Created wheel for openml: filename=openml-0.10.2-py3-none-any.whl size=190318 sha256=6f431466cb702f739c2337475733f05ce4bb2bafb81cc3d9d1ab6941159d0152
Stored in directory: /root/.cache/pip/wheels/9c/9e/f3/6a5ebf16527d7fe22d9bc1652bc9beb5dc9fcfdeb75e805400
Building wheel for liac-arff (setup.py) ... done
Created wheel for liac-arff: filename=liac_arff-2.5.0-py3-none-any.whl size=11731 sha256=60fe3cfb23ccb90a5eb581e4884df69a8e804b2d6fb7ec25b6ae071acde27916
Stored in directory: /root/.cache/pip/wheels/1f/0f/15/332ca86cbebf25ddf98518caaf887945fbe1712b97a0f2493b
Building wheel for gpustat (setup.py) ... done
Created wheel for gpustat: filename=gpustat-1.0.0b1-py3-none-any.whl size=15979 sha256=80eda2084e3b9a55684ec3fbfc292b182ec25c8bef7220130107c82157038bb8
Stored in directory: /root/.cache/pip/wheels/1a/16/e2/3e2437fba4c4b6a97a97bd96fce5d14e66cff5c4966fb1cc8c
Successfully built openml liac-arff gpustat
Installing collected packages: multidict, frozenlist, yarl, deprecated, asynctest, async-timeout, aiosignal, redis, opencensus-context, hiredis, blessed, aiohttp, xmltodict, ray, pyaml, py-spy, opencensus, liac-arff, gpustat, colorful, aioredis, aiohttp-cors, openml, dh-scikit-optimize, ConfigSpace, deephyper
Successfully installed ConfigSpace-0.4.20 aiohttp-3.8.1 aiohttp-cors-0.7.0 aioredis-1.3.1 aiosignal-1.2.0 async-timeout-4.0.1 asynctest-0.13.0 blessed-1.19.0 colorful-0.5.4 deephyper-0.3.3 deprecated-1.2.13 dh-scikit-optimize-0.9.4 frozenlist-1.2.0 gpustat-1.0.0b1 hiredis-2.0.0 liac-arff-2.5.0 multidict-5.2.0 opencensus-0.8.0 opencensus-context-0.1.2 openml-0.10.2 py-spy-0.3.11 pyaml-21.10.1 ray-1.8.0 redis-4.0.1 xmltodict-0.12.0 yarl-1.7.2
6.1. Loading the data#
First, we will create the load_data
function which loads and returns the training and validation data. The load_data
function generates data from a function \(f\) where \(\mathbf{x} \in [a, b]^n\) such as \(f(\mathbf{x}) = -\sum_{i=0}^{n-1} {x_i ^2}\):
[1]:
import numpy as np
def load_data(verbose=0, dim=10, a=-50, b=50, prop=0.80, size=10000):
rs = np.random.RandomState(2018)
def polynome_2(x):
return -sum([x_i ** 2 for x_i in x])
d = b - a
x = np.array([a + rs.random(dim) * d for _ in range(size)])
y = np.array([[polynome_2(v)] for v in x])
sep_index = int(prop * size)
X_train = x[:sep_index]
y_train = y[:sep_index]
X_valid = x[sep_index:]
y_valid = y[sep_index:]
if verbose:
print(f"X_train shape: {np.shape(X_train)}")
print(f"y_train shape: {np.shape(y_train)}")
print(f"X_valid shape: {np.shape(X_valid)}")
print(f"y_valid shape: {np.shape(y_valid)}")
return (X_train, y_train), (X_valid, y_valid)
_ = load_data(verbose=1)
X_train shape: (8000, 10)
y_train shape: (8000, 1)
X_valid shape: (2000, 10)
y_valid shape: (2000, 1)
6.2. Define a neural architecture search space#
Let us define the neural architecture search space. To do this we use a KSearchSpace
class. We define the ResNetMLPSpace
search space which is a sub-class of KSearchSpace
where we have to implement a build()
method which return itself. The __init__
method is used to pass possible options of the search space such as the maximum number of layers self.num_layers
.
The input nodes can be retrieved with self.input_nodes
which is automatically built depending on the input_shape
.
The search space is composed of ConstantNode
and VariableNode
. A ConstantNode
defines a fixed operations whereas the VariableNode
defines a list of possible operations (i.e., corresponds to a categorical decision variable). Operations can be defined directly from Keras Layers such as:
Dense = operation(tf.keras.layers.Dense)
All nodes of the search space without outer edges are automatically assumed to be output nodes.
[2]:
import collections
import tensorflow as tf
from deephyper.nas import KSearchSpace
from deephyper.nas.node import ConstantNode, VariableNode
from deephyper.nas.operation import operation, Zero, Connect, AddByProjecting, Identity
Dense = operation(tf.keras.layers.Dense)
Dropout = operation(tf.keras.layers.Dropout)
Add = operation(tf.keras.layers.Add)
Flatten = operation(tf.keras.layers.Flatten)
ACTIVATIONS = [
tf.keras.activations.elu,
tf.keras.activations.gelu,
tf.keras.activations.hard_sigmoid,
tf.keras.activations.linear,
tf.keras.activations.relu,
tf.keras.activations.selu,
tf.keras.activations.sigmoid,
tf.keras.activations.softplus,
tf.keras.activations.softsign,
tf.keras.activations.swish,
tf.keras.activations.tanh,
]
class ResNetMLPSpace(KSearchSpace):
def __init__(self, input_shape, output_shape, seed=None, num_layers=3, mode="regression"):
super().__init__(input_shape, output_shape, seed=seed)
self.num_layers = num_layers
assert mode in ["regression", "classification"]
self.mode = mode
def build(self):
source = self.input_nodes[0]
output_dim = self.output_shape[0]
out_sub_graph = self.build_sub_graph(source, self.num_layers)
if self.mode == "regression":
output = ConstantNode(op=Dense(output_dim))
self.connect(out_sub_graph, output)
else:
output = ConstantNode(
op=Dense(output_dim, activation="softmax")
) # One-hot encoding
self.connect(out_sub_graph, output)
return self
def build_sub_graph(self, input_, num_layers=3):
source = prev_input = input_
# look over skip connections within a range of the 3 previous nodes
anchor_points = collections.deque([source], maxlen=3)
for _ in range(self.num_layers):
dense = VariableNode()
self.add_dense_to_(dense)
self.connect(prev_input, dense)
x = dense
dropout = VariableNode()
self.add_dropout_to_(dropout)
self.connect(x, dropout)
x = dropout
add = ConstantNode()
add.set_op(AddByProjecting(self, [x], activation="relu"))
for anchor in anchor_points:
skipco = VariableNode()
skipco.add_op(Zero())
skipco.add_op(Connect(self, anchor))
self.connect(skipco, add)
prev_input = add
# ! for next iter
anchor_points.append(prev_input)
return prev_input
def add_dense_to_(self, node):
node.add_op(Identity()) # we do not want to create a layer in this case
for units in range(16, 16 * 16 + 1, 16):
for activation in ACTIVATIONS:
node.add_op(Dense(units=units, activation=activation))
def add_dropout_to_(self, node):
a, b = 1e-3, 0.4
node.add_op(Identity())
dropout_range = np.exp(np.linspace(np.log(a), np.log(b), 10)) #! NAS
for rate in dropout_range:
node.add_op(Dropout(rate))
A KSearchSpace
as some useful methods such as:
space.sample(choice)
which returns a random model from the search space ifchoice == None
or generate a model corresponding to the choice if not.space.choices()
which returns the list of discrete dimensions corresponding to the search space.
Let us visualize a few randomly sampled neural architecture from this search space.
[3]:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from tensorflow.keras.utils import plot_model
shapes = dict(input_shape=(10,), output_shape=(1,))
space = ResNetMLPSpace(**shapes).build()
print("Choices: ", space.choices())
images = []
plt.figure(figsize=(15,15))
for i in range(4):
plt.subplot(2,2,i+1)
model = space.sample()
plot_model(model,
"random_model.png",
show_shapes=False, show_layer_names=False)
image = mpimg.imread("random_model.png")
plt.imshow(image)
plt.axis('off')
plt.show()
Choices: [(0, 176), (0, 10), (0, 1), (0, 176), (0, 10), (0, 1), (0, 1), (0, 176), (0, 10), (0, 1), (0, 1), (0, 1)]

6.3. Create a problem instance#
Let us define the neural architecture search problem.
[4]:
from deephyper.problem import NaProblem
from deephyper.nas.preprocessing import minmaxstdscaler
# Create a Neural Architecture problem
problem = NaProblem()
# Link the load-data function
problem.load_data(load_data)
# The function passed to preprocessing has to return
# a scikit-learn like preprocessor.
problem.preprocessing(minmaxstdscaler)
# Link the defined search space
problem.search_space(ResNetMLPSpace)
# Fixed hyperparameters for all trained models
problem.hyperparameters(
batch_size=32,
learning_rate=0.01,
optimizer="adam",
epsilon=1e-7,
num_epochs=20,
callbacks=dict(
EarlyStopping=dict(
monitor="val_r2", mode="max", verbose=0, patience=5
)
),
)
# Define the optimized loss (it can also be a function)
problem.loss("mse")
# Define metrics to compute for each training and validation epoch
problem.metrics(["r2"])
# Define the maximised objective
problem.objective("val_r2__last")
problem
[4]:
Problem is:
- search space : __main__.ResNetMLPSpace
- data loading : __main__.load_data
- preprocessing : deephyper.nas.preprocessing._base.minmaxstdscaler
- hyperparameters:
* verbose: 0
* batch_size: 32
* learning_rate: 0.01
* optimizer: adam
* num_epochs: 20
* callbacks: {'EarlyStopping': {'monitor': 'val_r2', 'mode': 'max', 'verbose': 0, 'patience': 5}}
- loss : mse
- metrics :
* r2
- objective : val_r2__last
Find more about NaProblem
settings on the Problem documentation.
Tip
Adding an EarlyStopping(...)
callback is a good idea to stop the training of your model as soon as it stops to improve.
...
EarlyStopping=dict(monitor="val_r2", mode="max", verbose=0, patience=5)
...
6.4. Running the search#
Create an Evaluator
object using the ray
backend to distribute the evaluation of the run-function. In neural architecture search DeepHyper provides the run_base_trainer
function which automate the training process of a sampled model.
[5]:
import multiprocessing
num_cpus = multiprocessing.cpu_count()
print(f"{num_cpus} CPU{'s' if num_cpus > 1 else ''} are available on this system.")
from deephyper.evaluator import Evaluator
from deephyper.evaluator.callback import TqdmCallback
from deephyper.nas.run import run_base_trainer
evaluator = Evaluator.create(run_base_trainer,
method="ray",
method_kwargs={
# Start a new Ray server
"address": None,
# Defines the number of available CPUs
"num_cpus": min(4, num_cpus),
# Defines the number of CPUs for each task
"num_cpus_per_task": 1,
"callbacks": [TqdmCallback()]
})
print("Number of workers: ", evaluator.num_workers)
10 CPUs are available on this system.
/Users/romainegele/Documents/Argonne/deephyper/deephyper/evaluator/_evaluator.py:126: UserWarning: Applying nest-asyncio patch for IPython Shell!
warnings.warn(
2023-01-30 18:08:58,502 INFO worker.py:1518 -- Started a local Ray instance.
Number of workers: 4
Tip
If executed locally, you can open the ray-dashboard at an address like http://127.0.0.1:port in a browser to monitor the CPU usage of the execution.
Finally, you can define a Random search called Random
and link to it the defined problem
and evaluator
.
[6]:
from deephyper.search.nas import Random
search = Random(problem, evaluator)
[7]:
results = search.search(10)
(run_base_trainer pid=45436) 2023-01-30 18:09:05.625566: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
(run_base_trainer pid=45437) 2023-01-30 18:09:05.625456: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
(run_base_trainer pid=45438) 2023-01-30 18:09:05.631501: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
(run_base_trainer pid=45439) 2023-01-30 18:09:05.642482: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
After the search is over, you will find the following files in your current folder:
results.csv
save/
Let us visualize the training of our models. First, we need to load the training history of each model which are located in save/history
:
[8]:
import os
import json
histories = [os.path.join("save/history", f) for f in os.listdir("save/history/") if ".json" in f]
for i, fpath in enumerate(histories):
with open(fpath, "r") as fd:
histories[i] = json.load(fd)
print(list(histories[0].keys()))
['n_parameters', 'training_time', 'loss', 'r2', 'val_loss', 'val_r2']
[9]:
plt.figure()
for h in histories:
plt.plot(h["val_r2"])
plt.ylabel("Validation $R^2$")
plt.xlabel("Epochs")
plt.show()

Once the search is over, a file named results.csv
is saved in the current directory. The same dataframe is returned by the search.search(...)
call. It contains the configurations evaluated during the search and their corresponding objective
value (i.e, validation accuracy), timestamp_submit
the time when the evaluator submitted the configuration to be evaluated and timestamp_gather
the time when the evaluator received the configuration once evaluated (both are relative times
with respect to the creation of the Evaluator
instance). Each neural architecture is embedded as a list of discrete decision variables called arch_seq
.
[10]:
results
[10]:
p:arch_seq | objective | job_id | m:timestamp_submit | m:timestamp_gather | |
---|---|---|---|---|---|
0 | [1, 7, 1, 75, 7, 0, 0, 87, 5, 1, 0, 0] | 0.968497 | 1 | 5.663778 | 11.017621 |
1 | [17, 8, 1, 50, 3, 0, 1, 40, 3, 0, 0, 1] | 0.950744 | 0 | 5.663629 | 11.263356 |
2 | [89, 9, 1, 92, 9, 0, 1, 142, 9, 1, 0, 1] | 0.975879 | 2 | 5.663916 | 14.272947 |
3 | [133, 7, 0, 148, 10, 0, 1, 10, 7, 0, 1, 0] | 0.924481 | 3 | 5.664056 | 14.722977 |
4 | [1, 6, 0, 151, 4, 1, 0, 80, 4, 1, 0, 1] | 0.936412 | 4 | 11.035176 | 14.983041 |
5 | [127, 1, 0, 168, 9, 1, 0, 131, 5, 1, 0, 1] | 0.965255 | 5 | 11.264783 | 15.918054 |
6 | [103, 3, 1, 170, 6, 0, 0, 144, 10, 1, 0, 0] | 0.965917 | 9 | 15.919445 | 19.091425 |
7 | [20, 4, 0, 123, 0, 1, 1, 60, 10, 1, 1, 0] | 0.937058 | 8 | 14.984620 | 19.759570 |
8 | [84, 1, 1, 112, 1, 0, 0, 165, 2, 0, 0, 0] | 0.930200 | 7 | 14.724561 | 20.370351 |
9 | [175, 10, 0, 148, 6, 1, 1, 148, 10, 1, 1, 0] | 0.897664 | 6 | 14.274681 | 21.072806 |
The deephyper-analytics
command line is a way of analyzing this type of file. For example, we want to output the best configuration we can use the topk
functionnality.
[12]:
results.nlargest(n=3, columns="objective")
[12]:
p:arch_seq | objective | job_id | m:timestamp_submit | m:timestamp_gather | |
---|---|---|---|---|---|
2 | [89, 9, 1, 92, 9, 0, 1, 142, 9, 1, 0, 1] | 0.975879 | 2 | 5.663916 | 14.272947 |
0 | [1, 7, 1, 75, 7, 0, 0, 87, 5, 1, 0, 0] | 0.968497 | 1 | 5.663778 | 11.017621 |
6 | [103, 3, 1, 170, 6, 0, 0, 144, 10, 1, 0, 0] | 0.965917 | 9 | 15.919445 | 19.091425 |
Where each architecture is described as a vector of scalar values named arch_seq. In fact, each of this scalar values represents chosen operations for the variable nodes of our search space.
6.5. Testing the best configuration#
We can visualize the architecture of the best configuration:
[13]:
best_config = results.iloc[results.objective.argmax()][:-2].to_dict()
arch_seq = json.loads(best_config["p:arch_seq"])
model = space.sample(arch_seq)
plot_model(model, show_shapes=False, show_layer_names=False)
[13]:

*** SIGTERM received at time=1675103578 ***
PC: @ 0x19670fe18 (unknown) kevent
[2023-01-30 19:32:58,377 E 45405 830659] logging.cc:361: *** SIGTERM received at time=1675103578 ***
[2023-01-30 19:32:58,378 E 45405 830659] logging.cc:361: PC: @ 0x19670fe18 (unknown) kevent