Hyperparameter search for classification with Tabular data (Keras)
Contents
2. Hyperparameter search for classification with Tabular data (Keras)#
In this tutorial we present how to use hyperparameter optimization on a basic example from the Keras documentation.
Reference: This tutorial is based on materials from the Keras Documentation: Structured data classification from scratch
Let us start with installing DeepHyper!
Warning
This tutorial should be run with tensorflow>=2.6
.
[ ]:
!pip install deephyper
!pip install ray
Collecting deephyper
Downloading deephyper-0.3.3-py2.py3-none-any.whl (962 kB)
|████████████████████████████████| 962 kB 4.2 MB/s
Requirement already satisfied: networkx in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.6.3)
Requirement already satisfied: pydot in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.3.0)
Requirement already satisfied: pandas>=0.24.2 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.1.5)
Requirement already satisfied: typeguard in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.7.1)
Collecting openml==0.10.2
Downloading openml-0.10.2.tar.gz (158 kB)
|████████████████████████████████| 158 kB 42.2 MB/s
Requirement already satisfied: statsmodels in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.10.2)
Collecting dh-scikit-optimize==0.9.4
Downloading dh_scikit_optimize-0.9.4-py2.py3-none-any.whl (102 kB)
|████████████████████████████████| 102 kB 10.5 MB/s
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from deephyper) (4.62.3)
Collecting ray[default]>=1.3.0
Downloading ray-1.8.0-cp37-cp37m-manylinux2014_x86_64.whl (54.7 MB)
|████████████████████████████████| 54.7 MB 26 kB/s
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.19.5)
Requirement already satisfied: joblib>=0.10.3 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.1.0)
Requirement already satisfied: Jinja2 in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.11.3)
Requirement already satisfied: scikit-learn>=0.23.1 in /usr/local/lib/python3.7/dist-packages (from deephyper) (1.0.1)
Requirement already satisfied: matplotlib>=3.0.3 in /usr/local/lib/python3.7/dist-packages (from deephyper) (3.2.2)
Requirement already satisfied: tensorflow>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from deephyper) (2.7.0)
Collecting ConfigSpace>=0.4.18
Downloading ConfigSpace-0.4.20-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB)
|████████████████████████████████| 4.2 MB 49.7 MB/s
Requirement already satisfied: tensorflow-probability in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.14.1)
Requirement already satisfied: xgboost in /usr/local/lib/python3.7/dist-packages (from deephyper) (0.90)
Requirement already satisfied: scipy>=0.19.1 in /usr/local/lib/python3.7/dist-packages (from dh-scikit-optimize==0.9.4->deephyper) (1.4.1)
Collecting pyaml>=16.9
Downloading pyaml-21.10.1-py2.py3-none-any.whl (24 kB)
Collecting liac-arff>=2.4.0
Downloading liac-arff-2.5.0.tar.gz (13 kB)
Collecting xmltodict
Downloading xmltodict-0.12.0-py2.py3-none-any.whl (9.2 kB)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from openml==0.10.2->deephyper) (2.23.0)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/dist-packages (from openml==0.10.2->deephyper) (2.8.2)
Requirement already satisfied: cython in /usr/local/lib/python3.7/dist-packages (from ConfigSpace>=0.4.18->deephyper) (0.29.24)
Requirement already satisfied: pyparsing in /usr/local/lib/python3.7/dist-packages (from ConfigSpace>=0.4.18->deephyper) (2.4.7)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.0.3->deephyper) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=3.0.3->deephyper) (0.11.0)
Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.24.2->deephyper) (2018.9)
Requirement already satisfied: PyYAML in /usr/local/lib/python3.7/dist-packages (from pyaml>=16.9->dh-scikit-optimize==0.9.4->deephyper) (3.13)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil->openml==0.10.2->deephyper) (1.15.0)
Requirement already satisfied: grpcio>=1.28.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (1.41.1)
Collecting redis>=3.5.0
Downloading redis-4.0.0-py3-none-any.whl (118 kB)
|████████████████████████████████| 118 kB 66.6 MB/s
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (1.0.2)
Requirement already satisfied: protobuf>=3.15.3 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (3.17.3)
Requirement already satisfied: attrs in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (21.2.0)
Requirement already satisfied: jsonschema in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (2.6.0)
Requirement already satisfied: filelock in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (3.3.2)
Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (7.1.2)
Collecting opencensus
Downloading opencensus-0.8.0-py2.py3-none-any.whl (128 kB)
|████████████████████████████████| 128 kB 62.6 MB/s
Collecting aiohttp>=3.7
Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
|████████████████████████████████| 1.1 MB 46.2 MB/s
Collecting gpustat>=1.0.0b1
Downloading gpustat-1.0.0b1.tar.gz (82 kB)
|████████████████████████████████| 82 kB 215 kB/s
Requirement already satisfied: prometheus-client>=0.7.1 in /usr/local/lib/python3.7/dist-packages (from ray[default]>=1.3.0->deephyper) (0.12.0)
Collecting colorful
Downloading colorful-0.5.4-py2.py3-none-any.whl (201 kB)
|████████████████████████████████| 201 kB 55.2 MB/s
Collecting aiohttp-cors
Downloading aiohttp_cors-0.7.0-py3-none-any.whl (27 kB)
Collecting aioredis<2
Downloading aioredis-1.3.1-py3-none-any.whl (65 kB)
|████████████████████████████████| 65 kB 3.4 MB/s
Collecting py-spy>=0.2.0
Downloading py_spy-0.3.11-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (3.0 MB)
|████████████████████████████████| 3.0 MB 59.0 MB/s
Collecting async-timeout<5.0,>=4.0.0a3
Downloading async_timeout-4.0.1-py3-none-any.whl (5.7 kB)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.7/dist-packages (from aiohttp>=3.7->ray[default]>=1.3.0->deephyper) (2.0.7)
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/dist-packages (from aiohttp>=3.7->ray[default]>=1.3.0->deephyper) (3.10.0.2)
Collecting frozenlist>=1.1.1
Downloading frozenlist-1.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (192 kB)
|████████████████████████████████| 192 kB 45.3 MB/s
Collecting yarl<2.0,>=1.0
Downloading yarl-1.7.2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (271 kB)
|████████████████████████████████| 271 kB 66.4 MB/s
Collecting multidict<7.0,>=4.5
Downloading multidict-5.2.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (160 kB)
|████████████████████████████████| 160 kB 63.0 MB/s
Collecting asynctest==0.13.0
Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)
Collecting aiosignal>=1.1.2
Downloading aiosignal-1.2.0-py3-none-any.whl (8.2 kB)
Collecting hiredis
Downloading hiredis-2.0.0-cp37-cp37m-manylinux2010_x86_64.whl (85 kB)
|████████████████████████████████| 85 kB 3.4 MB/s
Requirement already satisfied: nvidia-ml-py3>=7.352.0 in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (7.352.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.7/dist-packages (from gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (5.4.8)
Collecting blessed>=1.17.1
Downloading blessed-1.19.0-py2.py3-none-any.whl (57 kB)
|████████████████████████████████| 57 kB 5.2 MB/s
Requirement already satisfied: wcwidth>=0.1.4 in /usr/local/lib/python3.7/dist-packages (from blessed>=1.17.1->gpustat>=1.0.0b1->ray[default]>=1.3.0->deephyper) (0.2.5)
Collecting deprecated
Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn>=0.23.1->deephyper) (3.0.0)
Requirement already satisfied: keras-preprocessing>=1.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.1.2)
Requirement already satisfied: libclang>=9.0.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (12.0.0)
Requirement already satisfied: google-pasta>=0.1.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.2.0)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.22.0)
Requirement already satisfied: keras<2.8,>=2.7.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: tensorboard~=2.6 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: flatbuffers<3.0,>=1.12 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.0)
Requirement already satisfied: tensorflow-estimator<2.8,~=2.7.0rc0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (2.7.0)
Requirement already satisfied: h5py>=2.9.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (3.1.0)
Requirement already satisfied: wheel<1.0,>=0.32.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.37.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (3.3.0)
Requirement already satisfied: gast<0.5.0,>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.4.0)
Requirement already satisfied: astunparse>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.6.3)
Requirement already satisfied: absl-py>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (0.12.0)
Requirement already satisfied: wrapt>=1.11.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.13.3)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.7/dist-packages (from tensorflow>=2.0.0->deephyper) (1.1.0)
Requirement already satisfied: cached-property in /usr/local/lib/python3.7/dist-packages (from h5py>=2.9.0->tensorflow>=2.0.0->deephyper) (1.5.2)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.8.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.3.4)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (57.4.0)
Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.35.0)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.6.1)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.0.1)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.4.6)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.7.2)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.2.4)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.2.8)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (1.3.0)
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (4.8.2)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (0.4.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (2021.10.8)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->openml==0.10.2->deephyper) (3.0.4)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.1.1)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata->markdown>=2.6.8->tensorboard~=2.6->tensorflow>=2.0.0->deephyper) (3.6.0)
Requirement already satisfied: MarkupSafe>=0.23 in /usr/local/lib/python3.7/dist-packages (from Jinja2->deephyper) (2.0.1)
Collecting opencensus-context==0.1.2
Downloading opencensus_context-0.1.2-py2.py3-none-any.whl (4.4 kB)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /usr/local/lib/python3.7/dist-packages (from opencensus->ray[default]>=1.3.0->deephyper) (1.26.3)
Requirement already satisfied: packaging>=14.3 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]>=1.3.0->deephyper) (21.2)
Requirement already satisfied: googleapis-common-protos<2.0dev,>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]>=1.3.0->deephyper) (1.53.0)
Requirement already satisfied: patsy>=0.4.0 in /usr/local/lib/python3.7/dist-packages (from statsmodels->deephyper) (0.5.2)
Requirement already satisfied: cloudpickle>=1.3 in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (1.3.0)
Requirement already satisfied: dm-tree in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (0.1.6)
Requirement already satisfied: decorator in /usr/local/lib/python3.7/dist-packages (from tensorflow-probability->deephyper) (4.4.2)
Building wheels for collected packages: openml, liac-arff, gpustat
Building wheel for openml (setup.py) ... done
Created wheel for openml: filename=openml-0.10.2-py3-none-any.whl size=190318 sha256=6985f53d704e157f7c05a7ec1c98b7c65fe0982b15c99d162abed0823233d3d0
Stored in directory: /root/.cache/pip/wheels/9c/9e/f3/6a5ebf16527d7fe22d9bc1652bc9beb5dc9fcfdeb75e805400
Building wheel for liac-arff (setup.py) ... done
Created wheel for liac-arff: filename=liac_arff-2.5.0-py3-none-any.whl size=11731 sha256=5e23e06296998edfab256b372aaa4c035fb1ece982c6e339accb0a190947d8b8
Stored in directory: /root/.cache/pip/wheels/1f/0f/15/332ca86cbebf25ddf98518caaf887945fbe1712b97a0f2493b
Building wheel for gpustat (setup.py) ... done
Created wheel for gpustat: filename=gpustat-1.0.0b1-py3-none-any.whl size=15979 sha256=c62f08af2268980732240c1225196924a2ac59b3199eae3c8e4974be96b21896
Stored in directory: /root/.cache/pip/wheels/1a/16/e2/3e2437fba4c4b6a97a97bd96fce5d14e66cff5c4966fb1cc8c
Successfully built openml liac-arff gpustat
Installing collected packages: multidict, frozenlist, yarl, deprecated, asynctest, async-timeout, aiosignal, redis, opencensus-context, hiredis, blessed, aiohttp, xmltodict, ray, pyaml, py-spy, opencensus, liac-arff, gpustat, colorful, aioredis, aiohttp-cors, openml, dh-scikit-optimize, ConfigSpace, deephyper
Successfully installed ConfigSpace-0.4.20 aiohttp-3.8.1 aiohttp-cors-0.7.0 aioredis-1.3.1 aiosignal-1.2.0 async-timeout-4.0.1 asynctest-0.13.0 blessed-1.19.0 colorful-0.5.4 deephyper-0.3.3 deprecated-1.2.13 dh-scikit-optimize-0.9.4 frozenlist-1.2.0 gpustat-1.0.0b1 hiredis-2.0.0 liac-arff-2.5.0 multidict-5.2.0 opencensus-0.8.0 opencensus-context-0.1.2 openml-0.10.2 py-spy-0.3.11 pyaml-21.10.1 ray-1.8.0 redis-4.0.0 xmltodict-0.12.0 yarl-1.7.2
Note
The following environment variables can be used to avoid the logging of some Tensorflow DEBUG, INFO and WARNING statements.
[1]:
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = str(3)
os.environ["AUTOGRAPH_VERBOSITY"] = str(0)
2.1. Imports#
Warning
It is important to follow the import strategy import tensorflow as tf
to prevent serialization errors that will crash the search.
The import strategy from the original Keras tutorial (shown below),
from tensorflow import keras
from tensorflow.keras import layers
...
from tensorflow.keras.layers import IntegerLookup
from tensorflow.keras.layers import Normalization
from tensorflow.keras.layers import StringLookup
resulted in non-serializable data, preventing the search from executing.
[2]:
import pandas as pd
import tensorflow as tf
Note
The following can be used to detect if GPU devices are available on the current host. Therefore, this notebook will automatically adapt the parallel execution based on the resources available locally. However, this simple code will not detect the ressources from multiple nodes.
[3]:
from tensorflow.python.client import device_lib
def get_available_gpus():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos if x.device_type == "GPU"]
n_gpus = len(get_available_gpus())
if n_gpus > 1:
n_gpus -= 1
is_gpu_available = n_gpus > 0
if is_gpu_available:
print(f"{n_gpus} GPU{'s are' if n_gpus > 1 else ' is'} available.")
else:
print("No GPU available")
No GPU available
2.2. The dataset (from Keras.io)#
The dataset is provided by the Cleveland Clinic Foundation for Heart Disease. It’s a CSV file with 303 rows. Each row contains information about a patient (a sample), and each column describes an attribute of the patient (a feature). We use the features to predict whether a patient has a heart disease (binary classification).
Here’s the description of each feature:
Column |
Description |
Feature Type |
---|---|---|
Age |
Age in years |
Numerical |
Sex |
(1 = male; 0 = female) |
Categorical |
CP |
Chest pain type (0, 1, 2, 3, 4) |
Categorical |
Trestbpd |
Resting blood pressure (in mm Hg on admission) |
Numerical |
Chol |
Serum cholesterol in mg/dl |
Numerical |
FBS |
fasting blood sugar in 120 mg/dl (1 = true; 0 = false) |
Categorical |
RestECG |
Resting electrocardiogram results (0, 1, 2) |
Categorical |
Thalach |
Maximum heart rate achieved |
Numerical |
Exang |
Exercise induced angina (1 = yes; 0 = no) |
Categorical |
Oldpeak |
ST depression induced by exercise relative to rest |
Numerical |
Slope |
Slope of the peak exercise ST segment |
Numerical |
CA |
Number of major vessels (0-3) colored by fluoroscopy |
Both numerical & categorical |
Thal |
3 = normal; 6 = fixed defect; 7 = reversible defect |
Categorical |
Target |
Diagnosis of heart disease (1 = true; 0 = false) |
Target |
[4]:
def load_data():
file_url = "http://storage.googleapis.com/download.tensorflow.org/data/heart.csv"
dataframe = pd.read_csv(file_url)
val_dataframe = dataframe.sample(frac=0.2, random_state=1337)
train_dataframe = dataframe.drop(val_dataframe.index)
return train_dataframe, val_dataframe
def dataframe_to_dataset(dataframe):
dataframe = dataframe.copy()
labels = dataframe.pop("target")
ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
ds = ds.shuffle(buffer_size=len(dataframe))
return ds
2.3. Preprocessing & encoding of features#
The next cells use tf.keras.layers.Normalization()
to apply standard scaling on the features.
Then, the tf.keras.layers.StringLookup
and tf.keras.layers.IntegerLookup
are used to encode categorical variables.
[5]:
def encode_numerical_feature(feature, name, dataset):
# Create a Normalization layer for our feature
normalizer = tf.keras.layers.Normalization()
# Prepare a Dataset that only yields our feature
feature_ds = dataset.map(lambda x, y: x[name])
feature_ds = feature_ds.map(lambda x: tf.expand_dims(x, -1))
# Learn the statistics of the data
normalizer.adapt(feature_ds)
# Normalize the input feature
encoded_feature = normalizer(feature)
return encoded_feature
def encode_categorical_feature(feature, name, dataset, is_string):
lookup_class = (
tf.keras.layers.StringLookup if is_string else tf.keras.layers.IntegerLookup
)
# Create a lookup layer which will turn strings into integer indices
lookup = lookup_class(output_mode="binary")
# Prepare a Dataset that only yields our feature
feature_ds = dataset.map(lambda x, y: x[name])
feature_ds = feature_ds.map(lambda x: tf.expand_dims(x, -1))
# Learn the set of possible string values and assign them a fixed integer index
lookup.adapt(feature_ds)
# Turn the string input into integer indices
encoded_feature = lookup(feature)
return encoded_feature
2.4. Define the run-function#
The run-function defines how the objective that we want to maximize is computed. It takes a config
dictionary as input and often returns a scalar value that we want to maximize. The config
contains a sample value of hyperparameters that we want to tune. In this example we will search for:
units
(default value:32
)activation
(default value:"relu"
)dropout_rate
(default value:0.5
)num_epochs
(default value:50
)batch_size
(default value:32
)learning_rate
(default value:1e-3
)
A hyperparameter value can be acessed easily in the dictionary through the corresponding key, for example config["units"]
.
[6]:
def run(config: dict):
tf.autograph.set_verbosity(0)
# Load data and split into validation set
train_dataframe, val_dataframe = load_data()
train_ds = dataframe_to_dataset(train_dataframe)
val_ds = dataframe_to_dataset(val_dataframe)
train_ds = train_ds.batch(config["batch_size"])
val_ds = val_ds.batch(config["batch_size"])
# Categorical features encoded as integers
sex = tf.keras.Input(shape=(1,), name="sex", dtype="int64")
cp = tf.keras.Input(shape=(1,), name="cp", dtype="int64")
fbs = tf.keras.Input(shape=(1,), name="fbs", dtype="int64")
restecg = tf.keras.Input(shape=(1,), name="restecg", dtype="int64")
exang = tf.keras.Input(shape=(1,), name="exang", dtype="int64")
ca = tf.keras.Input(shape=(1,), name="ca", dtype="int64")
# Categorical feature encoded as string
thal = tf.keras.Input(shape=(1,), name="thal", dtype="string")
# Numerical features
age = tf.keras.Input(shape=(1,), name="age")
trestbps = tf.keras.Input(shape=(1,), name="trestbps")
chol = tf.keras.Input(shape=(1,), name="chol")
thalach = tf.keras.Input(shape=(1,), name="thalach")
oldpeak = tf.keras.Input(shape=(1,), name="oldpeak")
slope = tf.keras.Input(shape=(1,), name="slope")
all_inputs = [
sex,
cp,
fbs,
restecg,
exang,
ca,
thal,
age,
trestbps,
chol,
thalach,
oldpeak,
slope,
]
# Integer categorical features
sex_encoded = encode_categorical_feature(sex, "sex", train_ds, False)
cp_encoded = encode_categorical_feature(cp, "cp", train_ds, False)
fbs_encoded = encode_categorical_feature(fbs, "fbs", train_ds, False)
restecg_encoded = encode_categorical_feature(restecg, "restecg", train_ds, False)
exang_encoded = encode_categorical_feature(exang, "exang", train_ds, False)
ca_encoded = encode_categorical_feature(ca, "ca", train_ds, False)
# String categorical features
thal_encoded = encode_categorical_feature(thal, "thal", train_ds, True)
# Numerical features
age_encoded = encode_numerical_feature(age, "age", train_ds)
trestbps_encoded = encode_numerical_feature(trestbps, "trestbps", train_ds)
chol_encoded = encode_numerical_feature(chol, "chol", train_ds)
thalach_encoded = encode_numerical_feature(thalach, "thalach", train_ds)
oldpeak_encoded = encode_numerical_feature(oldpeak, "oldpeak", train_ds)
slope_encoded = encode_numerical_feature(slope, "slope", train_ds)
all_features = tf.keras.layers.concatenate(
[
sex_encoded,
cp_encoded,
fbs_encoded,
restecg_encoded,
exang_encoded,
slope_encoded,
ca_encoded,
thal_encoded,
age_encoded,
trestbps_encoded,
chol_encoded,
thalach_encoded,
oldpeak_encoded,
]
)
x = tf.keras.layers.Dense(config["units"], activation=config["activation"])(
all_features
)
x = tf.keras.layers.Dropout(config["dropout_rate"])(x)
output = tf.keras.layers.Dense(1, activation="sigmoid")(x)
model = tf.keras.Model(all_inputs, output)
optimizer = tf.keras.optimizers.Adam(learning_rate=config["learning_rate"])
model.compile(optimizer, "binary_crossentropy", metrics=["accuracy"])
history = model.fit(
train_ds, epochs=config["num_epochs"], validation_data=val_ds, verbose=0
)
return history.history["val_accuracy"][-1]
The objective maximised by DeepHyper is the scalar value returned by the run
-function.
In this tutorial it corresponds to the validation accuracy of the last epoch of training which we retrieve in the History
object returned by the model.fit(...)
call.
...
history = model.fit(
train_ds, epochs=config["num_epochs"], validation_data=val_ds, verbose=0
)
return history.history["val_accuracy"][-1]
...
Using an objective like max(history.history['val_accuracy'])
can have undesired side effects.
For example, it is possible that the training curves will overshoot a local maximum, resulting in a model without the capacity to flexibly adapt to new data in the future.
2.5. Define the Hyperparameter optimization problem#
Hyperparameter ranges are defined using the following syntax:
Discrete integer ranges are generated from a tuple
(lower: int, upper: int)
Continuous prarameters are generated from a tuple
(lower: float, upper: float)
Categorical or nonordinal hyperparameter ranges can be given as a list of possible values
[val1, val2, ...]
[7]:
from deephyper.problem import HpProblem
# Creation of an hyperparameter problem
problem = HpProblem()
# Discrete hyperparameter (sampled with uniform prior)
problem.add_hyperparameter((8, 128), "units", default_value=32)
problem.add_hyperparameter((10, 100), "num_epochs", default_value=50)
# Categorical hyperparameter (sampled with uniform prior)
ACTIVATIONS = [
"elu", "gelu", "hard_sigmoid", "linear", "relu", "selu",
"sigmoid", "softplus", "softsign", "swish", "tanh",
]
problem.add_hyperparameter(ACTIVATIONS, "activation", default_value="relu")
# Real hyperparameter (sampled with uniform prior)
problem.add_hyperparameter((0.0, 0.6), "dropout_rate", default_value=0.5)
# Discrete and Real hyperparameters (sampled with log-uniform)
problem.add_hyperparameter((8, 256, "log-uniform"), "batch_size", default_value=32)
problem.add_hyperparameter((1e-5, 1e-2, "log-uniform"), "learning_rate", default_value=1e-3)
problem
[7]:
Configuration space object:
Hyperparameters:
activation, Type: Categorical, Choices: {elu, gelu, hard_sigmoid, linear, relu, selu, sigmoid, softplus, softsign, swish, tanh}, Default: relu
batch_size, Type: UniformInteger, Range: [8, 256], Default: 32, on log-scale
dropout_rate, Type: UniformFloat, Range: [0.0, 0.6], Default: 0.5
learning_rate, Type: UniformFloat, Range: [1e-05, 0.01], Default: 0.001, on log-scale
num_epochs, Type: UniformInteger, Range: [10, 100], Default: 50
units, Type: UniformInteger, Range: [8, 128], Default: 32
2.6. Evaluate a default configuration#
We evaluate the performance of the default set of hyperparameters provided in the Keras tutorial.
[8]:
import ray
# We launch the Ray run-time depending of the detected local ressources
# and execute the `run` function with the default configuration
# WARNING: in the case of GPUs it is important to follow this scheme
# to avoid multiple processes (Ray workers vs current process) to lock
# the same GPU.
if is_gpu_available:
if not(ray.is_initialized()):
ray.init(num_cpus=n_gpus, num_gpus=n_gpus, log_to_driver=False)
run_default = ray.remote(num_cpus=1, num_gpus=1)(run)
objective_default = ray.get(run_default.remote(problem.default_configuration))
else:
if not(ray.is_initialized()):
ray.init(num_cpus=1, log_to_driver=False)
run_default = run
objective_default = run_default(problem.default_configuration)
print(f"Accuracy Default Configuration: {objective_default:.3f}")
Accuracy Default Configuration: 0.820
2.7. Define the evaluator object#
Evaluator
object allows to change the parallelization backend used by DeepHyper.run_function
to be instantiated.method
defines the backend (e.g., "ray"
) and the method_kwargs
corresponds to keyword arguments of this chosen method
.evaluator = Evaluator.create(run_function, method, method_kwargs)
Once created the evaluator.num_workers
gives access to the number of available parallel workers.
Finally, to submit and collect tasks to the evaluator one just needs to use the following interface:
configs = [{"units": 8, ...}, ...]
evaluator.submit(configs)
...
# To collect the first finished task (asynchronous)
tasks_done = evaluator.get("BATCH", size=1)
# To collect all of the pending tasks (synchronous)
tasks_done = evaluator.get("ALL")
Warning
Each Evaluator
saves its own state, therefore it is crucial to create a new evaluator when launching a fresh search.
[9]:
from deephyper.evaluator import Evaluator
from deephyper.evaluator.callback import TqdmCallback
def get_evaluator(run_function):
# Default arguments for Ray: 1 worker and 1 worker per evaluation
method_kwargs = {
"num_cpus": 1,
"num_cpus_per_task": 1,
"callbacks": [TqdmCallback()]
}
# If GPU devices are detected then it will create 'n_gpus' workers
# and use 1 worker for each evaluation
if is_gpu_available:
method_kwargs["num_cpus"] = n_gpus
method_kwargs["num_gpus"] = n_gpus
method_kwargs["num_cpus_per_task"] = 1
method_kwargs["num_gpus_per_task"] = 1
evaluator = Evaluator.create(
run_function,
method="ray",
method_kwargs=method_kwargs
)
print(f"Created new evaluator with {evaluator.num_workers} worker{'s' if evaluator.num_workers > 1 else ''} and config: {method_kwargs}", )
return evaluator
evaluator_1 = get_evaluator(run)
Created new evaluator with 1 worker and config: {'num_cpus': 1, 'num_cpus_per_task': 1, 'callbacks': [<deephyper.evaluator.callback.TqdmCallback object at 0x2a06bb670>]}
/Users/romainegele/Documents/Argonne/deephyper/deephyper/evaluator/_evaluator.py:99: UserWarning: Applying nest-asyncio patch for IPython Shell!
warnings.warn(
2.8. Define and run the centralized Bayesian optimization search (CBO)#
A primary pillar of hyperparameter search in DeepHyper is given by a centralized Bayesian optimization search (henceforth CBO). CBO may be described in the following algorithm:
Following the parallelized evaluation of these configurations, a low-fidelity and high efficiency model (henceforth “the surrogate”) is devised to reproduce the relationship between the input variables involved in the model (i.e., the choice of hyperparameters) and the outputs (which are generally a measure of validation data accuracy).
After obtaining this surrogate of the validation accuracy, we may utilize ideas from classical methods in Bayesian optimization literature for adaptively sample the search space of hyperparameters.
First, the surrogate is used to obtain an estimate for the mean value of the validation accuracy at a certain sampling location \(x\) in addition to an estimated variance. The latter requirement restricts us to the use of high efficiency data-driven modeling strategies that have inbuilt variance estimates (such as a Gaussian process or Random Forest regressor).
Regions where the mean is high represent opportunities for exploitation and regions where the variance is high represent opportunities for exploration. An optimistic acquisition function called UCB can be constructed using these two quantities:
The unevaluated hyperparameter configurations that maximize the acquisition function are chosen for the next batch of evaluations.
Note that the choice of the variance weighting parameter \(\kappa\) controls the degree of exploration in the hyperparameter search with zero indicating purely exploitation (unseen configurations where the predicted accuracy is highest will be sampled).
The top s
configurations are selected for the new batch. The following schematic demonstrates this process:
The process of obtaining s
configurations relies on the “constant-liar” strategy where a sampled configuration is mapped to a dummy output given by a bulk metric of all the evaluated configurations thus far (such as the maximum, mean or median validation accuracy).
Prior to sampling the next configuration by acquisition function maximization, the surrogate is retrained with the dummy output as a data point. As the true validation accuracy becomes available for one of the sampled configurations, the dummy output is replaced and the surrogate is updated.
This allows for scalable asynchronous (or batch synchronous) sampling of new hyperparameter configurations.
2.8.1. Choice of surrogate model#
Users should note that our choice of the surrogate is given by the Random Forest regressor due to its ability to handle non-ordinal data (hyperparameter configurations may not be purely continuous or even numerical). Evidence for how they outperform other methods (such as Gaussian processes) is also available in [1]
2.8.1.1. Setup CBO#
We create the CBO using the problem
and evaluator
defined above.
[10]:
from deephyper.search.hps import CBO
# Uncomment the following line to show the arguments of CBO.
# help(CBO)
[11]:
# Instanciate the search with the problem and the evaluator that we created before
search = CBO(problem, evaluator_1, initial_points=[problem.default_configuration])
Note
All DeepHyper’s search algorithm have two stopping criteria:
<li> <code>`max_evals (int)`</code>: Defines the maximum number of evaluations that we want to perform. Default to <code>-1</code> for an infinite number.</li>
<li> <code>`timeout (int)`</code>: Defines a time budget (in seconds) before stopping the search. Default to <code>None</code> for an infinite time budget.</li>
</ul>
[12]:
results = search.search(max_evals=10)
100%|██████████| 10/10 [00:16<00:00, 2.16s/it, objective=0.836]
Warning
The search
call does not output any information about the current status of the search. However, a results.csv
file is created in the local directly and can be visualized to see finished tasks.
The returned results
is a Pandas Dataframe where columns are hyperparameters and information stored by the evaluator:
job_id
is a unique identifier corresponding to the order of creation of tasksobjective
is the value returned by the run-functiontimestamp_submit
is the time (in seconds) when the task was created by the evaluator since the creation of the evaluator.timestamp_gather
is the time (in seconds) when the task was received after finishing by the evaluator since the creation of the evaluator.
[13]:
results
[13]:
activation | batch_size | dropout_rate | learning_rate | num_epochs | units | job_id | objective | timestamp_submit | timestamp_gather | |
---|---|---|---|---|---|---|---|---|---|---|
0 | relu | 32 | 0.500000 | 0.001000 | 50 | 32 | 1 | 0.803279 | 5.007318 | 7.747537 |
1 | relu | 35 | 0.033381 | 0.000265 | 33 | 103 | 2 | 0.836066 | 7.893480 | 9.366301 |
2 | sigmoid | 10 | 0.304321 | 0.001563 | 18 | 116 | 3 | 0.819672 | 9.396822 | 10.795727 |
3 | tanh | 22 | 0.400652 | 0.000750 | 45 | 120 | 4 | 0.786885 | 10.826229 | 12.536323 |
4 | relu | 13 | 0.154455 | 0.003917 | 55 | 108 | 5 | 0.770492 | 12.567024 | 14.700497 |
5 | gelu | 167 | 0.095698 | 0.000015 | 70 | 108 | 6 | 0.622951 | 14.730562 | 16.405771 |
6 | relu | 29 | 0.474157 | 0.000866 | 13 | 48 | 7 | 0.819672 | 16.436501 | 17.624854 |
7 | gelu | 33 | 0.513944 | 0.000103 | 67 | 12 | 8 | 0.737705 | 17.655086 | 19.627776 |
8 | selu | 26 | 0.209993 | 0.001832 | 34 | 45 | 9 | 0.786885 | 19.728662 | 21.173264 |
9 | elu | 58 | 0.543356 | 0.000043 | 94 | 81 | 10 | 0.803279 | 21.203402 | 24.535908 |
The search can be continued without any issue.
[14]:
results = search.search(max_evals=5)
results
100%|██████████| 10/10 [00:18<00:00, 1.82s/it, objective=0.836]
100%|██████████| 5/5 [00:05<00:00, 1.33s/it, objective=0.836]
[14]:
activation | batch_size | dropout_rate | learning_rate | num_epochs | units | job_id | objective | timestamp_submit | timestamp_gather | |
---|---|---|---|---|---|---|---|---|---|---|
0 | relu | 32 | 0.500000 | 0.001000 | 50 | 32 | 1 | 0.803279 | 5.007318 | 7.747537 |
1 | relu | 35 | 0.033381 | 0.000265 | 33 | 103 | 2 | 0.836066 | 7.893480 | 9.366301 |
2 | sigmoid | 10 | 0.304321 | 0.001563 | 18 | 116 | 3 | 0.819672 | 9.396822 | 10.795727 |
3 | tanh | 22 | 0.400652 | 0.000750 | 45 | 120 | 4 | 0.786885 | 10.826229 | 12.536323 |
4 | relu | 13 | 0.154455 | 0.003917 | 55 | 108 | 5 | 0.770492 | 12.567024 | 14.700497 |
5 | gelu | 167 | 0.095698 | 0.000015 | 70 | 108 | 6 | 0.622951 | 14.730562 | 16.405771 |
6 | relu | 29 | 0.474157 | 0.000866 | 13 | 48 | 7 | 0.819672 | 16.436501 | 17.624854 |
7 | gelu | 33 | 0.513944 | 0.000103 | 67 | 12 | 8 | 0.737705 | 17.655086 | 19.627776 |
8 | selu | 26 | 0.209993 | 0.001832 | 34 | 45 | 9 | 0.786885 | 19.728662 | 21.173264 |
9 | elu | 58 | 0.543356 | 0.000043 | 94 | 81 | 10 | 0.803279 | 21.203402 | 24.535908 |
10 | tanh | 181 | 0.585590 | 0.001198 | 97 | 64 | 11 | 0.803279 | 26.024792 | 27.976763 |
11 | relu | 256 | 0.021215 | 0.000020 | 22 | 123 | 12 | 0.180328 | 28.262971 | 29.387546 |
12 | relu | 256 | 0.019209 | 0.000098 | 12 | 89 | 13 | 0.557377 | 29.675868 | 30.765097 |
13 | relu | 256 | 0.580891 | 0.000016 | 23 | 122 | 14 | 0.737705 | 31.123886 | 32.239641 |
14 | relu | 256 | 0.054057 | 0.000020 | 40 | 128 | 15 | 0.245902 | 32.529887 | 33.878031 |
Now that the search is over, let us print the best configuration found during this run.
[15]:
i_max = results.objective.argmax()
best_config = results.iloc[i_max][:-3].to_dict()
print(f"The default configuration has an accuracy of {objective_default:.3f}. \n"
f"The best configuration found by DeepHyper has an accuracy {results['objective'].iloc[i_max]:.3f}, \n"
f"discovered after {results['timestamp_gather'].iloc[i_max]:.2f} secondes of search.\n")
best_config
The default configuration has an accuracy of 0.820.
The best configuration found by DeepHyper has an accuracy 0.836,
discovered after 9.37 secondes of search.
[15]:
{'activation': 'relu',
'batch_size': 35,
'dropout_rate': 0.0333810292433754,
'learning_rate': 0.0002654663875892,
'num_epochs': 33,
'units': 103,
'job_id': 2}
2.9. Restart from a checkpoint#
It can often be useful to continue the search from previous results. For example, if the allocation requested was not enough or if an unexpected crash happened. The AMBS
searhc provides the fit_surrogate(dataframe_of_results)
method for this use case.
To simulate this we create a second evaluator evaluator_2
and start a fresh AMBS search with strong explotation kappa=0.001
.
[16]:
# Create a new evaluator
evaluator_2 = get_evaluator(run)
# Create a new AMBS search with strong explotation (i.e., small kappa)
search_from_checkpoint = CBO(problem, evaluator_2, kappa=0.001)
# Initialize surrogate model of Bayesian optization (in AMBS)
# With results of previous search
search_from_checkpoint.fit_surrogate(results)
Created new evaluator with 1 worker and config: {'num_cpus': 1, 'num_cpus_per_task': 1, 'callbacks': [<deephyper.evaluator.callback.TqdmCallback object at 0x3427e3e50>]}
/Users/romainegele/Documents/Argonne/deephyper/deephyper/evaluator/_evaluator.py:99: UserWarning: Applying nest-asyncio patch for IPython Shell!
warnings.warn(
[17]:
results_from_checkpoint = search_from_checkpoint.search(max_evals=10)
[18]:
results_from_checkpoint
[18]:
activation | batch_size | dropout_rate | learning_rate | num_epochs | units | job_id | objective | timestamp_submit | timestamp_gather | |
---|---|---|---|---|---|---|---|---|---|---|
0 | sigmoid | 256 | 0.427160 | 0.000767 | 30 | 70 | 1 | 0.704918 | 1.533085 | 2.817155 |
1 | relu | 256 | 0.518127 | 0.002222 | 32 | 77 | 2 | 0.819672 | 3.201892 | 4.484528 |
2 | relu | 256 | 0.569377 | 0.002478 | 39 | 79 | 3 | 0.836066 | 4.789920 | 6.043683 |
3 | relu | 256 | 0.588966 | 0.006999 | 31 | 75 | 4 | 0.819672 | 6.352603 | 7.632020 |
4 | relu | 256 | 0.578453 | 0.000659 | 54 | 77 | 5 | 0.786885 | 8.010408 | 9.583746 |
5 | relu | 256 | 0.572587 | 0.003358 | 35 | 87 | 6 | 0.868852 | 9.889757 | 11.116933 |
6 | relu | 256 | 0.582392 | 0.002076 | 36 | 86 | 7 | 0.852459 | 11.428699 | 12.755969 |
7 | relu | 256 | 0.584875 | 0.003595 | 43 | 85 | 8 | 0.803279 | 13.066058 | 14.417546 |
8 | relu | 256 | 0.536006 | 0.008031 | 34 | 74 | 9 | 0.803279 | 14.798247 | 16.009628 |
9 | relu | 256 | 0.571806 | 0.001217 | 35 | 76 | 10 | 0.803279 | 16.325069 | 17.680495 |
[19]:
i_max = results_from_checkpoint.objective.argmax()
best_config = results_from_checkpoint.iloc[i_max][:-3].to_dict()
print(f"The default configuration has an accuracy of {objective_default:.3f}. "
f"The best configuration found by DeepHyper has an accuracy {results_from_checkpoint['objective'].iloc[i_max]:.3f}, "
f"finished after {results_from_checkpoint['timestamp_gather'].iloc[i_max]:.2f} secondes of search.")
best_config
The default configuration has an accuracy of 0.820. The best configuration found by DeepHyper has an accuracy 0.869, finished after 11.12 secondes of search.
[19]:
{'activation': 'relu',
'batch_size': 256,
'dropout_rate': 0.5725867102466046,
'learning_rate': 0.0033577267689664,
'num_epochs': 35,
'units': 87,
'job_id': 6}
2.10. Add conditional hyperparameters#
Now we want to add the possibility to search for a second fully-connected layer. We simply add two new lines:
if config.get("dense_2", False):
x = tf.keras.layers.Dense(config["dense_2:units"], activation=config["dense_2:activation"])(x)
[20]:
def run_with_condition(config: dict):
tf.autograph.set_verbosity(0)
train_dataframe, val_dataframe = load_data()
train_ds = dataframe_to_dataset(train_dataframe)
val_ds = dataframe_to_dataset(val_dataframe)
train_ds = train_ds.batch(config["batch_size"])
val_ds = val_ds.batch(config["batch_size"])
# Categorical features encoded as integers
sex = tf.keras.Input(shape=(1,), name="sex", dtype="int64")
cp = tf.keras.Input(shape=(1,), name="cp", dtype="int64")
fbs = tf.keras.Input(shape=(1,), name="fbs", dtype="int64")
restecg = tf.keras.Input(shape=(1,), name="restecg", dtype="int64")
exang = tf.keras.Input(shape=(1,), name="exang", dtype="int64")
ca = tf.keras.Input(shape=(1,), name="ca", dtype="int64")
# Categorical feature encoded as string
thal = tf.keras.Input(shape=(1,), name="thal", dtype="string")
# Numerical features
age = tf.keras.Input(shape=(1,), name="age")
trestbps = tf.keras.Input(shape=(1,), name="trestbps")
chol = tf.keras.Input(shape=(1,), name="chol")
thalach = tf.keras.Input(shape=(1,), name="thalach")
oldpeak = tf.keras.Input(shape=(1,), name="oldpeak")
slope = tf.keras.Input(shape=(1,), name="slope")
all_inputs = [
sex,
cp,
fbs,
restecg,
exang,
ca,
thal,
age,
trestbps,
chol,
thalach,
oldpeak,
slope,
]
# Integer categorical features
sex_encoded = encode_categorical_feature(sex, "sex", train_ds, False)
cp_encoded = encode_categorical_feature(cp, "cp", train_ds, False)
fbs_encoded = encode_categorical_feature(fbs, "fbs", train_ds, False)
restecg_encoded = encode_categorical_feature(restecg, "restecg", train_ds, False)
exang_encoded = encode_categorical_feature(exang, "exang", train_ds, False)
ca_encoded = encode_categorical_feature(ca, "ca", train_ds, False)
# String categorical features
thal_encoded = encode_categorical_feature(thal, "thal", train_ds, True)
# Numerical features
age_encoded = encode_numerical_feature(age, "age", train_ds)
trestbps_encoded = encode_numerical_feature(trestbps, "trestbps", train_ds)
chol_encoded = encode_numerical_feature(chol, "chol", train_ds)
thalach_encoded = encode_numerical_feature(thalach, "thalach", train_ds)
oldpeak_encoded = encode_numerical_feature(oldpeak, "oldpeak", train_ds)
slope_encoded = encode_numerical_feature(slope, "slope", train_ds)
all_features = tf.keras.layers.concatenate(
[
sex_encoded,
cp_encoded,
fbs_encoded,
restecg_encoded,
exang_encoded,
slope_encoded,
ca_encoded,
thal_encoded,
age_encoded,
trestbps_encoded,
chol_encoded,
thalach_encoded,
oldpeak_encoded,
]
)
x = tf.keras.layers.Dense(config["units"], activation=config["activation"])(
all_features
)
### START - NEW LINES
if config.get("dense_2", False):
x = tf.keras.layers.Dense(config["dense_2:units"], activation=config["dense_2:activation"])(x)
### END - NEW LINES
x = tf.keras.layers.Dropout(config["dropout_rate"])(x)
output = tf.keras.layers.Dense(1, activation="sigmoid")(x)
model = tf.keras.Model(all_inputs, output)
optimizer = tf.keras.optimizers.Adam(learning_rate=config["learning_rate"])
model.compile(optimizer, "binary_crossentropy", metrics=["accuracy"])
history = model.fit(
train_ds, epochs=config["num_epochs"], validation_data=val_ds, verbose=0
)
return history.history["val_accuracy"][-1]
To define conditionnal hyperparameters we use ConfigSpace. We define dense_2:units
and dense_2:activation
as active hyperparameters only when dense_2 == True
. The cs.EqualsCondition
help us do that. Then we call
problem_with_condition.add_condition(condition)
to register each new condition to the HpProblem
.
[21]:
from deephyper.problem import EqualsCondition
# Define the hyperparameter problem
problem_with_condition = HpProblem()
# Define the same hyperparameters as before
problem_with_condition.add_hyperparameter((8, 128), "units")
problem_with_condition.add_hyperparameter(ACTIVATIONS, "activation")
problem_with_condition.add_hyperparameter((0.0, 0.6), "dropout_rate")
problem_with_condition.add_hyperparameter((10, 100), "num_epochs")
problem_with_condition.add_hyperparameter((8, 256, "log-uniform"), "batch_size")
problem_with_condition.add_hyperparameter((1e-5, 1e-2, "log-uniform"), "learning_rate")
# Add a new hyperparameter "dense_2 (bool)" to decide if a second fully-connected layer should be created
hp_dense_2 = problem_with_condition.add_hyperparameter([True, False], "dense_2")
hp_dense_2_units = problem_with_condition.add_hyperparameter((8, 128), "dense_2:units")
hp_dense_2_activation = problem_with_condition.add_hyperparameter(ACTIVATIONS, "dense_2:activation")
problem_with_condition.add_condition(EqualsCondition(hp_dense_2_units, hp_dense_2, True))
problem_with_condition.add_condition(EqualsCondition(hp_dense_2_activation, hp_dense_2, True))
problem_with_condition
[21]:
Configuration space object:
Hyperparameters:
activation, Type: Categorical, Choices: {elu, gelu, hard_sigmoid, linear, relu, selu, sigmoid, softplus, softsign, swish, tanh}, Default: elu
batch_size, Type: UniformInteger, Range: [8, 256], Default: 45, on log-scale
dense_2, Type: Categorical, Choices: {True, False}, Default: True
dense_2:activation, Type: Categorical, Choices: {elu, gelu, hard_sigmoid, linear, relu, selu, sigmoid, softplus, softsign, swish, tanh}, Default: elu
dense_2:units, Type: UniformInteger, Range: [8, 128], Default: 68
dropout_rate, Type: UniformFloat, Range: [0.0, 0.6], Default: 0.3
learning_rate, Type: UniformFloat, Range: [1e-05, 0.01], Default: 0.0003162278, on log-scale
num_epochs, Type: UniformInteger, Range: [10, 100], Default: 55
units, Type: UniformInteger, Range: [8, 128], Default: 68
Conditions:
dense_2:activation | dense_2 == True
dense_2:units | dense_2 == True
We create a new evaluator evaluator_3
and start a fresh AMBS search with this new problem problem_with_condition
.
[22]:
evaluator_3 = get_evaluator(run_with_condition)
search_with_condition = CBO(problem_with_condition, evaluator_3)
/Users/romainegele/Documents/Argonne/deephyper/deephyper/evaluator/_evaluator.py:99: UserWarning: Applying nest-asyncio patch for IPython Shell!
warnings.warn(
Created new evaluator with 1 worker and config: {'num_cpus': 1, 'num_cpus_per_task': 1, 'callbacks': [<deephyper.evaluator.callback.TqdmCallback object at 0x341d4e970>]}
[23]:
results_with_condition = search_with_condition.search(max_evals=10)
[24]:
results_with_condition
[24]:
activation | batch_size | dense_2 | dropout_rate | learning_rate | num_epochs | units | dense_2:activation | dense_2:units | job_id | objective | timestamp_submit | timestamp_gather | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | gelu | 24 | False | 0.522131 | 0.001057 | 44 | 8 | NaN | NaN | 1 | 0.836066 | 3.763403 | 5.403222 |
1 | softsign | 47 | True | 0.377181 | 0.000019 | 47 | 127 | elu | 75.0 | 2 | 0.786885 | 5.769829 | 7.444927 |
2 | hard_sigmoid | 25 | False | 0.333931 | 0.000461 | 12 | 33 | NaN | NaN | 3 | 0.786885 | 7.735531 | 8.905337 |
3 | softsign | 8 | True | 0.203617 | 0.000153 | 77 | 24 | softplus | 20.0 | 4 | 0.836066 | 9.270190 | 12.828704 |
4 | tanh | 232 | False | 0.544339 | 0.000236 | 17 | 15 | NaN | NaN | 5 | 0.245902 | 13.121197 | 14.197206 |
5 | linear | 31 | True | 0.431446 | 0.002055 | 27 | 114 | tanh | 69.0 | 6 | 0.803279 | 14.490091 | 15.919840 |
6 | swish | 185 | True | 0.143728 | 0.000066 | 36 | 126 | relu | 88.0 | 7 | 0.819672 | 16.282540 | 17.669768 |
7 | selu | 85 | True | 0.513058 | 0.000975 | 20 | 24 | elu | 77.0 | 8 | 0.836066 | 17.957845 | 19.113598 |
8 | softplus | 256 | False | 0.541869 | 0.000032 | 44 | 123 | NaN | NaN | 9 | 0.213115 | 19.404429 | 20.872728 |
9 | hard_sigmoid | 57 | True | 0.221024 | 0.000014 | 77 | 101 | selu | 56.0 | 10 | 0.770492 | 21.233461 | 23.194847 |
Finally, let us print out the best configuration found from this conditionned search space.
[25]:
i_max = results_with_condition.objective.argmax()
best_config = results_with_condition.iloc[i_max][:-3].to_dict()
print(f"The default configuration has an accuracy of {objective_default:.3f}. "
f"The best configuration found by DeepHyper has an accuracy {results_with_condition['objective'].iloc[i_max]:.3f}, "
f"finished after {results_with_condition['timestamp_gather'].iloc[i_max]:.2f} seconds of search.")
best_config
The default configuration has an accuracy of 0.820. The best configuration found by DeepHyper has an accuracy 0.836, finished after 5.40 seconds of search.
[25]:
{'activation': 'gelu',
'batch_size': 24,
'dense_2': False,
'dropout_rate': 0.5221310827728567,
'learning_rate': 0.0010573740607372,
'num_epochs': 44,
'units': 8,
'dense_2:activation': nan,
'dense_2:units': nan,
'job_id': 1}