Skip to content

Commit

Permalink
Merge pull request #238 from automl/development
Browse files Browse the repository at this point in the history
SMAC3 v0.5.0
  • Loading branch information
mfeurer authored May 8, 2017
2 parents 94e83c9 + bad1ecf commit 855cfa3
Show file tree
Hide file tree
Showing 54 changed files with 753 additions and 418 deletions.
8 changes: 4 additions & 4 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ matrix:

include:
- os: linux
env: PYTHON_VERSION="3.4" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh"
env: PYTHON_VERSION="3.4" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh"
- os: linux
env: PYTHON_VERSION="3.5" COVERAGE="true" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh"
env: PYTHON_VERSION="3.5" COVERAGE="true" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh"
- os: linux
env: PYTHON_VERSION="3.6" COVERAGE="true" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh"
env: PYTHON_VERSION="3.6" COVERAGE="true" MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh"

# Disable OSX building because it takes too long and hinders progress
# Set language to generic to not break travis-ci
Expand Down Expand Up @@ -46,7 +46,7 @@ before_install:
- conda update --yes conda
- conda create -n testenv --yes python=$PYTHON_VERSION pip wheel nose
- source activate testenv
- conda install --yes gcc
- conda install --yes gcc swig
- echo "Using GCC at "`which gcc`
- export CC=`which gcc`

Expand Down
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# SMAC v3 Project

Copyright (C) 2016 [ML4AAD Group](http://www.ml4aad.org/)
Copyright (C) 2017 [ML4AAD Group](http://www.ml4aad.org/)

__Attention__: This package is under heavy development and subject to change.
A stable release of SMAC (v2) in Java can be found [here](http://www.cs.ubc.ca/labs/beta/Projects/SMAC/).
Expand All @@ -11,15 +11,15 @@ Status for master branch:

[![Build Status](https://travis-ci.org/automl/SMAC3.svg?branch=master)](https://travis-ci.org/automl/SMAC3)
[![Code Health](https://landscape.io/github/automl/SMAC3/master/landscape.svg?style=flat)](https://landscape.io/github/automl/SMAC3/master)
[![Coverage Status](https://coveralls.io/repos/automl/auto-sklearn/badge.svg?branch=master&service=github)](https://coveralls.io/github/automl/SMAC3?branch=master)
[![codecov Status](https://codecov.io/gh/automl/SMAC3/branch/master/graph/badge.svg)](https://codecov.io/gh/automl/SMAC3)

Status for development branch

[![Build Status](https://travis-ci.org/automl/SMAC3.svg?branch=development)](https://travis-ci.org/automl/SMAC3)
[![Code Health](https://landscape.io/github/automl/SMAC3/development/landscape.svg?style=flat)](https://landscape.io/github/automl/SMAC3/development)
[![Coverage Status](https://coveralls.io/repos/automl/SMAC3/badge.svg?branch=development&service=github)](https://coveralls.io/github/automl/SMAC3?branch=development)
[![codecov](https://codecov.io/gh/automl/SMAC3/branch/development/graph/badge.svg)](https://codecov.io/gh/automl/SMAC3)

#OVERVIEW
# OVERVIEW

SMAC is a tool for algorithm configuration
to optimize the parameters of arbitrary algorithms across a set of instances.
Expand All @@ -38,7 +38,11 @@ we refer to
SMAC v3 is written in python3 and continuously tested with python3.4 and python3.5.
Its [Random Forest](https://bitbucket.org/aadfreiburg/random_forest_run) is written in C++.

#Installation:
# Installation

Besides the listed requirements (see `requirements.txt`), the random forest used in SMAC3 requires SWIG.

apt-get install swig

cat requirements.txt | xargs -n 1 -L 1 pip install

Expand Down
35 changes: 35 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,38 @@
# 0.5

## Major changes

* MAINT #192: SMAC uses version 0.4 of the random forest library pyrfr. As a
side-effect, the library [swig](http://www.swig.org/) is necessary to build
the random forest.
* MAINT: random samples which are interleaved in the list of challengers are now
obtained from a generator. This reduces the overhead of sampling random
configurations.
* FIX #117: only round the cutoff when running a python function as the target
algorithm.
* MAINT #231: Rename the submodule `smac.smbo` to `smac.optimizer`.
* MAINT #213: Use log(EI) as default acquisition function when optimizing
running time of an algorithm.
* MAINT #223: updated example of optimizing a random forest with SMAC.
* MAINT #221: refactored the EPM module. The PCA on instance features is now
part of fitting the EPM instead of reading a scenario. Because of this
restructuring, the PCA can now take instance features which are external
data into account.

## Minor changes

* SMAC now outputs scenario options if the log level is `DEBUG` (2f0ceee).
* SMAC logs the command line call if invoked from the command line (3accfc2).
* SMAC explicitly checks that it runs in `python>=3.4`.
* MAINT #226: improve efficientcy when loading the runhistory from a json file.
* FIX #217: adds milliseconds to the output directory names to avoid race.
conditions when starting multiple runs on a cluster.
* MAINT #209: adds the seed or a pseudo-seed to the output directory name for
better identifiability of the output directories.
* FIX #216: replace broken call to in EIPS acqusition function.
* MAINT: use codecov.io instead of coveralls.io.
* MAINT: increase minimal required version of the ConfigSpace package to 0.3.2.

# 0.4

* ADD #204: SMAC now always saves runhistory files as `runhistory.json`.
Expand Down
19 changes: 19 additions & 0 deletions ci_scripts/circle_install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!bin/bash

# on circle ci, each command run with it's own execution context so we have to
# activate the conda testenv on a per command basis. That's why we put calls to
# python (conda) in a dedicated bash script and we activate the conda testenv
# here.
source activate testenv

# install documentation building dependencies
pip install --upgrade numpy
pip install --upgrade matplotlib setuptools nose coverage sphinx pillow sphinx-gallery sphinx_bootstrap_theme cython numpydoc
# And finally, all other dependencies
cat requirements.txt | xargs -n 1 -L 1 pip install

python setup.py clean
python setup.py develop

# pipefail is necessary to propagate exit codes
set -o pipefail && cd doc && make html 2>&1 | tee ~/log.txt
44 changes: 14 additions & 30 deletions circle.yml
Original file line number Diff line number Diff line change
@@ -1,44 +1,28 @@
machine:
environment:
# The github organization or username of the repository which hosts the
# project and documentation.
USERNAME: "automl"

# The repository where the documentation will be hosted
DOC_REPO: "SMAC3"

# The base URL for the Github page where the documentation will be hosted
DOC_URL: ""

# The email is to be used for commits in the Github Page
EMAIL: "[email protected]"
PATH: /home/ubuntu/miniconda/bin:$PATH

dependencies:

# Various dependencies
pre:
# Get rid of existing virtualenvs on circle ci as they conflict with conda.
# From nilearn: https://github.com/nilearn/nilearn/blob/master/circle.yml
- cd && rm -rf ~/.pyenv && rm -rf ~/virtualenvs
# from scikit-learn contrib
- sudo -E apt-get -yq remove texlive-binaries --purge
- sudo apt-get update
- sudo apt-get install libatlas-dev libatlas3gf-base
- sudo apt-get install build-essential python-dev python-setuptools
# install numpy first as it is a compile time dependency for other packages
- pip install --upgrade numpy
# install documentation building dependencies
- pip install --upgrade matplotlib setuptools nose coverage sphinx pillow sphinx-gallery sphinx_bootstrap_theme cython numpydoc
# Installing required packages for `make -C doc check command` to work.
- sudo -E apt-get -yq update
- sudo -E apt-get -yq --no-install-suggests --no-install-recommends --force-yes install dvipng texlive-latex-base texlive-latex-extra
# Installing packages to build the random forest
# finally install the requirements of the package to allow autodoc
- pip install -r requirements.txt
# Conda installation
- wget http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
- bash ~/miniconda.sh -b -p $HOME/miniconda
- conda update --yes conda
- conda create -n testenv --yes python=3.6 pip wheel nose gcc swig

# The --user is needed to let sphinx see the source and the binaries
# The pipefail is requested to propagate exit code
override:
- python setup.py clean
- python setup.py develop
- set -o pipefail && cd doc && make html 2>&1 | tee ~/log.txt
- source ci_scripts/circle_install.sh
test:
# Grep error on the documentation
override:
Expand All @@ -58,7 +42,7 @@ general:
- "doc/_build/html"
- "~/log.txt"
# Restric the build to the branch master only
branches:
only:
- development
- master
#branches:
# only:
# - development
# - master
57 changes: 21 additions & 36 deletions examples/rf.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@
import inspect

import numpy as np
from sklearn.model_selection import KFold
from sklearn.metrics import make_scorer
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import load_boston

from smac.configspace import ConfigurationSpace
from ConfigSpace.hyperparameters import CategoricalHyperparameter, \
Expand All @@ -14,7 +16,9 @@
from smac.scenario.scenario import Scenario
from smac.facade.smac_facade import SMAC

def rfr(cfg, seed):
boston = load_boston()

def rf_from_cfg(cfg, seed):
"""
Creates a random forest regressor from sklearn and fits the given data on it.
This is the function-call we try to optimize. Chosen values are stored in
Expand All @@ -36,7 +40,6 @@ def rfr(cfg, seed):
rfr = RandomForestRegressor(
n_estimators=cfg["num_trees"],
criterion=cfg["criterion"],
max_depth=cfg["max_depth"],
min_samples_split=cfg["min_samples_to_split"],
min_samples_leaf=cfg["min_samples_in_leaf"],
min_weight_fraction_leaf=cfg["min_weight_frac_leaf"],
Expand All @@ -45,36 +48,19 @@ def rfr(cfg, seed):
bootstrap=cfg["do_bootstrapping"],
random_state=seed)

rmses = []
for train, test in kf:
# We iterate over cv-folds
X_train, X_test = X[train], X[test]
y_train, y_test = y[train], y[test]

rfr.fit(X_train, y_train)

y_pred = rfr.predict(X_test)

# We use root mean square error as performance measure
rmse = np.sqrt(np.mean((y_pred - y_test)**2))
rmses.append(rmse)
return np.mean(rmses)
def rmse(y, y_pred):
return np.sqrt(np.mean((y_pred - y)**2))
# Creating root mean square error for sklearns crossvalidation
rmse_scorer = make_scorer(rmse, greater_is_better=False)
score = cross_val_score(rfr, boston.data, boston.target, cv=11, scoring=rmse_scorer)
return -1 * np.mean(score) # Because cross_validation sign-flips the score


logger = logging.getLogger("RF-example")
logging.basicConfig(level=logging.INFO)
#logging.basicConfig(level=logging.DEBUG) # Enable to show debug-output

folder = os.path.realpath(
os.path.abspath(os.path.split(inspect.getfile(inspect.currentframe()))[0]))

# Load data
X = np.array(np.loadtxt(os.path.join(folder, "data/X.csv")), dtype=np.float32)
y = np.array(np.loadtxt(os.path.join(folder, "data/y.csv")), dtype=np.float32)

# Create cross-validation folds
kf = KFold(n_splits=4, shuffle=True, random_state=42)
kf = kf.split(X, y)
logger.info("Running random forest example for SMAC. If you experience "
"difficulties, try to decrease the memory-limit.")

# Build Configuration Space which defines all parameters and their ranges.
# To illustrate different parameter types,
Expand All @@ -88,28 +74,27 @@ def rfr(cfg, seed):

# Or we can add multiple hyperparameters at once:
num_trees = UniformIntegerHyperparameter("num_trees", 10, 50, default=10)
max_depth = UniformIntegerHyperparameter("max_depth", 20, 30, default=20)
max_features = UniformIntegerHyperparameter("max_features", 1, X.shape[1], default=1)
max_features = UniformIntegerHyperparameter("max_features", 1, boston.data.shape[1], default=1)
min_weight_frac_leaf = UniformFloatHyperparameter("min_weight_frac_leaf", 0.0, 0.5, default=0.0)
criterion = CategoricalHyperparameter("criterion", ["mse", "mae"], default="mse")
min_samples_to_split = UniformIntegerHyperparameter("min_samples_to_split", 2, 20, default=2)
min_samples_in_leaf = UniformIntegerHyperparameter("min_samples_in_leaf", 1, 20, default=1)
max_leaf_nodes = UniformIntegerHyperparameter("max_leaf_nodes", 10, 1000, default=100)

cs.add_hyperparameters([num_trees, max_depth, min_weight_frac_leaf, criterion,
cs.add_hyperparameters([num_trees, min_weight_frac_leaf, criterion,
max_features, min_samples_to_split, min_samples_in_leaf, max_leaf_nodes])

# SMAC scenario oject
scenario = Scenario({"run_obj": "quality", # we optimize quality (alternative runtime)
"runcount-limit": 20, # maximum number of function evaluations
"cs": cs, # configuration space
scenario = Scenario({"run_obj": "quality", # we optimize quality (alternative runtime)
"runcount-limit": 50, # maximum number of function evaluations
"cs": cs, # configuration space
"deterministic": "true",
"memory_limit": 1024,
"memory_limit": 3072, # adapt this to reasonable value for your hardware
})

# To optimize, we pass the function to the SMAC-object
smac = SMAC(scenario=scenario, rng=np.random.RandomState(42),
tae_runner=rfr)
tae_runner=rf_from_cfg)

# Example call of the function with default values
# It returns: Status, Cost, Runtime, Additional Infos
Expand Down
7 changes: 3 additions & 4 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,9 @@ setuptools
numpy>=1.7.1
scipy>=0.18.1
six
Cython
psutil
pynisher>=0.4.1
ConfigSpace>=0.3.1
pyrfr==0.2.0
ConfigSpace>=0.3.2
scikit-learn
typing
typing
pyrfr>=0.4.0
5 changes: 5 additions & 0 deletions smac/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
import sys

if sys.version_info < (3,4):
raise ValueError("SMAC requires Python 3.4 or newer.")

from smac.__version__ import __version__
AUTHORS = "Marius Lindauer, Matthias Feurer, Katharina Eggensperger, " \
"Aaron Klein, Stefan Falkner and Frank Hutter"
2 changes: 1 addition & 1 deletion smac/__version__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Version information."""

# The following line *must* be the last in the module, exactly as formatted:
__version__ = "0.4.0"
__version__ = "0.5.0"
Loading

0 comments on commit 855cfa3

Please sign in to comment.