Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSOC] hyperopt suggestion service logic update #2412

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
f615e3f
hyperopt suggestion logic update
shashank-iitbhu Aug 21, 2024
a8bc887
Merge upstream master and resolve conflicts in base_service.py and se…
shashank-iitbhu Aug 25, 2024
a67f373
fix
shashank-iitbhu Aug 25, 2024
365c2f5
DISTRIBUTION_UNKNOWN enum set to 0 in gRPC api
shashank-iitbhu Aug 26, 2024
caa2422
convert parameter method fix
shashank-iitbhu Aug 26, 2024
0f38a51
convert feasibleSpace func updated
shashank-iitbhu Sep 3, 2024
ae9fa34
renamed DISTRIBUTION_UNKNOWN to DISTRIBUTION_UNSPECIFIED
shashank-iitbhu Sep 3, 2024
910a46c
fix
shashank-iitbhu Sep 3, 2024
08b01ac
added more test cases for hyperopt distributions
shashank-iitbhu Sep 6, 2024
16dc030
added support for NORMAL and LOG_NORMAL in hyperopt suggestion service
shashank-iitbhu Sep 7, 2024
282f81d
added e2e tests for NORMAL and LOG_NORMAL
shashank-iitbhu Sep 7, 2024
b7d09a6
hyperopt-suggestion example update
shashank-iitbhu Sep 19, 2024
58ab1ac
updated logic for log distributions
shashank-iitbhu Sep 19, 2024
2b1932e
updated logic for log distributions
shashank-iitbhu Sep 19, 2024
2f1c355
e2e test fixed
shashank-iitbhu Sep 22, 2024
8391c29
added support for parameter distributions for Parameter type INT
shashank-iitbhu Sep 22, 2024
23fd30b
unit test fixed
shashank-iitbhu Sep 22, 2024
7f6deb5
Update pkg/suggestion/v1beta1/hyperopt/base_service.py
shashank-iitbhu Sep 22, 2024
b85b4bf
comment fixed
shashank-iitbhu Sep 22, 2024
dc36303
added unit tests for INT parameter type
shashank-iitbhu Sep 22, 2024
658daaf
completed param unit test cases
shashank-iitbhu Sep 22, 2024
5198ad1
handled default case for normal distributions when min or max are not…
shashank-iitbhu Sep 23, 2024
262912d
fixed validation logic for min and max
shashank-iitbhu Oct 4, 2024
81f5526
removed unnecessary test params
shashank-iitbhu Oct 6, 2024
748e4ba
fixes
shashank-iitbhu Oct 8, 2024
14f30a5
added comments
shashank-iitbhu Oct 10, 2024
4f35663
fix
shashank-iitbhu Oct 12, 2024
c35789b
set default distribution as uniform
shashank-iitbhu Jan 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/e2e-test-pytorch-mnist.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,7 @@ jobs:
- "long-running-resume,from-volume-resume,median-stop"
# others
- "grid,bayesian-optimization,tpe,multivariate-tpe,cma-es,hyperband"
- "hyperopt-distribution"
- "file-metrics-collector,pytorchjob-mnist"
- "median-stop-with-json-format,file-metrics-collector-with-json-format"

shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
69 changes: 69 additions & 0 deletions examples/v1beta1/hp-tuning/hyperopt-distribution.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
apiVersion: kubeflow.org/v1beta1
kind: Experiment
metadata:
namespace: kubeflow
name: hyperopt-distribution
spec:
objective:
type: minimize
goal: 0.05
objectiveMetricName: loss
algorithm:
algorithmName: random
parallelTrialCount: 3
maxTrialCount: 12
maxFailedTrialCount: 3
parameters:
- name: lr
parameterType: double
feasibleSpace:
min: "0.01"
max: "0.05"
step: "0.01"
distribution: "normal"
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
- name: momentum
parameterType: double
feasibleSpace:
min: "0.001"
max: "1"
distribution: "uniform"
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
- name: batch_size
parameterType: int
feasibleSpace:
min: "32"
max: "64"
distribution: "logNormal"
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
trialTemplate:
primaryContainerName: training-container
trialParameters:
- name: learningRate
description: Learning rate for the training model
reference: lr
- name: momentum
description: Momentum for the training model
reference: momentum
- name: batchSize
description: Batch Size
reference: batch_size
trialSpec:
apiVersion: batch/v1
kind: Job
spec:
template:
spec:
containers:
- name: training-container
image: docker.io/kubeflowkatib/pytorch-mnist-cpu:latest
command:
- "python3"
- "/opt/pytorch-mnist/mnist.py"
- "--epochs=1"
- "--batch-size=${trialParameters.batchSize}"
- "--lr=${trialParameters.learningRate}"
- "--momentum=${trialParameters.momentum}"
resources:
limits:
memory: "1Gi"
cpu: "0.5"
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
restartPolicy: Never
180 changes: 90 additions & 90 deletions pkg/apis/manager/v1beta1/api.pb.go

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions pkg/apis/manager/v1beta1/api.proto
Original file line number Diff line number Diff line change
Expand Up @@ -101,11 +101,11 @@ enum ParameterType {
* Distribution types for HyperParameter.
*/
enum Distribution {
UNIFORM = 0;
LOG_UNIFORM = 1;
NORMAL = 2;
LOG_NORMAL = 3;
DISTRIBUTION_UNKNOWN = 4;
DISTRIBUTION_UNSPECIFIED = 0;
UNIFORM = 1;
LOG_UNIFORM = 2;
NORMAL = 3;
LOG_NORMAL = 4;
}

/**
Expand Down
24 changes: 12 additions & 12 deletions pkg/apis/manager/v1beta1/python/api_pb2.py

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions pkg/apis/manager/v1beta1/python/api_pb2.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ class ParameterType(int, metaclass=_enum_type_wrapper.EnumTypeWrapper):

class Distribution(int, metaclass=_enum_type_wrapper.EnumTypeWrapper):
__slots__ = ()
DISTRIBUTION_UNSPECIFIED: _ClassVar[Distribution]
UNIFORM: _ClassVar[Distribution]
LOG_UNIFORM: _ClassVar[Distribution]
NORMAL: _ClassVar[Distribution]
LOG_NORMAL: _ClassVar[Distribution]
DISTRIBUTION_UNKNOWN: _ClassVar[Distribution]

class ObjectiveType(int, metaclass=_enum_type_wrapper.EnumTypeWrapper):
__slots__ = ()
Expand All @@ -39,11 +39,11 @@ DOUBLE: ParameterType
INT: ParameterType
DISCRETE: ParameterType
CATEGORICAL: ParameterType
DISTRIBUTION_UNSPECIFIED: Distribution
UNIFORM: Distribution
LOG_UNIFORM: Distribution
NORMAL: Distribution
LOG_NORMAL: Distribution
DISTRIBUTION_UNKNOWN: Distribution
UNKNOWN: ObjectiveType
MINIMIZE: ObjectiveType
MAXIMIZE: ObjectiveType
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -532,22 +532,12 @@ func convertParameterType(typ experimentsv1beta1.ParameterType) suggestionapi.Pa
}

func convertFeasibleSpace(fs experimentsv1beta1.FeasibleSpace) *suggestionapi.FeasibleSpace {
distribution := convertDistribution(fs.Distribution)
if distribution == suggestionapi.Distribution_DISTRIBUTION_UNKNOWN {
return &suggestionapi.FeasibleSpace{
Max: fs.Max,
Min: fs.Min,
List: fs.List,
Step: fs.Step,
}
}

return &suggestionapi.FeasibleSpace{
Max: fs.Max,
Min: fs.Min,
List: fs.List,
Step: fs.Step,
Distribution: distribution,
Distribution: convertDistribution(fs.Distribution),
}
}

Expand All @@ -562,7 +552,7 @@ func convertDistribution(typ experimentsv1beta1.Distribution) suggestionapi.Dist
case experimentsv1beta1.DistributionLogNormal:
return suggestionapi.Distribution_LOG_NORMAL
default:
return suggestionapi.Distribution_DISTRIBUTION_UNKNOWN
return suggestionapi.Distribution_DISTRIBUTION_UNSPECIFIED
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -618,7 +618,7 @@ func TestConvertDistribution(t *testing.T) {
},
{
inDistribution: experimentsv1beta1.DistributionUnknown,
expectedDistribution: suggestionapi.Distribution_DISTRIBUTION_UNKNOWN,
expectedDistribution: suggestionapi.Distribution_DISTRIBUTION_UNSPECIFIED,
testDescription: "Convert unknown distribution",
},
}
Expand Down
79 changes: 71 additions & 8 deletions pkg/suggestion/v1beta1/hyperopt/base_service.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,12 @@
# limitations under the License.

import logging
import math

import hyperopt
import numpy as np

from pkg.apis.manager.v1beta1.python import api_pb2
from pkg.suggestion.v1beta1.internal.constant import (
CATEGORICAL,
DISCRETE,
Expand Down Expand Up @@ -62,14 +64,75 @@ def create_hyperopt_domain(self):
# hyperopt.hp.uniform('x2', -10, 10)}
hyperopt_search_space = {}
for param in self.search_space.params:
if param.type == INTEGER:
hyperopt_search_space[param.name] = hyperopt.hp.quniform(
param.name, float(param.min), float(param.max), float(param.step)
)
elif param.type == DOUBLE:
hyperopt_search_space[param.name] = hyperopt.hp.uniform(
param.name, float(param.min), float(param.max)
)
if param.type in [INTEGER, DOUBLE]:
if param.distribution == api_pb2.UNIFORM or param.distribution is None:
Comment on lines 65 to +68
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set the uniform distribution as a default one ?
We can mutate the Uniform distribution in the Experiment defaulter: https://github.com/kubeflow/katib/blob/master/pkg/apis/controller/experiments/v1beta1/experiment_defaults.go
For example, in Optuna the Uniform distribution will be removed in favour of floatDistribution: https://optuna.readthedocs.io/en/stable/reference/generated/optuna.distributions.UniformDistribution.html

if param.step:
hyperopt_search_space[param.name] = hyperopt.hp.quniform(
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
param.name,
float(param.min),
float(param.max),
float(param.step),
)
else:
if param.type == INTEGER:
hyperopt_search_space[param.name] = hyperopt.hp.uniformint(
param.name, float(param.min), float(param.max)
)
else:
hyperopt_search_space[param.name] = hyperopt.hp.uniform(
param.name, float(param.min), float(param.max)
)
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
elif param.distribution == api_pb2.LOG_UNIFORM:
if param.step:
hyperopt_search_space[param.name] = hyperopt.hp.qloguniform(
param.name,
math.log(float(param.min)),
math.log(float(param.max)),
float(param.step),
)
else:
hyperopt_search_space[param.name] = hyperopt.hp.loguniform(
param.name,
math.log(float(param.min)),
math.log(float(param.max)),
andreyvelich marked this conversation as resolved.
Show resolved Hide resolved
)
elif param.distribution == api_pb2.NORMAL:
mu = (float(param.min) + float(param.max)) / 2
shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved
# We consider the normal distribution based on the range of ±3 sigma.
sigma = (float(param.max) - float(param.min)) / 6
Copy link
Contributor Author

@shashank-iitbhu shashank-iitbhu Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed this article to determine the value of sigma from min and max.
cc @tenzen-y @andreyvelich

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add this article to the comments. WDYT @tenzen-y @johnugeorge ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not want to depend on the individual article. Instead of that, it would be better to add an actual mathematical description here as a comment.

shashank-iitbhu marked this conversation as resolved.
Show resolved Hide resolved

if param.step:
hyperopt_search_space[param.name] = hyperopt.hp.qnormal(
param.name,
mu,
sigma,
float(param.step),
)
else:
hyperopt_search_space[param.name] = hyperopt.hp.normal(
param.name,
mu,
sigma,
)
elif param.distribution == api_pb2.LOG_NORMAL:
log_min = math.log(float(param.min))
log_max = math.log(float(param.max))
Comment on lines +130 to +131
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we use the fixed value when the min and max are scalers the same as Nevergrad, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can, but that was an edge case when min and max are not defined in case of nevergrad.

    elif isinstance(param, (p.Log, p.Scalar)):
        if (param.bounds[0][0] is None) or (param.bounds[1][0] is None):
            if isinstance(param, p.Scalar) and not param.integer:
                return hp.lognormal(label=param_name, mu=0, sigma=1)

For example,

    - name: batch_size
      parameterType: int
      feasibleSpace:
        min: "32"
        max: "64"
        distribution: "logNormal"

The above parameter will be sampled out from this graph:
Screenshot 2024-09-22 at 8 48 49 PM
where u=3.8123 and sigma=0.3465 are calculated by putting min=32 and max=64 in our code. and E(X) represents the mean which is 48 in our case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense.
In that case, could you address the cases where min or max is not specified, as well as nevergrad?

https://github.com/facebookresearch/nevergrad/blob/a2006e50b068fe598e0f3d7dab9c9bcf6cf97e00/nevergrad/optimization/externalbo.py#L61-L64

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shashank-iitbhu This is still pending.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if param.FeasibleSpace.Max == "" && param.FeasibleSpace.Min == "" {
allErrs = append(allErrs, field.Required(parametersPath.Index(i).Child("feasibleSpace").Child("max"),
fmt.Sprintf("feasibleSpace.max or feasibleSpace.min must be specified for parameterType: %v", param.ParameterType)))
}

The webhook validator requires feasibleSpace.max or feasibleSpace.min to be specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when either min or max is empty, this validation does not reject the request, right?
So, shouldn't we implement the special case in the Suggestion Service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the validation webhook does not reject the request when either min or max is empty. But I created an example where:

    - name: batch_size
      parameterType: int
      feasibleSpace:
        min: "32"
        distribution: "logNormal"

For this, the experiment is being created but the suggestion service is not sampling out any value hence the trials are not running, though handled this case (when either min or max are not specified) in pkg/suggestion/v1beta1/hyperopt/base_service.py.
Do we need to check experiment_defaults.go file?
https://github.com/kubeflow/katib/blob/867c40a1b0669446c774cd6e770a5b7bbf1eb2f1/pkg/apis/controller/experiments/v1beta1/experiment_defaults.go

mu = (log_min + log_max) / 2
sigma = (log_max - log_min) / 6

if param.step:
hyperopt_search_space[param.name] = hyperopt.hp.qlognormal(
param.name,
mu,
sigma,
float(param.step),
)
else:
hyperopt_search_space[param.name] = hyperopt.hp.lognormal(
param.name,
mu,
sigma,
)
elif param.type == CATEGORICAL or param.type == DISCRETE:
hyperopt_search_space[param.name] = hyperopt.hp.choice(
param.name, param.list
Expand Down
5 changes: 5 additions & 0 deletions pkg/suggestion/v1beta1/internal/constant.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,8 @@
DOUBLE = "DOUBLE"
CATEGORICAL = "CATEGORICAL"
DISCRETE = "DISCRETE"

UNIFORM = "UNIFORM"
LOG_UNIFORM = "LOG_UNIFORM"
NORMAL = "NORMAL"
LOG_NORMAL = "LOG_NORMAL"
39 changes: 26 additions & 13 deletions pkg/suggestion/v1beta1/internal/search_space.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,25 +82,36 @@ def __str__(self):

@staticmethod
def convert_parameter(p):
distribution = (
p.feasible_space.distribution
if p.feasible_space.distribution != ""
and p.feasible_space.distribution is not None
and p.feasible_space.distribution != api.DISTRIBUTION_UNSPECIFIED
else None
)

if p.parameter_type == api.INT:
# Default value for INT parameter step is 1
step = 1
if p.feasible_space.step is not None and p.feasible_space.step != "":
step = p.feasible_space.step
step = p.feasible_space.step if p.feasible_space.step else 1
return HyperParameter.int(
p.name, p.feasible_space.min, p.feasible_space.max, step
p.name, p.feasible_space.min, p.feasible_space.max, step, distribution
)

elif p.parameter_type == api.DOUBLE:
return HyperParameter.double(
p.name,
p.feasible_space.min,
p.feasible_space.max,
p.feasible_space.step,
distribution,
)

elif p.parameter_type == api.CATEGORICAL:
return HyperParameter.categorical(p.name, p.feasible_space.list)

elif p.parameter_type == api.DISCRETE:
return HyperParameter.discrete(p.name, p.feasible_space.list)

else:
logger.error(
"Cannot get the type for the parameter: %s (%s)",
Expand All @@ -110,33 +121,35 @@ def convert_parameter(p):


class HyperParameter(object):
def __init__(self, name, type_, min_, max_, list_, step):
def __init__(self, name, type_, min_, max_, list_, step, distribution=None):
self.name = name
self.type = type_
self.min = min_
self.max = max_
self.list = list_
self.step = step
self.distribution = distribution

def __str__(self):
if self.type == constant.INTEGER or self.type == constant.DOUBLE:
if self.type in [constant.INTEGER, constant.DOUBLE]:
return (
"HyperParameter(name: {}, type: {}, min: {}, max: {}, step: {})".format(
self.name, self.type, self.min, self.max, self.step
)
f"HyperParameter(name: {self.name}, type: {self.type}, min: {self.min}, "
f"max: {self.max}, step: {self.step}, distribution: {self.distribution})"
)
else:
return "HyperParameter(name: {}, type: {}, list: {})".format(
self.name, self.type, ", ".join(self.list)
)

@staticmethod
def int(name, min_, max_, step):
return HyperParameter(name, constant.INTEGER, min_, max_, [], step)
def int(name, min_, max_, step, distribution=None):
return HyperParameter(
name, constant.INTEGER, min_, max_, [], step, distribution
)

@staticmethod
def double(name, min_, max_, step):
return HyperParameter(name, constant.DOUBLE, min_, max_, [], step)
def double(name, min_, max_, step, distribution=None):
return HyperParameter(name, constant.DOUBLE, min_, max_, [], step, distribution)

@staticmethod
def categorical(name, lst):
Expand Down
Loading