-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endless exploitation cycle #468
Comments
Minimal Exampleimport numpy as np
import pandas as pd
import torch
from botorch.test_functions import Rastrigin
from matplotlib import pyplot as plt
from baybe import Campaign
from baybe.acquisition.acqfs import qLogExpectedImprovement
from baybe.objectives import SingleTargetObjective
from baybe.parameters import NumericalDiscreteParameter
from baybe.recommenders.meta.sequential import TwoPhaseMetaRecommender
from baybe.recommenders.pure.bayesian.botorch import BotorchRecommender
from baybe.searchspace import SearchSpace
from baybe.simulation.lookup import look_up_targets
from baybe.targets import NumericalTarget
from baybe.utils.random import set_random_seed
N_MC_ITERATIONS = 10
N_DOE_ITERATIONS = 15
BATCH_SIZE = 1
POINTS_PER_DIM = 10
DIMENSION = 1
TEST_FUNCTION = Rastrigin(DIMENSION)
BOUNDS = TEST_FUNCTION.bounds
def blackbox(df: pd.DataFrame, /) -> pd.DataFrame:
"""A callable whose internal logic is unknown to the algorithm."""
df["Target"] = TEST_FUNCTION(torch.tensor(df.values))
return df
parameters = [
NumericalDiscreteParameter(
name=f"x_{k+1}",
values=list(np.linspace(BOUNDS[0, k], BOUNDS[1, k], POINTS_PER_DIM)),
tolerance=0.01,
)
for k in range(DIMENSION)
]
searchspace = SearchSpace.from_product(parameters=parameters)
objective = SingleTargetObjective(target=NumericalTarget(name="Target", mode="MIN"))
AC = qLogExpectedImprovement
campaign = Campaign(
searchspace=SearchSpace.from_product(parameters=parameters),
objective=objective,
recommender=TwoPhaseMetaRecommender(
recommender=BotorchRecommender(
allow_repeated_recommendations=True, acquisition_function=AC()
)
),
)
def get_acqf_values(campaign: Campaign):
surrogate = campaign.get_surrogate()
acqf_cls = AC
acqf = acqf_cls().to_botorch(
surrogate, searchspace, objective, campaign.measurements
)
return acqf(
torch.tensor(
searchspace.transform(searchspace.discrete.exp_rep).values
).unsqueeze(-2)
)
set_random_seed(0)
x = parameters[0].values
for iter in range(N_DOE_ITERATIONS):
measured = campaign.recommend(BATCH_SIZE)
if iter >= 1:
p = campaign.posterior(searchspace.discrete.exp_rep)
acqf = get_acqf_values(campaign).detach().numpy()
m = campaign.measurements
mean = p.mean.detach().squeeze().numpy()
std = p.stddev.detach().numpy().squeeze()
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
ax1.errorbar(x, mean, std, fmt="none", ecolor='gray')
ax1.plot(m["x_1"], m["Target"], "ok")
ax1.hlines(campaign.measurements["Target"].min(), min(x), max(x), color="g")
ax2.plot(x, acqf, "r")
plt.show()
look_up_targets(measured, campaign.targets, blackbox, "error")
campaign.add_measurements(measured)
print(measured) |
@AdrianSosic we had a bug report more than a year ago about a similar observation, the conclusion then was numerical artifacts that are exponentially less likely for batch sizes > 1 (see slide 8) |
Yes, that's what I meant with
I'm not so sure this has something to do with numerics. In fact, it makes sort of sense to me that larger batch sizes mitigate the effect since they result in more exploration (due to batch diversification). So I still think there is a deeper issue that needs to investigated. |
Observation
Already a while ago, we sometimes noticed "strange" behavior (especially in discrete domains) where a campaign would keep on repeating the same point over and over again. This was one of the motivations for setting the default of either
allow_recommending_already_measured
orallow_repeated_recommendations
toFalse
, both of which are an effective way to avoid getting stuck in the endless cycle / force the algorithm to explore further.Yesterday, @Hrovatin stumbled again over the same effect, noticing that two (otherwise identical) campaign settings can perform drastically differently when either i) adding new measurements to the campaign after each iteration or ii) reinitializing a new campaign with the extended dataset (see figure below). The reason is in fact the one described above since in the latter case the flags cannot take effect (because the campaign meta data is lost).
Investigation
A deeper investigation led me to the hypothesis that this is ultimately caused by the inherent mechanisms of expected improvement (and other acquisition functions), which can simply happen to underexplore. I've quickly drafted a minimal example that consistently reproduces the effect (see post below)
In this example, the iteration loop quickly reaches a dead end where the recommendations keep on cycling between two equal minima. This steady state is graphically shown below (black dots=data, green line=current minimum, gray=posterior, red=acqf values). According to EI, there is not enough benefit to explore unseen points but, at the same time, re-observing the minima also doesn't change the situation since their posterior variance is already effectively zero.
Interpretation and Way Forward
Overall, it seems that the issue could be that there is overall too much trust in the current model or, in other words: the model uncertainty is not adequately estimated. In fact, there is evidence that this can lead to BO getting stuck with expected improvement, like described in "A hierarchical expected improvement method for Bayesian optimization" [Chen et al]
If this is the case, a potential avenue is to replace the point-estimate-based GP parameter fitting with an approach that properly takes into account the posterior hyper-parameter distribution, e.g. Bayesian model averaging or deterministic approximation schemes. A more flexible fitting approach is on the roadmap anyway...
Until then, it's probably best to keep the
allow_*
flags active.The text was updated successfully, but these errors were encountered: