-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c9c0e6f
commit 92d4564
Showing
2 changed files
with
58 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
--- | ||
title: "Bayesian Ensembling for Contextual Bandit Models" | ||
author: "Joseph Lawson" | ||
date: "Oct 23, 2023" | ||
--- | ||
|
||
|
||
|
||
|
||
|
||
## Abstract | ||
|
||
Contextual bandit models are a primary tool for sequential | ||
decision making with applications ranging from clinical trials | ||
to e-commerce. While there are numerous bandit algorithms | ||
which achieve optimal regret bounds and show strong | ||
performance on benchmark problems, algorithm selection | ||
and tuning in any given application remains a major open | ||
problem. We propose the Bayesian Basket of Bandits (B3), | ||
a meta-learning algorithm which automatically ensembles a | ||
set (basket) of candidate algorithms to produce an algorithm | ||
which dominates all those in the basket. The method | ||
works by treating the evolution of a bandit algorithm as a | ||
Markov decision process in which the states are posterior | ||
distributions over model parameters and subsequently | ||
applying approximate Bayesian dynamic programming | ||
to learn an optimal ensemble. We derive both Bayesian | ||
and frequentist convergence results for the cumulative | ||
discounted utility. In simulation experiments, the proposed | ||
method provides lower regret than state-of-the-art algorithms | ||
including Thompson Sampling, upper confidence bound | ||
methods, and Information-Directed sampling. | ||
|
||
|
||
|
||
### Advisor(s) | ||
|
||
Eric Laber |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
--- | ||
title: "Minimax Mixing Time of the Metropolis-Adjusted Langevin Algorithm for Log-Concave Sampling" | ||
author: "Keru Wu" | ||
date: "Oct 23, 2023" | ||
--- | ||
|
||
|
||
|
||
|
||
|
||
## Abstract | ||
|
||
We study the mixing time of the Metropolis-adjusted Langevin algorithm (MALA) for sampling from a log-smooth and strongly log-concave distribution. We establish its optimal minimax mixing time under a warm start. Our main contribution is two-fold. First, for a d-dimensional log-concave density with condition number kappa, we show that MALA with a warm start mixes in kappa times sqrt(d) iterations up to logarithmic factors. This improves upon the previous work on the dependency of either the condition number kappa or the dimension d. Our proof relies on comparing the leapfrog integrator with the continuous Hamiltonian dynamics, where we establish a new concentration bound for the acceptance rate. Second, we prove a spectral gap based mixing time lower bound for reversible MCMC algorithms on general state spaces. We apply this lower bound result to construct a hard distribution for which MALA requires at least kappa times sqrt(d) steps to mix. The lower bound for MALA matches our upper bound in terms of condition number and dimension. Finally, numerical experiments are included to validate our theoretical results. | ||
|
||
|
||
|
||
|
||
### Advisor(s) | ||
|
||
Yuansi Chen |