Written by KimRass
\pi^{*}(a|s)=\left\{
\begin{array}{c l}
1, & if\ a = \underset{a \in A}{\operatorname{argmax}}Q^{*}(s, a)\\
0, & otherwise
\end{array}\right.
$$\pi(a|s) =
\left\{
\begin{align}
&\frac{\epsilon}{|A|} + 1 - \epsilon
&&if\ a = \underset{a \in A}{\operatorname{argmax}}Q^{\pi}(s, a)\\
&\frac{\epsilon}{|A|}
&&otherwise
\end{align}
\right.$$
$$\begin{align} \mathbb{E}{x \sim P}[f(x)] &= \sum{x \in X}p(x)f(x)\ &= \sum_{x \in X}q(x)\frac{p(x)}{q(x)}f(x)\ &= \mathbb{E}{x \sim Q}\left[\frac{p(x)}{q(x)}f(x)\right] \end{align}$$ $$\prod{k=t}^{T-1}\frac{\pi(A_{k}|S_{k})}{\mu(A_{k}|S_{k})}$$
\frac{1}{2}
\underset{a \in A}{\operatorname{argmax}}