diff --git a/README.md b/README.md
index 78fe7b5..75f4648 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,3 @@
-
 # ExpectationMaximization
 
 [![Docs](https://img.shields.io/badge/docs-dev-blue.svg)](https://dmetivie.github.io/ExpectationMaximization.jl/dev)
@@ -6,6 +5,8 @@
 This package provides a simple implementation of the Expectation Maximization (EM) algorithm used to fit mixture models.
 Due to [Julia](https://julialang.org/) amazing [dispatch](https://www.youtube.com/watch?v=kc9HwsxE1OY) systems, generic and reusable code spirit, and the [Distributions.jl](https://juliastats.org/Distributions.jl/stable/) package, the code while being very generic is both very expressive and fast! (Have a look at the [Benchmark section](https://dmetivie.github.io/ExpectationMaximization.jl/dev/benchmarks/))
 
+## What type of mixtures?
+
 In particular, it works on a lot of mixtures:
 
 - Mixture of Univariate continuous distributions
@@ -14,9 +15,13 @@ In particular, it works on a lot of mixtures:
 - Mixture of mixtures (univariate or multivariate and continuous or discrete)
 - More?
 
-Note that [Distributions](https://juliastats.org/Distributions.jl/stable/) *currently* does not allow `MixtureModel` to both have discrete and continuous components (but what does that? Rain).
+## What EM algorithm?
+
+So far the classic EM algorithm and the Stochastic EM are implemented. Look at the [Bibliography section](https://dmetivie.github.io/ExpectationMaximization.jl/dev/biblio) for references.
+
+## How?
 
-Just define a [`mix::MixtureModel`](https://juliastats.org/Distributions.jl/stable/mixture/) and do `fit_mle(mix, y)` with your data `y` and that's it!
+Just define a [`mix::MixtureModel`](https://juliastats.org/Distributions.jl/stable/mixture/) and do `fit_mle(mix, y)` where `y` is you observation array (vector or matrix). That's it! For Stochastic EM, just do `fit_mle(mix, y, method = StochasticEM())`.
 **Have a look at the [Examples](https://dmetivie.github.io/ExpectationMaximization.jl/dev/examples/#Examples) section**.
 
 To work, the only requirements are that the components of the mixture `dist ∈ dists = components(mix)` considered (custom or coming from an existing package)
diff --git a/docs/src/benchmarks.md b/docs/src/benchmarks.md
index 7216aed..a6a5c00 100644
--- a/docs/src/benchmarks.md
+++ b/docs/src/benchmarks.md
@@ -1,33 +1,32 @@
-
 # Benchmarks
 
 I was inspired by [this benchmark](https://floswald.github.io/post/em-benchmarks/).
 I am not too sure how to do 100% fair comparisons across languages[^1].
 There is a small overhead for using `PyCall` and `RCall`. I checked that it was small in my experimentation (~ few milliseconds?).
 
-I test only Gaussian Mixture since it is the most common type of mixture (remembering that this packages allow plenty of mixtures, and it should be fast in general).
+I test only the Gaussian Mixture case since it is the most common type of mixture (remember that this package allows plenty of other mixtures).
 
-In my code, I did not use fancy programming tricks, the speed only comes from Julia, `LogExpFunctions.jl` for `logsumexp!` function and `fit_mle` for each distribution's coming from `Distributions.jl` package.
+In the code, I did not use (too much) fancy programming tricks, the speed only comes from Julia, `LogExpFunctions.jl` for `logsumexp!` function and `fit_mle` for each distribution's coming from `Distributions.jl` package.
 
 ## Univariate Gaussian mixture with 2 components
 
 I compare with [Sklearn.py](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html#sklearn.mixture.GaussianMixture)[^2], [mixtool.R](https://cran.r-project.org/web/packages/mixtools/index.html), [mixem.py](https://mixem.readthedocs.io/en/latest/index.html)[^3].
 I wanted to try [mclust](https://cloud.r-project.org/web/packages/mclust/vignettes/mclust.html), but I did not manage to specify initial conditions
 
-Overall, [mixtool.R](https://cran.r-project.org/web/packages/mixtools/index.html) and [mixem.py](https://mixem.readthedocs.io/en/latest/index.html) were constructed in a similar spirit as this package hence easy to use for me. [Sklearn.py](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html#sklearn.mixture.GaussianMixture) is build to fit the Sklearn format (all in one). `GaussianMixturesModel.jl` is build with a similar vibe.
+Overall, [mixtool.R](https://cran.r-project.org/web/packages/mixtools/index.html) and [mixem.py](https://mixem.readthedocs.io/en/latest/index.html) were constructed in a similar spirit as this package, making them easy to use for me. [Sklearn.py](https://scikit-learn.org/stable/modules/generated/sklearn.mixture.GaussianMixture.html#sklearn.mixture.GaussianMixture) is built to match the Sklearn format (all in one). `GaussianMixturesModel.jl` is built with a similar vibe.
 
-If you have comments to improve these benchmarks, comments are welcomed.
+If you have comments to improve these benchmarks, they are welcome.
 
-You can find the benchmark code in [here](https://github.com/dmetivie/ExpectationMaximization.jl/tree/master/benchmarks/benchmark_v1_K2_unidim.jl).
+You can find the benchmark code [here](https://github.com/dmetivie/ExpectationMaximization.jl/tree/master/benchmarks/benchmark_v1_K2_unidim.jl).
 
-**Conclusion: for Gaussian mixture, `ExpectationMaximization.jl` is about 2 to 10 times faster than Python or R implementations** and about as fast as the specialized Julia package `GaussianMixturesModel.jl`.
+**Conclusion: for Gaussian mixtures, `ExpectationMaximization.jl` is about 2 to 10 times faster than Python or R implementations** and about as fast as the specialized Julia package `GaussianMixturesModel.jl`.
 
 ![timing_K_2_rudimentary_wo_memory_leak](https://user-images.githubusercontent.com/46794064/227195619-c75b9276-932b-4029-8b49-6cce919acc87.svg)
 
-[^1]: Note that `@btime` with `RCall` and `PyCall` might produce a small-time overhead compare to the true R/Python time see [here for example](https://discourse.julialang.org/t/benchmarking-julia-vs-python-vs-r-with-pycall-and-rcall/37308).
-I did compare with `R` `microbenchmark` and Python `timeit` and it produces very similar timing but in my experience `BenchmarkTools` is smarter and simpler to use, i.e. it will figure out alone the number of repetition to do in function of the run.
+[^1]: Note that `@btime` with `RCall` and `PyCall` might produce a small-time overhead compared to the pure R/Python time; see [here for example](https://discourse.julialang.org/t/benchmarking-julia-vs-python-vs-r-with-pycall-and-rcall/37308).
+I did compare with `R` `microbenchmark` and Python `timeit` and they produced very similar timing but in my experience `BenchmarkTools.jl` is smarter and simpler to use, i.e. it will figure out alone the number of repetition to do in function of the run.
 
-[^2]: There is a suspect triggers warning regarding K-means which I do not want to use here. I asked a question [here](https://github.com/scikit-learn/scikit-learn/discussions/25916). It lead to [this issue](https://github.com/scikit-learn/scikit-learn/issues/26015) and [that PR](https://github.com/scikit-learn/scikit-learn/pull/26021). Turns out even if intial condition were provided K-mean were still computed. However to this day 23-11-29 with `scikit-learn 1.3.2` it still get the warning. Maybe it will be in the next release? I also noted this recent [PR](https://github.com/scikit-learn/scikit-learn/pull/26416).
-Last, the step by step likelihood of `Sklearn` is not the same as outputted by `ExpectationMaximization.jl` and [mixtool.R](https://cran.r-project.org/web/packages/mixtools/index.html) (both agree), so I am a bit suspicious.
+[^2]: There is a suspect trigger warning regarding K-means which I do not want to use here. I asked a question [here](https://github.com/scikit-learn/scikit-learn/discussions/25916). It led to [this issue](https://github.com/scikit-learn/scikit-learn/issues/26015) and [that PR](https://github.com/scikit-learn/scikit-learn/pull/26021). It turns out that even if initial conditions were provided, the K-mean was still computed. However, to this day (23-11-29) with `scikit-learn 1.3.2` it still gets the warning. Maybe it will be in the next release? I also noted this recent [PR](https://github.com/scikit-learn/scikit-learn/pull/26416).
+Last, the step-by-step likelihood of `Sklearn` is not the same as outputted by `ExpectationMaximization.jl` and [mixtool.R](https://cran.r-project.org/web/packages/mixtools/index.html) (both agree), so I am a bit suspicious.
 
-[^3]: It overflows very quickly for $n>500$ or so. I think it is because of naive implementation of [`logsumexp`](https://github.com/sseemayer/mixem/blob/2ffd990b22a12d48313340b427feae73bcf6062d/mixem/em.py#L5). So I eventually did not include the result in the benchmark.
+[^3]: It overflows very quickly for $n>500$ or so. I think it is because of the implementation of [`logsumexp`](https://github.com/sseemayer/mixem/blob/2ffd990b22a12d48313340b427feae73bcf6062d/mixem/em.py#L5). So I eventually did not include the result in the benchmark.
diff --git a/docs/src/biblio.md b/docs/src/biblio.md
index 9c350cf..908f2fa 100644
--- a/docs/src/biblio.md
+++ b/docs/src/biblio.md
@@ -1,4 +1,3 @@
-
 # Bibliography
 
 ## Theory
diff --git a/docs/src/examples.md b/docs/src/examples.md
index 5958e01..4f473d7 100644
--- a/docs/src/examples.md
+++ b/docs/src/examples.md
@@ -1,4 +1,3 @@
-
 # Examples
 
 ```@example 1
diff --git a/docs/src/fit_mle.md b/docs/src/fit_mle.md
index b4089c5..d83f4e3 100644
--- a/docs/src/fit_mle.md
+++ b/docs/src/fit_mle.md
@@ -1,4 +1,3 @@
-
 # [Instance vs Type version](@id InstanceVType)
 
 This package relies heavily on the types and methods defined in `Distributions.jl` e.g., `fit_mle` and `logpdf`.
diff --git a/docs/src/index.md b/docs/src/index.md
index 3c46a79..776d1a4 100644
--- a/docs/src/index.md
+++ b/docs/src/index.md
@@ -1,8 +1,10 @@
-
 # ExpectationMaximization.jl
 
 This package provides a simple implementation of the Expectation Maximization (EM) algorithm used to fit mixture models.
 Due to [Julia](https://julialang.org/) amazing [dispatch](https://www.youtube.com/watch?v=kc9HwsxE1OY) systems, generic and reusable code spirit, and the [Distributions.jl](https://juliastats.org/Distributions.jl/stable/) package, the code while being very generic is both very expressive and fast[^1]!
+
+## What type of mixtures?
+
 In particular, it works on a lot of mixtures:
 
 - Mixture of Univariate continuous distributions
@@ -11,7 +13,7 @@ In particular, it works on a lot of mixtures:
 - Mixture of mixtures (univariate or multivariate and continuous or discrete)
 - More?
 
-Note that [Distributions](https://juliastats.org/Distributions.jl/stable/) *currently* does not allow `MixtureModel` to both have discrete and continuous components (but what does that? Rain).
+## How?
 
 Just define a [`mix::MixtureModel`](https://juliastats.org/Distributions.jl/stable/mixture/) and do `fit_mle(mix, y)` with your data `y` and that's it!
 **Have a look at the [examples](@ref Examples) section**.
@@ -27,7 +29,11 @@ In case 2. is not explicit known, you can always implement a numerical scheme, i
 Or, when possible, represent your “difficult” distribution as a mixture of simple terms.
 (I had [this](https://stats.stackexchange.com/questions/63647/estimating-parameters-of-students-t-distribution) in mind, but it is not directly a mixture model.)
 
+!!! note
+    [Distributions.jl](https://juliastats.org/Distributions.jl/stable/) *currently* does not allow `MixtureModel` to both have discrete and continuous components[^2].
+
 [^1]: Have a look at the [Benchmark section](@ref Benchmarks).
+[^2]: Rain is a good example of mixture having both a discrete (`Delta` distribution in `0`) and continuous (`Exponential`, `Gamma`, ...) component.
 
 ## Algorithms