Skip to content

Commit

Permalink
Update readme for plymi-package instructions; update html
Browse files Browse the repository at this point in the history
  • Loading branch information
rsokl committed Jan 18, 2020
1 parent 914db32 commit 6ec6eb7
Show file tree
Hide file tree
Showing 33 changed files with 3,543 additions and 665 deletions.
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,17 @@ pip install sphinx-rtd-theme==0.4.3
pip install jupytext-1.3.0rc1
```

Using this environment, you should now be able to run sphinx to build the html for this site from the source-code. To do this, navigate to the directory named Python in this repository, and then run:
and install the `plymi` code base from this repo. Clone the present repository and run:

```shell
python -m sphinx . _build -j4
pip install .
```

Using this environment, you should now be able to run sphinx to build the html for this site from the source-code. To do this, run the following commands in your Python terminal:

```python
import plymi
plymi.convert_src_to_html("./Python") # point to the dir containing `conf.py`
```

This will convert all of the "restructured text" (.rst) files to html via `sphinx`. `jupytext` is responsible for converting the markdown (.md) files to jupyter notebooks (.ipynb) and then `nbsphinx` converts these notebooks to html.
Expand All @@ -60,8 +67,9 @@ Note that, if you are introducing a new page to the site or are doing anything t
# Publishing HTML for this site
Once you have built the html and have verified that it looks good to you, navigate to the top level of the repository and run:

```shell
python build_to_doc.py
```python
import plymi
plymi.build_to_doc(".") # point to the top-level dir (contains both `docs/` and `docs_backup`)
```

This will back-up your current `docs` directory, and will move the html from `_builds` to `docs`. It will also ensure some essential "meta" files, `.nojekyll` and `CNAME` are present. The former is required for githubpages to build the site correctly, the latter ensures that the canonical name for the site is `pythonlikeyoumeantit.com`.
Expand Down
Binary file added docs/.doctrees/Module6_Testing/Hypothesis.doctree
Binary file not shown.
Binary file not shown.
Binary file added docs/.doctrees/Module6_Testing/Pytest.doctree
Binary file not shown.
Binary file modified docs/.doctrees/changes.doctree
Binary file not shown.
Binary file modified docs/.doctrees/environment.pickle
Binary file not shown.
Binary file modified docs/.doctrees/index.doctree
Binary file not shown.
Binary file modified docs/.doctrees/intro.doctree
Binary file not shown.
Binary file added docs/.doctrees/module_6.doctree
Binary file not shown.
327 changes: 327 additions & 0 deletions docs/Module6_Testing/Hypothesis.html

Large diffs are not rendered by default.

640 changes: 640 additions & 0 deletions docs/Module6_Testing/Intro_to_Testing.html

Large diffs are not rendered by default.

797 changes: 797 additions & 0 deletions docs/Module6_Testing/Pytest.html

Large diffs are not rendered by default.

131 changes: 131 additions & 0 deletions docs/_sources/Module6_Testing/Hypothesis.md.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
jupyter:
jupytext:
text_representation:
extension: .md
format_name: markdown
format_version: '1.2'
jupytext_version: 1.3.0
kernelspec:
display_name: Python [conda env:.conda-jupy] *
language: python
name: conda-env-.conda-jupy-py
---

<!-- #raw raw_mimetype="text/restructuredtext" -->
.. meta::
:description: Topic: Writing tests for your code, Difficulty: Easy, Category: Section
:keywords: test, automated, pytest, parametrize, fixture, suite, decorator, clean directory
<!-- #endraw -->

<!-- #region -->
# Describing Data with Hypothesis

It is often the case that the process of *describing our data* is by far the heaviest burden that we must bear when writing tests. This process of assessing "what variety of values should I test?", "have I thought of all the important edge-cases?", and "how much is 'enough'?" will crop up with nearly every test that we write.
Indeed, these are questions that you may have been asking yourself when writing `test_count_vowels_basic` and `test_merge_max_mappings` in the previous sections of this module.

[Hypothesis](https://hypothesis.readthedocs.io/) is a powerful Python library that empowers us to write a _description_ (specification, to be more precise) of the data that we want to use to exercise our test.
It will then *generate* test cases that satisfy this description and will run our test on these cases.

Let's look at a simple example of Hypothesis in action.
In the preceding section, we learned to use pytest's parameterization mechanism to test properties of code over a set of values.
For example, we wrote the following trivial test:

```python
import pytest

# A simple parameterized test that only tests a few, conservative inputs.
# Note that this test must be run by pytest to work properly
@pytest.mark.parametrize("size", [0, 1, 2, 3])
def test_range_length(size):
assert len(range(size)) == size
```

which tests the property that `range(n)` has a length of `n` for any non-negative integer value of `n`.
Well, it isn't *really* testing this property for all non-negative integers; clearly it is only testing the values 0-3.
We should probably also check much larger numbers and perhaps traverse various orders of magnitude (i.e. factors of ten) in our parameterization scheme.
No matter what set of values we land on, it seems like we will have to eventually throw our hands up and say "okay, that seems good enough."

Instead of manually specifying the data to pass to `test_range_length`, let's use Hypothesis to simply describe the data:
<!-- #endregion -->

<!-- #region -->
```python
from hypothesis import given

# Hypothesis provides so-called "strategies" for us
# to describe our data
import hypothesis.strategies as st

# Using hypothesis to test any integer value in [0, 10 ** 10]
@given(size=st.integers(min_value=0, max_value=1E10))
def test_range_length(size):
assert len(range(size)) == size
```
<!-- #endregion -->

<!-- #region -->
Here we have specified that the `size` value in our test should take on any integer value within $[0, 10^{10}]$.
We did this by using the `integers` "strategy" that is provided by Hypothesis: `st.integers(min_value=0, max_value=1E10)`.
When we execute the resulting test (which can simply be run within a Jupyter cell or via pytest), this will trigger Hypothesis to generate test cases based on this specification;
by default, Hypothesis will generate 100 test cases - an amount that we can configure - and will evaluate our test for each one of them.

```python
# Running this test once will trigger Hypothesis to
# generate 100 values based on the description of our data,
# and it will execute the test using each one of those values
>>> test_range_length()
```

With great ease, we were able to replace our pytest-parameterized test, which only very sparsely tested the property at hand, with a much more robust, hypothesis-driven test.
This will be a recurring trend: we will generally produce much more robust tests by _describing_ our data with Hypothesis, rather than manually specifying test values.

The rest of this section will be dedicated to learning about the Hypothesis library and how we can leverage it to write powerful tests.
<!-- #endregion -->

<!-- #region -->
<div class="alert alert-warning">

**Hypothesis is _very_ effective...**:

You may be wondering why, in the preceding example, I arbitrarily picked $10^{10}$ as the upper bound to the integer-values to feed to the test.
I actually didn't write the test that way initially.
Instead, I wrote the more general test:

```python
@given(size=st.integers(min_value=0))
def test_range_length(size):
assert len(range(size)) == size
```

which places no formal upper bound on the integers that Hypothesis will generate.
However, this test immediately found an issue (I hesitate to call it an outright bug):

```python
Falsifying example: test_range_length(
size=9223372036854775808,
)

----> 3 assert len(range(size)) == size

OverflowError: Python int too large to convert to C ssize_t
```

This reveals that the implementation of the built-in `len` function is such that it can only handle non-negative integers smaller than $2^{63}$ (i.e. it will only allocate 64 bits to represent a signed integer - one bit is used to store the sign of the number).
Hypothesis revealed this by generating the failing test case `size=9223372036854775808`, which is exactly $2^{63}$.
I did not want this error to distract from what is otherwise merely a simple example, but it is very important to point out.

Hypothesis has a knack for catching these sorts of unexpected edge cases.
Now we know that `len(range(size)) == size` _does not_ hold for "arbitrary" non-negative integers!
(I wonder how many of the Python core developers know about this 😄).


</div>
<!-- #endregion -->

## Links to Official Documentation

- [Hypothesis](https://hypothesis.readthedocs.io/)


## Reading Comprehension Solutions
Loading

0 comments on commit 6ec6eb7

Please sign in to comment.