This repository contains a Jupyter notebook which can be used in a workshop about k-means clustering using the 120 years of Olympic history: athletes and results dataset available on Kaggle.
- You will need miniconda (or the full anaconda) for Python 3.7. Allow it to prepend the install location to your path.
- (Don't forget to source your
.bash_profile
so bash can find theconda
binary!) - Clone this repo
- Using the
environment.yml
file, create a new conda environment:conda env create -f environment.yml
- To activate the environment, run
source activate myenv
. - To test that everything works, run
jupyter notebook
and navigate tolocalhost:8888/
in your browser. You should see an interface like this:
There are two versions of this notebook:
olympic_kmeans_follow_along.ipynb
lets you follow along, filling in the code as you go.olympic_kmeans.ipynb
is the full notebook, with answers if you get stuck
Click on the notebook you wish to run.
Inside each notebook are several cells. When interacting with the cells, you can either be in:
- Edit Mode (green border) for editing cells. Selecting a cell and hitting ENTER will put you in Edit Mode.
- Command Mode (blue border) for running cells. Hitting ESCAPE on a cell in Insert Mode will put you back in Command Mode.
To run a selected cell, you can either hit the "Run" button in the top menu bar or by hitting Shift+Enter in Command Mode.