Data cleaning, analysis and visualization of Rail network Île-de-France traffic with Python, Pandas, Seaborn, and Matplotlib
This project analyzes the traffic data of rail networks for the years 2019, 2020, 2021 & 1st semester of 2022.
We will analyze the daily attendance data of the Ile-de-France rail network by trying to answer the following questions:
-
What changes in weekly/monthly patterns are observable before/during/after the COVID crisis?
-
What was the impact of COVID from a statistical point of view?
-
We will also explore some other questions about the data, and answer them with visualizations.
This project consists of 2 separate notebooks:
- Notebook 1/2: Initial data exploration and cleaning
- Notebook 2/2: Analysis and visualization
Ile-de-France Mobilités (IdFM) is the organizing authority for sustainable mobility in Ile-de-France. Île-de-France Mobilités imagines, organizes and finances public transport for all Ile-de-France residents. Every day, more than 10 million passengers use the Île-de-France Mobilités transport network (bus) and a rail network (train, metro, RER, funicular), operated by 75 OPTILE companies, RATP and SNCF.4
Since 2016, the IdFM has provided access to dynamic services (route search, real-time timetables, etc.) and gives access to some of its raw data through an opendata portal.
The IdFM provides detailed data about the rail network: such as daily traffic per station stop (number of check-ins per day and per ticket type). Data about daily traffic is available for the years 2015, 2016, 2017, 2018, 2019, 2020, 2021, and 2022.
We intended in this project to study the impact of COVID crisis on the Ile-de-France rail network, so we will focus our analysis on the year 2019 to 2022.
The data are available in open-data on the IDFM website:
https://data.iledefrance-mobilites.fr/explore/dataset/histo-validations-reseau-ferre/information/