Learn Data Analysis with Python in this comprehensive tutorial for beginners, with exercises included!
NOTE: Check description for updated Notebook links.
Data Analysis has been around for a long time, but up until a few years ago, it was practiced using closed, expensive and limited tools like Excel or Tableau. Python, SQL and other open libraries have changed Data Analysis forever.
In this tutorial you'll learn the whole process of Data Analysis: reading data from multiple sources (CSVs, SQL, Excel, etc), processing them using NumPy and Pandas, visualize them using Matplotlib and Seaborn and clean and process it to create reports.
Additionally, we've included a thorough Jupyter Notebook tutorial, and a quick Python reference to refresh your programming skills.
Course created by Santiago Basulto from DataWars
Check out all Data Science courses from DataWars: https://datawars.io/ref=fcc
️ Note: Instead of loading the notebooks on notebooks.ai, you should use Google Colab instead. Here are instructions on loading a notebook directly from GitHub into Google Colab: https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb#scrollTo=K-NVg7RjyeTk
⭐️ Course Contents ⭐️
⌨️ Part 1: Introduction
What is Data Analysis, why Python?, what other options are there? what's the cycle of a Data Analysis project? What's the difference between Data Analysis and Data Science?
Slides for this section: https://docs.google.com/presentation/d/1XXhVx2a7z2GrG5qddIyLFk4T_5s5mmdqSptDGBD9hWk/edit?usp=sharing
⌨️ Part 2: Real Life Example of a Python/Pandas Data Analysis project (00:11:11)
A demonstration of a real life data analysis project using Python, Pandas, SQL and Seaborn. Don't worry, we'll dig deeper in the following sections
Notebooks: https://github.com/rmotr-curriculum/FreeCodeCamp-Pandas-Real-Life-Example
⌨️ Part 3: Jupyter Notebooks Tutorial (00:30:50)
A step by step tutorial to learn how to use Juptyer Notebooks
Twitter Cheat Sheet: https://twitter.com/rmotr_com/status/1122176794696847361
Notebooks: https://github.com/rmotr-curriculum/ds-content-interactive-jupyterlab-tutorial
⌨️ Part 4: Intro to NumPy (01:04:58)
Learn why NumPy was such an important library for the data-processing world in Python. Learn about low level details of computations and memory storage, and why tools like Excel will always be limited when processing large volumes of data.
Notebooks: https://github.com/rmotr-curriculum/freecodecamp-intro-to-numpy
⌨️ Part 5: Intro to Pandas (01:57:08)
Pandas is arguably the most important library for Data Processing in the Python world. Learn how it works and how its main data structure, the Data Frame, compares to other tools like spreadsheets or DFs used for Big Data
Notebooks: https://github.com/rmotr-curriculum/freecodecamp-intro-to-pandas
⌨️ Part 6: Data Cleaning (02:47:18)
Learn the different types of issues that we'll face with our data: null values, invalid values, statistical outliers, etc, and how to clean them.
Notebooks: https://github.com/rmotr-curriculum/data-cleaning-rmotr-freecodecamp
⌨️ Part 7: Reading Data from other sources (03:25:15)
Notebooks: https://github.com/rmotr-curriculum/RDP-Reading-Data-with-Python-and-Pandas
⌨️ Part 8: Python Recap (03:55:19)
If your Python or coding skills are rusty, check out this section for a quick recap of Python main features and control flow structures.
Notebooks: https://github.com/rmotr-curriculum/ds-content-python-under-10-minutes
--
Learn to code for free and get a developer job: https://www.freecodecamp.org
Read hundreds of articles on programming: https://freecodecamp.org/news