Sick of complaining about data wrangling? As part of the »PyCon.DE 2017 & PyData Karlsruhe» conference«, taking place at ZKM from October 25–27, 2017, visitors of the »Open Codes« exhibition are invited to attend the workshop »Practical Data Cleaning 101« with Pythonista and data scientist Katharine Jarmul.
Unsure what libraries to even begin with? In this tutorial, we will highlight some practical examples of data cleaning, using tools to dedupe records, perform string matching and preprocess data for machine learning. This workshop features libraries, tools and strategies for data extraction and cleaning. You will be asked to run code and participate actively, so get ready to do some hands-on data wrangling.
Prerequisites: you should feel comfortable using Pandas and Jupyter Notebook. If you don't, you can still attend but there might be points where you are confused as to how to best follow along.
Please fork or clone the workshop repository before the course and reach out with any issues: https://github.com/kjam/data-cleaning-101. We will only be going over the data-cleaning notebooks, but you are welcome to take a walk through validation at your leisure!