Katharine Jarmul on using Python for data analysis - a podcast by OReilly Media
from 2017-11-30T12:15
::
::
The O’Reilly Programming Podcast: Wrangling data with Python’s libraries and packages.
In this episode of theO’Reilly Programming Podcast, I talk withKatharine Jarmul, a Python developer and data analyst whose company,Kjamistan, provides consulting and training on topics surrounding machine learning, natural language processing, and data testing. Jarmul is the co-author (along with Jacqueline Kazil) of the O’Reilly bookData Wrangling with Python, and she has presented thelive online trainingcoursePractical Data Cleaning with Python.
Discussion points:
- How data wrangling enables you to take real-world data and “clean it, organize it, validate it, and put it in some format you can actually work with,” says Jarmul.
- Why Python has become a preferred language for use in data science: Jarmul cites the accessibility of the language and the emergence of packages such as NumPy, pandas, SciPy, and scikit-learn.
- Jarmul callspandas“Excel on steroids” and says, “it allows you to manipulate tabular data, and transform it quite easily. For anyone using structured, tabular data, you can’t go wrong with doing some part of your analysisin pandas.”
- She citesgensimandspaCyas her favorite NLP Python libraries, praising them for “the ability to just install a library and have it do quite a lot of deep learning or machine learning tasks for you.”
Other links:
- Check out the videoBuilding Data Pipelines with Python, presented by Jarmul.
- Check out the videoData Wrangling and Analysis with Python, presented by Jarmul.
- Jarmul is one of the founders of the groupPyLadies, which focuses on helping more women become active participants and leaders in the Python open source community.
Further episodes of O
Further podcasts by O'Reilly Media
Website of O'Reilly Media