Pycon is the largest conference for developers of the Python community. It offers many interesting talks, tutorials and discussions. With over a hundred hours of video content to watch it is hard to choose. So I selected my favourite breakout sessions for you! You can find all the keynotes, tutorials, and breakout sessions here, including a panel discussion with the newly appointed steering council. If you have more time and want a more hands-on experience check out the tutorial section.
00. Problems well-defined are problems solved
Raymond Hettinger is a python core developer. His talk is about the need for a paradigm shift in problem-solving. Instead of focussing on finding a solution, Hettinger encourages us to focus on the problem definition instead. The 'how' is an accepted abstraction. He shows how with significant increases in computational power and algorithmic strength, you can tackle complex problems with generic solutions.
01. How to think about data visualizations
Python is a jungle when it comes to tools for creating graphs: matplotlib, plotly, seaborn, and plotnine are just a few. As a follow-up to his talk on ‘which tool should I use?’ Jayke van der Plas explains what a good visualization should look like using Anscombe's quartet. He dives into the use of different colours, shapes, sizes and how humans interpret them. In the end, he advocates for the use of a relatively new plotting library (Altair) which creates graphs using a syntax similar to common speech.
02. Break the Cycle
As programmers, we like to automate as much as possible. Still, we sometimes find ourselves typing the same commands over and over. In her talk, Thea, a data engineer at Google, covers three tools to automate your workflow:
- Tox; a well-known library primarily for automation your test suite;
- Nox; a python library which offers a flexible python alternative to Tox
- Invoke; a very cool and flexible library offering functionality similar to Makefiles using decorated python functions as targets.
03. Wily Python
The number one frustration of any Data Engineer or Scientist: complex and incomprehensible code. However, it is often hard to define what exactly makes code complex and even harder to track over time. Wily is designed to do just that. It uses a set of metrics ranging from simple statistics such as lines-of-code (LOC) and Cyclomatic Complexity to more advanced metrics such as the Halstead Metrics. Anthony Shaw advocates using Wily in your CI/CD pipeline to track complexity and reject pull requests that add too much complexity.
04. A new era in Python Governance
In 2018 Guido van Rossum – the creator of python – stepped down from his position as benevolent dictator for life (BDFL). The community had to figure out a new way to govern the evolution of python and decide which Python Enhancement Proposals (PEPs) would get approved/rejected. Shauna Gordon-McKeon presents on the different models that were considered as governance structures ranging from a council of core developers to an independent group acting as an auditing authority.
05. Programmatic Notebooks with papermill
Notebooks are the go-to solution for exploration. They provide an interactive environment which combines code, documentation, and graphs in one place. Netflix wants to take notebooks to the next level with scheduling and templating. This talk covers Papermill, an open source python library which parameterizes, executes, and analyzes notebooks from either CLI or python.
06. Machine learning model and dataset versioning practices
In the Software Engineering world we’ve become well acquainted with the benefits of versioning. In data science we encounter the same problems with our data sets, models, and model performances. Over time the data sets and models change, making it hard to keep track of results. Dmitry Petrov (the author of DVC), discusses tools that can help you apply versioning to your data and/or machine learning results. He covers ML Flow, Git LFS and DVC. Some talks that nearly made the list but are definitely worth watching:
- Algorithmic fairness discusses potential biases in your algorithms
- The Zen of Python Teams describes how teams should work together according to PEP 20
- Life is better painted black is a configurable auto formatting tool (PEP 8) which integrates nicely with your IDE.
- Type hinting (and mypy) on the benefits of providing type hints in your code.
Am I missing a great talk that you think should have been mentioned, please share it with me, Chiel Peters - CTO at Xomnia.