Sitemap & RSS Feed Tags

Issue 9: Code standardization, container orchestration, lakehouse, cats: concepts needed to productionalize machine learning models

Sign up for a bi-weekly newsletter

Learn about data science in real life and machine learning in production
* indicates required

Code standardisation with Pylint, container orchestration with Kubernetes, lakehouse with DeltaLake, working with cats ❤️, these are many concepts that can be useful to productionalize machine learning. This is what I saw recently and thought interesting.

Why Pylint is both useful and unusable, and how you can actually use it

Pylint saves the day.

Pylint has a lot of useful errors and warnings… but also a whole lot of highly opinionated assumptions about how your code should look.

Luckily Pylint has some functionality that can help: you can configure it to only enable a limited list of lint checks.

Kubernetes basics

A serie of videos about Kubernetes done by Brendan Bruns, one of the cofounder of Kubernetes.

Well explained!

Kubeflow

A serie of videos about Kubeflow, a platform dedicated to data science and based upon Kubernetes.

I’ve never used it but it’s always good to get an overview how the ecosystem around you.

Building Reliable Data Lakes with Delta Lake | Virtual Hands-on Lab

If you’re looking for ACID transactions, time travel, mixing the abilities of a data warehouse and a data lake, I recommend you to watch this webinar. It deals with DeltaLake, a tool developed by Databricks which has great capabilities.

Your cat might be better than focus

I recently wrote an article to speak about the limits of focus in our jobs.

Many books and people value focus and Pomodoro. I also value all of that. At the same time, you’re not executors. You need creativity, imagination, and distance with what you do. You need that more than I could have thought at the beginning of my career. This is why you also need to let your mind wander.