Sitemap & RSS Feed Tags

Learn about data science in real life and machine learning in production

Your cat might be better than focus

Feb 23, 2021 I would like to tell you a story. I was trying to understand a bug. I spent maybe 5 pomodoros on it. Pomodoro is a technique that helps you to stay focus. I couldn't find the root cause of my bug. I was quite frustrated but still motivated. I decided to stop my session of pomodoros and left my work for the day. I went to my beautiful cat and started petting her. I was starting relaxing when at a sudden I got the solution to my bug. ...

Issue 8: Code standardisation and MLOps seen by Databricks

Feb 16, 2021 Code standardisation inside the same team is important. There are different ways to achieve this goal. In this newsletter, I also would like to talk about MLOps and the way Databricks presents that. ...

Issue 7: Recommendation systems and data engineering jobs

Feb 03, 2021 Once, I've assessed different recommendation systems through AB tests. They were black boxes for me. I've decided to focus my recent techno watch on recommendation systems. I wanted to understand better these black boxes. There are many ways to do recommendation systems. The solutions depend on your context and needs. In this newsletter, I also would like to talk a bit about the roles of data engineers or machine learning engineers. ...

How can you connect to MLFlow registry remotely?

Jan 25, 2021 MLFlow is helpful when you are looking for reproducibility and MLOps. This tutorial will lead you to connect to MLFlow model registry from outside. ...

Issue 6: The past, present and, future of AI and ML

Jan 18, 2021 This week, I'm sharing some resources I recently read about the past, present and, possible future of AI and ML. What have we done with AI? What can we do? And what can we hope we will be able to do? ...

Comparison of different tools to do unit tests for data

Jan 08, 2021 Recently, I benchmarked different tools to do unit tests for data. Here are the results. ...

Issue 5: Resources to get started with Kubernetes

Jan 06, 2021 As a data scientist, data engineer or, machine learning engineer, you sometimes have to deal with Kubernetes, this strange tool to orchestrate containers. This is why this is the main subject of this issue. ...

A banal machine learning system where interpretability and explainability matter

Jan 06, 2021 As algorithmic systems become more prevalent, the need to understand them grows (Interpretability report from Cloudera). ...

Issue 4: Resources to get an overview of a machine learning project

Dec 23, 2020 It's not easy to start when doing machine learning. This is why I would like to highlight some useful resources to get an idea of what a data science project looks like from end to end. ...

A bug in data science

Dec 13, 2020 Bugs in predictive systems are most of the time silent. Don't expect your users to raise their hands saying something is wrong. That won't happen ...

Issue 3: Feature store, weird chatbot, XGboost for Spark and Airflow in Amazon

Dec 10, 2020 Issue 3: Feature store, weird chatbot, XGboost for Spark and Airflow in Amazon ...

The hidden face of sexism

Dec 07, 2020 Have you ever been discriminated against? I have. Have you ever been discriminated against without being sure of that? I have. A story about unconscious sexism in IT ...