Code standardisation inside the same team is important. There are different ways to achieve this goal. In this newsletter, I also would like to talk about MLOps and the way Databricks presents that.
If you don’t know where to start with code standardisation, the Python community released an official style guide named PEP 8. You can go through it.
Code standardisation inside the same team is important. To do that, you can use Pylint in Python. Pylint is based on PEP 8. With that, you don’t have to know all the rules by heart. You can follow Pylint. You can integrate it in a CI/CD pipeline to get automatic checks.
Another way to get code standardisation is to build a unique monorepo. I’m not sure the benefits are higher than the disadvantages. But I definitely believe reusing code between different projects is a path to standardisation and relaxation.
I recently attended a meeting provided by Databricks about MLOps. The recording is available on demand.
Here is what I’ve found interesting:
- One of the most evident benefit of The MLFlow Model registry is that it’s a central place for the models
- Reminder: it’s possible to auto log with MLFlow (if you don’t need to log something special)
- Distribution of TensorFlow and Pytorch is available in the Databricks platform
- The new workspace should integrate more easily CI/CD needs
- H&M uses the following process to develop:
- In your IDE, do all the things you need in a package and a notebook that is the glue (of the different elements of your package)
- Launch your unit tests
- Send the package and the notebook in Databricks with a local script
- Launch your end-to-end test in the Databricks platform (manually)
- Back to your IDE, commit and push