I’m glad to see that we can find more and more papers about deploying Machine Learning. The paper Challenges in Deploying Machine Learning: a Survey of Case Studies focuses on challenges that you can encounter in this adventure. Challenges are multiple and everywhere. The authors of the paper tried to categorise and name them. Let’s dive a bit into these challenges and resources that discuss these.
The survey shows that practitioners face challenges at each stage of the deployment. The goal of this paper is to layout a research agenda to explore approaches addressing these challenges.
A categorisation of the issues and concerns is helpful:
Data management - Data augmentation - Labelling of large volumes of data
Labelling in machine learning is an issue. It is expensive and leads to boring tasks.
You can keep the costs of labelling down with automatic solutions. I wanted to highlight some methods to do that.
Model learning - Model selection - Model complexity
I didn’t know this concept in machine learning. It is useful to find relationships between items.
In some use cases, it can be used instead of a more recommender system.
Association Rules is one of the very important concepts of machine learning being used in market basket analysis.
Association rules help uncover all such relationships between items from huge databases.
Rules do not extract an individual’s preference, rather find relationships between set of elements of every distinct transaction. This is what makes them different from collaborative filtering.
Model deployment - Integration - Software engineering anti-patterns
This guide is very useful if you want to understand type checking in Python.
You can add types in Python. By default, they do nothing. But you can use tools such as Enforce or Pydantic that force the good usage of the types.
It’s a way to get safer deployment and maintenance.
Cross-cutting aspects - Security
With this article, I realise that I don’t know very much about the security regarding machine learning itself. Then, I think this blog post is a good start.
Machine Learning systems are complex software systems where software code and data are interlaced, so they do not only face computer security risks but also data threats.