This week, I think I’ve finally understood what a feature store is (thanks to Tecton). There is also some other news: better integration of the orchestrator Airflow in Amazon, distributed XGboost with Spark, a chatbot from Google which is better than before but still weird.
I’m a big fan of Apache Airflow which is a great tool to orchestrate your tasks. Then, I’m glad to hear about
A fully managed service that makes it easy to run open-source versions of Apache Airflow on AWS
A great article about what a feature store is concretely. In a nutshell:
A feature store is an ML-specific data system that:
- Runs data pipelines that transform raw data into feature values
- Stores and manages the feature data itself, and
- Serves feature data consistently for training and inference purposes
Chatbots are better and better year after year, but still weird:
XGBoost is currently one of the most popular machine learning libraries and distributed training is becoming more frequently required to accommodate the rapidly increasing size of datasets.
XGBoost4J-Spark can now be quickly used to distribute training on big data for high performance and accuracy predictions
Thank you for reading. Feel free to contact me on Twitter if you want to discuss machine learning in real life.