Sitemap & RSS Feed Tags

Issue 14: Market yourself, ACID transactions in DeltaLake, OpenSource and Bayesian ABTests

Sign up for a bi-weekly newsletter

Learn about data science in real life and machine learning in production
* indicates required

Market yourself, ACID transactions in DeltaLake, OpenSource and Bayesian ABTests. I’ve seen and worked on a lot of different topics these days. They are diverse, but I thought that they can be all useful.

Diving Into Delta Lake: Unpacking The Transaction Log

I have presented a lot DeltaLake as its purpose was only for time travel. It’s not the only feature. One that is also very important is ACID transactions. DeltaLake solves conflicts optimistically.

In general, the process proceeds like this:

Record the starting table version. Record reads/writes. Attempt a commit. If someone else wins, check whether anything you read has changed. Repeat.

DeltaLake gives the possibility to modify parts of different data concurrently. If two programs are modifying the same portion of data, DeltaLake executes the modifications serially. If there’s a conflict, an error is raised.

In the vast majority of cases, this reconciliation happens silently, seamlessly, and successfully. However, in the event that there’s an irreconcilable problem that Delta Lake cannot solve optimistically (for example, if User 1 deleted a file that User 2 also deleted), the only option is to throw an error.

How to market yourself (without being a celebrity)

Let’s start with a recap of the presentation:

Why should you market yourself?

You have experience and skills. It’s normal to want recognition for that. It can help you find another job or be promoted. Then, we will distinguish outside and inside marketing.

Tips to market yourself outside:

  • Brand yourself
  • You can choose a logo
  • Keep the same good picture
  • Be consistent
  • Create content on your own platform (blogs or podcasts for instance)
  • Speak with people (for example, you can do a blog post to answer to another one)
  • Choose a domain of expertise. It should be a niche but not too much. You must find a balance.
  • You don’t have to be an expert at the beginning: just speak about the subject you want.

Tips to market yourself inside your company:

  • Always keep with you a quick summary (one page) of your accomplishments in your company: it helps you to be ready to market yourself at anytime. More than that, it helps you for bad days to realise that you’re able to bring value.
  • During stand-ups and demos, be joyful and present things as accomplishments.
  • Help and mentor people
  • Start another project to show that you’re able to dedicate yourself and to take initiatives
  • Create a newsletter internally and share your knowledge

I was pleased to attend this talk. Often, we speak about outside marketing. It’s the first time I saw something about inside marketing. It’s valuable.

I don’t totally agree with the last two points about marketing yourself inside your company. I didn’t make researches about this subject. But I have already experienced the fact of starting another project, creating an internal newsletter and organising talks. My objective was not too market myself absolutely. That’s true that it helped me a lot though. But there are counterparts. It takes time and you don’t focus on the project you’re paying for. Now, what I prefer doing is focusing only on my project and gravitate around to improve it and communicate about it. It’s less demanding and I think more useful for everyone.

Be aware that here I’m giving you personal statistics. By definitions, it is worthless compared to a real study.

Overview of Bayesian AB Tests

I’m working on Bayesian AB Tests for a showcase project.

Then I decided to do a quick memo on some principles regarding Bayesian AB Tests.

Contributing to open source is not what I thought it was

When I was younger at the beginning of my career, I was fascinating by the idea of open source projects. I thought that contribution was unreachable for me. I still think it is difficult to open a pull request that modifies the code of a complex project like Apache Spark. But I have also realised that open source is not only about modifying code.

Thank you for reading. Feel free to contact me on Twitter if you want to discuss that.