Simon Mudd is a Senior Database Administrator and works at booking.com from Madrid, Spain.
He has been working with MySQL in large production environments for over 10 years. He has contributed heavily to orchestrator (https://github.
He has a degree in Computation from the University of Manchester Institute of Science and Technology (UMIST), now University of Manchester
Presentation: How Booking.com avoids and deals with replication lag
MySQL/MariaDB replication is asynchronous. You can make replication faster by using better hardware (faster CPU, more RAM, or quicker disks), or you can use parallel replication to remove it single-threaded limitation; but lag can still happen. This talk is not about making replication faster, it is how to deal with its asynchronous nature, including the (in-)famous lag.
We will start by explaining the consequences of asynchronous replication and how/when lag can happen. Then, we will present the solution used at Booking.com to avoid both creating lag and minimize the consequence of stale reads on slaves (hint: this solution does not mean reading from the master because this does not scale).
Once all above is well understood, we will discuss how Booking.com’s solution can be improved: this solution was designed years ago and we would do this differently if starting from scratch today. Finally, I will present an innovative way to avoid lag: the no-slave-left-behind MariaDB patch.