!geek

Code, learn & share

Visual simulation of consistent hashing

Caching is an important aspect of the high performance applications. As the data volume increases, the cached data needs to be distributed across multiple servers. We need to make sure the following objectives are met while doing so.

  • Maximize the cache hits: This will reduce the load on primary data source and reduce the overall latency.
  • Distribute the data and traffic evenly: This ensures optimal use of servers and avoid overloading a subset of the servers.

As the title of this post suggests, we will look into how consistent hashing can be used to achieve the above objectives. Before that, let’s look at using a straight forward approach for solving the problem.

Cloju-re-alization

This is a story of how learning clojure made me realize one of its selling points through subconscious thinking.

I had tried to learn clojure few times in the past and had dropped it because of the prefix notation for expressions and parentheses black-hole. Writing (def y (+ (* m x) c) felt very weird after expressing it as y = m * x + c during many years of education. I wasn’t alone, many of my colleagues had similar feeling about clojure and pure functional programming languages.

Monitoring DB backups using prometheus

Ensuring correct database backups taken at regular interval is very critical for disaster recovery. Recent gitlab database incident re-emphasizes this fact. Gitlab was very transparent about this and documented approaches for preventing these failures

The preventive measures include, monitoring -

  • Backup file is created in every x interval: Catches backups not being uploaded due to backup script error or scheduling error
  • Size of latest backup file is at-least y bytes: Catches erroneous backup file uploaded due to script error

To cron or not to cron

Using crontab for scheduling background jobs like database backup, purging old data is a common practice in lot of IT organizations. As these jobs are running in background, failures can get missed easily without proper monitoring. In most cases, failures occur due to an error in job script and some times due to mistakes in cron configuration

Docker swarm in production

This is a experience report on using docker swarm(17.06) in production for ~3 months

Context

Prior to this, we were deploying services on VMs using ansible. For a new project, we wanted to explore benefits of running service as containers and decided to use docker swarm due to its simpler setup and consistency with docker engine APIs

Joy of finding a math series

I was trying to come up with a question for preliminary logic round for interviewing. I decided to form a tricky math series and tried to figure out 2 math series which looks same in the beginning and diverges after a while. Something on the lines of

1
2
3
4
series x => x1, x2, x3, x4, x5
series y => y1, y2, y3, y4, y5

where x1=y1, x2=y2, x3=y3 and x4!=y4, x5!=y5

So the question could be

1
2
Write the missing number in below series
x1, x2, x3, ? , y5

Evolution of monitoring systems

Monitoring systems have evolved over time to support evolving architecture styles and deployment strategies

Generation X+1: Monolithic applications, Bare metal servers

This was the era where you would have single deployable unit for whole application (labeled as monolithic now). Application and the database would be deployed on a known set of bare metal servers. Vertical scaling was more favorable option for scaling

Simple obstacle avoiding robot using arduino

Building obstacle avoiding robot is a simple & fun way to start learning arduino and electronics. A lot of useful articles explain this, but you will be blocked if you can’t get the same parts in your region. In this post, I’ll explain how to build a simple and minimal robot using the parts available online in India.

Watch the below video to get an idea of what you could build by following this article.

Soft delete and unique constraint

This post describes a robust solution and other alternatives for having unique constraint at DB level for a table with soft deleted rows.

Problem Context

The system identifies the users by their mobile number and hence mobile number must be unique across users. The users are soft deleted in the system by updating column deleted = 1. A new user can register in with same mobile number as previously deactivated user (since mobile numbers are recycled by telecoms). The unique check at application are susceptible to fail in case of concurrent requests, unique constraint is needed at DB to ensure integrity of data.

Working software is worth thousand assurances

In the beginning of a software project, every one starts with a big list of features. If you ask the product owner, what is the minimum set of features that we can go live with? most of the times you’ll hear a lot more than you expected.

It gets lot harder to talk about minimum scope when you are rewriting an existing software. We would usually want to go live with the minimum viable product(MVP) and build reset of it incrementally. If people are new to agile methodologies, these questions on reducing scope might look stupid & annoying. Fortunately there are ways to get these questions answered, I’m sharing my experience from couple of projects in past few years.

Lets start with a story, where we are writing a new version of the popular website.