As the title of this post suggests, we will look into how consistent hashing can be used to achieve the above objectives. Before that, let’s look at using a straight forward approach for solving the problem.
In this approach, we hash the requests based on a key and use the formula hash(key) % number_of_servers
to route the request to appropriate cache server. For example: If a key “apple” hashes to 14 and we have 3 cache servers, the request for “apple” will be forwarded server number 2 as 14 % 3 = 2
.
Let’s simulate this for 3 cache servers, 100 unique keys, 300 random requests and see how it performs.
Click the play button below and check the stats. Increase the speed to fast forward.
Let’s analyse the results
Modulo hashing works well for a fixed number of servers. But in many cases, we need to add or remove servers as per the variation in traffic volume. And, servers can crash sometimes. Let’s simulate the following dynamic nodes scenario with modulo hashing and see how it performs.
Let’s analyse the results
This is because, many keys are mapped to a different server when the number of servers change. For example: a key “orange” with hash value 11 is initially routed to server S2
when there are 3 servers(11 % 3 = 2
), whereas it is routed to server S3
when there are 4 servers(11 % 4 = 3
). This leads to ineffective use of cache.
Consistent Hashing has a different approach to address the drawbacks of the modulo hashing with dynamic nodes. Let’s start with the basic concepts of consistent hashing.
S1
maps to 90, it will be placed as point at 90 degrees on the circumference of the circle.Let’s play this simulation in 0.5x speed and visualise this basic concept.
Now that we understand the basic concept, let’s run the simulation and observe the stats for 3 servers, 100 unique keys, 300 random requests. (Please increase the simulation speed when required to fast forward to the final stats)
We can observe that, cache hit ratio and load distribution is very similar to that of modulo hashing. This is expected as the algorithm behaves almost the same for fixed number of servers.
Let’s see how this basic concept of consistent hashing handles the addition and removal of the nodes using the below scenario
Let’s analyse the results
S4
doesn’t get many requests due to its proximity to node S3
.Consistent hashing solves this load distribution problem by placing each node at multiple points on the ring. These points are called as virtual nodes. For example, if we need to represent node S1
as 4 points on the ring, we place virtual nodes S1-1
to S1-4
on the ring using same logic as earlier. This allows multiple small fragments of the ring to be mapped to a single node.
Let’s simulate the previous elastic nodes scenario with 12 virtual nodes per node.
Let’s analyse the results
S4
gets a fair amount of traffic compared to earlier. This is because node S4
is mapped multiple fragments of the ring, increasing its chance of fair share of traffic.If you would like to simulate your own scenarios, please modify this JSBin code and run own experiments.
Consistent hashing has proven to be a useful technique since its inception in 1997 and it is used in many well known distributed systems because of the simplicity and the benefits it offers. The optimization of consistent hashing does not end with what we have read so far. For example: checkout this blog or the video by vimeo engineering on their practical usage and adaptation.
Meta: You can find the the code used for the above simulations here.
I had tried to learn clojure few times in the past and had dropped it because of the prefix notation for expressions and parentheses black-hole. Writing (def y (+ (* m x) c)
felt very weird after expressing it as y = m * x + c
during many years of education. I wasn’t alone, many of my colleagues had similar feeling about clojure and pure functional programming languages.
Some of us attended a training by clojure evangelist and expert who recommended to keep an open mind in the beginning and learn the concepts by solving few basic problems using clojure. As part of this exercise, we wrote a function to find factors
of a number which made me curious about extending it to find prime-factors
of a number. My mind automatically started thinking of imperative programming approach and translating it to functional style in clojure.
After few minutes of dabbling, it seemed like a very hard(close to impossible) problem to solve in clojure. It made a dent in my confidence and I needed a fix. I decided to write it in imperative style first and quickly came up with below python program to regain part of my confidence :)
1 2 3 4 5 6 7 8 |
|
As it was close to end of the day(& week), my mind dropped this problem there. But my subconscious mind hadn’t let go of this problem and started giving me this hint at the end of a good night sleep. I needed to breakdown the problem into smaller abstractions i.e prime factors of number is smallest prime factor of the number and prime factors of the quotient. This is the code I wrote in clojure after this sudden flash of thoughts
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
This made me realise, the imperative code I had written earlier was more complex to digest the concept of prime factors when compared to the clojure code. If I have to write this function again in any other language, I would definitely break it down to abstractions defined in the clojure function above. In a way, clojure was making it harder to write bad code!
Well, this was my #cloju-re-alization. What is yours? Leave a comment below or write your own blog post and share
]]>The preventive measures include, monitoring -
In our case, DB backups are uploaded to Azure blob storage(similar to AWS S3) and prometheus is used for monitoring
High level design
latest_file_timestamp
and latest_file_size
for each blob container where backup files are uploadedcurrent_time - latest_file_timestamp > backup_interval
or latest_file_size < expected_backup_file_size
As we couldn’t find any existing exporter, we wrote prometheus-azure-blob-exporter to capture following metrics
1 2 3 4 |
|
Alerts are defined as
1 2 3 4 5 6 7 8 9 |
|
1 2 3 4 5 6 7 8 |
|
Please checkout the github repo for more details
]]>Monitoring short lived cron jobs are not straight forward compared to monitoring long running services like web services. These are few well-known mechanisms for alerting cron job failures
If you have a CI server like jenkins in your infrastructure, one of the approach that has worked well for us is - create a scheduled job in jenkins with slack/email notifications on failures.
Advantages of this approach
Whether you use scheduled jobs in crontab or jenkins, you shouldn’t depend only on the job’s exit status for determining success - [Refer]. It is important to have alerting based on expected state of the system after job execution. Example: Timestamp of latest backup file uploaded to backup storage, minimum size of the backup file etc
UPDATE: Checkout Monitoring DB backups using prometheus for more details on monitoring DB backups
]]>Prior to this, we were deploying services on VMs using ansible. For a new project, we wanted to explore benefits of running service as containers and decided to use docker swarm due to its simpler setup and consistency with docker engine APIs
Using container orchestration engine to run services provided following benefits
Docker swarm specific
Overall, using a container orchestration engine in production has proven to be very productive and useful. Docker swarm is maturing over time, with improved stability it is a promising platform for running containers in production
]]>1 2 3 4 |
|
So the question could be
1 2 |
|
To figure out these series, my approach was to start with some series x and find a pattern which doesn’t hold good after few numbers. After few unsuccessful attempts, tried my luck with series of squares
1
|
|
VoilĂ ! difference of consecutive squares were not only a odd number, they were consecutive odd numbers. Tried it for a long series and found it true for all the squares!
1
|
|
From what I knew of squares of numbers, this wasn’t very obvious. I was very excited to check if this is already known or I was the first one to discover this :). Simple web search presented ton of articles on this observation.
That was disappointing but reading this link about a mathematician appreciating a student’s effort in similar finding was comforting and encouraging to carry on with my quest
]]>This was the era where you would have single deployable unit for whole application (labeled as monolithic now). Application and the database would be deployed on a known set of bare metal servers. Vertical scaling was more favorable option for scaling
Nagios was(probably still is) the widely used open source monitoring software in this era. You would configure a list of known severs the monitoring system needs to probe to determine the health of the system. This system was simple to configure & operate
Image Credit: https://support.nagios.com/kb/article.php?id=141
Managing monolithic applications became painful as the businesses grew. Service Oriented Architecture(SOA) became mainstream in this era. Cloud services like AWS made it very easy to launch new VMs for deploying services. Configuration management tools like puppet, ansible made it easy to deploy applications across large number of servers. Immutable server pattern was evangelized by well known companies.
Horizontal scalability started becoming more favorable option for scaling. One could bring up new instances of service by launching VMs from a image(eg: AMIs) and bring down VMs easily depending on the load on the system. This deployment architecture demanded a monitoring system which can handle this dynamicity.
Monitoring systems like Sensu solved this issue with a different architecture style. Instead of a central server probing a static list of servers, sensu had a publish & subscription model using RabbitMQ. The monitoring agent running on the application server subscribes to monitoring check messages relevant for the service and pushes the results via same messaging system. Monitoring server itself was horizontally scalable to handle varying load
Image Credit: https://sensuapp.org/docs/1.0/overview/architecture.html
Micro services are becoming mainstream now. Deploying stateless services as containers has made the concept of immutable servers very easy and efficient compared to using VMs. Container orchestration engines like kubernetes, docker swarm have made it easy to run and manage large number of services running as containers inside a cluster of servers. These orchestration engines also provide benefits like auto healing(restart on failure), easy to scale, in-built service discovery & load balancing.
Monitoring systems like sensu doesn’t fit well in this setup. Running a monitoring agent alongside the application process in a container adds additional complexity in containers. As service discovery is provided by container orchestration engines, there is no need for adding complexity of running messaging system for solving discovery problem
During the time when Google open sourced kubernetes(most popular container orchestration engine as of now), a new monitoring system - Prometheus (built by ex-googlers in soundcloud) started gaining lot of traction. Prometheus leverages service discovery mechanisms for registering services to be monitored. It has a lot simpler setup and smaller resource requirements compared to system like sensu. Prometheus factored the containers ecosystem and fits very well for the job. The scalability argument against pull model of monitoring was also addressed by the authors of the system.
Image Credit: https://prometheus.io/docs/introduction/overview/
If you are building a new system with architecture patterns and deployment strategies of this era, Prometheus is a leading choice among open source monitoring systems
PS: There are lot of good things(and few limitations) about prometheus which deserves separate blog post :)
]]>Watch the below video to get an idea of what you could build by following this article.
We’ll be using a sensor to detect an obstacle in front of robot. Depending on the sensor input we’ll control the motor wheels of robot to either move forward or turn aside.
Follow the steps in the order below. If you get stuck at any point, refer to the troubleshooting section.
If you are new to arduino, try out few basic examples
I couldn’t get a fritzing diagram for this circuit since I don’t have the SVG for the part purchased. I’ll list the connections needed to get it running
The connection should look somewhat like the image below
Please refer to images in purchased website for additional technical details
loop
function initially tests the move_forward
function. Check both wheels are moving in forward direction.loop
function to and replace move_forward
with other methods like drive_backward, turn_left, turn_right to test they work as expectedUpload the below code which has logic to move the robot depending on distance of the obstacle. Feel free to change the obstacle distance or delay or the rotation angle as per your motor speed
These are optional enhancements that I wanted to try but didn’t an opportunity to try
The code and fritzing diagram is also shared on github repo
The system identifies the users by their mobile number and hence mobile number must be unique across users. The users are soft deleted in the system by updating column deleted = 1
. A new user can register in with same mobile number as previously deactivated user (since mobile numbers are recycled by telecoms). The unique check at application are susceptible to fail in case of concurrent requests, unique constraint is needed at DB to ensure integrity of data.
We were able to find different flavors of solutions on net but they were incomplete for our case. They only served as starting point to a solution that meets all of our needs mentioned above.
deletion_token
mobile_number, deletion_token
deletion_token
. This is ensured by setting up default value of NA
at DB level and having constructor of User model(used by ORM) to initialize deletion_token
to NA
by defaultdeletion_token
Add unique constraint for columns mobile_number, deleted
Drawback: This wouldn’t allow us to have more than one deleted user with same mobile number
Add a unique constraint with a where clause eg: ADD CONSTRAINT .... WHERE deleted != 1;
Drawback: The where clause in constraint definition is not supported by all databases
Instead of using only 0 or 1 as values for deleted column, increment the number on each delete. Drawback: Expensive as it needs extra db call to retrieve previously soft deleted rows and also expensive to update numbers for existing soft deleted rows in legacy system. It would theoretically fail for concurrent requests without lock.
Add a new time-stamp column called deleted_at
and add an unique constraint on mobile_number, deleted_at
Drawback: The old rows in legacy system didn’t have data for deleted_at
and populating with dummy data wasn’t acceptable.
Add a new column called deletion_token
and add a constraint on mobile_number, deletion_token
with NULL value for new rows and UUID for soft deleted rows.
Drawback: Few databases don’t consider nulls as equal and hence unique constraint does not fail for two rows with same mobile number and NULL value in deletion_token
Slight modification to point 5, to arrive at the final solution described in the beginning of the post
It gets lot harder to talk about minimum scope when you are rewriting an existing software. We would usually want to go live with the minimum viable product(MVP) and build reset of it incrementally. If people are new to agile methodologies, these questions on reducing scope might look stupid & annoying. Fortunately there are ways to get these questions answered, I’m sharing my experience from couple of projects in past few years.
Lets start with a story, where we are writing a new version of the popular website.
Team: According to the stats, features X & Y used only by 5%. Can we deprioritize it for first release? We can redirect to old site for users who need them. Product owners: No! Everything goes live or nothing does.
Start with implementing walking skeleton[1] [2] to validate the approach. After first few iterations showcase,
Team: We have the thin slice of end to end user journey. This is how it works. Product owners: Wow, thats great. Whats left then? Team: This one does not handle some of these rare scenarios. Lets prioritize what is needed.
Prioritize the backlog to do most important features first. Few weeks before planned release for all features, ask the question again
Team: We have everything except feature X & Y. It would take another month for implementing X & Y. Can we go live with users who need X & Y being redirected to old site? Product owners: Lets go live!! We need to go there before our competitors
You can’t get all the answers in the beginning. Prioritize your backlog to do the most important features first. Ask your unanswered question(s) again after each milestone, you’ll be surprised to see how easy it is to get the answers this time.
It is hard for people to understand the benefits of agile methodologies, kanban etc when it is just theory based on your past experiences which they can’t relate to. Build something small & tangible, show the working software to build their confidence.
method_missing
you need to make sure to implement respond_to_missing?
, otherwise bad things will happen to you. The below ruby example shows minimal parts recommended for providing dynamic methods
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
The same can be achieved in python using __getattr__
with lesser code.
1 2 3 4 5 6 7 8 9 10 |
|
Eventhough it isn’t a lot of code in ruby, people can forget to implement both methods or implement them differently by mistake leading to tricky bugs and higher maintenance cost.
May be not. It is because of the fact that ruby functions are not first class objects which can be returned in a single method_missing hook. Also ruby’s syntax of calling a method without parenthesis(i.e. foo.bar_qux
is same as foo.bar_qux()
) makes it hard to treat functions as callable objects.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
1 2 3 4 5 |
|
1 2 3 4 5 6 7 8 |
|
Test Driven Development
]]>The app objects are natural extension of page objects recommended for writing functional tests. An app encapsulates all pages and coarse level actions in the app.
An simple implementation of app class would be
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
The test using this simple implementation would look like
1 2 3 4 5 6 7 8 |
|
The test code is structured as
apps registration app.rb patient_page.rb visit_details_page.rb clinical app.rb patient_search_page.rb features new_patient_visit.rb framework app.rb # Base class for other apps page.rb # Base class for other pages
We are using capybara and Rspec. The lambda syntax, meta programming constructs in ruby and convention based programming allowed us to implement a DSL shown below. The complete code can be found here
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
I didn’t into details of implementing the DSL. If people are interested, I can write a part 2 of this post.
]]>Couple of bad ways to achieve this is by using sleep(xSeconds)
or wait_for_ajax
like this
1 2 3 4 5 6 7 8 9 |
|
The sleep(xSeconds)
makes your tests indeterministic and wait_for_ajax
makes it dependent on javascript framework used in application.
Frameworks like capybara have implicit wait mechanism which eliminates the need for wait_for_ajax
1 2 |
|
But what happens when
In these cases, don’t go back to wait_for_ajax
or sleep
solutions. Instead of depending on technical details of the app, you need to think in terms of
How does the user know app is done loading or saving data?
This will give you a hint on any missing usability requirements in the app.
When you think like user, you will realize there must be a visual clue in the app to indicate the progress. The functional tests should also depend on this indicator (like a spinner, overlay etc) to figure out when to assert on data.
1 2 3 4 5 6 7 8 |
|
When your tests user centric, they provide valuable feedback on the user experience.
]]>If you are using MVC framework you need to make sure controllers are very thin and domain logic lies in small, framework independent, composable models - Wise People
In AngularJS, you need to make sure lot of data is not defined directly on $scope and domain logic is not dependent on angular’s digest cycle. If follow this mantra, unit testing the models would be a lot simpler which in turn is a indicates that your code in is good shape.
Alright lets get to some code. Let’s consider a simple example where we have a form to capture person’s information such as firstName, lastName, age or dateOfBirth. The age or dateOfBirth should be auto populated based on its counter part.
1 2 3 4 5 6 7 8 9 10 |
|
If you want to test the logic to compute fullName or age<->dateOfBirth logic, you will have to use angular-mock and inject $scope in your tests. This leads to lot of unnecessary boilerplate code. Lets look at how to refactor this code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
Now you can simply instante a person object and test the fullName method.
In this step we will use Object.defineProperty, ES5 API which works on most of the browsers
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
After this step, your domain logic can be tested without having to use angular-mock or injectors etc.
One of the boasted feature in AngualrJS is using POJOs for data binding, compared to using special observables or models in knockout, ember etc. If this is one of the reason you are using AngularJS, it is very important to make sure your domain logic doesn’t leak into controllers.
]]>eval()
or new Function()
to contruct the function. The basic difference between these two are
eval()
works within the current execution scope. It can access or modify local variables.new Function()
runs in a separate scope. It cannot access or modify local variables.These samples show how the json would differ in these two cases
1 2 3 4 5 6 7 8 9 |
|
1 2 3 4 5 6 7 8 9 |
|
You can use either based on the use case in your application. In bahmni, we went with new Function()
for couple of reasons
If you prefer using eval
syntax, try vkiryukhin/jsonfn.
The above examples work fine for sinle line expressions. If you need multiple line functions, you need to tweak it a bit. Firstly, JSON does not support multiline string. The work around is to define an array of strings as shown below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
There is not much difference in performance when you define a function using function() {}
expression eval(‘function() {}’)
or new Function()
. Have a look at this benchmark using jsperf.
To solve this, one might add an option in interceptor to not show spinner for certain calls. This leads to complicated code due to initial wrong assumption. A better soultion is to have simple reusable code to show/hide spinner and use it explicitly for calls which need spinner.
If you are using a library which returns a promise for ajax call(or object like xhr returned by jQuery.ajax), the API and implementation would look like this.
1 2 3 |
|
1 2 3 4 5 6 7 8 9 10 |
|
If you have a multiple components of the page using same spinner, we need to enhance the the code to make sure spinner is hidden only after both components have completed async calls. This can be implemented by keeping spinner count as shown below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
In Bahmni, we use a spinner with animation which can be found here.
]]>In Bahmni EMR we needed to support customizable html templates for printing patient regisatrtion card and other printable documents. We needed a print API which looks like
1 2 3 4 5 |
|
Sample html template
1 2 3 4 5 6 7 8 |
|
As the app is built using angularjs, we decided to use angular as the templating engine for rendering these templates as well. This also helped to reuse filters and other templating features of angular. The implementation consists of following steps
The code for print function looks like this
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
The complete solution is available on github.
]]>AngularJS does not raise any event to notify this. The suggested simple solution is to use $timeout to queue your work to be run after current digest cycle (also waits for DOM renedering to be completed by the browser).
1 2 3 |
|
The above solution works only for views which don’t have ng-include or directives with template url. In this case you have to wait for all the templates to be loaded(async) and then run your code. This can be achived by waiting for $http.pendingRequests to be zero. The enhanced solution is
1 2 3 4 5 6 7 8 |
|
Hopefully AngularJS will come up with a easier solution in future releases.
The $http.pendingRequests supposed to be used for debugging purpose only. If angular team decides to remove this, you can implement the same using http interceptors as suggested in this link.
]]>1 2 3 4 5 6 7 8 9 10 11 12 |
|
The issues with above solution:
The first issue can be addressed by using an iframe instead of new window. The 2nd and 3rd issues are addressed by making sure print happens after page has loaded css files and images. The working solution we use in Bahmni for printing patient registration cards looks like this
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
For printing contents of an element you can use this
1 2 3 4 5 |
|