Let’s say you are using some kind of application performance tool (HP Diagnostics, New Relic, etc) and you see a graph that looks like this:
What do we know by looking at this? That the reason the request is taking 20 seconds is due to four linear, synchronous http requests. I see this kind of thing frequently when assessing the performance of customer applications, where subsystems are tested and considered to have acceptable performance in isolation. Then someone comes along and wants to tie several services together and doesn’t stop to reflect upon how 4 synchronous calls of 4 seconds each is automatically 16 seconds. How do you fix this? You have two basic choices:
1) Find a way to significantly improve the performance of each subsystem.
2) Find a way to call the subsystems asynchronously such that the overall execution time is reduced.