In the architectures I usually work with, concurrency issues arise when multiple users view and edit the same information at the same time, and when multiple external systems call the same services simultaneously.
The goal is to allow maximum concurrency as far into all flows as possible, because this helps achieve higher performance, increase capacity and better scalability.
Avoiding collisions in concurrency requires protecting critical sections using various programming techniques like locks supported by the operating system running on the servers.
Some technologies automatically protect critical resources against corruption due to concurrency.
An example of this is database software that may be designed or configured to automatically lock records while writing date to tables. If parallel threads attempt to update the same record at the same time, only one will succeed, and all others will fail.
Even though built-in locks protect data integrity, you still need to catch and handle the failing transactions. You shouldn’t push the problem back to the calling external system, expecting them to happily retry their call later.
Either catch these errors and retry the pending transactions on your calling external system’s behalf.
Or even better, implement locking before calling the subsystem that could potentially fail due to internal locking. Doing this essentially establishing queuing in front of critical sections, which minimizes delays due to locking situations.
Resorting to retrying of steps blocked by locking always involve waiting time between reties. This waiting time ends up being wasted because the blocking lock is likely to be released long before the next retry takes place.
Retries can’t be allowed to happen at too short intervals, because retrying comes with an added overhead.
Ideally, you should use both techniques. Protect critical sections using locks, and catch timeous and retry them.