Microservices – Handling Distributed Data

Hope you are safe in this pandemic.

Since from few years, I’m working on Microservices and related ecosystem. Though to write down a series of blogs based on my understanding and experience. As continuation to my previous post related to Cloud Native way of application development, we talked about what is mean by cloud native and designing microservices based on 12-factor principles.

In this post will see one of pain point of distributed data, as we have seen design pattern database per microservice. It will be challenging to address the ACID nature if transactions and so integrity of data.

We were looking into e-commerce web app where we do have Pricing service, Product service and Order service and these services communicate with each other to fetch the data e.g. data related to Price or Product information required while creating a order. This needs to ensure that transactions should complete as single unit or in case of fault/exceptions should rollback the transactions.

but why One Database Per Service ?

  • easy to scale and maintain the domain data
  • schema can be changed without impacting the other services (entire application as compared to monolithic)
  • database or service failure will not impact other services (will see how)
  • one application can use multiple database or store solutions (like Relational database, NoSQL etc.) base on workload, data volume, read/write operations

Lets have a look on below techniques/common patterns to handle distributed data challenges:

  1. Direct Http Call
  2. Materialized Views
  3. Saga Pattern
  4. Command Query Responsibility Segregation – CQRS
  1. Direct Http Call: one microservice can call another one using direct HTTP call but its said that synchronous HTTP calls couple microservices together.
  2. One way to overcome this nature is to use Materialized View. So that order service can you local replica of product catalogue plus pricing details. If so it also reduces dependency on Product catalogue and Pricinig services. It also reduce the network calls between services. having local copy of this data makes Ordering service more resilient if either of the services ( product or Pricing) not available.

This requires materialized view needs to be sync with all latest product and pricing data. Data needs to be replicated across services as each service has its own database. Remember replication of data is not anti-pattern in microservices architectural style. Every time this read models need to be updated if record is changed. This leads to eventual-consistency of data and synchronization of data can be achieved through publish/subscribe model.

For large volume of data this become inconvenient task to perform or this eventual consistency approach does not fit for some business transactions e.g. account balance credit or debit etc. e.g. if customer purchased items then her credit card limit or account balance need to be updated immediately and latest balance should be available for subsequent transactions/checks.

3. Saga Pattern: I believe this is most useful pattern where business transactions propagated/spans through multiple services, i.e. distributed transactions where local ACID transactions can’t be applied we should consider Saga pattern.

In Saga Pattern, a developer is responsible for handling data consistency programmatically. It’s implemented by grouping local transactions together programmatically and sequentially invoking each one.  If any of the local transactions fails, the transaction been aborted by calling rollback/undo set of transactions to restore data consistency.

There are two ways of coordination sagas:

  1. Choreography – each local transaction publishes domain events that trigger local transactions in other services e.g Amazon SQS, Azure service Bus
  2. Orchestration – an orchestrator (object) tells the participants what local transactions to execute.

4. Command Query Responsibility Segregation (CQRS): This pattern used and helps to increase system performance across distributed transactions having large volume and reporting needs.

In this pattern, normally read (Query) operation queried against the read-replicas which are nothing but highly denormalized database for read operations. Its eliminates joins and nested queries and so table locks and improves performance.

The write operation, known as a command, would update against a fully normalized representation of the data that would guarantee consistency. Here also need to publish event to sync read tables whenever write operation modified data.

Most of time write operations are clubbed together as events and stored ( in event stored) and then at final submission the data is written to persistent store and events get published to updates other data stores. This is knows as Event Sourcing.

Next will look into some more patterns related to microservices communication.

References:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s