Stateful microservices pattern

What are stateful microservices?

Microservices holding state while performing some longer-than-normal execution time type tasks. They have the following characteristics

  1. They have an API to start a new instance and an API to read the current state of a given instance
  2. They orchestrate a bunch of actions that may be part of a single end-to-end transaction. It is not necessary to have these steps as a single transaction
  3. They have tasks which wrap callouts to external APIs, DBs, messaging systems etc.
  4. Their Tasks can define error handling and rollback conditions
  5. They store their current state and details about completed tasks

Screen Shot 2020-03-13 at 7.52.57 pm

Why stateful?

Stateless microservice requests are generally optimised for short-lived request-response type applications.  There are scenarios where long-running one-way request handling is required along with the ability to provide the client with the status of the request and the ability to perform distributed transaction handling and rollback (because XA sucked!)

So you need stateful because

  • there are a group of tasks that need to be done together as a step that is asynchronous with no guaranteed response-time or asynchronous one-way with a response notification due later
  • or there are a group of tasks where each step individually may have a short response time but  aggregated response-time is large
  • or there are a group of tasks which are part of a single distributed transaction if one fails you need to rollback all

Stateful microservice API

Microservices implementing this pattern generating provide two endpoints

  1. An endpoint to initiate: for example, HTTP POST which responds with a status code of “Created” or “Accepted” (depending on what you do with the request) and responds back with a location
  2. An endpoint to query request state: for example, HTTP GET using the process id from the initiate process response. The response is then the current state of the process with information about the past states

Sample use case: User Signup

  1. The process of signing-up or registering a new user requires multiple steps and interaction looks like this [Command]
  2. The client can then check the status of the registration periodically [Query]


POST /registrations HTTP/1.1Content-Type: application/jsonHost:

{ "firstName": "foo","lastName":"bar",email:"" }
HTTP/1.1 201 Created  
Location: /registrations/12345


GET /registrations/12345 HTTP/1.1Content-Type: application/jsonHost:

{ "firstName": "foo","lastName":"bar",email:"" }
HTTP/1.1 200 Ok  

{ "id":"12345", "status":"Pending", "data": { "firstName": "foo","lastName":"bar",email:"" }}

Screen Shot 2020-03-13 at 7.38.41 pm


While the pattern is simple, I have seen the implementation vary with some key anti-patterns. These anti-patterns make the end solution brittle over time leading to issues with stateful microservice implementation and management

  1. Enterprise business process orchestration: Makes it complex, couples various contexts. Keep it simple!
  2. Hand rolling your own orchestration solution: Unlike regular services, operating long-running services requires additional tools for end-to-end observability and handling errors
  3. Implementing via a stateless service platform and bootstrapping a database: The database can become the bottleneck and prevent your stateful services from scaling. Use available services/products as they optimised their datastores to make them highly scalable and consistent
  4. Leaking internal process id: Your end consumer should see some mapped id not the internal id of the stateful microservice. This abstraction is necessary for security (malicious user cannot guess different ids and query them) and dependency management
  5. Picking a state machine product without “rollback”: Given that distributed transaction rollback and error-handling are two big things we are going need to implement this pattern, it is important to pick a product that lets you do this. A lightweight BPM engine is great for this otherwise you may need to hack around to achieve this in other tools
  6. Using stateful process microservices for everything: Just don’t! Use the stateless pattern as they are optimal for the short-lived request/responses use cases. I have, for example, implemented request/response services with a BPEL engine (holds state) and lived to regret it
  7. Orchestrate when Choreography is needed: If the steps do not make sense within a single context, do not require a common transaction boundary/rollback or the steps have no specific ordering with action rules in other microservices then use event-driven choreography


Stateful microservices are a thing! Welcome to my world. They let you orchestrate long-running or a bunch of short-running tasks and provide an abstraction over the process to allow clients to fire-and-forget and then come back to ask for status

Screen Shot 2020-03-13 at 8.37.14 pm

Like everything, it is easy to fall into common traps when implementing this pattern and the best-practice is to look for a common boundary where orchestration makes sense

Screen Shot 2020-03-13 at 8.33.59 pm