Domain Centric Product Teams: Pulling a Reverse Conway Manoeuvre to deliver better Software Products

Modern software engineering is oriented towards building networked distributed features for a highly connected and web savvy customer base in varying contexts. Traditional team structures within the enterprise have evolved from technical SME cliques as engineers who “Ate lunch together wrote Software together”

Good product strategy requires thinking about product features are built by engineering teams because Conway’s Law drives the outcome – to build, maintain and change great functional products we need to deliberately fix the team organisation

Conway’s Law and Layered Architecture

Melvin Conway described a law where he states that the parts of a software system are directly proportional to the organisational structure

  • “Organizations which design systems are constrained to produce designs which are copies of the communication structures of these organizations.” – Melvin Conway [1]
  • “If you have four groups working on a compiler, you’ll get a 4-pass compiler” – Eric S Raymond [2]

We can think of the modern teams oriented around the customer and end-systems to deliver integrated products in the following manner (circa 2020)

Thus, broadly speaking, teams orient around technical layers due to the cohesiveness of the “technical domain” expertise & customer’s (software client) needs.

Screen Shot 2020-08-17 at 1.29.17 pm

 

 

If you have been part of  similar teams in the past you must be familiar with the “layered architectural style” that gives rise to this team

SOALayers

 

Technical Teams = Technical Product

While the layered teams do a great job of grouping the teams by technical expertise, business needs for new end-to-end features are contextual, especially in a large enterprise. Things are more contextual in the real-world

Screen Shot 2020-08-17 at 2.52.24 pm

While the layered team structure may work initially for one context, we find adding new features and domain contexts makes it harder to maintain software components across a layer. This leads to slower software delivery due to various reasons

It is becomes increasing difficult for a single team to know and own aspects of a technical layer used in various contexts

Screen Shot 2020-08-17 at 2.59.30 pm

Conway’s law: With layered teams, we build layered software components focussed on the technical correctness vs end-to-end functionality

Techno-functional Teams = Functional Products

Applying the “reverse Conway manoeuvre” – fixing our team organisation to shape our software we break the technical teams into smaller functional teams. This process naturally builds a more “Domain Centric Product team” which can focus on functional needs and stay true to the end-to-end feature

This organisation allows for a closer relationship across technology layers as we group experts in different technical domains to deliver a common outcome and if done well, allows a smallish technical team to own a contextual solution

Screen Shot 2020-08-17 at 3.10.07 pm

 

Pitfalls of Functional Teams

Functional teams are not the perfect solution. Functional teams are highly agile and can own a specific end-to-end solutions through the entire software lifecycle but they operate in their own bubble

When organisations have different “technical practices” within their IT with different software engineering standards (one for the Portal team, one for the Integration team, one for the CRM team etc) it becomes harder to enforce technical standards within a technical context when a member is “outsourced” to another team

Summary

Modern software engineering is oriented towards building networked distributed features for a highly connected and web savvy customer base in varying contexts. Traditional team structures within the enterprise have evolved from technical SME cliques as engineers who “Ate lunch together wrote Software together”

Good product strategy requires thinking about breaking up the technical groups into highly effective functional teams and keeping the “band together”  through  the  lifecycle  of the software  (the end-to-end product)

 

 

 

Better Digital Products using Domain Oriented APIs: The Shopping Mall Metaphor

APIs are the abstractions over technical services. Good APIs mirror strategic thinking in an organisation and lead to better customer experience by enabling high-degree of connectivity via secure mechanisms

Too much focus is on writing protocols & semantics with the desire to design good APIs and too little on business objectives. Not enough questions are asked early on and the focus is always on system-system integration. I believe thinking about what a business does and aligning services to leads us to product centric thinking with reuseable services

History
As an ardent student of software design and engineering principles, I have been keen on Domain Driven Design (DDD) and had the opportunity to apply these principles in the enterprise business context in building reusable and decoupled microservices. I believe the best way to share this experience is through a metaphor and I use a “Shopping mall” metaphor with “Shops” to represent a large enterprise with multiple lines of businesses and teams

Like all metaphors – mine breaks beyond a point but it helps reason about domains, bounded contexts, APIs, events and microservices. This post does not provide a dogmatic point-of-view or a “how to guide”; rather it aims to help you identify key considerations when designing solutions for an enterprise and is applicable upfront or during projects

I have been designing APIs and microservices in Health and Insurance domains across multiple lines of business, across varying contexts over the past 5-8 years. Through this period, I have seen architects (especially those without Integration domain knowledge) struggle to deliver strategic, product centric, business friendly APIs. The solutions handed to us always dealt with an “enterprise integration” context with little to no consideration for future “digital contexts” leading to brittle, coupled services and frustration from business teams around cost of doing integration ( reckon this is why IT transformation is hard )

This realisation led me to asking questions around some of our solution architecture practices and support them through better understanding and application of domain modeling and DDD (especially strategic DDD ). Thought this practice, I was able to design and deliver platforms for our client which were reusable and yet not coupled


Domain Queries 

In one implementation, my team delivered around 400 APIs and after 2 years the client has been able to make continuous changes & add new features without compromising the overall integrity of the connected systems or their data

Though my journey with DDD in the Enterprise, I discovered some fundamental rules about applying these software design principles in a broader enterprise context but first we had to step in to our customer’s shoes and ask some fundamental questions about their business and they way they function

The objective is to key aspects of the API ecosystem you are designing for, below are some of the questions you need to answer through your domain queries

  • What are your top-level resources leading to a product centric design?
  • When do you decide what they are? Way up front or in a project scrum?
  • What are the interactions between these domain services?
  • How is the quality and integrity of your data impacted through our design choices?
  • How do you measure all of this “Integration entropy” – the complexity introduced by our integration choices between systems?

The Shopping Mall example

Imagine being asked to implement the IT system for a large shopping complex or shopping mall. This complex has a lot of shops which want to use the system for showing product information, selling them, shipping them etc

There are functions that are common to all the shops but with nuanced difference in the information they capture – for example, the Coffee Shop does “Customer Management” function with their staff, while the big clothes retail store needs to sell its own rewards point and store the customer’s clothing preferences and the electronics retail does its customer management function through its own points system

You have to design the core domains for the mall’s IT system to provide services they can use (and reuse) for their shops and do so while being able to change aspects of a shop/business without impacting other businesses

Asking Domain and Context questions

  • What are your top-level “domains” so that your can build APIs to link the Point-of-Sale (POS), CRM, Shipping and other systems?
  • Where do you draw the line? Is a service shared by all businesses or to businesses of a certain type or not shared at all?
  • Bounded contexts? What contexts do you see as they businesses do their business?
  • APIs or Events? How do you share information across the networked systems to achieve optimal flow of information while providing the best customer experience? Do you in the networked systems pick consistency or availability?

Summary:

Though my journey with DDD in the Enterprise, I discovered some fundamental rules about applying these software design principles in a broader enterprise context. I found it useful to apply the Shopping Mall metaphor to a Business Enterprise when designing system integrations

It is important to understand the core business lines, capabilities (current and target state), business products, business teams, terminologies then do analysis on any polysemy across domains and within domain contexts leading to building domains, contexts and interactions

We then use this analysis to design our solution with APIs, events and microservices to maximise reuse and reduce crippling coupling

Stateful microservices pattern

What are stateful microservices?

Microservices holding state while performing some longer-than-normal execution time type tasks. They have the following characteristics

  1. They have an API to start a new instance and an API to read the current state of a given instance
  2. They orchestrate a bunch of actions that may be part of a single end-to-end transaction. It is not necessary to have these steps as a single transaction
  3. They have tasks which wrap callouts to external APIs, DBs, messaging systems etc.
  4. Their Tasks can define error handling and rollback conditions
  5. They store their current state and details about completed tasks

Screen Shot 2020-03-13 at 7.52.57 pm

Why stateful?

Stateless microservice requests are generally optimised for short-lived request-response type applications.  There are scenarios where long-running one-way request handling is required along with the ability to provide the client with the status of the request and the ability to perform distributed transaction handling and rollback (because XA sucked!)

So you need stateful because

  • there are a group of tasks that need to be done together as a step that is asynchronous with no guaranteed response-time or asynchronous one-way with a response notification due later
  • or there are a group of tasks where each step individually may have a short response time but  aggregated response-time is large
  • or there are a group of tasks which are part of a single distributed transaction if one fails you need to rollback all

Stateful microservice API

Microservices implementing this pattern generating provide two endpoints

  1. An endpoint to initiate: for example, HTTP POST which responds with a status code of “Created” or “Accepted” (depending on what you do with the request) and responds back with a location
  2. An endpoint to query request state: for example, HTTP GET using the process id from the initiate process response. The response is then the current state of the process with information about the past states

Sample use case: User Signup

  1. The process of signing-up or registering a new user requires multiple steps and interaction looks like this [Command]
  2. The client can then check the status of the registration periodically [Query]

Command

POST /registrations HTTP/1.1Content-Type: application/jsonHost: myapi.org

{ "firstName": "foo","lastName":"bar",email:"foo@bar.com" }
HTTP/1.1 201 Created  
Location: /registrations/12345

Query

GET /registrations/12345 HTTP/1.1Content-Type: application/jsonHost: myapi.org

{ "firstName": "foo","lastName":"bar",email:"foo@bar.com" }
HTTP/1.1 200 Ok  

{ "id":"12345", "status":"Pending", "data": { "firstName": "foo","lastName":"bar",email:"foo@bar.com" }}

Screen Shot 2020-03-13 at 7.38.41 pm

Anti-patterns

While the pattern is simple, I have seen the implementation vary with some key anti-patterns. These anti-patterns make the end solution brittle over time leading to issues with stateful microservice implementation and management

  1. Enterprise business process orchestration: Makes it complex, couples various contexts. Keep it simple!
  2. Hand rolling your own orchestration solution: Unlike regular services, operating long-running services requires additional tools for end-to-end observability and handling errors
  3. Implementing via a stateless service platform and bootstrapping a database: The database can become the bottleneck and prevent your stateful services from scaling. Use available services/products as they optimised their datastores to make them highly scalable and consistent
  4. Leaking internal process id: Your end consumer should see some mapped id not the internal id of the stateful microservice. This abstraction is necessary for security (malicious user cannot guess different ids and query them) and dependency management
  5. Picking a state machine product without “rollback”: Given that distributed transaction rollback and error-handling are two big things we are going need to implement this pattern, it is important to pick a product that lets you do this. A lightweight BPM engine is great for this otherwise you may need to hack around to achieve this in other tools
  6. Using stateful process microservices for everything: Just don’t! Use the stateless pattern as they are optimal for the short-lived request/responses use cases. I have, for example, implemented request/response services with a BPEL engine (holds state) and lived to regret it
  7. Orchestrate when Choreography is needed: If the steps do not make sense within a single context, do not require a common transaction boundary/rollback or the steps have no specific ordering with action rules in other microservices then use event-driven choreography

Summary

Stateful microservices are a thing! Welcome to my world. They let you orchestrate long-running or a bunch of short-running tasks and provide an abstraction over the process to allow clients to fire-and-forget and then come back to ask for status

Screen Shot 2020-03-13 at 8.37.14 pm

Like everything, it is easy to fall into common traps when implementing this pattern and the best-practice is to look for a common boundary where orchestration makes sense

Screen Shot 2020-03-13 at 8.33.59 pm