Microservices with Layers
by Jeff Vroom
StrataCode offers a new strategy for managing your code more efficiently. It reduces the upfront design load on the architect and improves options for incremental updates to scale systems management.
Microservices has become a hot design focus for architects, particularly in recent years. This is no doubt in reaction to the time we've felt has been wasted in managing monolithic code bases. But it seems that we'll only really save time, if we wisely choose the boundaries between pieces. Due to additional complexity of implementing and managing remote interfaces throughout the software lifecycle, it also seems important that we consider patterns that support "design evolution" - where we can start monolithic and gracefully break out parts of our system at the code level, or RPC level as the system evolves.
Programmers use the term "Microservices" to refer to the pattern of software development which emphasises deploying smaller services which communicate with each other using web service APIs (RPC or messaging based), instead of deploying larger services which use internal APIs (i.e. direct procedure calls which require coupling of the code of both interfaces).
There are lots of advantages to smaller services: smaller teams, separate source control repositories, more agile, independently scalable but some pitfalls when you apply this approach: more error checking, runtime overhead, coordination and availability requirements across services, and coordination across teams when interfaces change.
The architect of a large system balances many factors to make the right decision and is faced with high costs when a change is required.
Are the databases of the services easy to decouple, or would two fields benefit from being updated together with transactional integrity? What redundant data will need to be created and how often will it need to be updated? How will we manage integrity or mediate conflicts that might arise?
How easy will it be to test the API in an automated way so we can reliably support it without running all of the code that uses it? Will the services API need to change in an incompatible way? Are two candidate services destined to upgrade in lock-step more often than not due to feature overlap? Do we really need a remote service boundary between these services, or is a normal code-module boundary we can use with separate source-trees to provide a more efficient application with the same team separation? Will the API boundary be used too often for efficient RPC? Will there be too much data copying? How will failures in one service be handled in the other?
In support of Microservices, consider how much benefit of scale will be achieved by having independent development streams of the two services - i.e. separated in such a manner that they could be operated at arm's length from each other. For some applications, even though there's more work to build them as separate pieces, it's an important design boundary that should be enforced for maximum agility. Like separating a payment service which has different integrations and SLAs.
But if you split tightly coupled logic across services, you will spend more time managing them as separate services than as one. Splitting a database into two pieces without a good reason could just add more overhead for each insert, update and more potential data integrity problems.
These are not easy decisions, leading to a design challenge to pick the right pieces and design the interfaces carefully. It's more difficult when you are starting a new project and can't answer these questions with actual domain model experience. In that case, it would be fastest to just write code so we get it working quickly, knowing we have an easy way to repackage later based on experience. We could start out with services based on different classes and then easily run them as separate services with a seamless transition. We could also split the git repository into pieces based on how that module was evolving or who needed to manage it. We could do it all as a change to an internal project, splitting one published layer into a few layer parts, all without breaking any upstream dependencies or changing API contracts. When you learn to think of code in terms of layers, sharing types across prosses boundaries, and augmenting RPC with 'synchronization', you get these benefits in how you build and evolve systems.
Evolutionary Code Management
I'm the kind of programmer, that when starting on a new project, wants to deliver 'working code' for the core business-domain model as quickly as possible. Let the stakeholders assess and improve it when it's simple and still in the rapid development stage.
I'm also the kind of programmer that never wants to interrupt the incremental flow of business driven changes for an 'architectural rewrite'.
So let me show you how StrataCode has helped me reconcile these two demands in myself.
The prototype starts out in a single git repo, running in a single process, maybe using the file system for save/restore just to get things going. Once we have solidified the domain model somewhat, we'll build out a database schema and storage strategy and start the automated tests. If there's some obvious service that should be decoupled, we'll move that code into a new set of layers that we can optionally configure to run in a separate process.
When StrataCode builds these two processes now, it detects all of the methods which cross process boundaries and injects a "remoteMethod" call, or it may detect new errors for calls to remote methods with objects that are not serializable or the runtime does not support synchronous remote methods. Once we've cleaned those errors up, the code should just work without changes. And we have a quick way to inspect the service API and adjust the code to clean up the protocol. We'll run the unit tests both in local and remote mode, so we can detect when there's a problem caused by the remote api.
Eventually we add more developers and create a separate team for the remote service. Now it's just a matter of moving those layers into the new git repo and including that in the list of repos required for the application. When programmers do a new sync of the application, they are in sync automatically.
Down the road, when major feature changes are required - perhaps an incompatible domain model change is required, you can use layers to isolate the deprecated types, properties, and methods. Again, no contracts change but now you can separately build the old compatible version, and the new incompatible version separately. Clients can opt-in to the new version. Static typing would let you discover any references to the old version quickly. Code processing could even be used to find and alter references to code that refers to incompatible features. Until the migration is complete, you can dispose of the deprecated layer and any tests that refer to it.
More Freedom with Layers
Layers give you this freedom, but to use them effectively requires thinking in a new way. Fortunately, you can start with what you know - use one layer or one layer for each module. You should only use layers when dependencies get more complicated or when things need to be refactored. In the early stages of development, it's fastest to build things in as few pieces as possible - e.g. one file per class, or even everything in one file. As it evolves you split pieces out to avoid any one piece from getting too large. You keep a piece of logic together if it implements a logical unit - things that belong together because they are created or updated at the same time, and/or because the code is likely to be changed at the same time. Properties stay near the methods which manipulate them for example. Parts of the code which have the same dependencies will also be near each other in the file in this approach when different parts of the same type have different structural dependencies (e.g. UI and database).
When one file is getting too large, you can split it into pieces that helps you improve efficiency, readability, reusability etc. You might split off a layer to reduce conflicts between developers in source control, or to evolve a specific part of a class using a different git repo with different security policy (e.g. license keys, etc.)
When a Microservice is Needed
When a remote service boundary in your code is needed, you have to review the APIs and make sure they are easily remoteable. This means the argument types can be serialized. After that's done, you reconfigure the layers so they are split into different processes. Now your application stack of layers is separated into two overlapping runtime layer stacks. Since StrataCode has both stacks at compile time, it can detect the remote method calls - those that cross the process boundary by calling a method that only lives in the other stack (or is marked via an annotation to be run in a specific process). It modifies the code appropriate based on the frameworks used. For Java to Java, this turns into a method call like: DynUtil.invokeRemoteMethod(...). For JS, your remote method calls must be in a data binding expression, so they can be run asynchronously. As part of the synchronization protocol, any arguments referenced in the remote call are lazily serialized before the call itself. If the object was received from the remote side or already sent, it's sent by reference using it's id. For basic uses cases, domain model code can be independent of the remote procedure boundary. For more complex use cases (e.g. flexible paging of list oriented data structures), we can improve the declarative sync model, or you can fall back to using RPC when a custom pattern is actually needed..
For example, many applications have a "User" which stores data about a user in the system. It would be logical to split the user's profile data (i.e. address, name, etc.) from the behavioral history data (e.g. the last time they logged in, the last pages they visited, etc.). Now there's probably some code in our User file which is used by both pieces. Let's say the user's id and login name. We can split this into three layers: user.core, user.profile, and user.history. When we combine all three, we have the same code we started with, but when we run core and profile in one process and core and history in another, we have a simple micro-services deployment.
If the two need to communicate using APIs, we can detect that it's a remote method call (i.e. it's calling a method that's part of the same bundle of layers, but not in the same process) and hide that notion from the program if the client can use synchronous remote calls. If not, they can put the remote method call in a data binding expression, or code it up to process the results in a callback. Either way, we can still run that code using a local method call so we can still provide an incremental way to run both configurations.
To deploy microservices, it's common to have a central dispatcher service which delegates the service call to the appropriate service. Because StrataCode knows the entire process connectivity diagram at build time, framework layers can be easily added to generate the configuration for these dispatchers so it stays in sync with the current configuration.
Let's say that today you have a huge code base that is only deployable as a monolithic service and you'd like to break it up into pieces. While you could start making major surgery to the code, you'll be touching so many files and the timeframe will be large that you'll need to do it on a branch. In this scenario, if you have "many changes" going into the code base you are refactoring, you might never complete your project or suffer a major hiccup during the transition.
With StrataCode's robust, type-safe splitting and merging, you can use the IntelliJ plugin, or just your regular old editor to break apart your code into layers by cutting and pasting fields, methods, etc back and forth. You can compare the merged output with the original each step of the way. You'll know immediately when you have made two layers independent and can collect the set of APIs that they use from each other. All along the way, you are building the original system so there's no need to branch. Before you launch, run some A/B tests between the old and new systems to roll it out slowly and cut over when the new system works better.