Service Fabric: Stateless Services

Prelog

I admit initially everything about stateful services was counter intuitive.  Storing data with compute node contradicts what we have been hardwired to do for the last couple of decades specially on Microsoft powered platform. We are trained to isolate the persisted data away from compute in a highly available persisted storage either at the block level by a SAN backed clusters such as Sql Servers or at the byte level such as Azure storage be it tables or blobs. Service Fabric currently – as far as i can tell – is the only platform that works in this way (data are stored with compute processes).

Stateless services however is what we have been used to. It is a code we build activated one or multiple times in process instances on one or more server machines. Because there is not persisted state; once the process stopped or crashed all the process memory scrapped and there is no data stored anywhere.

Service Fabric Powered Applications Categorized by Lifetime

If you think about Service Fabric Powered Applications from a perspective who controls their life time you will end up with the following:

  1. Applications that you as a the developer control their life time they usually start and once deployed and keep running until un-deployed or moved around on the cluster as it is fluid state change (as a response node availability changes, load re-balance etc..). Those are Service Fabric Services either Stateful or Stateless.
  2. Application that the caller/client control their life time. They are started (the correct term is: Activated) when the first call arrives, terminated when no more calls are coming (after a grace period). Those are Service Fabric Actors either Stateful or Stateless.

We have talked about Stateful Services (here) with focus Partitions, today we will talk about Stateless Services and later we will cover Actors.

Creating Stateless Services

To create a stateless service you inherit from StatelessService base class which will give you 

  1. A means to create communication listener (via overriding CreateCommnuicationListener method) that works exactly as a stateful service as described here which will allow you to listen to incoming requests to your service.
    
            protected override ICommunicationListener CreateCommunicationListener()
    
  2. A a single call is issued to the following object method
    protected override async Task RunAsync(CancellationToken cancellationToken)
    

It is assumed that you are supposed to keep processing via means of a loop until the token is signaled. Again similar to Stateful service returning from this method will not terminate the replica.

Stateless Services Use Cases

Statless Services in addition to supporting partitions (one primary per partition, as there is no need for secondaries) supports “Instances” as well. Instead of describing how Partitioning/Instances is different with Stateless services I decided to use architecture uses case to describe the use cases for the following reasons:

  1. The architecture use cases published with the documentation and samples while really good they work sometime unintentionally as an anchor where we template-ize our architecture based on them missing all other great possibilities the platform can offer.
  2. Introduce few other possibilities as baby steps at taking the platform beyond boiler plate designs.
  3. It is more fun this way

The following are few uses cases of how stateless services (most also apply on Stateful as well) can be used.

The Gateway / Web Site

We have discussed the gateway in the previous post (Here); i have borrowed the diagram again below

GW Architecture Overview

I will not add more to it. Yet it is worth mentioning that gateway pattern serves well in on-ramping your system bit by bit on Service Fabric. Your gateway might be talking to services already built on different types of systems. In order to enable the gateway pattern you will need a set of instances typically one per each node on your cluster, for this modify ApplicationManifest.xml as the following:


<!-- default services XML element -->
<Service Name="StatelessSvc01">
      <StatelessService ServiceTypeName="StatelessSvc01Type" InstanceCount="-1">
        <SingletonPartition />
      </StatelessService>
    </Service>

Having Instance count as -1 Fabric Runtime will ensure to activate one instance per node for each node of the cluster. Because all your services are listening to the same URL it is easy to configure a load balancer on top of them that services the services as a single URL to your external clients.

You can use a smaller # of instances (as in less than total number of cluster nodes). In this case your load balancer should be proactively updating its internal routing table so that it won’t route to nodes that doesn’t host services.

If you used # of instances > than # of nodes service activation will fail after it activate # of instances = # of nodes. Singleton partitions are not meant to live with each other inside the same node (as a placement constraint). Hence Service Fabric will fail after reading # of instances = # of nodes. There will be no other locations it can find that meets the singleton special placement constraint. More on this in later posts.

the current version of WordCount sample uses this approach.

Backend Worker

Imagine your a typical workflow based backend processing that is characterized as the following:

  1. Highly Sequential.
  2. Each step is a long running it either succeed from beginning to end, retry from beginning to end  or fail. State will only need to be persisted before and after each step.
  3. Coordination Is needed to ensure no step is started before another.

We can lay out a system like that on Service Fabric as the following

Background Worker

 

The process described above is bread and butter for financial institutes used in fiscal period closure, produce statements, update credit reports and so on.

A system like that will consist of a Stateful Service X with # of partitions. I might decide to partition by operation type (example: Account Statements and Credit Reports). Each instance of account statement process require steps I can call externally into a highly partitioned stateless service. The system only needs to persist state before and after each step (by the stateful service or an external store). The stateful service essentially is a book keeper for the processes. I can choose to partition the stateless services into steps, operation type and so on. A System like that is typically designed to fan out and max out H/W resources to ensure biggest bang for a buck. My partitioning schema (for stateless services) will be like that


 <!-- default services element -->
 <Service Name="StatelessSvc01">
      <StatelessService ServiceTypeName="StatelessSvc01Type">
        <UniformInt64Partition HighKey="100" LowKey="1" PartitionCount="100" />
      </StatelessService>
    </Service>

I have a 100 partition on a uniform key range (we talked about this here). I can partition according to step type and so on, or i can treat them all equal and use my partition resolution (while i am calling them from my stateful service) to route calls to them on round/robin bases.

I can also use actors instead of the stateless services if I can ensure that each instance will handle one call at a time (more on actors on later posts).

In addition to throughput gains from a design like that, I can now upgrade my stateless services (which represents parts of logic) without having to upgrade my book keepers as well (stateful services). 

Singleton Monitor

Systems, specially complex systems require a singleton of some sort. A single instance of a process that monitors events and accordingly react. For example:

  1. A SaaS based system will have a monitor application that scale and de/scale the system based on the load.
  2. A log collector application that periodically moves logs from all servers to a unified storage analysis and audit.

Service Fabric will ensure that A singleton service will always have one and only instance on the cluster.  To enable this you will need to change your ApplicationManifest.xml as the following

<!-- Default services element -->
<Service Name="StatelessSvc01">
      <StatelessService ServiceTypeName="StatelessSvc01Type">
        <SingletonPartition />
      </StatelessService>
    </Service>

Epilog

This part of the journey was focused on Stateless services.

If you noticed i have been using the term “Activation” and “Instance” a lot in this post. I am preparing a post on how instance/Service/Application/Host are related to each others. Stay Tuned

Till next time!

@khnidk

Leave a Reply

Your email address will not be published. Required fields are marked *