Azure: Cloud Service Models

Since I joined Lokad this September I finally had the chance to dive into cloud computing. We chose Windows Azure as platform for our very computation intensive business, and built a neutral opensource framework on top of it: Lokad.Cloud.

Cloud Services

Lokad.Cloud is described as a .net object-to-cloud persistence mapper, but it's actually much more. This post shall concentrate on one aspect only: Its notion of Cloud Services as horizontally scalable workers.

In essence, cloud services are managed and executed as follows:

  1. The Lokad.Cloud management infrastructure (for now essentially a web role) allows you to upload one or more assemblies containing a set of cloud services and optionally some configuration file.
  2. Every Azure worker role instance loads all these services in an isolated AppDomain.
  3. Each Azure worker then executes these services one at a time according to some scheduling algorithm and execution policy.

We provide specialized base classes to simplify implementing services processing items from a shared queue or for services which are to be called in regular intervals.

We treat all azure workers as equal and therefore execute every cloud service on each Azure worker from time to time. In other words, we map all cloud services to all Azure workers, forming a complete bipartite graph between cloud services and Azure workers as shown in the following figure.

Cloud Services To Azure Workers

This is a fundamental concept that yields a very simple design with a potential for ideal horizontal scaling, and is even resilient to failing azure workers as long as at least one worker remains intact.

Cloud Service Models and Deployments

The only object that is aware of this mapping is the service scheduler. Yet, from the management and diagnostics perspective it would be interesting to represent the cloud services as first class objects. I'm therefore introducing the notion of Cloud Service Models for Lokad.Cloud (not part of the current release, open whether it ever will be).

In Azure, web and worker roles are explicitly defined and configured in two xml files. Since the latest update of the Azure tools for Microsoft VisualStudio, they are referred to as Azure Service Model. Using the Azure management website one can upload an assembly plus the two xml files to create a unique Azure deployment. A deployment can be stopped or running, either in production or in staging mode.

The same concepts can also be applied to Cloud Services, on a slightly higher level of abstraction and orthogonal to the Azure terms.

A Cloud Service Model is a unique entity, associated with a set of assemblies, the cloud services defined in them and their configuration (if applicable). Using the Lokad.Cloud management tools an administrator can upload such a model and create a unique Cloud Service Deployment. A deployment can be stopped or running, and of course be removed when no longer needed. A failing or malfunctioning deployment can be diagnosed and dealt with directly in the management UI.

Note that the currently implemented option to upload a zip file containing assemblies and optional configuration is already very close to such a models, but is missing identity and other metadata.

Cloud Services Deployments

In each Azure worker, our scheduler will load the current service model, load the services and schedule them accordingly. From time to time the scheduler will check whether the deployed service model has changed, and update if necessary.

Technically this design would also allow to run multiple different deployments in parallel, e.g. by breaking the complete bipartite graph between Cloud Services and Azure workers into a non-complete bipartite one where Azure workers are assigned to a single Cloud Service Deployment:

Cloud Services Deployments 2

Or by sharing the Azure workers by Cloud Service Deployments in a way or another (e.g. in parallel, or round robin):

Cloud Services Deployments 3

Remember however that some of these scenarios violate the fundamental concept mentioned above. Hence, as usual, there's a tradeoff between flexibility and robustness.

Update

It seems there's a better way to differentiate between cloud service models and deployments:

  • Model: An identity, a set of (named) cloud services, their assemblies and optionally some configuration.

  • Deployment: An identity, a set of models and their mapping to (Azure) worker nodes.

I.e. only one deployment can run at at time, but there's an option to support configuring multiple models in a deployment. Also, there's a trivial empty deployment where no models are loaded at all.

Hence, the labels in the figures above should read "Cloud Service Model A" instead of "Cloud Service Deployment A", etc.