Snowflake Windows Azure

Windows Azure is Microsoft's cloud-based platform-as-a-service offering. Following are the elements Windows Azure currently offers:


There are three types of computing you can use: VM, Web, Worker.

Virtual Machine (VM)

This is just what it sounds like - a virtual dedicated instance (it's Windows Server 2008 R2). It's comparable to Amazon's EC2. (Source)


This is deploying a web app to a stateless load-balanced set of front-end servers (running IIS7). (Any session state will have to be persisted outside of memory.)


Think of a worker as a chunk of code that does something. Probably a task that needs to be parallelizable, else you wouldn't be trying to deploy it to the cloud. Note however, that your worker can be interrupted or killed at any point, so you have to handle all the state management yourself. (And arguably state management and synchronization is the more difficult aspect of scalability than just spawning new threads or processes).


There are three types of storage that come with Windows Azure: Table, Blob, and Queue. Note that you can also use Sql Server for traditional relational storage, via SQL Azure.


This is a horizontally partitioned table, much like you would find in a relational database. You can specify columns of the following datatypes: binary (up to 64K), bool, datetime, double, guid, int32, int64, UTF-16 string (up to 64K). You choose one column to be your Partition Key and another to be your Row Key. Together those columns are your unique identifier (primary key) for any particular record. Rows with different Partition Key values may be stored on different nodes, which obviously degrades query performance, so choose the partition key values with care. You can access this data using ADO.NET and LINQ. (Source)


Blob is a general term and simple acronym meaning "Binary Large OBject". In this implementation, you can create named containers (think of this as a folder) and create named blobs within each container. Permissions are set at the container level (private or public read). Maximum 50GB per blob. Metadata (string key/value pairs) may be stored with each container and with each blob (8K max total size for each). The API supports sub-blob pieces (blocks), which enables parallel upload/download, as well as resuming an interrupted upload/download. The API lets you put/get/delete a blob or put/get a list of blocks, and you can check the ETag to see if a blob has changed since you last retrieved it. (Source)


The queue structure is a normal FIFO (first-in, first-out) queue. There is no maximum number of messages in the queue. The maximum time a message may remain in the queue is 1 week (messages older than a week will automatically be deleted). Metadata may be stored for each queue (8K max string key/value pairs). Each message in the queue is a maximum of 8K and may be stored in any format, but it is always returned as base64-encoded XML. Unfortunately, there are some significant limitations. Message order is NOT guaranteed. Also, the same message may be returned more than once. As usual, you need to code retry logic because you may get a timeout if you have extremely high load. And finally, the GET command can optionally mark a message invisible so a different worker won't grab that message, but again, this behavior is not guaranteed to work. These limitations are so significant that it's difficult to imagine a scenario where this structure would be preferred over one with a more robust locking mechanism that guarantees one message per worker. (Source)

App Fabric

The "App Fabric" is Microsoft's overall name for their Service Bus, Access Control, and Caching features. (Source)

Service Bus

Microsoft's service bus is (like most service bus offerings) a messaging platform. It allows you to send a message from one computer to another. It gives you a pull and a pseudo-push from behind NAT/firewall, and may attempt to open a direct peer-to-peer connection instead of relaying through the central server, for performance reasons. In this way, it's pretty much the same architecture as Skype or BitTorrent, but generalized for you to use. It's mostly for tying together apps or services running on your corporate LAN with apps or services you deployed to the cloud. (Source)

Access Control

Access control is marketed as including both identity (authentication) and access (authorization). The identity part is pretty clear. They support OpenID, and of course an integration with Active Directory, so you can get an authentication token (such as an SWT - "Simple Web Token") using those technologies. For authorization (securing resources), the picture is fuzzier. As of April 2011, OAuth2 is supported, meaning that if another site has data your Windows Azure app needs to access, and if that site already supports OAuth2 as a mechanism for granting permission to access that data, you can do it. But what about setting up roles or targeting resources within an application? The sorts of fields that are built-in to the standard user data structure, such as name or email or zip code, are not terribly useful in and of themselves for authorization. ("Yup, we only allow people from this area of San Francisco in our application screens." Right.) For now, you'll want to roll your own application authorization, tying in to the authentication store via some sort of unique string (like email address or username).


Caching is pretty straightforward. You keep data handy (usually in memory) if it's going to need to be used again soon, at the possible price of being out-of-date. The two pre-built caching providers are web-oriented. One is for session state (which is less for performance than for consistency across multiple web servers). The other is for web output (like the output caching available with IIS 7). If you use the caching API directly for your own purposes, aside from issues like authenticating securely to the cache service, the API is what you'd expect - a Set and a Get for a named object. There is a size limit of 8MB per serialized object. (Source)

Note: this data was gleaned from a variety of pages, mostly noted above, in May 2011. This is meant as a practitioner's introduction, not the last word. Let us know if any aspects are out of date or incorrect.