It’s just data

Distributed State Machines

State machines are a fundamental concept in computer science.  The concept is simple: sequentially feed a series of inputs, in a particular order, to a black box in order to obtain a desired result. 

One way to view this is by focusing on the series of messages.  Consider a typical ftp session: open, user, passwd, chdir, get, close.  In this world view, “state” is the hidden, encapsulated, abstraction of everything that is inside that black box.  It is the input and output messages that are important.

If you look at mechanisms like JAX-RPC, messages too get abstracted away into operations/method calls.  While certainly a convenience, this convenience sometimes comes at a price.

From an architect’s point of view, a system is composed of a number of boxes.  Sometimes multiple logical boxes will be put on a single physical box.  Sometimes a single logical box may span multiple physical boxes.  But what is comparatively rare is for a logical box to be composed of fractal portions of a several different physical boxes.

In other words, the network is generally not an irrelevant detail to be abstracted away.  See also The Eight Fallacies of Distributed Computing.

Making State Explicit

There is another way to rearrange these pieces that make the overall system more robust and less brittle.  Instead of abstracting away the state, expose it.

From Roy Fieldings’s doctoral dissertation, hypermedia is the engine of application state.  Each page on the web represents a single state of a single resource.  You can obtain that state in a uniform manner (via GET).  The data you receive contains hypertext “links” to other resources.  By walking the links you traverse the state machine.

Instead of concepts like a “current working directory” being a shared concept between client and server, with all the attendant lifecycle, synchronization, and error recover concerns that managing such a state implies, state becomes explicit and part of every request.

Characteristics you often find of such applications are shared-nothing architectures where there are no long running server processes beyond whatever database you are using; and a preference for fewer, relatively coarse grained messages as opposed to designs that require clients to pelt the server with a comparatively larger number of smaller requests.

Authentication still needs to be orthogonal, and the HTTP solutions are more motivated by pragmatism than theory.  “https” is a second URI scheme that is overlaid over the same “http” namespace - something that a number of theorists are uncomfortable with.  Basic and digest security are based on challenge/response, and also surface through the abstraction that is an HTTP GET.

To be clear, Roy’s thesis does not put this forward as a panacea or a silver bullet.  But for an amazing range of applications to which this technique applies, the result is servers that are much more scalable due to the fact they no longer need to manage state.  As an added bonus, clients can safely store away Uniform Resource Identifiers for later use, and even share them with friends via IM or by placing them on business cards and billboards.

Applications

Given the focus on GET, there is one point that often confuses a number of people.  Roy Fielding stated it this way There’s no basis for “everything must use GET” in Web architecture. There is for “use an URI for everything that’s important”.  In particular, one should not use GET for operations such as TrackBack or CartModify.  The resource being operated in those cases are a weblog entry and a cart respectively.  The URI should identify the resource.  The HTTP method in both such cases should be POST, not GET.

The core set of verbs defined by HTTP are GET, POST, PUT, and DELETE.  These cover the needs of a wide range of distributed applications.  If you find you need more, HTTP is extensible, but it would make sense to look first to WebDAV as they may already have standardized (or be standardizing) the method that you need.  Subversion and Microsoft office are examples of applications that use WebDAV today.


[link]...

Excerpt from del.icio.us/donpdonp at

Hey Sam,

You may find SSDL interesting ([link]). It’s a description language for services that allows one to describe a protocol state machine so that the sequence of expected messages in a protocol can be expressed as part of the service’s metadata.

Regards,
.savas.

Posted by Savas Parastatidis at

Savas: what I am really looking for is something that can’t be described by WSDL 2.0, i.e.,
distributed hypermedia.

Posted by Sam Ruby at

Hey Sam,

I guess I fail to understand what you are after. Distributed hypermedia is based on the idea that all possible state transitions are encoded in the state that currently exists at the processor.

.savas.

Posted by Savas Parastatidis at

Savas: perhaps an analogy would help.

Have you ever played an Adventure game?  One where you walk around a world, and at each location you pick from one of n possible alternatives?

At each place you stop, there is a document which describes your choices.  Associated with each document is a scheme which helps you decipher the document.  Along the way, you may find items that you can pick up.

Some of those schemas don’t simply contains pointers, but permit form input consisting of primitive data types which you are expected to combine with the pointer to find the next location.

Sound like an unusual game?

Now, go to amazon.com.  Along the left of the page you will find a bunch of links.  Pick one.  There you will find another page with links, and perhaps a form.  Continue to either click on links or buttons that contain words like “Continue”.

Along the way you may find that you have picked up items.

Wander around this application long enough, and you may end up at the “check out”.

Question: Can you describe this application using WSDL?  Or SSDL?

Posted by Sam Ruby at

Hey Sam,

There is a difference between the resource-oriented Web and the service-oriented applications that SSDL is meant for. Having said that, SSDL was created to describe exactly the kind of scenarios you describe.

With SSDL (but not WSDL) you can describe a protocol in terms of sequences or choices of messages (or any other operations that your formal model requires) supported by a service. A service may support multiple such protocols at the same time.

So, with SSDL you can describe the entire interaction (all the messages sent/received) that are necessary to go from the initial message to amazon.com all the way to the checkout.

There is a small difference between the above and what you want to achieve (there is a lot of discussion on the differences between service-orientation and resource-orientation but I won’t get into the details). In the Web world, state is assumed throughout and direct references to state. Integration happens through the explicit sharing of state (or better, state representations). In the service-oriented world I am talking about, integration takes places through protocol (or 'messaging behaviour') sharing. Resource state doesn’t play a key role. It’s all about the behaviour that is placed on the wire. (Of course this is a big discussion and not everyone agrees... we are writing a paper on exaclty this topic).

Regards,
.savas.

PS: Please note that I am travelling today to Korea so I may not be able to see a possible follow on. You can email me if you want.

Posted by Savas Parastatidis at

Savas - this is an interesting discussion, and one that need not be rushed.

My tagline ("It’s just data") shows my biases.  I have a tendency to believe that focusing on representation of resources is a more effective strategy than focusing on messaging behavior.

One can certainly map out all of the messaging behavior of a small and unchanging application.  I need to be convinced that such an approach works for Internet scale applications that are constantly changing.

Posted by Sam Ruby at

“In particular, one should not use GET for operations such as TrackBack or CartModify.”

Or ‘install’, ‘reload’, ‘start’, ‘stop’, 'remove': [link]

Posted by Bill de hÓra at

最近のブックマーク

最近の僕のブックマークです。 XML/SWF Charts XML ソースからグラフを表示してくれる swf cool! xml-dev - Re: [xml-dev] RELAX NG Marketing (was RE: [xml-dev] Do NamesMatter?) 名前は重要な問題ではない。リラクシングでもリラックスエヌジーでもよし by jjc Sam Ruby: Distributed State Machines Sam Ruby が state machine と...

Excerpt from 傭兵日記 at

Random thought: savas’s approach seems geared towards designating the optimal path to the “checkout” node (send bad input and you will pass through more nodes). If each resource included information on its proximity to the checkout or goal node in its representation, could you describe a service-oriented application by designating a starting node, goal nodes, and a heuristic? Something like the Atom Protocol has no goal as such, so it doesn’t include this information.

Posted by Robert Sayre at

A REST Intervention

Responding to James Snell’s article, Resource-oriented vs. activity-oriented Web services, a dissent in 4 parts and a contribution to the ongoing (never-ending?) debate on REST, web services and SOAP A Snap Judgment Sometimes you read or hear...

Excerpt from Koranteng's Toli at

I guess I fail to understand what you are after. Distributed hypermedia is based on the idea that all possible state transitions are encoded in the state that currently exists at the processor.
.savas.

While certainly all possible states must be generated from the processor, the current state is not encoded in the processor at all. Instead  the current state is “out there” in the returned HTTP response (and resides at the client). The processor does not maintain state at all.

Posted by anonymous at

Hi, I do a fair amount of work with Finite State Machines and web services in terms of designing business systems - take a look here for my thoughts:

[link]

Hope it’s of interest, I obviously have a whole bunch more details...

- David

Posted by David Ing at

Moment of clarity - REST, the Browser model and vs AJAX

Hopefully not the first I will post here ;) And I intend to flesh this one out, but thought I should get it down first.. I’m big on the concept on unique URI’s - every resource having a unique identifier and being addressable. It’s the basis of any...

Excerpt from thinkingBlogged at

Web Architecture Roundup

Some notes on recent activity by the web architecture regulars......

Excerpt from lesscode.org at

links for 2005-08-10

Sam Ruby: Distributed State Machines (tags: web http programming webservices) CocoaMySQL “The Open Source MySQL Database Manager for Mac OS X” (tags: mysql osx utilities tools database dev programming cocoa client)......

Excerpt from Full Speed at

IDLs para REST?

Antes de mais nada, um aviso para quem usa a porcaria do blogger: o “auto-save” não funciona ! Como o leitor já deve ter deduzido, eu descobri esse problema de maneira pouco agradável, perdendo o resultado de algumas horas de digitação. Vamos...

Excerpt from Rafael divagando at

IDLs para REST?

[Edit: O Aristotle Pagaltzis tratou deste mesmo tema neste post .] Antes de mais nada, um aviso para quem usa a porcaria do blogger: o “auto-save” não funciona ! Como o leitor já deve ter deduzido, eu descobri esse problema de maneira pouco...

Excerpt from Rafael rambling at

Sam Ruby: Distributed State Machines

[link]...

Excerpt from Delicious/bokmann/rest at

Why understanding REST is hard and what we should do about it – systematization, models and terminology for REST

This is going to be another long post, so I’m using the introduction as an overview again. Introduction This post is about understanding REST, the software architectural style behind the World Wide Web. My Ph.D. research, which I’ll...

Excerpt from Ivan Zuzak - blog at

Why understanding REST is hard and what we should do about it - systematization, models and terminology for REST

This is going to be another long post, so I’m using the introduction as an overview again. Introduction This post is about understanding REST , the software architectural style behind the World Wide Web . My Ph.D. research, which I’ll write about...

Excerpt from Ivan Zuzak at

Add your comment