Paresh Suthar's Radio Weblog
And that's all I have to say about that - Forrest Gump






Subscribe to "Paresh Suthar's Radio Weblog" in Radio UserLand.

Click to see the XML version of this web page.

Click here to send an email to the editor of this weblog.
 

 

Thursday, March 31, 2005
 

General information for writing Groove applications
We've had many developers over the past few years embrace Groove as a robust platform to build collaboration applications on.  Almost all of them love the built-in firewall traversal, always on security, self synchronizing data models, etc.... and would like to leverage this "plumbing" for building their purposed solution(s).  Here are a few points that should be read multiple times, and perhaps out loud just to make sure that they sink in before begin writing a Groove application:
  • Groove's data model is replicated to all members within a workspace
  • There is no single authoritative data store for a Groove application
  • Always write code that assumes that the data you are about to read/change/delete no longer exists
  • Never rely on knowing when all members of the workspace have performed a data change operation(s).
  • Ordering of disseminated data changes is guaranteed
  • Lost data changes are automatically retrieved
So let's examine each point in a little more detail.
Groove's data model is replicated to all members within a workspace
Groove store all data for a workspace in an encrypted Groove databases that manifests itself as an .xss file on disk.  When a user is invited to a workspace, they obtain a complete copy of the workspace from the inviter (with the possible exception of documents/folders configured to download on demand).  Each time a data change(s) is made, it is always locally executed/committed first and then queued/disseminated to other members where they apply the change(s) to their local workspace.
 
There is no single authoritative data store for a Groove application
When you write an application that performs changes in a relational database you typically start a write transaction, perform your changes, and then commit the transaction.  While the write transaction is open, no other write transactions for the same region(s) of data can be started.  Once the transaction is committed, then anyone who reads the database will get the latest changes.
 
In Groove, each member of a workspace can simultaneously start a write transaction for the same region(s) of data, perform changes and commit the transaction (i.e. there are no distributed locks in Groove).  A change to data is often referred to as a "delta".  The idea is that as deltas are assimilated from other members, Groove will determine if the changes are in conflict with other member's changes.  If a conflict is detected, Groove will perform one of the following actions:
 1. If changes are for document types (e.g. Groove Folder Sharing, Groove Files tool) then one of the member's changes will be determined to be the "latest" version and all other members changes will spawn a copy of the document (i.e. conflict version)
 
2a.  If changes are for structured data types (e.g. Groove Forms tool, Groove Discussion tool) and changes are made to heterogeneous fields within a record, Groove will merge the field changes.
 
2b.  If changes are for structured data types (e.g. Groove Forms tool, Groove Discussion tool) and changes are made to homogeneous fields within a record, Groove will deterministically pick one member's changes to be the "latest" version.  By the way, the algorithms to deterministically pick are not based on timestamps.
Note that addition type operations (e.g. create new document, add new record) can not conflict with each other.
 
Always write code that assumes that the data you are about to read/change/delete no longer exists
Since Groove is almost always processing deltas from other members on a background thread, it is possible that the data you are about to access no longer exists.  This is why the first few lines of code after starting a transaction validate that the data still exists before continuing.
 
Never rely on knowing when all members of the workspace have performed a data change operation(s)
When a delta is disseminated to other members, there is (currently) no foolproof way of detecting if they have processed those changes or not.  Deltas are sent to the other members in one of the following ways:
1a.  If source and target members are on local LAN, same subnet and both online, direct communication will occur
 
1b.  If source and target members are on local LAN, same subnet and target member is offline, changes will be queued via target member's Groove Relay Server until target member comes back online
 
1c.  If source and target members are on local LAN, same subnet, target member is offline and target member's Groove Relay Server is inaccessible, changes will be queued on source members device until target member comes back online or target member's Groove Relay Server becomes accessible
 
2a.  If source and target members are on local LAN, different subnets and both online, indirect communication will occur via Groove Relay Server
 
2b.  If source and target members are on local LAN, different subnets and target is member is offline, changes will be queued via target member's Groove Relay Server until target member comes back online
 
2c.  If source and target members are on local LAN, different subnets, target member is offline and target member's Groove Relay Server is inaccessible, changes will be queued on source member's device until target member comes back online or target member's Groove Relay Server becomes accessible
 
3a.  If source and target members are on disparate LANs and both online, indirect communication will occur via Groove Relay Server
 
3b.  If source and target members are on disparate LANs and target member is offline, changes will be queued via target member's Groove Relay Server until target member comes back online
3c.  If source and target members are on disparate LANs, target member is offline and target member's Groove Relay Server is inaccessible, changes will be queued on source member's device until target member comes back online or target member's Groove Relay Server becomes accessible
So it should be clear that you can't rely on knowing if everyone has received your changes, and you should note that the converse holds true as well - you can't rely on knowing that you have received everyone else's changes.
 
For those of you thinking about sending some type of signals to indicate successful processing of changes (e.g. data changes) - it's not recommended because you quickly wind up with star formation of network traffic for each member's change, and that's just not a good idea.
 
Ordering of disseminated data changes is guaranteed
Because connectivity between members is always a variable, there needs to be a way to ensure that data changes are applied in the same sequence as they were created.  Imagine the following steps occurring sequentially in time: 
1.  Member A adds a new discussion entry in a Groove Discussion tool - we'll call this change 'A'.  At the time Member A's changes are disseminated to member B, member B is not online, so change 'A' is sent to member B's Relay Server
 
2.  Member A adds a response entry to discussion entry from step 1 - we'll call this change 'B'.  At the time Member A's changes are disseminated to member B, member B is online and on the same local LAN and subnet, so change 'B' is sent directly to member B.
 
3.  Member B connects to his/her relay server and receives change 'A'
In this scenario, change 'B' is received before change 'A' is - doh!  Luckily Groove draws a distinction between receiving changes and applying them.  So while the changes have been received in a different order than they were generated, Groove guarantees that they will be applied in the same order as they were created.  Groove understands that change 'B' is dependent upon change 'A' having first been executed, and orders their execution accordingly.  It also means that if change 'A' is never received, then change 'B' will never be applied.
 
Lost data changes are automatically retrieved
So what happens in the event that deltas are waiting for you on your Groove Relay Server, and the unthinkable happens - the Groove Relay Server blows up?  Once again Groove comes in to save day by having built in an automatic detection mechanism for missing deltas.  When missing deltas are identified, there is an automatic request mechanism in Groove that issues a request to all other members of the space to supply one, several or all of the missing changes.
The last point I would make is that everything described here is applicable to both the Groove Virtual Office client software and Groove Enterprise Data Bridge server software - after all they're both built on top of the same platform.

11:48:59 AM    comment []


Click here to visit the Radio UserLand website. © Copyright 2005 Paresh Suthar.
Last update: 8/19/2005; 3:24:31 PM.
This theme is based on the SoundWaves (blue) Manila theme.
March 2005
Sun Mon Tue Wed Thu Fri Sat
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    
Feb   Apr