Ted Neward has also discovered Prevayler:
The part that has me sort of wondering how widespread this idea goes is the "everyting stored in-memory", coupled with persisting the objects stored there to disk every 24 hours or so. For certain systems (hey, like maybe this weblog), I can accept that kind of risk--for others, no way. If anybody using (or developing) Prevayler's reading this, can I control the periodic time between disk-persists? Okay, you missed the fine-print, I did too at first and initially ruled out the whole idea as insanity. The morning after (I stumbled in late and hungover actually) my pair-programming mate had read all about it at the Prevayler site I left open in the browser and already ript out our then OJB-persistence layer and replaced it with Prevayler. Later that day we experienced the sensational 10k performance gain and we're probably never going back again. (Remember our system was very performant from the start.)
Okay, so this is the stuff that makes you go: "Damn, why didn't I figure that out myself": Every modification to the system is serialized as a command-object in a command-log. At system startup the latest snapshot is deserialized and the command-log is replayed. This way you don't loose anything even if the system goes down between the 24-hour snapshot periods. If you couple this with some nice bit of AOP you actually get totally transparent persistence. (For example Nannings Prevayler-demo.)
Remember: every modification to the system must be completely deterministically reproducable or your done for. You'd be surprised to how easy this actually is.
PS. The snapshots you have to do manually, so you want it at 24 hour intervals you do that, you want it more often do that. You make the call... (...to prevayler.takeSnapshot() that is.) DS. [jutopia]
This is certainly interesting. From what I've understood so far, Prevayler can only work in a single process - I don't see how it can possibly work, for example, across a 10 machine server farm sharing the same data - at the very least there will be synchronization and recovery issues across machines. Though if you have a single process this seems fine - indeed Prevayler seems like a Java object database in itself that your other machines could talk to).
One of the great things RDBMS give you is cross-language and long-lived access to your data together with a gazillion of tools to query, analyse, produce reports, import, export, convert etc. Only being able to use a certain version of your codebase with Java code to access your business data seems pretty scary for most enterprise systems. e.g. imagine radically changing your schema of your database - how do you do it? I worked with OODBMs for years and not being able to use all those standard database tools out there can feel like developing enterprise apps with a hand tied behind your back.
Having said all that, I do agree that persistence code can be a pain to write and can often be slow and the performance benefits of Prevayler are impressive. From my cursory look at Prevayler, it seems that the main performance benefits of it are
- when reading, use objects that are in RAM so no database access is required
- when writing, rather than synchronously updating a database which can be slow (especially when lots of concurrent processes are doing the same), just write to a local transaction log and update objects in RAM
So its mostly all about caching really. There's no reason why you couldn't get all the benefits of a RDBMS while supporting a large cluster of machines (say 100 machines in a server farm) while still having the same performance benefits of Prevayler. The trick is to use a distributed write-through RAM cache.
Using a distributed cache you can cache in RAM all the information you want, which provides all the same massive read optimisations as Prevayler. When writing you can send Command objects on a JMS Queue (which is usually at least as fast as writing to a local transaction log). Then the Command asynchronously gets applied to the relational database(s) and then gets distributed over a JMS Topic to all your caches.
This is neat for a number of reasons
- all the same benefits as Prevalyer (very fast reads as its all in RAM, writes are also very fast as its just a quick write of a Command object to a JMS queue)
- you still get all the benefits of a RDBMS being able to do arbitrary queries with SQL, reports etc
- it can support a massive number of clients sharing and working on the same data
- it maintains the order of update commands across a cluster of machines so everyone is in the same consistent state
- good JMS providers can also cluster the queue so that if a machine just dies you still have full fault tolerance and failover. i.e. no central point of failure.
- you don't have to keep everything in RAM, you can use efficient eviction policies to keep what you need in RAM and you can work with data sets (gigabytes and terrabytes) which are larger than the available RAM in each process
- you can use N-tier caches - typically application servers such as servlet engines don't have huge amounts of spare RAM so a higher tier of cache servers can provide that extra RAM
- you can also use this technique to replicate your database across locations and database providers
Finally there's a standard API called JCache to these kinds of distributed caches (which I hope will be public soon!); so you should be able to write your application once and then plug and play your JCache provider. The API is not really much different from working with regular Map objects so its very easy to use.
7:50:38 AM
|
|