Martin Probst's weblog


Sunday, January 9, 2005, 15:05 — 0 comments Edit

I stumbled across a persitance framework for Java called "Prevayler. It's a rather interesting approach to persistence.

The basic idea is to keep all information within RAM within usual Java objects. These objects are persisted using some arbitrary Java serialization technology - implementing "Serializable" will do. All operations on the objects are done using objects representing transactions. These transaction-objects are passed through the system and serialized to a log. If something crashes the application the serialized transactions are replayed from an initial state of the system. To keep the logs small the system saves the current state to disk (using serialization) in regular intervals, like once a day. This is done by keeping a hot standby server which is running synchronized to the real server. If a backup is requested the standby server stops syching with the server, dumps his objects and resynchs.

This is a very smart way of easily achieving quite a good persitancy for objects without much hassle. While it is nice, it has a lot of limitations. The developers themselves seem to think they have found a universal solution to overcome relational databases in general. Does it really do that?

With Prevayler programmers have to be quite smart to really get their objects to be persistent - while they are completly unlimited regarding the use of Java features they have to pay attention on some certain things like not keeping references to objects. Also, the speed of the system relies on the ability of the programmer to create a sensible data structure to access her data - if she chooses the wrong data structure things will get really slow. SQL and RDBMS have been invented to solve these problems, and they are quite good at it, even though the conversion from the data represented in the relational world back to your business application takes some (mostly trivial) effort.

Next thing is Prevayler doesn't do anything in parallel. No transaction may be run parallel to another if the user doesn't synchronize it himself. This is a major con. Programming concurrent applications isn't trivial and one of the biggest benefits of RDBMS is to provide concurrency control, different levels of transactional security etc.

Also Prevayler won't help you with big amounts of data. The authors claim that falling RAM prices will solve that problem but that does not seem plausible. Real business applications can have several hundred GB of "live" data - thats not an area you will reach with cheap RAM anytime soon. These amounts of data can be managed with a (rather) cheap x86 computer and big harddrives, even if it might get slow. Prevayler simply fails.

I think Prevayler is a smart idea to solve a limited problem - persistance of data in small applications. If they can add intelligent means of swapping objects to disk to it it might get really useful. But this would again put limitations on the programmer, like forcing him to inherit from special objects, implementing certain interfaces etc.

The end of the story is that database management systems, relational or not, provide a lot of features Prevayler doesn't give you. Prevayler is suited for applications that are written by really good programmers, won't produce too much data and don't require any concurrency.

I can't really see how the programmers of Prevayler come to the conclusion that they obsoleted DBMS - and why do they think that thousands of capable programmers and scientists have just overlooked the Prevayler approach? That seems quite arrogant to me.

No comments.