Martin Probst's weblog

Dependency Injection

Friday, August 20, 2004, 08:15 — 0 comments Edit

Rickard Oberg writes about Dependency Injection. This is completely new to me but it looks like an interesting approach. Oberg is using a Container called Pico to manage his Java objects, their lifecycle and their dependencies. He writes about new design goals he adopted after writing code using Pico.

Pico can be used to group objects together in a container. If you have dependencies between your objects (especially objects which need another objects in their constructor) Pico will try to solve them using reflection. After adding several classes to a PicoContainer you can start the whole container and Pico will create your objects and start those which are capable of doing something.

By grouping objects together in nested containers you can manage whole cloud of dependent objects. This looks nice, but it obviously needs a lot of thinking in the design phase as well as a really different approach to OO software design. Pico (and possibly other similar frameworks) hide the dependencies, object configuration and lifecycle issues from the programmer. This might be a good idea in many situations but I’m pretty sure that you can blow an impressively big hole into your own foot with this.

Recovering overview of classes with lots of methods

Thursday, August 5, 2004, 22:10 — 0 comments Edit

I think everybody knows this. Editing a quite comprehensive class within Eclipse can get tedious because one’s loosing the overview over the methods. Eclipse provides an outline over the methods, but this isn’t sorted in a sensefull manner.

The Eclipse Protocols Plug-In helps with this by grouping methods into categories (marked by comments) and letting the user browser them. The grouping is quite easy using drag & drop. This is a nice tool to enhance the readability of your code which is in turn a good idea as code is usually read a lot more often than it is written …

[via Manageability]

IBMs Common Public License

Tuesday, July 20, 2004, 12:28 — 0 comments Edit

As posted before I took a look at the CPL, IBMs OpenSource license.

It seems to be some kind of a “mid-way” between the BSD style licenses (e.g. Apache or LGPL) and the GPL (or similar licenses). With the LGPL one may use and modify the source in own projects and distribute the resulting applications in binary without providing the full sourcecode and under your own licensing terms. You only have to give credit to the authors. This is IMHO a nice thing as it might help Open Source software to spread. Company lawyers are probably less afraid of such licensing terms than those of the GPL requiring full redistribution of source code.

On the other hand, this might result in some Evil Company™ taking over your nice open source project and making bigbucks while suppressing their userbase. There is nothing wrong with companies making bigbucks but I wouldn’t really feel happy with that. Look for example at Transgamings WineX - if I was one of the Wine developers I would be fairly pissed. They took most of the Wine work, only added features for gaming support, and didn’t give their improvements back to the community. This seems somewhat unfair as their product seems to be heavily dependend on Wine.

With the CPL it’s possible to have some kind of a middle way. If others are modifying your source and redistributing it they have to provide the source code with it. But if they only use the program (e.g. link against it, use it like one would use an XML parser) it’s ok for them to include it in closed source distributions while providing appropriate creadit to the original authors. This might be a way to go.

The only obstacle is that the definition of “uses the source code” versus “is a derivative of the source code” is rather weak or non existant. One would also have to look at the compatibility of the CPL to other licenses. This comes in handy if one needs to change the license of ones product.

Bug hunting with AOP and Eclipse

Tuesday, July 20, 2004, 12:04 — 0 comments Edit

Today I stumbled across an Eclipse Plugin called Bugdel. Bugdel provides an Aspect Weaver to include debugging code into your existing Java applications. This results in a clear separation of debugging/logging code and real application logic which is generally regarded to be a Good Thing ™.

Bugdel supports the common set of points to weave aspects into as method calls, method executions, field getting and settings etc. It’s also possible to weave aspects to line numbers. Bugdel supports wildcards in method and class names so you can easily weave certain debugging aspects to a lot of methods.

What looks really good is the integration into Eclipse. Bugdel provides an own Editor to Eclipse (based on the standard Java editor) where you can easily add the “pointcuts” (Bugdel term for AOP join points).

I have to look into this to find out whether the Bugdel pointcuts can only be used from within Eclipse. It would be great if they could also be compiled into the app using Ant or something similar.

Bugdel is distributed under the terms of the CPL (IBMs OpenSource license). I only took a glimpse at the CPL but it looks kind of strange to me. I’ll read some more on it.

(via Eclipse Plugins)

Open Source License Madness

Wednesday, July 14, 2004, 20:35 — 0 comments Edit

When I checked my email today after having ignored it for 2 days I first thought there was an email problem with the MonoDevelop mailing list. There were 47 new emails in the last two days which is a lot more than usual.

It wasn’t a misconfiguration of the mailinglist tool though, but a discussion about MonoDevelops license and eventual compatibility issues with SharpDevelops license.

MonoDevelop is an IDE developed for UNIXoid platforms (afaik mainly used on Linux & MacOS X) and its based on SharpDevelop to quite some extent. SharpDevelop is an IDE running under Mono and MS .NET but it’s currently limited to Windows because of its dependancy on Windows.Forms.

The discussion started with Todd Berman, one of the main MonoDevelop guys, announcing that he will license all his own contributions under the MIT X11 license as opposed to SharpDevelops GPL policy. The MIT X11 license is some kind of LGPL/BSD/Apache style license which allows the mixture of free and non-free software.

The SharpDevelop guys (namely Christoph Wille) then remarked, that this is not possible as the MonoDevelop modules depend/extend GPLed SharpDevelop code. This effectively means that the authors cannot decide about the license of their own code because of the GPL limitations. This kind of annoyed some MonoDevelop guys as they wanted a more liberal license, especially to allow third parties to provide non-free plugins to MonoDevelop. One of the major developers, John Luke, decided to let go on MonoDevelop because of that.

I’m not really sure what the definitive conclusion of the rather interesting debate was, but in the end (at least, the current state) Todd Berman announced that he will relicense his contributions to bei MIT X11 / GPL dual licensed. When being distributed with SharpDevelop parts they have to be licensed under the GPL though.

What does this mean to other developers? Be careful about license issues with your Open Source project. Noone wants to struggle with strange juristical issues instead of writing cool code. Especially when including source code from other projects you might irrevocably bork up your project in terms of what you can use it for.

After all, I don’t really understand the position of the SharpDevelop guys - they do not really seem very happy with the existence of MonoDevelopl. Otherwise they might go ahead and try to find a compromise, e.g. LGPLing the plug-in interfaces or the parts MonoDevelop depends on. I’m not a lawyer, but this might make non-GPL plug-ins and enhancements possible.

It’s their sourcecode and their copyright so they can ofcourse freely decide what should be possible and what shouldn’t. But I don’t consider it a good idea to enforce the use of your favorite license on other people. It doesn’t sound very “free” (freedom, not beer, german “frei”) either.

DocSynch, Collaborative Editing and XML Databases

Friday, June 11, 2004, 17:09 — 0 comments Edit

Today I received an email from my friend and fellow student Alexander Klimetschek. He is developing a system called DocSynch, which allows multiple people to synchronously work on one document. It’s implemented via an IRC protocol and theres a working plugin for the opensource java editor jEdit.

While this seems to be completely unrelated to my current XML database project the scientific background of collaborative editing is probably very interesting. Multiple users editing one semi-structured hierarchical document is exactly what a XML database is supposed to provide. I have to check on that …

dbXML, XML:DB, XMLDB and friends

Thursday, June 10, 2004, 11:00 — 0 comments Edit

Today I googled for a mirrorsite of (which seems to be down) containing more information about the XML:DB API. One would think this shouldn’t be all to hard, but as Google removes all non text characters from queries (also if they are quoted) this becomse “XML DB”. And apparently everyone doing anything XML related to Databases - including peeps storing their adressbook.xml in MySQL BLOBs - is calling it XMLDB or DBXML.

This is also always funny when discussing such things with other developers.

Is it so hard to find a name for a tool that is not a mere description of the used technologies? Giving things an unambigous name is not only marketing, it’s really necessary.

The dark side of Gentoo

Monday, June 7, 2004, 17:37 — 0 comments Edit

I’m using Gentoo Linux which is great because it’s building packages directly from source. This means you’ll have packages optimized for your machine and also most packages get updated quite often. Lately I found out, that this is something I consider really annoying too.

I just typed a rather common command for me: # sudo emerge -vpuU world which means “search for updates to all software I have installed manually” (i.e. not as a dependency). This is what it spits out: [ebuild U ] app-office/openoffice-ximian-1.1.59 [1.1.57] +gnome -kde -ooo-kde 1,013 kB [ebuild U ] net-www/apache-2.0.49-r3 [2.0.49-r1] +berkdb -doc +gdbm -ipv6 -ldap +ssl -static +threads 0 kB [ebuild U ] dev-util/eclipse-sdk-3.0.0_rc1 [3.0.0_pre8-r2] +gnome +gtk -jikes -kde +motif +mozilla 0 kB [ebuild U ] sys-apps/module-init-tools-3.0-r2 [3.0] -debug 0 kB [ebuild U ] sys-apps/baselayout-1.9.4-r2 [1.8.12] -bootstrap -build -livecd -(selinux) -static 0 kB

The important lines are those with eclipse and openoffice. I would like to keep up to date, but I don’t like to compile whole Eclipse or even OpenOffice for a “m8 to m9” or some similarily small step. Especially as they are java packages and compiling by hand doesn’t give you anything.

While I also like the configuration system and baselayout of Gentoo I think I would install Debian on the next machine I set up.

More thoughts about eXist style XML storage

Thursday, May 27, 2004, 16:34 — 0 comments Edit

I’ve come to think a little bit more about the way eXist stores XML. Actually it’s more what I imagine it’s supposed to be because this is only based on what Lars Trieloff told me about the phantom nodes they use in XML trees - I am too lazy to really look up how they do it.

If you insert phantom nodes and number your XML nodes as mentioned before, one can store a whole XML tree without any pointers within the nodes. You just have to take care of a table indexed with the numbers containing references to the objects. This table could also contain fields for locks and maybe even Access Control Lists.

This is generally a Great Thing ™ as it leads to nodes with a fixed size which would make storage a lot easier and faster. I’m not yet really sure if it’s a good idea because you need to take care of the attributes of elements and the text within textnode. This could be done by storing only references to them within the nodes itself but that would mean another dereferenciation when doing comparisons within XPath/XQuery queries.

Regarding the table of nodes, there are two styles to keep it. The first one can be considered somewhat of a “dense” index. This would mean one entry for one node, even if it’s a phantom node. The access to this table would be very fast - O(1). The drawback is a potentially massive overhead if your XML tree is very unbalanced. Imagine a tree that is generally very thin but has one node with several hundred children. The array would get extremely big because of all the phantom nodes.

The other style would be a sparse index. The list would only contain references for non phantom nodes even though they would be counted for the index. This index would be slower as it cannot be accessed directly via the index. If it is implemented as an ordered tree structure accesses would be sth like O(log n) - in which case the storage tradeoff for a pointered node tree might be less bad. Keep in mind that frequently traversed nodes would be in main memory all the time anyways so disk IO accesses should be seldom. Tree management shouldn’t be a big issue as the complete index has to be rebuilt if something gets changed anyways.

The dense index would be great but the space tradeoff might be heavy. The dense index might be a bad idea - every access to a node would be O(log n) which is a lot compared to simple pointer dereferencing in a DOM tree. Sounds like this should be user configurable as only the user can know how unbalanced his trees might get.

Storing XML

Tuesday, May 25, 2004, 12:17 — 0 comments Edit

I’m currently doing research on how to store XML with object-oriented means. It’s of course rather trivial to store XML somehow. The interesting question is which system makes most sense if the XML is to be queried using XPath or XQuery expressions. <!–more–> Just to write down some of the ideas:

Also a lazy update of pointer lists might be nice. Every XML node keeps a list of pointers to the nodes on its axis like siblings, predecessors, children, successors etc. which is kind of virtual. It’s only created when needed and updated if certain timestamps (which have to be somehow kept in the data dictionary) are updated.

New Post