Martin Probst's weblog

Eclipse PDE builds

Tuesday, June 13, 2006, 08:02 — 1 comment Edit

Something that really leaves me wondering is the Eclipse Plugin Development Environment (PDE). The folks at Eclipse certainly know how to make a great IDE, they show it with the Java Development Tools, but the PDE?

They provide some nice GUI editors for the various XML files used by plugins, but some very basic things fail, consistently. One of the most annoying things is building plugins. It just never works. There is a long standing bug that plugins don’t build if they are not in the exact same folder hierarchy, there is apparently a bug where it doesn’t pick up the right versions for feature projects, there is a bug where having a plugin installed in some version conflicts with building it, there is a bug where changes in one of the projects don’t show up in another if you’ve also got another plugin of that name installed - everytime I upgrade my feature to a new version I first change the version number in the GUI editor, then I manually edit the file because it never picks it up, then I try building, which always fails, then I restart Eclipse, then building usually works. Of course only because I found out about the bugs I mentioned before by painfully searching bugzilla and implemented workarounds.

Additionally the current release candidate sprinkles generated build.xml files and temporary folders all over your projects if you’re unwise enough to press that build button. But never think that these build.xml files might be usable with a standalone, plain ANT, they seem to only be there for added user annoyance.

After building the update site, installing the newer version fails mysteriously and silently - everyhting claims to have worked, but the code running is obviously still the old one, even after a restart.

This is really making Eclipse development a huge pain. Add to that the bad documentation of many things (most of the time you have to browse example code from somewhere on the net), the over-complicated and cryptic API, and I sometimes really wonder why everyone is so enthusiastic about Eclipse’s great extensibility and blah. Maybe it’s just the glamour of the Java IDE in Eclipse that makes people think the foundation of that must be great.

Translation of domain-specific terms - don't do it

Saturday, June 3, 2006, 09:37 — 1 comment Edit

pitosalas from BlogBridge wonders how to translate domain specific words like "Feed" or "Readling List" to German. I would always opt not to translate the very specific terms like "Feed". You can't come up with a word in German that encompasses the whole meaning, as there is nothing like a feed, yet. "Nachrichtenticker" is in itself an adoption from English (I think), and it's not really on spot. People will have to learn the meaning of "Feed" anyways, so why bother with translating, it's not going to help them.

I would always try to go this way. I heard that early IBM manuals would correctly translate the term "stack pointer" to "Kellerspeicherrücksprungszeiger". This is indeed the right way to translate it - Friedrich L. Bauer called it that way when he invented it. But translating it at all is wrong. If I read such a term, it takes me quite some time to translate it back to English and find out what it's supposed to mean.

These non-translatable words are in my personal opinion actually a benefit of being a non-English speaking computer user or programmer. E.g. if someone uses the term "map", I always know he's talking about the data structure, not about an actual road map. Same goes for many words, and again it's the same - the word "map" or "Karte" doesn't really help you much in understanding the data structure, you have to learn it anyways. This is of course only true if the word is indeed domain specific and doesn't have a well known translation.

"Reading lists" is different, as there may actually be a parallel in real life. That means there is a "natural" translation that has a meaning on it's own, without any further explanation.

Learned ruby today

Tuesday, May 30, 2006, 21:41 — 0 comments Edit

Well, not quite, but at least I wrote my first ruby script, something using the IMAP API and walking the tree in my mail account. I also debugged a tool called RExchange, something that retrieves calendar items from Exchange servers. I’ve not been succesful at that, still.

This language is ridiculously easy and straight forward. I’m not sure if one should be allowed to call this programming ;-)

Tomorrow: Ruby on Rails.

Mail and server woes

Tuesday, May 30, 2006, 16:14 — 0 comments Edit

Quite some time ago I lost my login password to my hosters configuration area. It didn’t really bother me as I didn’t have anything to set up, but yesterday I actually called them and received a new password (positive: they will only ever give you a password if they call you back on the previously set up telephone number).

Strange business practice

After that, I noticed they have a new offer which is significantly, so I upgraded my “Hosteurope WebPack L (r1)” to a “Hosteurope WebPack L (r2)”. What I failed to note whas the warning text telling me that I would loose all my data. So when I called support today asking why my email and webpage didn’t work, I was quite surprised. Hosteurope is a good hoster, I’ve never had any problems with them. But this is pretty annoying - I can’t think of a technical reason not to just copy the data over (if they even have to migrate to a different server), in case someone changes the contract.

So my guess is that they only do this to keep customers from upgrading to cheaper contracts. Which I really don’t like, and also the way they do this is quite dangerous: there is a single dialogue comparing prices in the two systems and then a button that says “upgrade” below that. As far as I can remember, there was no 40pt red letter warning confirmation or anything, the warning must have been in the small text above that. Which is totally ridiculous, an operation that can cost you your whole email archive not guarded by anything but some small text? And that only for the small commercial benefit of increasing the opportunity cost for upgrading?

Luckily I noticed soon enough so they still had the original email server running, so I could just copy my mail using IMAP. I would also have had a local backup of my ~/Library/Mail dir, but still.


I had quite some trouble creating a backup of the MySQL database on the server, surprisingly phpMyAdmin refused to “send” a backup. After some time I found out that an old webpage I keep around had a public comment function which was largely abused by spammers. The corresponding SQL tables had grown to 75 MB. I wonder how beginners create webpages nowadays, you have to get an expert in anti-spam technology before you can put something online “

If you’ve tried emailing me yesterday or today and didn’t receive a response, try contacting me again, some stuff might have been lost.

XQuery as a web scripting language

Tuesday, May 16, 2006, 08:17 — 1 comment Edit

In my last post, I threw some XQuery together to provide a browsable outline of an XML document in a browser. I used XQuery as a web language by embedding it into JSP pages and using our X-Hive/DB tag library. This somewhat sucks, because I don’t like JSP generally and our tag library is somewhat under maintained. It would probably be quite easy to get it up to speed again as it’s not really much code, and of course customers can modify it as we deliver the source code to them.

I’d really like to use XQuery exclusively. On first thought, it’s a perfect fit for a web language, being functional and such, but on after thought you run into quite some problems. Just returning the query result as a web page is nice, but that doesn’t get you very far. To provide a useful web interface you need to set arbitrary HTTP headers (most importantly response codes).

MarkLogic uses custom functions to do that, e.g.

eXist goes with the same approach, I can’t find something about headers though.

We used to do that with our debugging capabilities, e.g.

xhive:queryplan-debug(‘stdout’), …
But the problem with that method is that it’s highly un-functional. It doesn’t fit the language to have side-effect functions that always return the empty sequence. And it’s not only ugly, it might get you into real trouble, e.g.
  let $doc := doc($uri)
   if ($doc/type = “text”) then
This is a bit contrived and you could work around it in this case (construct the content type first, then call the method), but in the general case the query processor is allowed to evaluate both function calls, and in any random order it finds suitable. So you might end up first setting the content type to html, then to pdf, and then delivering an HTML document. We now use a syntax like “declare option xhive:queryplan-debug …;” in the query header, but that of course doesn’t work for HTTP headers. (I’m not writing this to bash on MarkLogic, I can understand their decision to do it like this very well, and it’s simply an ugly problem - I just took them as an example as their documentation is readily available).

The only XQuery-ish solution would probably be to provide a custom document format to encapsulate those web specific results. Have a document type called HTTP response (and one called HTTP request) and have the query return this document, e.g.

<response xmlns=“http://http/what/ever">
    <code>HTTP/1.1 404 Not found</code>
    <entry name=“Location” value=“…” />
  <body xmlns=“”>{
    for $x in …
This would probably work, though it puts a bit more work on the user. Should be possible with some library functions though … one would need to pay attention to some escaping issues etc. And there is probably quite a lot I didn’t think of …

Now, does anyone now such a format? I didn’t really research, and “xml http format” doesn’t make up a good Google query. Maybe having a consistent format between different XQuery implementations would be nice, too, as this stuff goes into the query, and the whole point of XQuery is to have portable queries. Or other people have completely different ideas on how to do this, I’d like feedback anyways.

Update: Lars Trieloff comments that he would rather use processing instructions on the document root level, e.g.

<?http.header Location: /other/path ?>
where the name is open to discussion. This raises two minor escaping issues: the user might already use < ?http.header?> (or whatever) and PIs don’t support namespaces, and the user might include line breaks in the processing instruction. The latter is probably just an error. The benefit of this technique is that it would be a lot less invasive to user code.

Create a document browser with XQuery in 50 SLOC

Saturday, April 22, 2006, 15:49 — 0 comments Edit

Some time ago I was asked to create a quick demo of how to create a documentation browser with X-Hive/DB. The idea was to present one huge document with a tree navigation on the left and the content of a selected tree node on the right. An XSL stylesheet to transform the document into HTML already existed.

AMM Browser

For the curious: AMM stands for “aircraft maintenance manual”, this documents the procedures necessary for different defects or routine maintenance on air planes. And no, the engineers don’t do Lorem ipsum all the time ;-)

This is something XQuery was designed for. I took an existing JavaScript tree library that eats a trivial XML format and the X-Hive/DB JSP taglibrary. The whole application consists of three jsp files:


Some quirky HTML and my misguided attempts to create CSS, plus this script code:

  tree=new dhtmlXTreeObject(“treeboxbox_tree”,“100%”,“100%”,0);
  //set function object to call on node select
  function onNodeSelect(nodeId) {
    ajaxpage(“content.jsp?id=” + nodeId, “content_box”);
… which just tells the JavaScript library to use tree.jsp for the tree and content.jsp for the content. Can’t get much easier.


Probably the most trivial query possible:

      1 <?xml version=“1.0” encoding=“utf-8” ?>
      2 <%@ taglib uri=“" prefix=“xhtags” %>
      3 <xhtags:session>
      4   <xhtags:transaction>
      5     <xhtags:contextnode contextNodePath=“/amm”/>
      6     <xhtags:xquery>
      7       //*[@KEY = “<jsp:expression>request.getParameter(“id”)</jsp:expression>“]
      8     </xhtags:xquery>
      9     <xhtags:foreach>
     10       <xhtags:transform styleUrl=“amm-common.xsl” />
     11     </xhtags:foreach>
     12   </xhtags:transaction>
     13 </xhtags:session>
This JSP simply imports the X-Hive taglib, opens a session with the database, within that a transaction, takes a context node in the library, runs an XQuery for nodes with a specific ID (the parameter passed as “id”) and runs a stylesheet on each of the returned nodes. The necessary configuration (X-Hive database path, DB user and password, cache size) is taken from web.xml.


A simple XQuery to produce the XML format the JavaScript library requires from the document in the database. The library expects something like this:

      1 <tree id=“0”>
      2   <item id=“1” text=“foo” child=“false”/>
      3   <item id=“2” text=“bar” child=“true”/>
      4 </tree>

Which simply means: the tree with the ID “0” (which is an opaque string, it is only expected to be unique) has two children, one with the description “foo” and the other called “bar” and IDs 1 and 2, respectively. The latter has children, too. It also supports fancy stuff like special icons etc., but I left this out for simplicity.

The document which is queried roughly looks like this:

      1 <?xml version=“1.0” encoding=“utf-8” ?>
      2 <AMM>
      3   <CHAPTER KEY=“…” CHAPNBR=“3”>
      4     <TITLE>…</TITLE>
      5     <SECTION KEY=“…” CHAPNBR=“3” SECNBR=“1”>…</SECTION>
      6     …
      7   </CHAPTER>
      8 </AMM>
Levels also include SUBJECT and PGBLK. The KEY attribute is always unique and each level element has a TITLE child and other structural elements are always direct children of their parent. This is all we need to know about our source.

tree.jsp looks like this:

      1 <?xml version=“1.0” encoding=“utf-8”?>
      2 <% response.setContentType(“text/xml”); %>
      3 <%@ taglib uri=“" prefix=“xhtags” %>
      4 <xhtags:session>
      5   <xhtags:transaction>
      6     <xhtags:contextnode contextNodePath=“/amm”/>
      7     <xhtags:xquery>
      8       declare function local:tree($root as element()) as element()
      9       {
     10         let $rootId := if ($root/@KEY) then $root/@KEY/string() else “0”
     11         return
     12         <tree id=“{ $rootId }”>
     13           {
     14             for $child in $root/(CHAPTER | SECTION | SUBJECT | PGBLK)
     15             let $numStr := string-join( ($child/@CHAPNBR,
     16                                          $child/@SECTNBR,
     17                                          $child/@SUBJNBR,
     18                                          $child/@PGBLKNBR),
     19                                         ‘-’)
     20             let $hasChildren := exists($child/(CHAPTER | SECTION | SUBJECT | PGBLK))
     21             return
     22               <item id=“{ $child/@KEY/string() }” text=“{ $numStr , ‘ ‘, $child/TITLE/string() }” child=“{ if ($hasChildren) then 1 else 0 }”/>
     23           }
     24         </tree>
     25       };
     27       let $root := //*[@KEY = “<jsp:expression>request.getParameter(“id”)</jsp:expression>“]
     28       return
     29         if (empty($root)) then
     30           (: empty root, return main doc :)
     31           local:tree(/AMM)
     32         else
     33           (: children of the node with the given ID :)
     34           local:tree($root)
     35     </xhtags:xquery>
     36     <xhtags:foreach>
     37       <xhtags:tostring/>
     38     </xhtags:foreach>
     39   </xhtags:transaction>
     40 </xhtags:session>

The query retrieves the element with the given ID if existant, the root node (/AMM) otherwise. It then calls to the function local:tree to create an XML document that matches the format. local:tree first checks whether we have a KEY attribute on the root (in AMM documents the root node doesn’t have an ID) and makes sure we always have something. It then creates the root of the tree in line 12 and for each child of the root node an <item/> with the proper text, ID and child flag. It only iterates over CHAPTER, SECTION, SUBJECT and PGBLK nodes (line 14), everything else is considered non-structural and shouldn’t show up in the tree. In line 20 it creates the chapter number that is prepended to the title of each element. This could also be calculated dynamically from the XML, but if you’re handling fragments of the document that wouldn’t work. The “child” attribute (which should have been named “hasChildren”) is just an optimization so that the client doesn’t have to ask for the children of each node just to decide whether to display a “+” in front of them.

While this certainly has deficits (e.g. no proper escaping used on parameters to the queries, should be query parameters, XSL stylesheet runs on the server) it shows how easy it is to query XML with XQuery. I wrote the whole thing in less than a day, most of the time was spent debugging the JavaScript library and setting up Tomcat and the build environment. As a bonus, the interface between the server and client components couldn’t be more trivial. Using the ID attribute we can easily drop our browser based editing component into the application to make the whole thing read/write.

To have the whole thing run fast I added an index on the attribute KEY of any element. This can also easily benefit from HTTP caching. If the whole thing needs to scale up, you can simply change the path to the database file (e.g. /var/xhivedb/data/XhiveDatabase.bootstrap) in your web.xml to a remote machine running X-Hive/DB (e.g. xhive://server:1235/). This way, all requests to the application servers will use a local data cache and the server is only ever queried if pages have been modified. Using that technique you can serve a lot of users.

AMM Browser

Saturday, April 22, 2006, 14:47 — 0 comments Edit

Screenshot of the AMM browser

First MacOS X impressions

Thursday, April 13, 2006, 06:22 — 0 comments Edit

I’m just in the process of getting used to MacOS and my new MacBook. The MacBook itself simply rocks - it’s fast, looks good, runs acceptably long (3.5 or something), etc.

For MacOS, some things are different in a pretty strange way, I definetely need to get used to that. Plus I don’t know some of the very basic things, I’m just learning how to install applications and especially Java applications atm. Something pretty difficult for me is for example text editing. On Windows and Linux there are text editing shortcuts which are supported virutally everywhere. I guess I just don’t know these on MacOS yet, though I also already noticed some differences between e.g. text editing in SubEthaEdit vs. Eclipse.

Regarding “looks good”: Many people say MacOS is extremely beautiful and it’s because of the window theme or the animated effects etc. I don’t think they are right - the major difference is font rendering. On MacOS all text looks really good. Proper anti aliasing where you want it, no anti aliasing if the font is too small etc. I don’t know exactly how they do it, but the effect is that everything looks very professional and is very usable. Nice.

MacBook shipping time

Wednesday, March 22, 2006, 17:39 — 0 comments Edit

I ordered one of those nice MacBook Pro’s, the 2 GHz variant, with an additional GB of RAM (making it 2 GB), the 7200rpm harddrive and in full English (keyboard/OS). The webpage initially says it’s going to take 3-5 days to ship. Now the shipping information says it’s going to be sent on April 18th, making it roughly 4 weeks.

I feel like a child who got it’s candy stolen …

UPDATE: 4 weeks in Apple time are significantly shorter than expected:

Verzonden op Mar 27, 2006 via TNT International Express

… ‘Verzonden’ is the Dutch equivalent of ‘sent’.

Functional programming

Tuesday, February 14, 2006, 17:24 — 0 comments Edit

Some articles/books about functional programming I came across:

Probably nothing new to the reader, I just note them so I won’t forget them.

New Post