Martin Probst's weblog

Safari auto-completion magic?

Tuesday, October 17, 2006, 06:42 — 0 comments Edit

Some seconds ago I filled out a form on some website and it had one of those (not really) Turing tests, in this case the question was: “Math question, take (5 - 1) and divide by two”.

To my big surprise, when I entered my name Safari auto completion kicked in, filled out the rest of the fields (telephone etc.) including the test question with “2”!

Probably totally random luck as this auto completion feature is actually quite buggy, I find myself deleting wrongly filled out entries more often than using it, but I was really shocked :-)

Programming language features

Friday, October 13, 2006, 09:49 — 0 comments Edit

Joel Spolsky reviews “Beyond Java”:

Programming consists of overcoming two things: accidental difficulties, things which are difficult because you happen to be using inadequate programming tools, and things which are actually difficult, which no programming tool or language is going to solve. An example of an accidental difficulty is manual memory management, e.g. “malloc” and “free,” or the singleton classes people create in Java because they don’t have top level functions. An example of something which is actually difficult is dealing with the subtle interactions between different parts of a program, for example, figuring out all the implications of a new feature that you just added.
I’m not so sure this is correct, at least the example is quite bad. Managing the subtle interactions between different parts of a program can be made a lot easier by programming language features. That, and particular programming styles (aka patterns), but most of those are only enabled by programming language features.

For example, the Model-View-Controller pattern eases managing the interaction between parts of your application, and it’s near impossible without an object oriented programming language. Or encapsulation of data types, which reduces the impact of modifying the actual (physical) data representation on the rest of the program. Or polymorphism, which eases adding additional data types.

I think this list goes on, and also spreads into other areas of programming. For example generators (“yield”) in python.

Programming language features enable us to use/invent new patterns, which in turn make programming easier. There are hard problems and they will probably stay hard, but certain programming techniques can make them at least a bit easier (ie. look at the new java.util.concurrent libraries for Threading).

Time for new HTML Validator?

Wednesday, August 16, 2006, 09:35 — 0 comments Edit

Sam Ruby (author of the excellent feed validator) asks this question.

I think the problem is not really with the validator not detecting bugs, it’s more about the fact that nobody really cares to look at it. I don’t think the situation can be fixed by providing a better validator. There is just no short term, visible benefit in making pages conformant (except maybe bragging in front of your friends).

Nothing will make people invest time (= money) into fixing their pages if they don’t get anything in return. There is no use case for conforming and valid pages, everything we do today works with invalid pages.

I think the only thing that would make webpages better would be the advent of a new technology that would crucially require valid webpages. However, noone will design such a technology, as anyone knows that all the pages on the web are invalid …

The only thing that can be done is avoiding the same mistakes with XML and e.g. ATOM feeds. Don’t consume incorrect feeds, so that people notice if their feeds are broken. Otherwise we’ll end up with something like HTML is today: building any app consuming HTML is unbelievably expensive and tedious, because all of the available HTML is broken. We should avoid this for XML, even if it means a little more expenses for the producers at the beginning.

The oppsite of a Virtual Files System - the Dumb FS

Wednesday, August 9, 2006, 08:57 — 0 comments Edit

Many development platforms implement a virtual file system. The basic idea is that you abstract away the actual location, access methods etc. of files to provide a coherent view to applications on top of that. These virtual file systems then provide some sort of hooks or plugin mechanism to allow others to extend this VFS.

Now the great extensibility platform Eclipse takes a surprising approach at this. It kind of goes in the opposite direction and implements a DumbFS. Everything useful needs to be a Resource in Eclipse, and a Resource does not only have to be a local file, it also needs to be in a project, which needs to be in the workspace.

But wait, there are non-local resources! See the great level of abstraction! From the JavaDocs of IResource:

* Phantom resources represent incoming additions or outgoing deletions
* which have yet to be reconciled with a synchronization partner.
Did someone say CVS? Anyways, non-local resources have the interesting property that they are like resources, except completely useless because no IO is possible on them. So if you want to work on files, they need to be local, and in a project, in your workspace. Anything else is black magic and evil, so it got abstracted away.

Now, this is kind of sad, especially as Eclipse tries to reach beyond IDE stuff with the Rich Client Platform (RCP), and it’s really sad if you can’t do anything with files in there. But no problem, there is an example of how you create a RCP text editor. Of course, actually using the Text Editor component (part of RCP!) is such a special use case, you will have to do some coding on your own. In the example, this is ~500 LOCs, not counting anything Text Editor or application related, only the file access stuff, over 3 classes and one XML file (they didn’t expect that people might want to open files, so you have to extend the platform to provide that advanced feature). Oh and of course you don’t end up with a plain Eclipse text editor, several features wont work because it doesn’t have it’s beloved Resources.

I’ve always wondered why it’s so difficult in Java to open and read a file, but these guys play in a whole different league. Provide a text editor component, but then require every user of it to write his or her own file handler. Provide an advanced application development framework, but require anyone who wants to open a file to do some black magic adding an “Open file” command to the platform. Great stuff.

Should developers write their own tests?

Wednesday, August 9, 2006, 07:47 — 0 comments Edit

Dare Obasanjo wonders if having your developer team do the testing and operations, too, is actually such a great thing. I’m not sure about operations/deployment, I’ve never worked in a company that does any serious deployment.

However regarding test teams the major problem with a lot of people is that they don’t make a clear distinction between source level Unit Tests (as in Test Driven Development), which is something only the developer can do. And then there is this completely different thing called end-to-end test or function tests. There you test your application (not the source code!) against a specification - either an explicit, proper one, or an implicit, informal one. This is probably best done by real test engineers.

If you can afford to do both, you should certainly do so. Unit Tests are a great development and productivity tool, they make you write better code faster. End-to-end tests are what your customers actually want and need, they assure the product is working as advertised, so you certainly have to do that, in one way or the other.

Telekom rant

Monday, August 7, 2006, 14:27 — 0 comments Edit

I’m currently trying to get a telephone/DSL connection for my new flat in Potsdam. There are 3 real providers available, Telekom (ex-monopolist), Arcor and Alice. Both Arcor and Alice offer unlimited calls to fixed networks in Germany and unlimited DSL access (6 MBit) for about 50 Euros. Arcor requires a 24 month contract, so I’ll probably go with Alice.

The funny thing is I wanted to compare this with Telekom’s prices, and I just can’t. Their website has 5 different fixed line contracts, plus three options for those (one with a monthly fee, two with minimum charges), plus four different DSL connections (some with a setup price, some without), some of those include VoIP, plus the DSL flatrate. All of these include some sort of teaser things for their hotspot network and other services.

Plus, there is this brand new thing called T-One, which somehow integrates a mobile phone with a fixed line and DSL. No idea how much that is, or how much calling ist etc. because the prices are apparently secret - you may only see them if you registered with the Telekom website.

Speaking of website, the Telekom company has decided to split up into several business sectors. This means that if you want to get a telephone line and find out how much it will be, you typically visit 3-4 different TLDs (telekom.de, t-com.de, t-online.de plus various t-com-special-offer.de etc.). So every second click opens a new window. The motivation is probably that the managers of the single divisions feel more important like that. Completely ridiculous. It seems they just don’t want to have customers. And they even joke on you - every second tag line on their website says “Easy and fast”, they’ve probably had one hell of a time writing that.

In addition to that - I’m not sure, but this is my guess - it seems at Telekom I’d end up with about 60 something Euros for an analogue telephone line, a DSL 2 MBit flat and no telephone flat, for an unknown setup price. Now that’s a competitive price.

There is a VoIP product available from GMX, but it requires a Telekom fixed line (requiring setup) and ends up at 51,94 including the telephone flat. So basically it’s about the same price as a normal line with Alice/Arcor, but you also have the hassle with two companies and the new technology. No thanks.

More proof that XML Schema is A Bad Thing (tm)

Wednesday, July 12, 2006, 09:41 — 0 comments Edit

I’m currently converting the XQuery Update Facility Use Cases into our XML format to use them as preliminary test cases for X-Hive’s implementation.

The use cases include several tow test documents that have schema information. Both are invalid in my schema editor (<oXygen/> XML editor) - one because of the order of elements (attribute definitions after sequence in a complexType), the other because something doesn’t work with qualified element names and “ref”erences to elements in the schema. I’m not an expert on XML Schema (thank God) so I don’t know whether it’s my processor that’s wrong or the documents.

The error messages of the tool were something like “there is something wrong here”, or alternatively misleading (“components without a namespace cannot be imported from document ….xsd”, but it’s not at all about the document, apparently), so it’s been quite annoying to fix.

The same document set includes three files with a DTD, all of them work.

X-Hive 8.0 will probably have support for RELAX NG, so there is some bright light at the end of the tunnel.

The ability to say no

Tuesday, June 20, 2006, 07:11 — 0 comments Edit

Some time ago I commented on Uche Ogbuji describing XQuery as too complex. Now Dare Obasanjo writes this:

The lessons listed above seem rather self evident and obvious yet it s a sad fact of the software industry that the mistakes of CORBA keep getting made all over again. Core XML technologies like W3C XML Schema and XQuery are ‘standards’ without a reference implementation which invented new features by committee instead of standardizing best practice.

I won’t comment on XML Schema, but of course I’ll have to defend XQuery :-)

Dare’s criticism is partially right, XQuery apparently had some problems with the innovation by commitee thing, which has bloated the spec significantly and delayed it a lot. However XQuery (and XSLT 2.0 and the related specs) do have an inofficial reference implementation, it’s Michael Kay’s Saxon. It also has a very good test suite (which just got even better due to the tests from the KDE guys) and the results show that there are at least three implementations taking very different approaches that manage to cover most of XQuery correctly. XQuery is an existing, working technology with support from major industry players and a lot of implementations.

I think the major problem with CORBA was it’s huge complexity for the user. This really doesn’t compare with XQuery, which may be complicated to implement, but it’s really trivial to use. Dare quotes a list of important traits a spec should have, originally from Michi Henning:

5. To create quality software, the ability to say “no” is usually far more important than the ability to say “yes.”.

This is something XQuery gets exactly right. You can use it directly, without XML Schema, without static, strict or any typing at all, without validation, without functions etc. Every feature is optional to the user and invisible if unused.

It’s quite important to look at XQuery how it is lately - use one of the implementations or at least take a look at the spec, the implementations and all. If you’re just looking at the time from the first WD (which is indeed a legit complaint), you might get the impression of failure all over, but I don’t think XQuery is going to be the next XML Schema.

Eclipse PDE builds

Tuesday, June 13, 2006, 08:02 — 1 comment Edit

Something that really leaves me wondering is the Eclipse Plugin Development Environment (PDE). The folks at Eclipse certainly know how to make a great IDE, they show it with the Java Development Tools, but the PDE?

They provide some nice GUI editors for the various XML files used by plugins, but some very basic things fail, consistently. One of the most annoying things is building plugins. It just never works. There is a long standing bug that plugins don’t build if they are not in the exact same folder hierarchy, there is apparently a bug where it doesn’t pick up the right versions for feature projects, there is a bug where having a plugin installed in some version conflicts with building it, there is a bug where changes in one of the projects don’t show up in another if you’ve also got another plugin of that name installed - everytime I upgrade my feature to a new version I first change the version number in the GUI editor, then I manually edit the file because it never picks it up, then I try building, which always fails, then I restart Eclipse, then building usually works. Of course only because I found out about the bugs I mentioned before by painfully searching bugzilla and implemented workarounds.

Additionally the current release candidate sprinkles generated build.xml files and temporary folders all over your projects if you’re unwise enough to press that build button. But never think that these build.xml files might be usable with a standalone, plain ANT, they seem to only be there for added user annoyance.

After building the update site, installing the newer version fails mysteriously and silently - everyhting claims to have worked, but the code running is obviously still the old one, even after a restart.

This is really making Eclipse development a huge pain. Add to that the bad documentation of many things (most of the time you have to browse example code from somewhere on the net), the over-complicated and cryptic API, and I sometimes really wonder why everyone is so enthusiastic about Eclipse’s great extensibility and blah. Maybe it’s just the glamour of the Java IDE in Eclipse that makes people think the foundation of that must be great.

Translation of domain-specific terms - don't do it

Saturday, June 3, 2006, 09:37 — 1 comment Edit

pitosalas from BlogBridge wonders how to translate domain specific words like "Feed" or "Readling List" to German. I would always opt not to translate the very specific terms like "Feed". You can't come up with a word in German that encompasses the whole meaning, as there is nothing like a feed, yet. "Nachrichtenticker" is in itself an adoption from English (I think), and it's not really on spot. People will have to learn the meaning of "Feed" anyways, so why bother with translating, it's not going to help them.

I would always try to go this way. I heard that early IBM manuals would correctly translate the term "stack pointer" to "Kellerspeicherrücksprungszeiger". This is indeed the right way to translate it - Friedrich L. Bauer called it that way when he invented it. But translating it at all is wrong. If I read such a term, it takes me quite some time to translate it back to English and find out what it's supposed to mean.

These non-translatable words are in my personal opinion actually a benefit of being a non-English speaking computer user or programmer. E.g. if someone uses the term "map", I always know he's talking about the data structure, not about an actual road map. Same goes for many words, and again it's the same - the word "map" or "Karte" doesn't really help you much in understanding the data structure, you have to learn it anyways. This is of course only true if the word is indeed domain specific and doesn't have a well known translation.

"Reading lists" is different, as there may actually be a parallel in real life. That means there is a "natural" translation that has a meaning on it's own, without any further explanation.


New Post