Martin Probst's weblog

Registration process from hell

December 8, 2005 at 23:02 #

I just had to sign up for a new ICQ account, as my old account somehow got lost. I suspect someone took it over as the password was weak, but who knows. I have not used any ICQ product beyond the network service for over 8 years, using various other clients as ICQ itself is just too much of a torture. I just got reminded how awful ICQ can be on their website.

First of all, all of their websites contain at least 3 blinking, jumping and sometimes sounding flash ads. Apparently they don't care if potential customers die of an epileptic shock before finishing registration. Then the website contains incredible amounts of garbage, but nowhere are the things people actually might want to do: get an ICQ number, login to an account on the webpage.

After finding the registration form, you'll first be surprised that "only" a nick, an email address and security related stuff are needed. Of course they have a captcha. After filling out the page, it returns, stating the password must be 6-8 characters and may include some special chars. I re-filled out the page with password and captcha three times until I realised that they actually limit passwords to 8 characters. Everything else would be too secure or what?

The answers to the special questions also give pain - again, at least 6 characters. What if my answers are shorter? Plus, why do I actually need this, if I have an email address to come back to? It's not as if I'd forget my email address but remember a 10 digit number ... and everytime you type something wrong at that page, you have another chance at the captcha plus reentering your password twice.

If a company has been doing online-stuff since well over 10 years, how did they manage to learn nothing about how to do it right? Just a tiny bit better than the average PHP coding "my homepage" guy?


XQuery too complex?

December 6, 2005 at 09:22 #

I'm repeatedly reading that XQuery was 'too complex', i.e. from Uche Ogbuji on xml-dev or in this article. That may be true if your implementing an XQuery engine, but the surprising thing is that this is mostly claimed by XQuery users (not implementors!).

I kind of wonder why. Sure, the spec took much too long, it's quite complicated, and with all it's references too static typing, XML schema etc. it can be quite scary. But which user reads the spec? And why should they?

What you have to know about XQuery are three things:

  • XPath is a part of the language, it directly works
  • XML is part of the language, just type ahead
  • The FLWOR statement is useful for XML aggregation

What you don't need to know (but may find interesting) is: there is a lot more to XQuery (functions, typing, ...), but it wont get into your way if you don't explicitly ask for it. The whole syntax a newbie who knows XPath and XML, e.g. the typical XSLT 1.0 user, has to understand is this:

  let $foo := "bar"
  for $x in doc('a.xml')/my/xpath[@attr = $foo]
  for $y := doc('b.xml')/more/xpath
  where $x/@id = $y/@id
  order by $x/name
  return
    <newtag>{ $x/name, $y/address }</newtag>

There's a lot more out there, but this is everything you really have to know. for loops and let statements to aggregate XML, where for easy-to-use joins, XPath to actually get the data, XML literals to group it together. If that's too complex for you, I don't know. Anyone who was able to twist his mind around XSLT should be able to use that. Plus, in the areas where XSLT 1.0 fails horribly (strings, grouping, ...) XQuery provides the functionality without forcing you to write 20 lines of stylesheet code for "replace($haystack, $needle, $replacement)". This is just a subset of what XQuery has to offer, but it's the most important part, and it's the part that gives the user direct, large benefits over other XML technologies (XSLT, XPath) in querying XML.

The working group took a long time, actually too long, but this is something they really did right. There is a subset, a 'core' as Uche Ogbuji calls it, that gives all the important benfits and is really easy to learn.


Gentoo to Ubuntu

November 20, 2005 at 23:53 #

Lars:

Ubuntu is an ancient african word, which means : "I'm sick of compiling Gentoo all the time" -- Jeff Waugh

Maven 2

October 22, 2005 at 11:38 #

I wanted to try out Apache Maven 2 today. So I started off with the tutorial and created a default project. After some playing with targets and plugins, a connection to http://repo1.maven.org/maven2 timed out and maven reported that this repository had been blacklisted.

And that's it. I grepped through all the files in ~/.m2, lots of POM files mention the given URL, but nothing looks like blacklisting. So basically the tool error'ed out and I don't have any clue how to fix it. Plus, there is of course no documentation about this "feature" at all. Every attempt to download a plugin or a dependency fails by now.


Java 6 Preview

October 21, 2005 at 23:44 #

Lars Trieloff is happy that Java 6 Swing picks up GTK themes. While that is certainly nice, what I like a lot more is that Java 6 runs a certain native XML database about 15% faster in our benchmarks.

Of course it's quite hard to tell why, but it seems that Java 6 is bringing several new optimisations to the HotSpot virtual machine, most notably the ability to have small, local objects created on the stack, and not on the heap. That saves general allocation overhead plus the time needed for GC, so this may or may not be a reason. Anyways, it's cool, and quite suprising that they can still get such big improvements.


Atom XML Schema

October 8, 2005 at 20:53 #

Does anyone know about an up-to-date XML Schema definition for Atom 1.0? I only found this one, which is nice, but it doesn't fit the current spec very good, and I'm too lazy to fix it ;-). There must be a (non-RELAX NG) schema out there, or not?

I'm currently playing around with the Atom Publishing Protocol, Atom itself and this idea of a Atom-based web storage facility. I'm not completly convinced that it's useful, I mainly wanted to try how hard it would be to implement something like that using X-Hive/DB. Or maybe it's just that XQuery is being finalized, and I need a new quick moving target to complain about changes in the spec ...

Speaking of that, we just released X-Hive/DB 7.0, which is really cool. I will probably write some stuff about it later, when the website is properly updated.


Collaborative Editing with Gobby

September 28, 2005 at 12:47 #

There is a text editor for Macs called SubEthaEdit, allowing multiple users to edit files collaboratively. Quite cool, but while the editor is free you have to get yourself at least the smallest hardware dongle at $ 500 (iMac mini).

Now there is Gobby, an editor doing roughly the same for Linux, Windows and Mac. I just tried it out and it does work on a local machine. Unluckily I didn't have a second box to try the advertised Zeroconf support etc., but it looks very promising!

Now all we need is a generic protocol for alle realtime collaborative editors ...


Media-less Linux installation

September 7, 2005 at 14:11 #

Install Linux without any media. If I had known of this slightly earlier (don't know since when it exists, though) it would have saved me a lot of trouble. Installing linux on a ThinkPad X30 without any external drive can get quite difficult.

When installing Gentoo on it I managed to get there by booting a kernel which had it root filesystem on a NFS share on a second box. Works, but is quite a lot of hassle setting up the server. Plus you learn a lot of things about tftp, NFS etc. you really never wanted to know.

When installing Ubuntu I found out that you just need to have the kernel + initrd. I formatted my USB key, marked the primary partition as bootable using fdisk, installed grub and the Ubuntu kernel on it, and it actually worked, pulling the whole installer from the net. Except that sometimes my wireless LAN card was recognized in the installer, sometimes not. This works probably better by now.

The method described by Marc Herbert seems a little more difficult than the USB key drive, but if you don't have a Linux system to set up the keydrive or don't have a Notebook that supports booting from keydrives, it's definetly the way to go.

[via Ben Maurer]


Java Unit Test Coverage

September 4, 2005 at 11:16 #

I've spent one and a half day last week setting up a Java Unit Test code coverage system. This was somewhat surprising to me, I don't think something like that should take that long. The major problem was the state of the available tools. I wanted to find if there exist any usable open source tools first, so I avoided Clover, JCover & Co. Instead I tried:

* jcoverage GPL - http://www.jcoverage.com/
Doesn't work with Java 1.5, not updated for ages. * Quilt - http://quilt.apache.org
Doesn't work, not quite sure why. Not updated for ages. * ucovered - http://jxcl.sourceforge.net/
Doesn't work, Javadoc even states the error in question, but seems to be abandoned, too. * cobertura - http://cobertura.sourceforge.net/
Does work (hurray!) and seems to be in active development. Creates really nice coverage reports, but has quite some overhead problems. Even with some tricks the overhead seemed to be something like factor 4 to factor 5, and that just doesn't work if you have a testsuite of >10,000 tests that does already take several hours * EMMA - http://emma.sourceforge.net/
This is what we are running now. EMMA doesn't test real line coverage, but rather code block coverage. A code block is some java code that is not broken up by flow logic but rather a simple sequence of statements. I'm not sure if it's because of that, but EMMA is a lot faster than cobertura. Drawback is that the results are not displayed as nice.

So now we have something that is somewhat working. Somewhat because I ran into (presumably) a bug of ant (1.6.3) where custom junit task result formatters don't get their extension passed along if the <junit/> task is set to forkmode='once'. This currently makes it impossible to view the results of the unit tests if they are run with code coverage enabled, and by that makes it quite difficult to hunt down errors. I still have to check if that bug is fixed in a later version of ANT.

The forkmode='once' also lead to quite a number of errors on our side, as our test machinery relies on static class fields in several places, and those might be set to something wrong after a test. That's probably an error on our side, but annoying nonetheless. The forkmode='once' is necessary though, as anything else slows down the testing horribly.

In the aftermath coverage testing is quite nice, and the results are not as horrible as I expected. In most packages we have a coverage of over 90%. Most of the untested code is in generated classes. I presume most of it is untestable and not used at all. Code coverage in terms of lines or blocks is of course a very bad criterium for test completeness, path coverage wouldn't be that much better too, but it can at least give you good pointers to areas that are under- or untested. Another step to better software development ;-)

PS: Also a plus for EMMA is that it's self contained, only two jars, as opposed to other projects which require 6-8 libraries to be on your classpath. This is generally just a little more work to do when setting up, but wait until tool A requires a different version of a library than tool B. DLL hell for java, but that's another story ...


BOM of death

August 4, 2005 at 16:40 #

Note to self: next time you get really strange XML parse and comparison errors, try running this before looking at XML and Java files, cursing at XSLT, JUnit, Eclipse & the world in general for an hour:

find | grep -v .svn | xargs sed "s|\xEF\xBB\xBF||" -i.from-bom

(Unix shell script to remove UTF-8 byte order marks from all files below the current directory).

Afterwards start cursing about Notepad, Windows, and Microsoft's use of the BOM in general.