Martin Probst's weblog

Writing strategy

July 15, 2005 at 22:41 #

Bennaco tells us How to Ruin a Writing Project in 10 Easy Steps. After that, he writes how to really do it, step 1:

1. Decide that you're not going to really "do it". Which is to say, decide that you are not going to approach the whole big, terrifying, thing in one go. Instead, you're going to do some noodling around, some very small, easy, graspable, low-intensity, and non-threatening things, one at a time, until the project gets done.

While he is talking about how to write as in literature, I feel this does apply a lot to programming, too. If your starting to write a big piece of complex software, do not try to approach the whole big thing at once. Just start writing all the little parts that glue together to the whole.

Start off by dissecting the problem into smaller building blocks. This is the most important step and requires quite some time. Dissect the blocks more until you can describe what each block really does in two sentences, without "and then something magic happens". Really figure out how the single parts work together, otherwise you'll be screwed afterwards. Discuss every detail with you team members, if any, to really make sure it works this way.

Then start to write all the smaller parts. Don't start with your "public static void main(String[] args)", but rather with the smaller helper routines, the data model your working on, conversion etc. If you have a proper development environment, you can test those parts using you favorite Unit Testing framework. The important thing is not to implement something that does the full job partially, but rather do a small part completely. Otherwise you will end up with a codebase that is completly cluttered by adding feature after feature without a bigger plan. That results in rewrites over rewrites and lots of bugs, not to forget the maintenance nightmare.

If you just continue to do so, at some point you will start using these components and glueing them together, more or less automatically. At this point, everything you should be finished with all the low level stuff and just put together the system in a bigger sense.

Finally putting the blocks together and seeing how it takes off can be quite rewarding. On the XMLDB project I did at the end of my bachelor studies, the senior developers supervising us recommended to just get something working as fast as possible. We did not go that way but rather took 2 months of planning out of 7. Then we started writing components, testing and slowly putting them together. The system didn't run a single query until after 4 1/2 months. But at that point, most of the hard work was done and we managed to deliver a working XQuery and XUpdate implementation including a persistent storage backend on time. And that with seven students, of whom one was busy writing a GUI for the server and one was doing documentation, infrastructure and other related work.

I just cited Bennaco's first point, but the rest is quite similar to what I wrote. Lots of refinements of an abstract plan until it's really trivial to write the single steps and glueing them together.


XML Editor for Eclipse

July 8, 2005 at 11:55 #

I just installed the Eclipse Web Tools Project stuff. It's not like I was doing web development, but these tools include something I've been looking for for ages:

### A decent XML Editor

Finally. I tried about 8 different tools, open source and commercial alike. All of them sucked in one or more ways - some we're merely text editors with highlighting, a lot were simply defunct, and something that not a single one got right was simple editing (proper indentation, proper cursor placement, etc.). The only one that was tolerable was the <oXygen/> editor, but well over $1000 * is a lot too much if your just using the XML editor.

It's still a little bit strange to install a full blown web development environment just to get something as basic as an XML editor (shouldn't this be provided by the editing platform by default?), but whatever.

* Update: I stand corrected, <oXygen/> is indeed a lot cheaper. Must have confused it with some other tool. Anyways judging from the first glimpse I prefer the WTP XML Editor over <oXygen/>, mainly because editing seems smoother.


Discussion of Apple's RSS extensions

July 6, 2005 at 12:26 #

Sam Ruby asks for linking to the discussion of Apple's RSS extensions in his blog. It's a worthwile read on how to (and especially how not to) extend existing XML formats.

The topic is quite interesting. I'd be interested in a more general discussion of non breaking extensions to existing XML formats - might be worthwile reading.


Evolution & Spam filtering

June 29, 2005 at 10:50 #

After quite a long and annoying hunt I think I have found out why Evoltion refuses to filter spam for me. Evolution uses SpamAssassin as it's backend and SpamAssassin has a certain feature called bayes_auto_learn.

It basically means that everything that gets classified as definetly spam (>15) or definetly not spam (<=0.1) is also automatically sent to train the bayesian filter.

I really wonder of what use this is. The bayesian filter will learn the same rules that are already implemented in SpamAssassin by that, if I'm not mistaken.

Apart from that, for me this was a nice bug. When you mark a message as spam in Evolution, it's supposed to train the filter. But the spam I'm getting (advertisement on stock options and such) always gets rated as 0.1 by SpamAssassin and is then automatically trained as not spam. Evolution would have to call sa-learn with the --forget option to force training the message as spam as SpamAssassin tries to avoid training messages multiple times.

So basically the spam filtering worked, but all the spam I got was automatically trained to be ham, no matter what I did with clicking etc. I whish spam filtering in Evolution was as easy and helpful as in Thunderbird...


Off to XIME-P

June 15, 2005 at 04:23 #

I'll be off to XIME-P, the International Workshop on XQuery Implementation, Experience and Perspectives. There will be a number of talks about directions and future development of XQuery. I'm especially interested in the upcoming Update language.

Also, I'll be spending 4 days in Baltimore, so I have two free days. Everybody told me Baltimore is not that interesting so I will try to get to Washington and do some tourism.


Beagle

June 10, 2005 at 18:39 #

I've got a new hobby - watching beagle index all the data that has accumulated in my homedir.

The installation on Ubuntu is pretty straightforward, except that some libraries/symlinks don't seem to be created correctly.

$ sudo apt-get install libgsf-cil libgmime-cil libebook1.2-3 
$ sudo ln -s /usr/lib/libebook1.2.so /usr/lib/libebook1.2.so.0

I think there were some more libraries missing, but executing "beagled --fg --debug" will probably tell you about that. Everytime it spits out a DllNotFoundException with some .so or complains about a missing .dll, just install those and everything works fine.

Pretty amazing that it runs so smooth, at least up til now. Kudos to the developers.

PS: Yes, this is about beagle in the version 0.0.11.1 for Hoary. 0.0.12 has not been backported yet so I wont install, even though 0.0.11.1 has serious issues for me (memory consumption with blam! is insane).


External functions in XQuery

June 4, 2005 at 12:20 #

I recently implemented a (IMHO) much handier way to provide external functions to XQueries in X-Hive/DB.

External functions can be declared in XQuery like this:

declare function myfunc($a, $b, $c) external;

In X-Hive, you can now create a statement on an arbitrary XML node, register functions, and execute the query (this is from memory and will probably not compile like that):

  XhiveNodeIf node = ...;
  XhiveXQueryQuery statement = node.createXQuery(
    "declare function extract-post($author, $title, $content, $time) external;" +
    "declare namespace dc = 'http://purl.org/dc/elements/1.1/';" +
    "declare namespace content= 'http://purl.org/rss/1.0/modules/content/';" +
    "for $item in /rss/channel/item" +
    "return extract-post($item/dc:creator, $item/title, $item/content:encoded, $item/pubDate)")
  ArrayList posts = new ArrayList();
  statement.setExternalFunction(null, "extract-post", new XhiveExtensionFunctionIf() {
    Object[] call(Iterator< ? extends XhiveXQueryValueIf>[] params) {
       String author = params[0].next().toString();
       ...
       posts.add(new RSSPost(author, title, content, date));
       return null;
    }
  });
  statement.execute();

While in general you have to be very careful with functions having side effects, this is a pretty handy way to extract Java objects from a given XML source. As long as you do not make any assumptions about the order, in which the function calls happen, it should also not break.

There are quite a lot of other projects about converting your XML into Java objects (e.g. Apache XMLBeans or DAX). Using XQuery has the advantage of giving you a real XML query language at hand for value extraction, and in combination with an XML database you can also handle really large documents very efficiently.


Microsoft good at competing

May 20, 2005 at 06:54 #

Dare Obasanjo writes:

The main problem is that Microsoft is good at competing but not good at caring for customers. The focus of the developer division at Microsoft is the .NET Framework and related technologies which is primarily a competitor to Java/JVM and related technologies. However when it comes to areas where there isn't a strong, single competitor that can be focused on (e.g. RAD development, scripting languages, web application development) we tend to flounder and stagnate. Eventually I'm sure customer pressure will get us of our butts, it's just unfortunate that we have to be forced to do these things instead of doing them right the first time around.

That is probably a very insightful comment. Also, I can't remember Microsoft creating a whole new market sector to compete in at any time. Microsoft seems to always enter markets very late, then take over the whole market by producing arguably quite good products after some time, and then not much happens anymore. The stagnation is probably because of the complete lack of any serious competition. Does anyone remember a really innovative feature in MS Office ever since it evaded it's competition?


Visited Countries

May 16, 2005 at 13:31 #

This is cool:


create your own visited country map

[via Daniel Holbach].


Firefox Extensions

May 16, 2005 at 13:02 #

So it seems the Firefox extensions webpage is very smart and checks if you're using the latest firefox version. Great.

And if you or your linux distribution somehow suck and do not install the latest firefox update 10secs after it has been released you suck and are thereby sentenced to "no extensions" penalty.

Hello? It's nice to add something like that, but a "no I don't want to upgrade, take me to the extensions" button would be quite nice. This somewhat reminds me of the old windows installers that insisted you would reboot you system after installation, no matter what. I can remember using my computer with unclosed but finished installers for longer periods because I didn't want to reboot ...

btw I only ran into this because somehow some extensions must have interefered with each other, and as the net result middle-click-open-tab-in-background stopped working, which is - at least for me - one of the most important features of a tabbed browser there is...

Update: killing all extensions and reinstalling didn't help. Instead, if I uncheck the "Open middle clicked links in background" option in tabbrowser-preferences it works. A magic inverted checkbox. Gnarf.