July 14, 2004, 8:35 p.m. —
When I checked my email today after having ignored it for 2 days I first thought there was an email problem with the MonoDevelop mailing list. There were 47 new emails in the last two days which is a lot more than usual.
It wasn't a misconfiguration of the mailinglist tool though, but a discussion about MonoDevelops license and eventual compatibility issues with SharpDevelops license.
MonoDevelop is an IDE developed for UNIXoid platforms (afaik mainly used on Linux & MacOS X) and its based on SharpDevelop to quite some extent. SharpDevelop is an IDE running under Mono and MS .NET but it's currently limited to Windows because of its dependancy on Windows.Forms.
The discussion started with Todd Berman, one of the main MonoDevelop guys, announcing that he will license all his own contributions under the MIT X11 license as opposed to SharpDevelops GPL policy. The MIT X11 license is some kind of LGPL/BSD/Apache style license which allows the mixture of free and non-free software.
The SharpDevelop guys (namely Christoph Wille) then remarked, that this is not possible as the MonoDevelop modules depend/extend GPLed SharpDevelop code. This effectively means that the authors cannot decide about the license of their own code because of the GPL limitations. This kind of annoyed some MonoDevelop guys as they wanted a more liberal license, especially to allow third parties to provide non-free plugins to MonoDevelop. One of the major developers, John Luke, decided to let go on MonoDevelop because of that.
I'm not really sure what the definitive conclusion of the rather interesting debate was, but in the end (at least, the current state) Todd Berman announced that he will relicense his contributions to bei MIT X11 / GPL dual licensed. When being distributed with SharpDevelop parts they have to be licensed under the GPL though.
What does this mean to other developers? Be careful about license issues with your Open Source project. Noone wants to struggle with strange juristical issues instead of writing cool code. Especially when including source code from other projects you might irrevocably bork up your project in terms of what you can use it for.
After all, I don't really understand the position of the SharpDevelop guys - they do not really seem very happy with the existence of MonoDevelopl. Otherwise they might go ahead and try to find a compromise, e.g. LGPLing the plug-in interfaces or the parts MonoDevelop depends on. I'm not a lawyer, but this might make non-GPL plug-ins and enhancements possible.
It's their sourcecode and their copyright so they can ofcourse freely decide what should be possible and what shouldn't. But I don't consider it a good idea to enforce the use of your favorite license on other people. It doesn't sound very "free" (freedom, not beer, german "frei") either.
June 11, 2004, 5:09 p.m. —
Today I received an email from my friend and fellow student Alexander Klimetschek. He is developing a system called DocSynch, which allows multiple people to synchronously work on one document. It's implemented via an IRC protocol and theres a working plugin for the opensource java editor jEdit.
While this seems to be completely unrelated to my current XML database project the scientific background of collaborative editing is probably very interesting. Multiple users editing one semi-structured hierarchical document is exactly what a XML database is supposed to provide. I have to check on that ...
June 10, 2004, 11 a.m. —
Today I googled for a mirrorsite of http://www.xmldb.org (which seems to be down) containing more information about the XML:DB API. One would think this shouldn't be all to hard, but as Google removes all non text characters from queries (also if they are quoted) this becomse "XML DB". And apparently everyone doing anything XML related to Databases - including peeps storing their adressbook.xml in MySQL BLOBs - is calling it XMLDB or DBXML.
This is also always funny when discussing such things with other developers.
Is it so hard to find a name for a tool that is not a mere description of the used technologies? Giving things an unambigous name is not only marketing, it's really necessary.
June 7, 2004, 5:37 p.m. —
I'm using Gentoo Linux which is great because it's building packages directly from source. This means you'll have packages optimized for your machine and also most packages get updated quite often. Lately I found out, that this is something I consider really annoying too.
I just typed a rather common command for me:
# sudo emerge -vpuU world
which means "search for updates to all software I have installed manually" (i.e. not as a dependency). This is what it spits out:
[ebuild U ] app-office/openoffice-ximian-1.1.59 [1.1.57] +gnome -kde -ooo-kde 1,013 kB
[ebuild U ] net-www/apache-2.0.49-r3 [2.0.49-r1] +berkdb -doc +gdbm -ipv6 -ldap +ssl -static +threads 0 kB
[ebuild U ] dev-util/eclipse-sdk-3.0.0_rc1 [3.0.0_pre8-r2] +gnome +gtk -jikes -kde +motif +mozilla 0 kB
[ebuild U ] sys-apps/module-init-tools-3.0-r2 [3.0] -debug 0 kB
[ebuild U ] sys-apps/baselayout-1.9.4-r2 [1.8.12] -bootstrap -build -livecd -(selinux) -static 0 kB
The important lines are those with eclipse and openoffice. I would like to keep up to date, but I don't like to compile whole Eclipse or even OpenOffice for a "m8 to m9" or some similarily small step. Especially as they are java packages and compiling by hand doesn't give you anything.
While I also like the configuration system and baselayout of Gentoo I think I would install Debian on the next machine I set up.
May 27, 2004, 4:34 p.m. —
I've come to think a little bit more about the way eXist stores XML. Actually it's more what I imagine it's supposed to be because this is only based on what Lars Trieloff told me about the phantom nodes they use in XML trees - I am too lazy to really look up how they do it.
If you insert phantom nodes and number your XML nodes as mentioned before, one can store a whole XML tree without any pointers within the nodes. You just have to take care of a table indexed with the numbers containing references to the objects. This table could also contain fields for locks and maybe even Access Control Lists.
This is generally a Great Thing (tm) as it leads to nodes with a fixed size which would make storage a lot easier and faster. I'm not yet really sure if it's a good idea because you need to take care of the attributes of elements and the text within textnode. This could be done by storing only references to them within the nodes itself but that would mean another dereferenciation when doing comparisons within XPath/XQuery queries.
Regarding the table of nodes, there are two styles to keep it. The first one can be considered somewhat of a "dense" index. This would mean one entry for one node, even if it's a phantom node. The access to this table would be very fast - O(1). The drawback is a potentially massive overhead if your XML tree is very unbalanced. Imagine a tree that is generally very thin but has one node with several hundred children. The array would get extremely big because of all the phantom nodes.
The other style would be a sparse index. The list would only contain references for non phantom nodes even though they would be counted for the index. This index would be slower as it cannot be accessed directly via the index. If it is implemented as an ordered tree structure accesses would be sth like O(log n) - in which case the storage tradeoff for a pointered node tree might be less bad. Keep in mind that frequently traversed nodes would be in main memory all the time anyways so disk IO accesses should be seldom. Tree management shouldn't be a big issue as the complete index has to be rebuilt if something gets changed anyways.
The dense index would be great but the space tradeoff might be heavy. The dense index might be a bad idea - every access to a node would be O(log n) which is a lot compared to simple pointer dereferencing in a DOM tree. Sounds like this should be user configurable as only the user can know how unbalanced his trees might get.
May 25, 2004, 12:17 p.m. —
I'm currently doing research on how to store XML with object-oriented means. It's of course rather trivial to store XML somehow. The interesting question is which system makes most sense if the XML is to be queried using XPath or XQuery expressions.
<!--more-->
Just to write down some of the ideas:
- Standard DOM - not really efficient because of the awful lot of pointers. If one element is changed, lots of others have to be updated too
- AST/TA developed at the Humbold University in Berlin - interprete XML Nodes as a tree-structured index above a simple text array with the contents of the XML file. This can probably be implemented quite efficient using a double-linked list of strings or whatever.
Might be useful if your XML application is very text-centered, e.g. true XHTML files where the Markup can be considered eye-candy.
- eXist style with pseudo nodes inflating every partial tree to be symmetric. This makes sibling questions very easy, but on every update a full rebuilding of the tree is necessary.
Also a lazy update of pointer lists might be nice. Every XML node keeps a list of pointers to the nodes on its axis like siblings, predecessors, children, successors etc. which is kind of virtual. It's only created when needed and updated if certain timestamps (which have to be somehow kept in the data dictionary) are updated.
April 15, 2004, 9:01 p.m. —
I found this article comparing C# with Java in the Gentoo forums. It promises to be quite interesting though I didn't have the time to read it yet - I should read it tomorrow.
April 11, 2004, 8:59 p.m. —
I have visited a lecture called DISCOURSE which is sponsored by Microsoft and aims at promoting the .NET framework. It was quite interesting and a friend of mine, Lars Trieloff, and me wrote a sample C# .NET application for the University afterwards.
It's a Visual Tail -f using .NET. The application core (I know it's ridiculous to use that term in conjunction with about 2500 LOC) is portable in terms of running under MS .NET as well as Mono. We also have GUIs for both Windows.Forms and GTK#.
I wonder if it might be of any use except to us for learning the language (and getting some ECTS credits at the Uni). At least the syslog tool included in GNOME seems to suck, it doesn't support reading from e.g. a remote SSH session or a sudo.
April 8, 2004, 9:39 p.m. —
Well, finally I have managed to create me a weblog. Maybe it's of some use, otherwise I have at least learned how to do this ;-)