Martin Probst's weblog

GNOME Online Desktop

Oct. 9, 2008, 6:41 a.m. — 0 comments

GNOME Online Desktop, via Silvan, with screenshots and a tour available at RedHat Magazine.

This basically looks like a nice idea, going forward to really integrate desktop functionality with web based apps. However it feels somewhat backwards, with the desktop developers implementing lots of connectors to various web applications. Shouldn't it be the other way around?

What I'd envision is a desktop that tightly integrates with the web browser and provides a set of hooks for web applications to integrate with, something like a one-click installation of a small plugin that augments the desktop with functionality related to the web app.

This could be small JavaScript pieces or maybe even only XML configuration that tell the desktop where to search for documents/calendar events/IM conversations/..., how to integrate with IM, pull notifications and so on. That would make the system much more open - any website developer could nicely integrate his application, without relying on the GNOME developers to add his webapp to the desktop. Of course there are security issues with that, but they should be fixable.

I generally think there is much value in extended JavaScript access to the desktop. It is certainly dangerous and needs to be done right ™, but the possibilities are really cool - like access to calendars and address books. Mac OS nicely shows how this can work - they provide basic, central services like the address book and the calendar, and allow other applications to re-use the functionality, which is of huge value to users. This would also make it viable to write real applications for mobile devices (iPhone, Android) just through web pages and JavaScript. Users would need to be asked for permissions, just like they do it with HTML5's openDatabase offline storage, and a good user interface for that is crucial.

My guess is that even if the desktop fails to deliver such integration, the web applications will, sooner or later. Google already has all the right APIs in place, it's just lacking a proper model to share some information with some applications without exposing your whole online life to some foreign app (giving away your GMail username/password is not an option). So the question isn't whether this tight integration over services is happening, but rather whether the desktop will be part of it.

Taleo E-Recruitement

Oct. 9, 2008, 5:54 a.m. — 2 comments

Some time ago I applied at a company that uses the "Taleo E-Recruitement" software. Taleo provides an ASP solution where - basically - HR people can post jobs and applicants can submit resumes. Gartner puts Taleo "in the leaders quadrant", as their website boasts. Once again, no idea on what Gartner judges (but some hints via Lars), but it's probably not related to the quality of the product.

I've rarely seen such a sucky web app, I thought they had died out somewhen around 2001. Search is pretty much broken, back button is broken, can't open pages in tabs, can't post links to jobs, it's integrated into the company's site in an iframe so that it takes a maximum of 1/3 of your screen, you need to create an account and will then be spammed with irrelevant job postings (need to login to turn them off, and the password recovery appears to be broken, too), etc. After you made it through the broken search and registration, you can submit your resume. First you can upload the CV, then you have to fix the automatically extracted name and address (why not just type it?!), then the system expects you to manually type your CV again in plain text, but please with formatting fixed. Then you'll have to re-enter all the information (work experience, education) from the CV in awkward HTML forms.

What could be simply writing a proper cover letter and sending it in an email including your CV is magically turned into a 1hr+ task, full of broken, annoying software and the constant fear that all your work will be eaten by another browser incompatibility on the last "wizard" page.

Taleo's website states that there is "Heightened competition for skilled workers." Yes, very much indeed. But why are you writing software that actively tries to keep people from applying to jobs then? It's like a specially designed filter that will drive off all the good people that don't need to put up with this stuff.

How to look for applicants

Sept. 20, 2008, 8:54 a.m. — 3 comments

I'm currently finishing my Master's thesis (finally!), so I'm looking for a job. It's a bit weird though: there are apparently a gazillion of books on the market, telling you how to apply properly for job offerings, but from the job offerings I see, we desperately need literature on how to properly write a job offering.

You'll find hundreds of boring "J2EE/Hibernate/Spring/JUnit/$DB" ads, all listing a long number of technologies you ought to be familiar with (hint: good people will have no problems learning any technology!), and telling you effectively nothing about the company, the domain, or anything. Sucks.

And then you have all the ads that don't even meet the minimum standards - typos, duplicate copy/paste content, bad formatting, completely broken HTML. What are they thinking? Is that the impression you want to give, "we can't even produce proper job ads, come work for us"?

Memory accounting in Linux

Sept. 19, 2008, 9:07 a.m. — 0 comments

Once again random processes on my virtual server running this blog were killed, due to out of memory errors. This time, it actually led to some outage, because Apache was killed.

Normally, Apache servers the whole blog from a static cache, and only comments and new posts are handled by Rails, which makes the whole thing pretty fast - otherwise it would be unusable. Rendering single requests takes something close to seconds on my MacBook Pro, and thus much longer on the vserver. I have no idea why, and it's probably some bug in my own implementation, but introducing efficient caching was both easier and more interesting than performance tracing the rails code.

This is getting quite annoying. I have 128 MB guaranteed memory on the vserver, and this is simply not enough for Apache, SVN, Rails, and Tomcat. Interestingly, Tomcat is much less of a culprit than Rails, which consumes a lot more memory through the several process instances, as I found out earlier.

The bad thing is that there really is currently no way under Linux to find out which of your applications is actually using your memory, and how much of it. RSS, VSZ, "size" and others simply don't give any relevant insight into memory usage - the only thing you can probably do is look at memory consumption, start the app, and compare. Which is a pretty bad state, IMHO.

Lately I found that the /proc filesystem provides a 'smaps' file for each process, which contains its memory mappings, each explained with segment, rss, private and shared pages. This might actually lead to a useful memory analysis. One could probably write a simple tool that reports the private memory usage of each process, and accounts the memory usage through shared libraries to the apps using them. This should probably separate fixed cost (i.e., the shared memory consumed by running anything Ruby based) and dynamic costs (i.e., the added private memory for each Ruby/Rails instance).

I've started playing around a bit with a small Ruby script, and maybe if I have some spare time, I can turn it into something useful, though this will probably take a lot of learning about virtual memory in Linux.

Maybe I'm just using the wrong tools or operating systems, but I find it somehow depressing that we don't have a proper way of accounting memory to applications. It's really annoying that even today a badly written application that does something like while (true) malloc(...); can effectively bring down your whole system...

Higher order functions in XQuery

Aug. 1, 2008, 7:39 a.m. — 0 comments

The requirements for XQuery 1.1 contain a MAY item called "higher order functions". I'm really fond of the idea of higher order functions in XQuery, and as there are currently no use cases for that, I'll contribute some in here:

Simple predicates

A simple use case is to pass a predicate to another function, as shown below:

(:: Selectively only copy those elements for which $pred returns true 
    TODO works recursively, but copys all attributes regardless of $pred
  :)
declare function local:selective-copy($nodes as element()*, $pred as function($node as element()) -> xs:boolean)
{
	for $n in $nodes
	return 
	  (: call $pred :)
	  if ($pred($n)) then element { node-name($n) } { $n/@*, local:selective-copy($n/*, $pred) }
	  else ()
};

declare function local:mypred($node as element()) as xs:boolean
{
  let $name = node-name($node)
  return namespace-uri-from-QName($name) = ('http://www.example.com/example', 'http://www.foo.com/bar')
};

let $xml :=
  <user xmlns="http://www.example.com/example">
    <name><first>Fritz</first> <last>Müller</last></name>
    <password xmlns="http://www.example.com/secure">vj/b5ZaUYQ6kU</password>
    <preferences> ... </preferences>
  </user>
(: user the curry operator & to get a function "handle" :)
let $predicate := &local:mypred
return local:selective-copy($xml, $predicate)

Currying

The next use case would be the natural extension - real currying of functions. local:selective-copy is as above, but we'll generalize our predicate a bit:

(: this function compares the namespace of a given node against a set of legal namespaces :)
declare function local:namespace-matches($namespaces as xs:anyURI*, $node as element()) as xs:boolean
{
  let $name = node-name($node)
  return namespace-uri-from-QName($name) = $namespaces
};

let $xml :=
  <user xmlns="http://www.example.com/example">
    <name><first>Fritz</first> <last>Müller</last></name>
    <password xmlns="http://www.example.com/secure">vj/b5ZaUYQ6kU</password>
    <preferences> ... </preferences>
  </user>
(: user the curry operator & to get a function "handle"
   we pass in a namespace to check against, and get an unary function in return :)
let $predicate := &local:mypred("http://www.example.com/example")
return local:selective-copy($xml, $predicate)

The & operator

The & operator ("curry operator") returns a handle to the function given by name, possibly setting values for parameters by specifying them in parentheses. This provides the reification for functions, i.e., the way from a function to a value.

Currying is only allowed from left to right, i.e., one cannot curry the second argument, but leave the first argument to a function unbound. I don't think this is a large restriction, and it makes the syntax much easier.

Calling function handles

The $someval(...) syntax allows straight forward calling of a function handle, so it is somewhat of the inverse of the & operator.

Types for higher order functions

I think the static typing feature of XQuery never really gained much traction. However it would of course be possible to introduce a new type, called "function()", and specify parameters and return values as shown above. I think it should be possible to type-check that, but I'm not an expert.

Implementation

I didn't implement this yet (I currently don't have access to an XQuery implementation), but it should not clash with any existing syntax, so from a grammar point of view it should be ok. It does require some changes to the runtime system, but that shouldn't be too difficult, IMHO.

The nice thing about higher order functions is that they can allow some method of dynamic dispatch. That is, they allow to write programs that decide at runtime which code is going to be executed, in an elegant way.

This is of course not complete. It doesn't support a terse syntax for lambdas, which would also be nice (not having to declare all those pesky one-line functions). XQuery should also have something like a "fn:resolve-function($name, $argc)" that provides dynamic access to the curry operator by specifying the QName and argument count of the function.

I think the example shows that this little extension can get you a lot of nice functionality and a lot less typing. Please leave a comment with your opinion!

XQuery Scripting Extensions and Use Cases

July 9, 2008, 7:48 a.m. — 0 comments

In march, the W3C published a first draft of the XQuery Scripting Extensions use cases and a working draft.

The XQSE propose to extend XQuery to add a defined expression evaluation order, variable assignments, a while loop, and some other control flow statements. This worries me a lot - I've spent considerable time implementing and using XQuery, and I really feel that this extension will break XQuery as a language.

The problem is that XQuery was intended to be a functional, declarative language. This allows implementations to reorder statements, executing them in a (hopefully) optimal order, benefiting from index lookups and the like. Now that they add side effects and state to the language, this is no longer possible in the general case. It will also greatly complicate XQuery implementations.

Of course, the question is, what benefits might this extension of the language give. The use cases document provides some insight.

Use cases section R Q1-3 define queries that perform some modification of a persistent document and at the same time return a result (the new bid, number of deleted accounts, ...).

Use case Q4 describes the use of a while loop to constantly poll the current highest bidder on some item and perform an action if it changes. I'm not sure what this seemingly strange scenario is supposed to solve, and the editors appear to be weary of this, too.

Q1-3 can easily be solved by allowing queries to perform modifications and return values at the same time. I've implemented this in X-Hive/DB, and it was actually simpler than what the W3C prescribed. I've since argued against this arbitrary limitation of the XQuery Updates specification, and maybe now would be a good time to fix that issue? Simply drop the separation between modifying and non-modifying queries and be done with it. About Q4, I'm not really sure. It looks like something that could be easily solved with some event based or message sending system. The fact that the editors are unsure how to implement this should probably be taken as a warning sign. Are we sure someone really needs this?

The use cases XHTML / AJAX both describe a scenario where a script first has to show a "busy" notice to the user, and then look up some data. I'm not sure if they envision XQuery running on the client, but otherwise this is perfectly solvable today. The client side JavaScript execution allows this trivially, and I really see no reason why to mix this client side GUI stuff with the server side operation. It breaks the MVC pattern for no really good reason.

Use case WS is again a variation of the "I want to perform changes and report results" theme. Drop the limitation in the XQuery Updates spec and be done with it.

In summary, it seems like they want to solve an actual problem, but approach it at a really complex angle. 5 use cases can trivially be solved by modifying the updates spec, 2 use cases (XHTML) should probably be considered harmful anyways, and one (R-Q4) I fail to understand. The collateral damage caused by the XQSE spec in complexity is not worth the net effect, if it can also be achieved by simplifying another spec!

I think we should rather concentrate on XQuery 1.1 with the grouping and windowing functionality that many people really need. Which, by the way, could also really use some simplification.

BlogImpl implements Blog

May 20, 2008, 3:27 p.m. — 3 comments

There's a weird thing you'll see all over the place in Java frameworks and applications of all kind. One might call it Implitis.

The symptoms are that each and every class implements some interface, and it's always the sole implementor of the interface. So you have a Blog and a BlogImpl, a Window and a WindowImpl, and so forth.

While in some cases there may be reasons for this pattern (e.g. Java's limited visibility rules force you to make some methods public that really shouldn't be), I think in many cases this is just someone who has read some book, and now wants to decouple everything and everyone.

If there is only one thing that implements your Foo interface, and you can't even give it a better name than FooImpl, I declare this a code smell. If the class is conceptually any different from it's interface, there should be a qualifying name addition for that, like a DatabaseFoo vs. a FileFoo, or a MailFoo vs. an HTMLFoo. If not, you're probably just complicating you application for no apparent reason.

I was reminded of this by the Spring criticism, see last post. This really ticks me off, people talking with glazing eyes off decoupled everything (do they realize that decoupled means no connection whatsoever? how does that work then?), and in practice only writing horribly complicated, ugly, implitis-infected code.

Spring criticism

May 20, 2008, 3:06 p.m. — 0 comments

A nice collection of criticisms on Spring, via Stefan Tilkov. The Spring part is highly interesting, it reproduces a lot of my frustration and annoyances with it.

The part about PHP isn't aggressive and derogative enough. When discussing PHP, one should always swear.

And finally, the part about Rails appears dubious, I feel it slides into marketing. The SQL injection problems, and the model object to database coupling things appear to be highly construed.

Java 3

May 20, 2008, 7:23 a.m. — 0 comments

Ola Bini has posted an interesting list of features he'd like to have in a hypothetical Java 3.

There are some points I definitely wouldn't like, such as "No primitive arrays." - I know they are a pain to have in the type system, but getting rid of them and thus making it nearly impossible to implement custom data structures is really not the way forward. One should try to come up with a primitive collection-like data structure that better fits into the language and type system, but dropping them altogether is not a good idea.

What I'd really like to see in Java 3 are two things: more syntactic sugar for common things, some important platform features, identifiers as first class citizens, a structured way out of the type system, and named parameters. Which is roughly the order of difficulty in implementation and unlikeliness of having these features actually appear :-)

Maybe one should start a project to actually try this stuff out. I know that many people have stated that Ola's list is just a description for a subset of Scala (or some other language), and they are right, in some sense. But this is an important point: I don't want all the complexity of Scala. And I don't like their implicits, the "object" feature - which is syntactic sugar for something that shouldn't be there at all, static must die -, the sometimes cryptic syntax, the weird rules about operator/method precedence, and so forth. This deserves further qualification (a lot), but I'm just not happy with Scala.

I guess one should also look at the work done by Gilad Bracha (Newspeak) and possibly Ian Piumarta/Alan Key in their COLA stuff. The former, because Bracha introduces some really nice and useful features, the latter just because it's totally awesome :-)

Google Groups being spammed

May 19, 2008, 6:17 a.m. — 1 comment

I'm getting lots of comment spam attempts for some Google Groups pages, e.g. "http://groups.google.us/group/dyt-cheaptickets";. The mentioned groups actually exist, and the spam content is being hosted by Google.

There is no "report abuse" link, so I have no idea how to tell the Google guys about this.

Also, I wonder how the spammers make it past the CAPTCHA?