Stephane’s thoughts corner…
Stephane Bailliez's thoughts on everything

«« A great piece of documentation | Eurotunnel debt explained »»

May 21, 2006

Maven chaos

Filed under: OpenSource,Software — stephane @ 8:27 pm

I apologize to my fellow Apache friends and Maven committers/PMC members, but let me state that clearly: Maven is absolutely unusable for any projects making heavy use of opensource projects. It supposedly make more you more productive by managing the dependencies but it just hides the fact that you are building using junk.

It could be said that such a mess is the fault of all people writing invalid POMs or using totally stupid naming conventions for their projects and stuffing that into a public repository such as ibiblio or mergere. It certainly is. Looking over the disaster on any maven repository will horrify anyone.

Maven is also guilty of having such a non-lazy dependency mechanism and conflict strategy that it will, as a default, try to pickup as much as it can online including bad POMs of versions it does not even need. Which will be overly difficult to detect, it will then fail every now and then with cryptic messages and no options that trial and error with -X (-X like debug) to figure out by reading questionnable log messages. I’m here also not mentioning the fact that you can find different POMs of the same project on different repositories.

As most people may not know (it is not documented rather than on a wiki page somewhere far away from any place where you would expect to find docs) which says it is, Maven 2 includes somewhere a so-called versioning range with mathematical syntax such as [1.3,) to say x >=1.3. As a default when you write a version on a dependency, it is called a ‘soft version’, which is bound to change as it will resolve with the ‘nearest‘ version found in the graph.

The not-so funny thing, is that POMs need to be maintained over time as, as you would expect, a component A depending on B 1.1 may not work anymore with B 2.0. And 2.0 > 1.1, right ? It may still work, but it may not. How do you know ? Well, for that you would need metadata in B to say that 2.0 breaks the compatibility cycle with previous 1.1 version. But that does not exist. AFAIK. You will find it yourself.

Point of case of what you can find in the central repository (it is used by EVERY maven user). Look over http://repo.mergere.com/maven2/commons-collections/commons-collections, you will find along ‘normal’ versions 2.1, 3.1, some obviously ‘timestamped’ snapshot with version numbers like: 20040102.233541.

As you can expect, 20040102.233541 > 3.1 and that will remain quite a few centuries to get over that.

And depending if you are lucky or not, on the mass of existing invalid POMs. With transitive dependencies, you may not realize that you are actually using a totally invalid version such as the one above. A great news for the build manager out there that tries to make sense of dependencies.

Next, we have the multiproject case. Some pattern that arise is to have a project made of multiple artifacts (jar). In Maven this is better handled by duplicating the structure in the repository so that it becomes even less manageable. Now the worst case I have seen is the following: module-option1-2.7.8.jar, module-option2-7.5.4.jar which are options of module-core-1.0.jar. That are part of the release module 1.0…uh ? why this naming scheme ?

To make it clear of what was done, imagine the following: there is product, which is splitted in sub-modules: core, modulex, moduley. Pretty common right ? You are releasing product 1.0, so you expect to have product-core-1.0.jar, product-module1-1.0.jar, product-module2-1.0.jar, right ? Especially as those adapters are part of the release and depends obviouslly on the core.

Not so fast ! Some think that having product-core-1.0.jar, product-modulex-2.4.2.jar and product-moduley-7.5.2.jar is much better because product-modulex-2.4.2 is an adapter over productx 2.4.2 and product-moduley-7.5.2 an adapter to producty 7.5.2 ! I have yet to figure out how this will work later if there are 2 releases of product and productx or producty has not changed, but…. you tell me what is the difference when you have in your classpath product-modulex-2.4.2 that works and…product-modulex-2.4.2 that does not….because one depend on product-core-1.1 and the other on product-core-1.3… Good luck finding that out !

Maven people used to say that Ivy should avoid duplicating the maven2 repository, but what’s the point ? Most of it is totally invalid anyway. It looks like anyone can actually submit any kind of POM without any validation process and even more sneaky, it is known than in the past (and it apparently is recurring), some people used to replace existing releases with new jars coming out of nowhere for whatever reason. So if you want control over which release you use, the last thing to do is certainly to avoid any using online repository.

As for scope management which could be a good idea in itself is obviouslly not working ‘sometimes’, making things even more black magic. Identifying a consistent behavior between the dependency logic is a challenge and bug reporting not an option.

Spending countless hours managing such mess and having to deal with so many undocumented ‘features’ is NOT a time saving in any way. It may give you a sense of productivity at first by making things magically work without doing too much boostrap effort but you’ll pay 10 times the price later. On this aspect, it is similar to what was the dreadly (un)famous JBoss Unified ClassLoader (UCL) which was initially dumped to ‘make it easier‘ for developers. On the end it opened such a massive Pandora’s box in production environment and with legacy code that Orwell’s ‘War of the Worlds’ vision looks like paradise.

I don’t like the fanatism of some people that blindly think without acknowledging flaws that Maven is the best thing since sliced bread. I have never seen a project other than a pet project run without any problems. Looking over Apache Directory and Apache Geronimo which both handle a reasonable amount of modules seems obvious that there are major deficiencies. Yet again, it benefits from the hard support of Maven core developer to solve any issues and it just breaks every now and then with messages such as “build failure”, “can someone build this ?” etc….

Maven promoting work in total isolation due to the presence of the cache in the user machine, any change in the availability of some dependencies will break everything and people having the dependencies already in the cache for some reasons will simply NOT realize it until someone fresh and clean comes to the project and build it (if you have a continous build environment ALWAYS clean the cache between each build !)

While Ivy is far from perfect it provides a much much much better dependency management. What I don’t like with Ivy is that they make the classic mistakes of doing opensource the wrong way:

  • Using a forum rather than a mailing list
  • Documentation available only on their (slow) website which comes out of nowhere instead of at least a wiki. And as the documentation is loaded with outdated, invalid or incomplete information it slows down severely its adoption rate and makes it even more difficult to contribute back documentation fixes.
  • The code is empty of any comments making it even more difficult to document it.
  • Development is not truely open
  • Features over features without documentation.

Yet Jayasoft are responsive with bug reports, but the problems above make it really really hard to follow this project and contribute back in an effective manner without doing 3 times the amount of necessary work.

Back to fix and sync ivy files and poms.

Sorry for the rant ! But damn, I’m sooooo pissed off I needed to express it ! @#$* !

4 Comments

  1. and remember than Ivy takes care of defining the metadata by themselves, so they have a huge escalability problem. We could only allow good metadata in ibiblio but would be unmanageable

    Comment by Carlos Sanchez — May 23, 2006 @ 9:46 pm

  2. We do not define the metadata ourself, we provide a repository in which we only put validated metadata, and also provide a sandbox in which anybody can push metadata, without any validation (more similar to ibiblio). The problem is that Ivy hasn’t reached sufficient adoption to find ivy files for everything easily. But most of our users build their own repository, using ivy files from our repositories and adjusting them to their needs if necessary.

    Concerning the mistakes quoted, I strongly agree with the lack of comments in the code, but most users don’t need to investigate in it. Documentation is not good in all areas (we should have more tutorials and examples), I agree, but most users don’t complain about its outdateness… maybe you could quote some areas where doc is outdated? And considering site slowness, you know it’s difficult to maintain an open source tool without ANY support. So please be kind with our small infrastructure, or give some money to help us provide a better performing site :-)

    Using a forum is mainly a matter of taste, we prefer forums, some prefer mailing lists, we don’t consider that as a mistake…

    Development is not truly open… I don’t see why, except for the problem of comments in code and technical documentation, if you submit a patch we often integrate it quickly, and if you submit often we can consider giving you commit rights (we just welcomed maarten coene in the ivy committer list).

    Features over features without documentation… Every released feature is documented in the reference documentation, if something is missing please tell us where, we really want to improve it.

    That’s all for me, thanks for your comments and for trying ivy!

    Comment by Xavier Hanin — May 29, 2006 @ 6:30 pm

  3. Xavier, thanks for your answer.

    Concerning the scalability of Ivy, as you already answered, it is a non-issue. There’s no point in having a repository loaded with invalid content and descriptors such as the one from Maven.

    The Ivy documentation being totally unacessible in its raw form make it impossible to fix. Stick that into svn and that will be a start to actually report, fix and contribute docs for anyone. That will also allow to actually get the docs on your computer, pretty useful if your site is going berserk, the internet connection is down, or you are in a train or a plane.

    If you want to go into the donation thing and even joke about it, then make it truly obvious that is independant rather than under a blatant corporate umbrella. Otherwise it basically sends the message to potential contributors that they are working for free while there is a commercial entity behind…this cannot give any good feeling.

    forums vs mailing list is basically visiting a list of websites everyday vs rss subscriptions. If you have a very slow site, your users are basically losing an absurd amount of time just browsing over the message so time vs information is far from mailing list or usenet. Why don’t you get hosted by sourceforge, codehaus, tigris, etc.. ? They provide a whole infrastructure for a project, it’s not 100% perfect, but it is more than you can offer already.

    Concerning the bug/patches/etc.. I created quite a few issues a few months ago so I know it for sure that you are responsive, so I don’t quite get why you feel necessary to underline this point.

    I feel like you may miss an important point regarding opensource projects and the importance of a community and how you actually create one. Resources need to be easily accessible. Decisions need to be transparents. Everybody must feel involved. Particulary when the project is leaded by a company. Otherwise the project will hardly get any traction, no matter how cool and revolutionary (and no matter if it is french. :)

    Please do make your documentation available on svn ! It must be part of the project !

    Comment by stephane — May 30, 2006 @ 12:41 am

  4. Thanks Stephane for your comments, I think I now better understand your position, and admit you’re right on many points. We will try to take them into account for Ivy development.

    I see two main interesting points:
    * documentation & forum
    I understand that documentation should be accessible offline, especially because our site is slow… we will try to find time to address this point soon. Put documentation in svn is more difficult (it’s a lot more work), but what we try to encourage is to put comments when a page has a problem, then we integrate the “patch” ourself. If you submit good things often, we can also provide you rights to contribute directly to the documentation. But maybe this process is still not a good solution for you? Concerning the forum vs mailing list, we will also try to allow the use of a mailing list synchronized with the forum. We investigated this area before without success, but drupal has evolved since then.

    * lack of openness
    You’re right, Ivy is not a totally open project in the sense that decisions are not transparent, and it is leaded by a commercial company. But it’s that commercial company, with only 4 co founders who have often difficulties to get paid, who decided to make part of its work accessible freely with the sources and a very open license. So I guess you can understand that this small company tries to make some benefit of this donation: get some visits on our web site, sell some service, keep the lead over the product, and yes get some donation to improve the product and its documentation…

    So thanks again for your comments, we’ll do our best to make Ivy better than it is!

    Comment by Xavier Hanin — May 30, 2006 @ 11:11 am

RSS feed for comments on this post. TrackBack URI

Sorry, the comment form is closed at this time.

Powered by WordPress