Remote repositories and reproductability
I cannot do anything but shiver when I read TheServerSide discussion about version control and I read comments such as:
Maven, for example, discourages storage of dependencies in revision control, preferring to grab them from a third party repository
And a comment mentioning:
Since I use Maven 2 : pom.xml + src directory. That’s all I need [...]
and then later in another thread:
I have students doing internship where I work and they complain when they download some source code and it comes with an Ant build. They are more use to Maven (since this is what we use) and prefer it.
We have here obviously conflicting thoughts about the software industry: long term and predictability versus short term and ‘works for me’. The ‘works for me’ attitude has been the rationale behind flawed designs and processes, the original JBoss UCL design for example. Some users just don’t know how it happens to run, when it just does and they are happy with that.. Pragmatic Programmers refer to that as Programming by coincidence so by extension we could call this technique Build by coincidence.
All that is enough to explain why some users actually fail to use any decent project that use Maven that way and end up with comments such as:
#1 I’ve yet to be able to get an external Maven project to build by simply checking it out[...]For some reason – maybe bad luck or maybe because I tend to consult at larger corporations – Maven can never download all the jars. Either it can’t find it or errors out or something.
Reasonably large projects with external dependencies in remote repositories using Maven and who run into problems on a regular basis are notorious: Apache Geronimo, Apache Cocoon, Apache DS. Just browsing the mailing list on a frequent basis is enough to understand how it gets in your way and it just slows down development when you have many dependencies and a remote repository (or worse, severals).
Konstantin Ignatyev is also right on target when he says:
Maven dependencies management is really really bad.[...]Ranges are especially bad: they cause build unpredictability and non repeatability because they make build to depend on server repo content.
A simple example to illustrate how things can be totally wrong:
- Assume MyProject depends on JasperReport 1.2.4
- Which itself depends on commons-collections [2.1,) as can be seen from the POM (meaning 2.1+)
Now if we take look at commons-collections in the Mergere repository we can see that… what is available as 2.1+ ? well among other things 2.1.1, 3.0, 3.1, 3.2…and 20030418.083655, 20031027.000000, 20040102.233541, 20040616.
So what do you think is the most recent (ie: greater) version for Maven ? 3.2 or 2004040616 ?
Relying on uncontrolled remote repositories is evil at best.
Never trust the online repositories for your project, that’s ok for a prototype but not more than that.
The irony being that, some little hands may fix this problem if they read this entry… but thousand of users that actually were depending on these dependencies, will not notice until they clean their cache….and download again the new dependencies and it may maybe break something in their project. So you get non-reproductability.
Put every dependency in source control, download the archives yourself, rewrite the POM yourself (most of the time it is incorrect, but you sure get the list of developpers which is 350 lines long) and be in control. Clean up your cache. Get your machine offline and build. If it does not build right away, you are in trouble anyway, it is just a matter of time before this WMD blows up your product.
Final advice: Use Ivy for dependencies and store all your dependencies under source control.











