OSGi at LinkedIn – Bundle repositories
February 17, 2009
Code Alert! This is a part of our continuing series on Engineering at LinkedIn. If this isn't your cup of Java, check back tomorrow for regular LinkedIn programming. In the meanwhile, check out some of our recent announcements, tips and tricks, or success stories.
When you start using OSGi, the very first problem you are going to be faced with, is the fact that OSGi requires bundles. A bundle is nothing more than a jar file with extra manifest information. Here is a 'typical' example of a manifest for an OSGi bundle (the entries in bold are the OSGi specific headers).
Tool: Bnd-<unknown version>
This is how you instruct the OSGi container about your dependencies (Import-Package), what you provide (Export-Package), how you become active (Bundle-Activator), etc...
So why did I start this post by saying it was a problem ? The answer is actually two-fold:
- you need to generate those headers for your own jar files and it can be quite challenging if you want to do it manually (our biggest bundle currently has over 760 import package entries!)
- all external libraries that you require also need to be a bundle (libraries that you do not control like log4j, xerces,...)
In this post I will be concentrating on problem #2 and I will come back to problem #1 in a later post.
Let's start with some numbers. As of this writing (January 2009), our repository of external libraries contains 200 jar files. Only 8 of them are bundles out of the box (4%). I believe that this small sample reflects the harsh reality out there: over 95% of the available libraries are not bundle.
So what is the solution? The answer is unfortunately not that easy. For starters, you should definitely check the SpringSource bundle repository that they are offering for free. It contains a good list of libraries that have been converted to bundles (they even have a full time employee just for this ongoing task!). One of the big issue is that it is hard to keep up with as new libraries are popping up on a daily basis include snapshots. It's benefits are debatable but in practice, sometimes you just don't have a choice! In our case, we just cannot afford to rely solely on the availability of bundles. So here is the approach that we took:
What we are trying to achieve is to convert a repository of libraries (96% jar files (blue)) into a repositories of bundles (100% bundles (red)). For this we use bnd, ivy and some custom code.
Our repository of external libraries is using ivy for dependency management (note that the process would be very similar with maven). Using ivy resolution, it is relatively easy to build the (non cyclic) graph of dependencies between all the libraries. All the leaves represent libraries that do not have dependencies on other libraries (Step1).
BND is a tool that analyzes a jar file and can create OSGi manifest headers. In Step 2, we iterate over each leaf and we feed it to bnd to generate a bundle as a result. We use some custom code (ant tasks) to have more control over what is provided as input to bnd and the errors/warnings that we need. In Step 3, we repeat the same process one level up the dependency graph. This time we know we are dealing with libraries that have dependencies, but we also know that they have properly been converted into bundles, so the classpath (which is one of the input to bnd) will contain only proper bundles. With the proper classpath, bnd will be able to generate the proper manifest entries (the version and resolution attributes of the Import-Package entries will be correct). And recursively we go all the way up the chain of dependencies until we have converted the entire repository.
Overall this process works quite well but there are several issues that I want to point out:
- The result clearly depends on the quality of the original repository in terms of dependencies. If the dependencies are wrong or missing, then the end result will be of lesser quality with the "resolution:=optional" attribute being set which can lead to the dreaded NoClassDefFoundError problem when deploying in the OSGi container. To fix this issue, we need to have a clean repository which, thanks to this process, we can now detect (I was mentioning error reporting added previously).
- The only change this process is really doing is adding header manifests to the jar file, the content of the jar file itself is not modified. If the jar file was a signed jar file, then changing the manifest breaks the overall signature even if you do not touch any of the headers containing the signature of individual classes. To fix this issue, in our case it is ok to simply remove the signature entirely.
- This process does not fix libraries that are simply not OSGi compatible. For example, OSGi do not support classes in the default package which for example the jdom library exposes, or they have class loading issues (famous Class.forName() OSGi issue). To fix this problem (which from my experience has been very rare), we have been using SpringSource versions.
The last point I wanted to raise is my concern that there isn't a 'one-size-fits-all' repository. Even with the amazing work that SpringSource is doing with the free repository, you get their interpretation of dependencies. For example, the jdom bundle (version 1.0) has the following entry: Import-Package: org.jaxen;version="[1.1.1, 2.0.0)";resolution:="optional"
The above entry basically means that jdom depends optionally on org.jaxen package version 1.1.1 all the way to 2.0.0 not included. This may work for you or not depending on your needs. In our case we like tighter version ranges ("[1.1.1, 1.1.2)"). What if jaxen v.1.2.3 ends up having a show-stopper bug when used in conjunction with jdom but you still need it for other parts of your code and you end up deploying it in the same container ? Stay tuned for a separate post entirely dedicated to version management soon.
Also, check out our series on OSGi at LinkedIn.