OSGi at LinkedIn: Java Compilation in OSGi
In this post I will describe how I was able to make LinkedIn’s JSP compiler work within an OSGi container.
I guess the first question I need to answer is why on earth does LinkedIn has its own JSP compiler? The answer is partly for historical reasons and partly for feature reasons. The JSP compiler that we have (I am the author of it) has enhanced the JSP standard in the following manner:
- Escaping HTML (ex: ‘<’ gets turned into ‘<’) is on by default so that web developers don’t even have to think about it (this is actually a pretty big deal because escaping HTML is what can help protect your web site against XSS attacks).
- Expression language (EL) is more powerful and allow you to pass arguments to method calls.
- JSP files can be located anywhere: inside the war, on the file system, inside a local jar file, inside a remote jar file or a combination of all those. This is quite nice in development because JSPs can be located in the source tree (instead of packaged in a WAR) and developers can be much more productive.
- The compiler uses strong typing and does not rely on reflection at runtime (only at compile time). Depending on how you look at it, this could be considered as a downgrade instead of an enhancement :).
- It handles non-web oriented content. For example, we use the JSP compiler to generate the emails that LinkedIn sends.
- It has a built-in extensible pipeline process which allows us to plugin any kind of processing before compiling the JSP. For example, we recently added an i18n processing which allows us to inject all localized strings directly as local variables inside the compiled code as opposed to doing runtime lookup into resource bundles.
The second question that comes to mind, is why is it challenging to have the JSP compiler running within OSGi? To answer this question there are 2 things to understand:
- The first is the steps involved in the JSP processing:
- The second is how OSGi handles class loading: to achieve dependency management at the package level and have the same class with multiple versions and dynamic reloading of classes, OSGi needs to control the class loading pretty tightly. In other words the concept of "classpath" in an OSGi container becomes pretty much nonsensical.
This is where there is a strong disconnect: step 3 needs to compile Java source code into byte code. If you are familiar with the standard Java compiler provided with the JDK (javac), you know that what you need to provide for the compilation is a classpath (with a lot of limitations of what the classpath can really be (it cannot be any arbitrary urls, etc.)).
For efficiency reasons, every time we invoke the Java compiler, we do not want to fork a new external process to invoke javac. Unfortunately with Java 5 there is no standard way of doing this. There is a non supported way of doing it, which is what we use:
String[] options = {
"-source", "1.5",
// coming from ServletContext.getAttribute("org.apache.catalina.jsp_classpath")
"-classpath", _classPath,
"-d", _baseClassDir.getPath()
}
StringWriter sw = new StringWriter();
PrintWriter out = new PrintWriter(sw);
if(com.sun.tools.javac.Main.compile(params, out) != 0)
{
out.close();
throw new CompileException(sw.toString());
}
Java 6 offers a standard way of invoking the compiler from within Java itself by using the javax.tools functionalities. I was very excited when I saw this especially the fact that there is a JavaFileManager.getClassLoader(location) method. Since OSGi has the concept of Bundle which offers an API very similar to a ClassLoader, it is fairly easy to write an adapter:
public class BundleClassLoader extends ClassLoader
{
private final Bundle _bundle;
public BundleClassLoader(Bundle bundle)
{
_bundle = bundle;
}
public BundleClassLoader(ClassLoader parent, Bundle bundle)
{
super(parent);
_bundle = bundle;
}
protected Class<?> loadClass(String name) throws ClassNotFoundException
{
return _bundle.loadClass(name);
}
protected URL findResource(String name)
{
return _bundle.getResource(name);
}
protected Enumeration<URL> findResources(String name) throws IOException
{
return _bundle.getResources(name);
}
}
Unfortunately my excitement got shattered when I realized that the ClassLoader was not used during the compilation, but only, as stated in the JavaDoc "for loading plug-ins (ex: annotation processors) from the given location". I really thought for a minute that you could use the Classloader instead of the classpath. It would have been too nice. The only method that looked potentially promising was:
public Iterable<JavaFileObject> list(Location location, String s, Set<JavaFileObject.Kind> kinds, boolean b) throws IOException;
This method gets called for every single package that is declared in the source code and is expecting in return a list of all the classes that the package contains. Unfortunately, it is impossible to get this from a Bundle. There may be convoluted ways to get to it using the PackageAdmin service but it was starting to get very hairy and seemed like a lot of work.
I then switched my focus away from the JDK as I was not getting anywhere and decided to explore the JDT compiler (org.eclipse.jdt.internal.compiler.Compiler). After all, Eclipse is built on top of OSGi so there had to be a way to compile Java code with the compiler. Thankfully I found the source code for Jasper, the Apache implementation of the standard JSP compiler and this is exactly what is being used. If you look at the org.apache.jasper.compiler.JDTCompiler class, you can see a very good example of how to use the Eclipse compiler (and trust me you need an example… as it is over 300 lines of code to invoke the compiler!). Using this example, I was able to implement the compilation and get everything working. The big advantage of this solution is that, unlike javac which expects the content of a package, you only need to locate a class which is totally possible with a ClassLoader. Below is my INameEnvironment implementation:
private NameEnvironmentAnswer findType(String className)
{
// note that try/finally error handling has been removed for brevity of the example...
String resourceName = className.replace('.', '/') + ".class";
InputStream is = _classLoader.getResourceAsStream(resourceName);
if(is != null)
{
// read bytes from input stream
byte[] classBytes = ...;
ClassFileReader classFileReader =
new ClassFileReader(classBytes, className.toCharArray(), true);
return new NameEnvironmentAnswer(classFileReader, null);
}
else
return null;
}
It took me about a full day to investigate the Java 6 approach (which did not conclude successfully) and I had the code up and running using the Eclipse compiler in about 4 hours. To conclude, I would just like to say that I am happy to see that the JDK is finally offering a standard way of invoking the Java compiler from within Java code. However, it feels like there is more work to do. They need to offer the ability to not use the concept of classpath anymore, but instead use a ClassLoader. Also it’d be great if the concept of classpath was expanded to support the concept of URL instead of restricting the classpath to be a bunch of jar files or classes located on the file system.
I hope you enjoy this post and stay tuned for more posts on OSGi at LinkedIn. The next topic will be about extending Spring-DM using a fragment host.
trackback
http://blog.linkedin.com/2008/06/12/osgi-at-linkedin-java-compilation-in-osgi/trackback/



Joe June 13th, 2008
Have you considered publishing the JSP compiler that includes XML escaping? This has been a problem we’ve been dealing with as well. I imagine others have the same problem.
Matt Raible June 13th, 2008
If you want to modify Tomcat to escape EL by default – you might checkout the following patch I contributed.
http://raibledesigns.com/rd/entry/proposed_tomcat_enhancement_add_flag
Neil Bartlett June 16th, 2008
Hi Yan,
Just a note about your BundleClassLoader, although I’m aware you didn’t end up using it. You really need to override the loadClass() method of ClassLoader rather than just the findClass() method. This is because loadClass() implements the default Java behaviour of delegating first to the parent classloader, before calling the classloader’s own findClass() method. However in OSGi, parent-first delegation is the exception rather than the rule… we only do it for the java.* packages (which MUST be defined by the boot classloader; this is enforced by the JVM) and for any packages listed on the org.osgi.framework.parentdelegation system property, which should usually be a small list.
Anyway, keep up the OSGi posts, they’re very interesting.
Regards
Neil
Yan Pujante June 16th, 2008
@Torsten: Thanks for the link. It is an interesting project to look at to ‘hide’ the complexity of invoking various compilers!
@Neil: Thank you very much to point out the flow in the BundleClassLoader implementation. I totally understand the problem and I am glad you saw it. I am sure it will save us (and other people reading this blog) several headaches!
Martin Cooper June 16th, 2008
Given your list of enhancements to JSP, I wonder why you chose JSP at all, and didn’t elect to use something like Apache Velocity instead. Velocity doesn’t have some of the limitations of JSP that were a hindrance to you; for example, the source files can be located anywhere, and it can be used to generate e-mail as well as web pages (or pretty much anything else).
Yan Pujante June 17th, 2008
@Martin: That is a good question. I think it is all a matter of timing. Very honestly if I had to make the choice today, I would not decide to take jsp and enhance it to suit our needs. At the time we took the decision, things were different. I don’t beleive that at that time, Velocity had the features that we wanted. And there is always the problem of choosing something that will die eventually, especially when there are multiple choices (even today there are a lot of choices… Velocity, Freemarker, JSP, Tapestry….) all with advantages and drawbacks.
Jon Stevens September 25th, 2008
@Yan, I call bullhonky. Velocity has been around since I started the project in 1998 (yes, ten years ago). None of the projects you list are anywhere dead. The only template engine project that died was Webmacro and that is because Velocity replaced it. At the time, WM had a GPL license and the author refused to change it, so I started the Velocity project with Geir and Jason. Eventually, they dual licensed WM, but it was too late.
In the end though, Velocity was expressly started to deal with the various limitations in JSP. Freemarker which came before Velocity had syntax that was/is too complicated/cryptic. Velocity has also been mostly feature complete since the day it was started and I can solidly say that the amount of work to add any missing features to velocity would have been far less than writing your own JSP compiler!?
Anyway, it doesn’t matter now, the path you choose is the one you are stuck with. I just wanted to point out that your stated reasoning seems a bit off to me. These days, I use JSP with taglibs (@see the tagonist framework) and am mostly happy with that. Velocity has been relegated to just generating RSS/XML, emails and the occasional dynamic javascript file.
Good luck.