(Java) Hosting Recommendation

Waaaay back in June I asked if anyone had any recommendations for where to host some Java applications. I also promised the publish the list of recommendations I had, so here I am.
In typical “comparison shopping crisis” mode its taken me ages to deliberate about where to move my kit. So I’ve only just completed the intended move.
The original list of recommendations I had was:

I was also pointed at a JavaLobby discussion which has some additional recommendations.
The two places that had the most votes were Kattare and RimuHosting. I also considered Bytemark. RimuHosting seemed to have the best overall word of mouth, particularly on their customer support. The folk I contacted there were friendly and willing to setup the server to any specification I wanted. They also offer a small discount for folk hosting open source software.
I signed up to one of the RimuHosting MiroVPS plans a few weeks ago, and have been very happy with the decision so far. The site has good documentation and I’ve been able to follow their guidelines and handy scripts for getting the environment set-up how I want it. So far everything has Just Worked, which is great. They get my recommendation.

Java Hosting Recommendations?

I’m looking to move the hosting of my sites and applications to a new provider and am interested to hear if anyone has any recommendations.
Ideally I want an environment that provides me with ssh access, Java and JSP/Servlet hosting through a private VM. I really want the freedom to be able to easily manage libraries and configure the web server myself. Access to a MySQL instance or other database is also essential.
If they’re Rails friendly too, then that would be another bonus; I’m playing with several applications currently and while Java is still my main development environment, I want to do some Ruby on Rails based applications too.
If anyone has any recommendations (or warnings!) please drop me a line.
Update: thanks to everyone who has sent in suggestions. I’ve had a lot of great feedback and there are a couple of services that stand-out. I’ll be sure to post a list of the suggestions. Thanks again.

Using Jena in an Application Server

I’ve been lurking on the jena-dev mailing list for a while now, and I’m constantly impressed with the level of patience displayed by the jena team at handling repeated questions and queries. This is despite the comprehensive documentation which covers all aspects of the toolkit.
Often these queries stray outside the realm of RDF and Jena into basic questions such as “how do I write a JSP or a web application”. Makes me wonder if there’s been a sudden increase in the number of undergraduate semantic web projects. Anyway one question I’ve seen quite often recently is “How do I use Jena within an Application Server?”
Here are some notes and pointers that may help answer that particular question. I don’t have time for a complete tutorial, but hopefully the following pointers may be sufficient to get your oriented.

The Database

I’ll assume that you’re going to be working with data held in a relational database. In Jena terminology this is known as a “persistent model”.
The Jena team have created a HOWTO on using persistent models. See that page for detailed database configuration options and pointers to database specific documentation.
You don’t have to worry about creating the relational database structure into which your RDF data will be stored. Jena will do that for you automatically once you create your first persistent model. This makes it very simple to get up and running.
The persistent model HOWTO contains example code that shows how to create and configure a persistent model.
However within an application server the code you’ll write is going to be slightly different: you’re going to need a connection pool.

Connection Pooling

All Java application servers allow you to configure a database connection pool, the specifics vary from server to server so you’ll need to consult your server documentation to find out how to do that. Here’s the Tomcat 5.5 JDBC data source documentation. You should be able to find similar documentation for JBoss, Weblogic, et al.
Once correctly configured a connection pool will allow you to do a JNDI lookup to obtain a DataSource from which you can create a Connection.
Creating a Jena Model is then simply a matter of instantiating a DBConnection. Here’s a code snippet which illustrates this:

// Obtain our JNDI context
Context initialContect = new InitialContext();
Context env = (Context) initialContext.lookup("java:comp/env");
// Look up our data source
DataSource dataSource = (DataSource)env.lookup("jdbc/MyDataSource");
// Allocate and use a connection from the pool
Connection connection = dataSource.getConnection();
//Create a Jena IDBConnection
IDBConnection jenaConnection = new DBConnection(connection, "MySQL");
//use open for an existing model, or createModel to create a new one
Model model = ModelRDB.open(jenaConnection);
//do some useful work, then tidy up

Business Logic

So far we’ve looked at creating connections and opening a Model to get access to the persistent data. For example you may navigate through the model using the Jena API or query it using ARQ the SPARQL query engine built upon Jena. More information on how to do that can be found in Phil McCarthy’s “Search RDF data with SPARQL” tutorial.
The context within which this code lives will depend on the overall architecture of your application.
If you’re just writing a simple Java web application that uses servlets and/or JSPs then you’ll want to structure your code so that the logic is in a servlet or utility code accessed from a JSP, ideally a tag library. This avoids mixing up your user interface code with your application logic. To ensure that your connection pool is available to your web application you’ll need to configure a resource reference in its web.xml
However if you’re writing a full J2EE application that uses EJBs, then you’ll want to do all of your Jena manipulation from with a bean. As J2EE Container Managed Persistence is designed for relational databases and not triple stores, you’ll have to use Bean Managed Persistence. In other words write the database manipulation code yourself.
Personally I’d suggest going with a Session bean that delegates to a Data Access Object to do the real work. Your Jena specific code will then be relegated to a small manageable layer in your application. In this scenario you’ll need to configure the bean’s deployment descriptor to ensure that it has a resource to your connection pool.
Hopefully that’s some useful pointers that’ll help get you started.


The developers of Jaikoz, a Java MP3 tag editor mailed be yesterday to say that their latest release is now live on their site. I’m mentioning this because Jaikoz bundles my MusicBrainz API for doing metadata lookups using MusicBrainz.
Jaikoz is payware although there’s a free trial available. I should note that I’m not getting any kickbacks from this: the API is CreativeCommons licenced so they’re free to do what they want with it. They did check in with me first though, which was very friendly. I did suggest that they may want to consider donating money to MusicBrainz if they get enough sales.
I’m just pleased that they found it useful enough to include it in their application.

MusicBrainz Java API beta-2

I’ve just uploaded beta-2 of my Java API to MusicBrainz RDF web service.
The API is Creative Commons licensed and is built around the Jena 2 Semantic Web toolkit.
The API provides raw access to the RDF returned from the service, but also a simple JavaBean layer for developers wanting a simpler interface to the data. You can read the Javadoc and view the changes since the last beta; these mainly consist of some bug fixes and support for a few new properties (including Amazon ASINs).
The API doesn’t aim to mimic everything in the C/C++ API, e.g. track id calculation or submission, it’s merely a read-only version suitable for embedding in Java applications.
I’ve included a trivial demo in this release: a simple command-line application that reads in a list of album names, looks them up in the service and aggregates the basic metadata into a new RDF document which is dumped to the console.

Slug: A Simple Semantic Web Crawler

Back in March I was tinkering with writing a Scutter. I’d never written a web crawler before, so was itching to give it a go as a side project. I decided to call it Slug because I was pretty sure it’d end up being a slow and probably icky; crafting a decent web crawler is an art in itself.

I got as far as putting together a basic framework that did the essential stuff: reading a scutter plan, fetching the documents using multi-threaded workers, etc. But I ended up getting sucked into a work project that ate up all my time so didn’t get much further with it.

Anyway, because the world is obviously sorely in need of another half-finished Scutter implementation, I’ve spent a few hours this evening tidying up some of the code so that it’s suitable for sharing.

Continue reading “Slug: A Simple Semantic Web Crawler”

foaf-beans 0.1

I’m pleased to announce the first iteration of a Java API for FOAF based around the Jena semantic web toolkit.
The API, which I’ve dubbed “foaf-beans”, is an attempt to provide a number of convenience classes that will allow Java developers to quickly get to grips with reading and writing FOAF data. With this in mind the API provides a thin layer of abstraction which hides much of the RDF processing, instead presenting the user with simple factory classes that create FOAFGraph and FOAFWriter objects for reading and writing respectively. These objects generate and process simple Java Beans that should play nicely with other Java APIs and toolkits (particularly JSP, JSTL, etc).

Continue reading “foaf-beans 0.1”

Bayesian Agents

Classifier4J is a Java text classification library that includes a text summariser and a Bayesian classifier. It was my interest in the latter that lead me to play with the API recently, as I wanted to demonstrate to some colleagues the ease with which one can use Bayesian classification to create a content filter/recommender. Well, it’s easy if all the hard work is done for you in a library!

The Classifier4J API is very easy to use, and you can plug a Bayesian classifier into an application with very few lines of code.

One of the things that intrigued me about the API design was that it separates out the Classifier from the storage of the words and their probabilities. The API comes with a simple in-memory implementation and a JDBC Words Data Source which stores the data in a database table.

It occured to me that it’d be an interesting experiment to create an implementation of the data source interface that stored the data as RDF.

Why RDF? Because then we’d have the share and aggregate the results of training classifiers.

For example I could export and share a classifier trained to spot spam, semantic web topics, or any number of other categories. The classifiers could be imported into both desktop applications (e.g. Thunderbird) as well as web applications. For example I might train a classifier to spot articles that I’m interested in, and then upload that configuration into a content management system and have it mine that data for material I may be interested in — hence “bayesian agents”

By tieing my exported bayesian probabilities to my FOAF file an aggregator may merge my data with others known to share similar interests. Trust is another aspect that may reflect whether my data is shared.

Anyone have any comments on this? Is anyone doing anything similar already? (They must be…)

I’ll try and hack something up when I get a few minutes.

For the RDF I was thinking of something like the following:

Continue reading “Bayesian Agents”

How to make RDF and JSP place nicely together?

Via Gavin (via the chumpologica): An application architecture that should yield superior productivity.
Interesting stuff. I’ve been pondering something similar myself, mainly because I have a slice of an application I’m working on that I want to replace with an RDF data model and storage. To achieve this successfully I need to make sure that the data nicely dovetails with the JSP 2.0/JSTL templating environment we’ve built on top. However I don’t want to model everything as objects if I can help it, because by doing so I’m going to sacrifice some of the flexibility I gain from using RDF.
Ideally I want to gut the current Data Access Objects and replace them with node that navigates the underlying RDF graph, perhaps using an RDF query language, and then return a subset of that graph in a form that suitable for traversing with JSTL. There’s not a great deal of business logic in that slice of the application so there’s little else to change.
I had been wondering whether the technique used in RDF Twig could be generalized to creation of simple object hierarchies (Lists and Maps). Rx4RDF might be another useful place to mine for ideas.
Suggestions for other useful APIs to techniques to explore will be gratefully received.
btw, if you find that you start extending your object model to allow arbitrary property annotation, and some of those properties are actually pointers to other objects in your graph, then that’s probably a sign that you may be better off using an RDF based model. And possibly Python too but I’ve not explored that angle yet.


Via Cafe con Leche I notice that Saxon 7.9 has been released. The interesting thing is that Mike Kay has founded Saxonica Limited which will offer professional services and additional modules, including a schema-aware processor as a commercial offering.
I’ve used Saxon for a long time now. It’s my XSLT processor of choice. I’ve never bothered with Xalan or other processors as Saxon has always Just Worked.
Like any good tool Saxon is adjustable enough to help you solve any particular problem. Just recently I’ve benefited from both the saxon:preview which helped me deal with a large transform and the very easy extension mechanism that allowed me to invoke some Java code during a transformation (generating a SHA1 sum for an email address).
I think it’s good news that Mike is intending to continue offering the basic product for free and wish him well in the commerical venture.