Posts Tagged ‘jobcentreproplus’
JobcentreProPlus, tricky geocoding and unreliable datasets 26th Mar 09
One of the problems with working with large datasets — especially when you’re scraping them — is that they don’t always work the way one might think.
We’ve recently had reports that JobcentreProPlus.com turns up jobs that aren’t close to the postcode that the user entered when they started their search. We’ve done a bit of digging, and turned up two problems. Unfortunately, neither is easily fixable.
The first problem is that JobcentrePro’s website doesn’t expose very good location data. It’s often as little as “Camden Town, London” or “Sevenoaks, Kent”. For this to be useful, we need to convert it to a latitude and longitude, so we can see if it’s near the postcode you enter when you start a search.
This process is called geocoding, and it’s an inherently error-prone process. There’s often no way to tell the the difference between places with similar names. Usually, it works well enough, but sometimes, it’ll generate a result that’s unexpected: in real terms, you see a search result for a job in Glasgow when you were searching for things in London.
There’s not a lot we can do about this. If JobcentrePlus included better geographical information in their listings — like a postcode, or a latitude/longitude — we wouldn’t have to geocode things, which would be a great improvement.
Unfortunately, in this case, it gets more complicated. The second problem is that the JobcentrePlus database (which also drives their service!) doesn’t store good location data. Sometimes the location refers to the address of the Jobcentre shop. Sometimes, it’s the agency advertising the job. Sometimes, it’s the employer’s head office, but not the actual building you’d be working in if you took the job.
In summary: the way we’re forced to gather data introduces errors, and the underlying dataset has quite a few errors to begin with.
Despite this, we still think JobcentreProPlus.com is useful. Most of the time, the job will in fact be near the jobcentre, the employer’s head office or the job agency. That’s why our “distance from postcode” field defaults to 10 miles — we’re confident that that’ll be right, most of the time.
The bottom line is that the quality of our site is completely dependent on the quality of the underlying data. Until that data is better, there’s not much we can do to improve things — but we’re not too worried. From a plain reading of search results, we think we’re doing ok. This search for stuff in London returns mostly stuff that, according to the job ad, is in London.
We think it’s good enough to be useful, and that’s really our only goal.
Rewired State: JobcentreProPlus 8th Mar 09
On Saturday I was at RewiredState. A bunch of geeks got together to build things. We wanted to show government how it’s done!
At the end of the day, we each got two minutes to present what we’d done to each other, and an assemblage of government types. People did some really cool stuff, from Rob McKinnon & co’s Compani.es, which is the website that Companies House ought to have, to a reimplementation of ActivePlaces. They scraped this multimillion pound website, got all their data, and then did with it in an afternoon what the site hasn’t managed to do with a massive budget and years of time. Great stuff. Emma Mulqueeny’s written some more about the day, and the other hacks.
Sam Smith and I got together to do a project. Given the current economic malaise, it’s quite important for people to be able to find jobs, and a little birdy turned us on to the fact that the JobCentre Plus site really isn’t good. In fact, it’s quite painful. To get any jobs out of it at all, you have to fill in 4 reasonably large forms. Once you have some jobs to look at, you can’t do anything with them. There’s no RSS, you can’t get email alerts for new jobs, and you can’t bookmark jobs you’re interested in, because their URLs don’t work properly. The next time you want to find jobs, you have to go through the whole ordeal again. Bleh.
Our task was to make this better. Sam wrote some scrapers to pull down Jobcentre’s data — which was no mean feat in itself — and I made a website to display it. It’s a bit rough and ready, but it works. You can go to www.jobcentreproplus.com, search for jobs in your area, view them, bookmark them, get email alerts, subscribe in your feed reader and use the API to search and display jobs on your own site. Everything that the real site should do and doesn’t.
We didn’t realise it at the time, but there were prizes for the hacks that the organisers liked the most. Rather suprisingly — given the very high quality of all the other projects — Sam and I won!
We’re really glad that they liked it, and we hope you will too. Have a look, and let us know what you think.

