<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Dextrous Web &#187; reusability</title>
	<atom:link href="http://thedextrousweb.com/tag/reusability/feed/" rel="self" type="application/rss+xml" />
	<link>http://thedextrousweb.com</link>
	<description>Just another WordPress weblog</description>
	<lastBuildDate>Mon, 26 Jul 2010 12:31:57 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>ConsultationXML is now Open Source</title>
		<link>http://thedextrousweb.com/2009/02/consultationxml-is-now-open-source/</link>
		<comments>http://thedextrousweb.com/2009/02/consultationxml-is-now-open-source/#comments</comments>
		<pubDate>Fri, 27 Feb 2009 16:01:39 +0000</pubDate>
		<dc:creator>Harry Metcalfe</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[consultations]]></category>
		<category><![CDATA[consultationxml]]></category>
		<category><![CDATA[dius]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[public sector information]]></category>
		<category><![CDATA[reusability]]></category>
		<category><![CDATA[ukgovOSS]]></category>

		<guid isPermaLink="false">http://www.thedextrousweb.com/?p=138</guid>
		<description><![CDATA[We&#8217;re terribly, fantastically pleased to announce that after a bit of wrangling, Steph Gray and I are able to release ConsultationXML as open source software under the GNU Affero license. The recent report on open source software in Government hinted that departments ought to try to release source code for the software they commission, and [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;re terribly, fantastically pleased to announce that after a bit of wrangling, <a href="http://blog.helpfultechnology.com/2009/02/consultationxml-goes-open-source/">Steph Gray</a> and I are able to release <a href="http://www.thedextrousweb.com/2009/02/consultation-xml-reusable-data-dfs-dius/">ConsultationXML</a> as open source software under the <a href="http://www.gnu.org/licenses/agpl-3.0.html ">GNU Affero</a> license. The recent <a href="http://www.cio.gov.uk/transformational_government/open_source/index.asp">report on open source software in Government</a> hinted that departments ought to try to release source code for the software they commission, and we&#8217;re delighted to be (we think!) the first to do so.</p>
<p>We&#8217;re not sure who will want to play with it yet. We hope that other departments will want to deploy and use the tool to improve their consultation offerings. It may be that people in the private sector will find some use for it. People have already used ConsultationXML for really <a href="http://www.thedextrousweb.com/2009/02/consultation-xml-mashups-wordle/">neat things</a> that we didn&#8217;t expect, so anything could happen: which is, of course, the point.</p>
<p>We ran this by the renowned geek-come-blogger-come-minister, <a href="http://www.tom-watson.co.uk/">Tom Watson</a>, who had good things to say:</p>
<blockquote><p>I think this is a great tool. We&#8217;ve just announced the <em>Open Source, Open Standards and Re–Use</em> report on the use of open source software in government, an element of which was to encourage government to contribute to the world of open source software, and this is the first practical expression of that goal.</p></blockquote>
<p>For more information about ConsultationXML, and to download it, <a href="/labs/consultationxml">head over here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://thedextrousweb.com/2009/02/consultationxml-is-now-open-source/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Scraping Civil Service Vacancies</title>
		<link>http://thedextrousweb.com/2009/02/scraping-civil-service-vacancies-rdfa/</link>
		<comments>http://thedextrousweb.com/2009/02/scraping-civil-service-vacancies-rdfa/#comments</comments>
		<pubDate>Thu, 05 Feb 2009 09:30:37 +0000</pubDate>
		<dc:creator>Harry Metcalfe</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[cabinet office]]></category>
		<category><![CDATA[coi]]></category>
		<category><![CDATA[jobs]]></category>
		<category><![CDATA[mashing]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[prototype]]></category>
		<category><![CDATA[reusability]]></category>
		<category><![CDATA[scraping]]></category>

		<guid isPermaLink="false">http://www.thedextrousweb.com/?p=13</guid>
		<description><![CDATA[Back in July, we were asked to make a prototype system for the Central Office of Information and the Cabinet Office. For some time, they have wanted to put civil service job vacancies together in one place so people can find them more easily and reuse the data in their own applications, much as we [...]]]></description>
			<content:encoded><![CDATA[<p>Back in July, we were asked to make a prototype system for the Central Office of Information and the Cabinet Office.</p>
<p>For some time, they have wanted to put civil service job vacancies together in one place so people can find them more easily and reuse the data in their own applications, much as <a href="http://www.tellthemwhatyouthink.org">we have already done for central government consultations</a>. Because of our experience with consultations, we were asked to make a prototype that uses <a href="http://en.wikipedia.org/wiki/Web_scraping">scraping</a> to gather data about job vacancies. Some <a href="http://code.google.com/p/argot-hub/">fantastic work is underway</a> to make this really easy by embedding RDFA into departmental websites: our part in this project was to get our hands on some data, check out what departmental websites are doing now and see if scraping could be a useful part of the solution.</p>
<p>We put a <a href="http://civiscrape.dev.thedextrousweb.com/">prototype</a> together over a couple of months last year &#8212; altogether, it took about three weeks of development time &#8212; and I&#8217;m very happy to say that <a href="http://civiscrape.dev.thedextrousweb.com">it&#8217;s now been unveiled, and you can play with it</a>. Though the site is live, the data isn&#8217;t current: it&#8217;s only there as an example. These were all real vacancies once, but they may have been filled by now!</p>
<p>The site is fairly simple. Several departmental websites were scraped to get information about their current vacancies. We took that data, cleaned it up a bit and added it to a database that can be searched. Users can look for jobs by keyword (like &#8216;<a href="http://civiscrape.dev.thedextrousweb.com/index.php?terms=assistant">assistant</a>&#8216;), location (for example, a <a href="http://civiscrape.dev.thedextrousweb.com/index.php?terms=SW1A%201AA">post code</a> or <a href="http://civiscrape.dev.thedextrousweb.com/index.php?terms=London">place name</a>), or <a href="http://civiscrape.dev.thedextrousweb.com/advanced.php?terms=manager&amp;distance=20&amp;location=London&amp;salary=40%2C000&amp;action=go">all of the above plus salary</a>.</p>
<p style="text-align: center;"><a href="http://www.thedextrousweb.com/wp-content/uploads/2009/02/civiscrape-map.png"><img class="size-thumbnail wp-image-104 aligncenter" title="civiscrape-map" src="http://www.thedextrousweb.com/wp-content/uploads/2009/02/civiscrape-map-150x150.png" alt="Google Maps &amp; Civiscrape Mashup" width="150" height="150" /></a></p>
<p>If we can automatically identify the vacancy&#8217;s location, we geocode the it using RDFA on the site and GeoRSS in the Atom feed.  We did this because it permits users to <a href="http://civiscrape.dev.thedextrousweb.com/advanced.php">search for jobs by proximity to a location</a>, and to import the feed into Google Maps and get an <a href="http://www.google.co.uk/maps?f=q&amp;hl=en&amp;geocode=&amp;q=http:%2F%2Fciviscrape.dev.thedextrousweb.com%2Ffeed.php%3F&amp;ie=UTF8&amp;t=h&amp;z=6">insta-mashup of vacancies plotted on a map</a> &#8212; neat!</p>
<p>We think that the prototype has done rather well. It suffers from the same kinds of problems that systems relying on scraped data generally encounter: occasionally, data is missing, incomplete or in the wrong place. It would need some manual intervention if it were ever to become a real service. Thankfully, the work that&#8217;s happening at the moment to produce an RDFA vocabulary to define vacancies means that this approach shouldn&#8217;t be needed in the future.</p>
<p>We wrote up some recommendations as a result of doing this project: hopefully, we&#8217;ll be able to publish them at some point. We&#8217;ll definitely be helping to get departments on board when the time comes for them to start embedding RDFA in their web pages.</p>
]]></content:encoded>
			<wfw:commentRss>http://thedextrousweb.com/2009/02/scraping-civil-service-vacancies-rdfa/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>ConsultationXML: the mashups have landed</title>
		<link>http://thedextrousweb.com/2009/02/consultation-xml-mashups-wordle/</link>
		<comments>http://thedextrousweb.com/2009/02/consultation-xml-mashups-wordle/#comments</comments>
		<pubDate>Wed, 04 Feb 2009 12:04:15 +0000</pubDate>
		<dc:creator>Harry Metcalfe</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[consultations]]></category>
		<category><![CDATA[dius]]></category>
		<category><![CDATA[mashing]]></category>
		<category><![CDATA[reusability]]></category>
		<category><![CDATA[wordle]]></category>

		<guid isPermaLink="false">http://www.thedextrousweb.com/?p=93</guid>
		<description><![CDATA[People have already started doing interesting things with ConsultationXML. I have to admit &#8212; I couldn&#8217;t be more pleased! Richard Goodwin took PDF attachments from the London Gazette, uploaded them to ConsultationXML, got the HTML preview output and fed it into Wordle &#8212; and voila! A Wordle map of the London Gazette&#8217;s honours list was [...]]]></description>
			<content:encoded><![CDATA[<p>People have already started doing interesting things with <a href="http://www.thedextrousweb.com/2009/02/consultation-xml-reusable-data-dfs-dius/">ConsultationXML</a>. I have to admit &#8212; I couldn&#8217;t be more pleased!</p>
<p style="text-align: center;"><a href="http://www.thedextrousweb.com/wp-content/uploads/2009/02/wordle-consultationxml-gazette.png"><img class="size-medium wp-image-94 aligncenter" title="wordle-consultationxml-gazette" src="http://www.thedextrousweb.com/wp-content/uploads/2009/02/wordle-consultationxml-gazette-300x163.png" alt="wordle-consultationxml-gazette" width="300" height="163" /></a></p>
<p>Richard Goodwin took PDF attachments from the <a href="http://www.london-gazette.co.uk/">London Gazette</a>, uploaded them to ConsultationXML, got the HTML preview output and fed it into Wordle &#8212; and voila! A <a href="http://www.wordle.net/gallery/wrdl/501494/London_Gazette_New_Year_Honours_List_31_Dec_2008">Wordle map of the London Gazette&#8217;s honours list</a> was born.</p>
<p>Has anyone else done interesting things? Do <a href="mailto:contact@thedextrousweb.com">let us know</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://thedextrousweb.com/2009/02/consultation-xml-mashups-wordle/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>ConsultationXML: getting reusable data out of horrid PDFs</title>
		<link>http://thedextrousweb.com/2009/02/consultation-xml-reusable-data-dfs-dius/</link>
		<comments>http://thedextrousweb.com/2009/02/consultation-xml-reusable-data-dfs-dius/#comments</comments>
		<pubDate>Mon, 02 Feb 2009 19:48:49 +0000</pubDate>
		<dc:creator>Harry Metcalfe</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[consultations]]></category>
		<category><![CDATA[dius]]></category>
		<category><![CDATA[pdf]]></category>
		<category><![CDATA[reusability]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://www.thedextrousweb.com/?p=67</guid>
		<description><![CDATA[Over the last few months, we&#8217;ve been working with Steph Gray of the Department for Innovation, Universities and Skills on making consultation documents easier to reuse. DIUS are doing some fantastic things with consultations. Typically, a formal consultation is a pretty tedious process: a department will write up a big PDF document, print it, send [...]]]></description>
			<content:encoded><![CDATA[<p>Over the last few months, we&#8217;ve been working with <a href="http://blog.helpfultechnology.com/">Steph Gray</a> of the <a href="http://interactive.dius.gov.uk/">Department for Innovation, Universities and Skills</a> on making consultation documents easier to reuse.</p>
<p>DIUS are doing some fantastic things with consultations. Typically, a formal consultation is a pretty tedious process: a department will write up a big PDF document, print it, send it to some people, stick it on their website and wait for people to respond. The whole process is pretty dated: it doesn&#8217;t really take advantage of the web, and is pretty inaccessible to most people.</p>
<p>DIUS have started to make this process better. In July last year, they launched a consultation that tried a bit harder to involve people. They used a WordPress plugin, CommentPress, to allow people to comment on individual paragraphs in the consultation. They published a nice HTML version of the consultation document, with links and all. They even made a widget generator, so that people could embed questions from the consultation in their blogs.</p>
<p>Doing these things doubled the number of people who responded to the consultation, with very little extra marketing. Unfortunately, they were also pretty time consuming: turning a PDF into nice HTML is pretty labourious. They wanted to automate as much of this process as possible, to make it cheaper to deploy similar consultations in the future, and they asked us to help.</p>
<p>Creating all these consultation tools would be quite easy, if the data existed in a format that could easily be reused. Unfortunately, PDF is certainly not that format. It is is designed for print, and is difficult to repurpose. To make this easier, we wrote some tools to convert PDFs into very basic XML, and to allow people to extend that XML into something useful.</p>
<p>This human intervention is really important. It allows semantic information to be added to these documents: questions and their possible answers can be identified, and explanatory paragraphs can be linked to questions. It also allows formatting and images lost during conversion to be added back into the document, and extra formatting like links to be added.</p>
<p>So, with that in mind, we produced a web-based XML editor for staff in web publishing departments. The idea was to create an editor customised to the XML schema we&#8217;re using, so that people who are only just XML-literate can still use it. The editor automatically converts PDF documents to basic XML and then presents it for marking up, tweaking and generally-making-better. The result is awesome XML, usable by other tools to do neat things.</p>
<p>ConsultationXML is about to be deployed within DIUS, where it&#8217;ll be used by real people so we can get feedback and make it better. We&#8217;re hosting an installation here, so that you can play with it and give us your thoughts. We hope to make it better &#8212; it&#8217;s not quite finished yet &#8212; but it&#8217;s finished <em>enough</em>, so we&#8217;re getting it out there for people to try. It&#8217;ll be open source just as soon as the lawyers have done their thing.</p>
<p><a href="http://consultationxml.labs.thedextrousweb.com/">Have a play with the beta ConsultationXML editor here</a>.<br />
Update: <a href="http://blog.helpfultechnology.com/2009/02/freeing-data-reducing-pain/">Steph has posted his writeup</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://thedextrousweb.com/2009/02/consultation-xml-reusable-data-dfs-dius/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>DCSF Statistical Releases, the BBC and Better Data Formats</title>
		<link>http://thedextrousweb.com/2009/01/dcsf-statistical-releases-bbc-better-data-formats/</link>
		<comments>http://thedextrousweb.com/2009/01/dcsf-statistical-releases-bbc-better-data-formats/#comments</comments>
		<pubDate>Thu, 15 Jan 2009 14:04:17 +0000</pubDate>
		<dc:creator>Harry Metcalfe</dc:creator>
				<category><![CDATA[Comment]]></category>
		<category><![CDATA[bbc]]></category>
		<category><![CDATA[dcsf]]></category>
		<category><![CDATA[public sector information]]></category>
		<category><![CDATA[reusability]]></category>

		<guid isPermaLink="false">http://www.thedextrousweb.com/?p=47</guid>
		<description><![CDATA[Simon Dickson picks up an interesting story from the BBC&#8217;s Editors&#8217; blog about official releases of statistics. Usually, when the Department of Children, Schools &#038; Families releases new statistics, they&#8217;re given to the media in advance. The media need this lead time to be able to format all their articles and tables and make sure [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://puffbox.com/2009/01/14/bbc-school-league-table-data/">Simon Dickson</a> picks up an interesting story from the <a href="http://www.bbc.co.uk/blogs/theeditors/2009/01/school_league_tables_data.html">BBC&#8217;s Editors&#8217; blog</a> about official releases of statistics.</p>
<p>Usually, when the Department of Children, Schools &#038; Families releases new statistics, they&#8217;re given to the media in advance. The media need this lead time to be able to format all their articles and tables and make sure everything is correct and works properly: this is fair enough, given that they&#8217;re the data they&#8217;re working with lives in lots of Excel spreadsheets, with multiple sections, differing layouts and everything else you really don&#8217;t want if you&#8217;re tasked with this kind of job.</p>
<p>Given what they have to work with, the BBC&#8217;s anger is understandable, but perhaps misplaced: why are we still dealing with bunch of spreadsheets in the information age? Why isn&#8217;t there an API that allows this data to be queried, or at the very least, a standard data format that doesn&#8217;t change from year to year, and doesn&#8217;t reply on proprietary technologies that are hard to work with?</p>
<p>An API or standard data format would allow media organisations to write code which generates the statistics they need <em>every year</em>. They wouldn&#8217;t have to create new tools to be tweaked and tested every time there are new statistics. Better still, it would create a market for someone to create a tool that did this for them, saving them money. Even better than that, it would allow anyone who wants to do something innovative with these statistics to do so far more easily.</p>
<p>I think I&#8217;m not alone in saying that the case for releasing data properly &#8212; in reusable formats, to everyone, for free, whenever it is possible &#8212; has been made, has been heard and has been widely accepted as valid.</p>
<p>Why are we still fiddling with messy spreadsheets, and bemoaning the fact that we only have days to do what should take hours?</p>
]]></content:encoded>
			<wfw:commentRss>http://thedextrousweb.com/2009/01/dcsf-statistical-releases-bbc-better-data-formats/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
