Muddy Boots

Introduction

The Semantic Web Project was commissioned by the BBC to look at ways of unambiguously identifying the main actors in a BBC news story and using open data sources as a controlled vocabulary to describe them. The project was borne out of the original Muddy Boots project that was commissioned by the BBC as part of the BBC Innovation Labs process in 2007. The first project looked at ways the commons (e.g. Wikipedia) could be used to enhance BBC news stories (and the archive) and create new paths into and out of the content. This project produced informative results but it proved difficult to measure it's success due to the subjective nature of the meta-data created (how do you measure if a link is interesting ?)

Semantic Web Project Aims

The Semantic Web project was initiated to create a measurable and testable system which can be run against a corpus of BBC news articles to test it's accuracy. The system's main aim is to 'unambiguously identify the main actors in a BBC news story'. In the first instance, this means being able to recognise the people and companies mentioned in any BBC news story. Once the main actors/entities have been identified in a story, they need a unique reference to describe them (to ensure that the entities are unambiguous). In the original Muddy Boots project Wikipedia was used as the main data source, in this project DBpedia is used as it provides an easier query interface, with a richer, semantically linked dataset. The system relates the identified actors to the most likely candidates in DBpedia. In doing this, the system uses DBpedia as a controlled vocabulary providing a URI that unambiguously describes each entity identified in the BBC news article (thus creating a linked dataset).

What does it look like ?

There are three main ways to access the project.

The Microformat visualisation component marks up a BBC news story with Microformats for the main actors and it also adds a 'Featured Actors' sidebar component that uses data from DBpedia to provide extra information about the actors in the story (including picture, abstract and homepage). A sample preview is shown below :

The stories are best viewed using a microformat enabled browser, such as Firefox with the Operator add-on.

The BBC Music Beta visualisation component looks at the stories the Muddy Boots system has processed and links them to the artist page using the MusicBrainz GUID as the linked dataset identifier. A sample preview is shown below :

I want to try it !

It's really easy :

  • To submit a new story, click the 'Submit story' link in the sidebar.
  • To view existing stories, click the 'View Stories' link in the sidebar.
  • or to view Music artists and their related stories, click here

How do I find out more ?

Contact Rattle Research. We'd love to hear any comments you have, or suggestions for improvement.

Side Title

Maybe some introduction text here

Meta

  1. Lorem ipsum
  2. Dolar sit amet
  3. Consecteutuer adipiscing

Links

Credits

Design and build by Rattle Research.