Category Archives: Semantic Web

Canon Flux

Blue Box – by Brainless Angel, Creative Commons, via Flickr

Well, that’s been an interesting couple of weeks. I’ve made progress on both the RDF/Ontology and the Ruby/Rails front – although one much more significantly than the other. We’ll deal with both in a moment, but first a few encouraging signs.

Thanks, as ever, to those who have commented on my posts so far – some interesting questions have been posed, of which I’ll get onto later, and ideas have been discussed. One thing that has encouraged me a great deal has been this article by Tom Scott and Michael Smethurst on coherence at bbc.co.uk – most of which I’m familiar with from work, but the references to non-linear narratives, the BBC as a story-telling organisation, and how to adapt that for the web, are of particular interest and encouragement, because narrative and story-telling is at the heart of what I’m trying to achieve. Michael also wrote this article over at the Radio Labs blog, which is gathering plenty of praise (not least from Tim Berners-Lee himself!) – from my perspective, it’s given me a useful focus on the steps I’d need to take to move these ideas from blog posts to a working prototype, and hopefully beyond.

So, how have I been getting on with a) developing a front-end website for exploration and administration of a fictional universe (by administration, I mean the creation of new elements in the ‘toy-box’ and the links between them), and b) an ontology (and accompanying RDF examples) to describe the narrative content of episodes?

Ruby/Rails – A Web Front End

One half of this project has always been focused on providing a front end application/web site. One in which users (I won’t limit or define *who* they might be at the moment) could navigate and explore a fictional universe, in a wider, more open format than the current focus strictly on the episodes. A suggested approach to the project as a whole has been to create the web app first, use that to produce and store the data in a MySQL (or similar) database, and then expose the data as RDF etc. Unfortunately, my lack of technical expertise has severely hindered my progress on this front. Over a week and a half has been spent on just getting Ruby/Rails up and running properly, and learning the basics (for which, thanks must go to Anthony Green and Craig Webster in particular for being patient and offering help whenever possible). Although it’s still a strand which I think is important, and would like to develop, I’ve been worried that concentrating solely on the Ruby/Rails side is taking me away from the semantic web/linked data roots of the idea, which I’d prefer to get sorted out first. I’ve also realised that before I can really begin to develop the front end properly, I need to know the scope and domain model inside out. Both of these I had a fair idea of, but the domain model in particular was very much a work in progress – and so I felt there was less value in developing the application until I had it sorted out. Note my use of the past tense to describe the domain model – which leads us nicely on to…

OntoMedia – An Ontology for Describing the Narrative Content of Media

This is a story of serendipity (which reminds me, of course, of the Jon Pertwee story ‘The Green Death‘, aka ‘The One with the Giant Maggots’, in which the concept of serendipity plays an important role – anyway, where was I?). I’ve already described in detail my frustrations previously with the tantalising prospect of the SUDS ontology – something which several people have helpfully mentioned as a good starting place, but for which an actual ontology specification has been lacking. I’m still pursuing the SUDS material, thanks to Kim in the comments, but I’ve managed to get my hands on an ontology which might just be what I need – OntoMedia. A chance meeting with Mike Jewell at the last OpenSoho (see, networking can be useful) led to a discussion of this project – and it turns out that whilst at the University of Southampton, Mike and Faith Lawrence (amongst others) developed an ontology called OntoMedia for doing just as described in the heading. It has its’ roots in an exploration of online fan fiction, and is extremely detailed and flexible. The fan fiction roots also mean that it has been designed with geeky subjects like Doctor Who in mind, which is a bonus. However, being so detailed and tailored to the fan fiction roots means that, speaking personally, it sometimes focuses a little too much on fantasy genre elements (detailed descriptions of clans, bonds, blood oaths, woods and coppices etc), whilst seemly lacking a couple of minor basics (although I’m still getting to grips with it, so it’s possible that I’m just missing the obvious bits…!). But that’s not to knock it at all – it’s a highly accomplished piece of work, and allows all kinds of narratives to be described. Since our initial meeting, I’ve been discussing the possibility of developing and improving the onotology – I truly believe that with a little more work, it brings me a huge step closer to my goal, and could end up being widely used throughout the BBC. To be honest, I’m just surprised that no one else had picked up on its’ potential yet.

I think I’ll leave a detailed description of how to go about implementing stuff in OntoMedia for another blog post, but what I can do is give you a flavour of the basic principles. Essentially you establish the existence of (at least) two universes – reality, and the fictional universe. Within the fictional universe you establish a timeline, your characters, locations etc, and link your characters to defined actors in the ‘real world’. Here, we can deal with characters and elements which are of dubious or multiple origins – we can deine essentially concepts that are shared between media, and their provenance as part of a universe (or context). I’ve also then defined episodes as being things existing in the real world, with their own timelines – the episodes are then linked into the bbc.co.uk/programme equivalents. Finally, you establish events which can occur in multiple timelines (and in different orders within those timelines). That’s the principle, at least. For me, it all harks back quite nicely to that ‘toy-box’ analogy. You set the scene, choose your characters, then tell the story. It’s also important to bear in mind that we’re not trying to restrict creativity and lay down the law for what happened and when – to use the analogy from within Doctor Who, some things are fixed points in time (i.e. the stuff shown on screen), others are in flux.

As for my progress so far, I’ve been helped by Yves Raimond in particular for reminding me of the benefits of writing n3 triples, Patrick Sinclair and Nicholas Humphrey for other guidance. I’ve been working to two case studies. The first is to eventually show the benefits of linking characters and events across several episodes – for this, I’ve defined the scope as the 2005 series of Doctor Who (including The Christmas Invasion), with the intention to show the Bad Wolf arc (I can then extend this to cover the second, third and fourth series). The results of which can be seen here and here. (You’ll need an RDF extension like Tabulator for Firefox to navigate the links properly).

The second case study is designed to highlight the benefits of exploring events in the fictional universe and comparing them with the order in which they occur within a given episode – so that the skill with which the writer has constructed the story can be fully appreciated, and the enjoyment of the story can be increased. For this case study, I’ve chosen to concentrate on the award winning story from the 2007 series, ‘Blink‘ – famous for its use of multiple, interconnected timelines – very ‘timey-wimey’, as they say. Results so far, which just set up the timeline, the episode, the characters, actors and locations, can be found here.

Events and occurrences are, by their nature, a little more complex, and I’m currently trying to get my head around how best to represent them – the OntoMedia ontology allows extremely detailed representations of the data, but I’m trying to stick to simple representations for the moment – the achievement of which is my current challenge.

The ontology allows, essentially, the description of any narrative. Which leads me to a potential further case study. Obviously for the moment I’ve been concentrating on fictional universes – but this could easily apply to the real world. Could this be a way to describe events and blend the semantic web into other areas of the BBC’s output in an easier and more subtle way? For instance, coverage of a football match – again, define the teams, the players, the timeline of the match and the various events. Then, again, we would have permanent, stable URIs for each team, player, event – I think the possibilities and potential are huge.

Finally, in terms of my overall approach – my current thinking is to continue with writing the RDF, then load it into a triple store. An application would then be written to allow the querying of data in the triple store, and its representation in a well designed, user facing front end. If there are standard patterns in the RDF for creating characters, events etc using Ontomedia, then ideally the application would take these recipies and allow the user to input the data without having to interact directly with writing RDF.

So there we are – a great deal of progress – not all the way there, but a huge step forward – although the phrase ‘Standing on the Shoulders of Giants’ does come to mind… Once I’ve worked ou how to represent events and occurrences, the triple store will be next, then the Ruby/Rails application, and then some design magic. Wish me luck!

Tuning Fork

Tuning Fork, by Toby Esterhase, via Flickr – Creative Commons

Part three of my investigation into fictional content modelling. See the previous two posts for the background to the project. Thanks to those who’ve been discussing the ideas – I think it’s coming along nicely. I’ve been playing around with writing some RDF, trying to link up various ontologies, and explaining what I’m trying to do as I go along. Here’s a plain text file of quasi-RDF within comments – see what you think…(UPDATE: Now here in beautiful RDF format 🙂 )

One thing that has come up in the discussions, though, is that there’s perhaps two elements to what I’m trying to achieve. The first is to link existing ontologies and, if needed, build a new one, to help describe the narrative content of ‘stories’ within the context of television and radio programmes. The second is to experiment (and for me to learn) with existing ontologies, again, linking them up, to build dynamic and interesting webpages that work on linked data principles.

So I’m interested in the ontology *and* what kind of cool stuff we could build on top of it (which includes ideas around remixing narrative, and audience story-telling). I haven’t got any definite plans on top of that at the moment, but I think the key is to see where it takes us. Well, I have an image in my mind of the types of things we could do, but again, it will be easier to describe them by prototypes. Something that might help is if I was to link to this diagram, from the aforementioned Tristan Ferne’s Radio Labs blog, describing similar things to do with the Archers – except linking that up with linked data/ontology work…

Which would lead to something like the diagram below. Again, it isn’t a complete set of what I want to do, but it shows the types of objects we’re talking about, the relationships between them, and where they link to ontologies:

Contextual Data Model

Contextual Data Model

Actors – Using FOAF, with possible extensions, this would be a URL for each actor who appears in a BBC show. This page could pull in a biography from WIkipedia, for instance, but mainly it will show the audience all the programmes that the actor has appeared in. Linking Actors to Characters, all the way through to Episodes, would allow us to auto-generate the cast lists for the /programmes episode pages. However, one problem in an early implementation might be that if we only record ‘significant’ events within an episode, the cast lists won’t represent everyone – but over time, this could be improved (the rest of the cast could possibly be listed manually against the episode, greyed-out, until they have their own URL).

Portrayal – This would allow an Actor to play many Characters, and a Character to be played by many Actors. Here I’m thinking more of ‘flashback’ scenes where you see a character as a child, but as Tom pointed out in the comments, this could be used to handle the different actors playing the Doctor. BUt how then would you deal with the different ‘characterisations’ of the same character?

This is where the recursive relationship around ‘Character’ comes in – I haven’t worked out exactly what to call this yet, but it would allow both the foaf:knows relationship, and potentially use the owl:sameAs to link different Doctors? (Perhaps not – but something along those lines).

Again, a many-to-many resolver is needed between Characters and Events, which I’ve called ‘Action’ – I’m not sure whether these many-to-many objects would need to be made explicit and have their own URLs, but the main objects certainly would, as they could have useful pages for the audience to explore.

Events would be pages that would describe a significant event in the episode, something that would be worth describing, for instance an event which is part of a wider story arc – we would then need a URL to link these together, so you could say that ‘Someone points out that Donna has something on her back’ is part of the ‘Donna/Time-Beetle’ story arc (apologies for the random example!). This is, though, where the main value of the project would be for the audience. BY giving an event a URL, the user could trace storylines throughout the episodes, outside of the confines of the episode structure – making the fictional universe more cohesive, rather than restricting our view to the episodes, which are like ‘windows’ onto the fictional universe.

Similarly, if a user then wanted to write a story featuring some of the characters, they could refer to the character’s URL (which would then allow us to have something on the character’s page to say ‘others have written stories using this character’ – linking out  onto the web, and promoting new writers and stories. The users could equally refer to events, perhaps building events into their owns stories, taking them as cues for new stories etc. Again, it all fits in with the idea of giving our audience the tools to be creative, whilst using the advantages of the BBC website’s exposure to promote audience creativity.

There’s one many-to-many resolver which I’m not sure about at the moment – between Events and Episodes – what if the same event was  shown, or even just referred to, in more than one episode? We would need some way of defining this – but I’m not sure of the correct term for it yet, hence the ‘???’ object.

So – events could be described using the Event Ontology. Actors and Characters would use the FOAF ontology. Episodes would use the Programmes Ontology. We therefore just need a way of tying them together, and then once we have some examples, it would be good to start thinking about what new things we might need from a new ontology.

On the subject raised in the comments about expressing a person in FOAF as  fictional or real – I’d side withi Tom in saying that it would be  better to label the individual people as fictional, so that it was explicit which FOAF people were characters or not – and then you’d also have the issue of characters being used to represent, for instance, historical figures such as Charles DIckens…

Anyway, that’s enough for this entry. I hope I’ve got a little further in both clarifying the two strands of the idea, and exploring the breadth and potential of it. Comments, discussion, etc. encouraged! I’m hoping to present the idea in a meeting this coming Tuesday as a possible 10% time project, so I will keep you posted…

Baby Steps

Photo by strollerdos, via Flickr, Creative Commons

This is the second post in a series covering my exploration, experimentation and musings in the area of fictional modelling. In short, can we use the recent developments in semantic web technologies to represent elements of fictional content, and what does this allow us to do. For my introduction to the topic, see my previous post here. In this entry, I’ll talk about my first practical steps, and their implications. Thanks also go to Tom Scott, Dan Brickley and Anthony Green, amongst others, who responded to the first post with helpful comments.

Before I go any further, as pointed out by Chris Sizemore, it’s worth noting that work has been done in a similar area before. Previous IAs at the BBC, including Celia Romaniuk, worked on an ontology to describe the content of soap operas, known as SUDS. From what I have seen, it was an extension to FOAF in order to describe further relationships between people, the nature of people ‘playing’ characters, and various events that could take place between the characters in a show. This was done to tie in with an Eastenders website relaunch. I won’t go into much more detail here, but if you’re interested in seeing the original work, there’s a short article here and a great presentation here. Unfortunately, apart from a few example XML fragments, I have so far been unable to find a document that defines the SUDS ontology. This is a shame, because it would have been an extremely useful starting point for my experiments. One option might be to gather the examples together and try to reverse-engineer a schema, but for the moment, and partly as a way for me to learn as much as possible, I’ve decided to start from scratch. Hopefully at some point we can find the SUDS ontology and see how it compares to what I come up with.

So, where to start? Well, as the title suggests, I’m going to start small. Sort of. Readers of the blog, and others who know me, will probably have guessed that I’m a bit of a, shall we say, ‘fan’ of the BBC’s Doctor Who (currently in the news for apparently appointing a 12-year-old as the Eleventh Doctor). So much so, that in my sad little way, most things that I’m presented with in the course of my BBC IA work make me think “How could/would this apply to Doctor Who?”. As a programme that originally ran for 26 years, and has been enjoying an overdue renaissance, its rich history, and sheer refusal to ever completely conform to most IA domain models, make it both a source of frustration and inspiration. So when I read Tristan Ferne’s blog post over at BBC Radio Labs, shortly before joining the Beeb, I began to wonder. Have a read, it’s a good example of a similar idea.

Tristan’s article concerns fictional modelling for another hugely successful BBC show, The Archers. He talks about being able to break an episode down into scenes, characters, plots etc. and, for instance, potentially being able to build pages that allow the user to follow a story through multiple episodes, rather than being tied to the traditional episode format. Of course, to paraphrase Jack Bauer, events within The Archers occur in linear time. If we were able to build dynamic and interesting websites from a show like that, centred around a small English village, how about a show that goes forward, back and sideways in time and space? Harking back to my ‘toy box’ analogy from last time, with the imagination of the writers of a show like Doctor Who, and the imagination of our audiences, the potential to create some fantastic websites would be huge.

Sorry, where was I? Oh yes, starting small. So, yes, obviously I couldn’t hope to cover the whole scope of the show in one go. However, to show the potential of the semantic web and linked data approach, I’d want to start off by experimenting not only with characters who are linked together, but with a plot that is threaded through several episodes. I still haven’t quite decided what I’m going to choose for this, but I’m thinking that the story arc from either the first or fourth series of the current show would be good to try. But before all that, I had to learn how to create some linked data.

So I went even smaller, even simpler. I chose the first ever episode of the show, from 1963. This featured four main characters, and thanks to the workshop from Yves and the others, I had an inkling of an understanding of how to create FOAF profiles. The results can be seen here (best viewed if you use a Firefox plugin like Tabulator). So far so good. I then linked each character to the other, using the simple ‘knows’ relationship. Finally, to get my linked open data brownie points, I linked each character to its DBpedia equivalent, using the OWL ‘same as’ relationship. And that’s basically it. Except…

Except even this small experiment (which I eventually got working after help from Yves!) raises some interesting points. Firstly, the pernickety part of my brain is saying that we’re mixing two distinct things here. We’re using FOAF, which, I guess, and am happy to be corrected, is primarily intended to represent real people, to model fictional things. Crucially, nowhere, at the moment, are we explicitly stating that these resources are fictional characters.  So I’m wondering whether FOAF is the correct ontology to use. Of course, like SUDS, the ontology that results from these experiments will probably be an extension of FOAF, as it is true to say that we’re still modelling the same sort of ‘thing’, the relationship between ‘people’. But the point still stands – that somehow we need some way of indicating the ‘fictional’ nature of the FOAF person, if applicable.

Secondly, and perhaps more importantly, as Anthony Green pointed out, and as I discovered when I linked the characters to their DBpedia equivalents, there’s a lot of detailed information out there already. When I linked each character to DBPedia, I got back information which was extremely detailed and fairly well structured. Which, to be honest, depressed me a little bit. Was it worth me continuing? It’s clear that others had done a lot of similar work already, and I knew that ultimately it would be silly to reinvent the wheel.

However, then I remembered what data I was trying to link. Of course I should still link to the DBpedia equivalents, but the linked data I am thinking of is more to do with linking between characters, plots etc within my own domain. I’m still slightly uneasy with this, because I know that obviously the main thrust of the whole linked data movement is to link external sources together, and that creating silos of data is not good. However, I’m still definitely in favour of linking to DBpedia – if we were to make our ‘internal’ linked data semantically rich, and then link to external sources, then everyone would benefit, and in a way, we would be regarded as the ‘master’ source in the same way that, in my small experiment, I used DBpedia as my ‘master’ source.

So that’s it. A long, rambling blog post, and small, simple experiment. Baby steps. Apologies for the rambling, and I’m not sure that I *quite* explained myself properly in that last part – but there’s definitely some interesting issues coming up already, and I’m hoping that the advantages of my position will be borne out in future experiments. Finally, I’ve adapted the RDF file that I used to create the FOAF profiles to temporarily remove the OWL ‘same as’ relationship – just to ease the page loading time, and to, for the moment, give me a more clean space to work in. The adapted version is here, the original version here. Linking back *in* to DBpedia will be a task for later…

Again, comments, queries, advice is more than welcome – comment, twitter or email me.

Content Modelling and Storytelling

BBC Television Centre in panoramic view – by strollerdos, from Flickr (Creative Commons license)

During the past year or so, the team at bbc.co.uk/programmes have been putting together a resource which allows people to access information about the BBC’s output in a structured way. This has been done using the principles of the semantic web, and of Linking Open Data. I’ll not go into great detail here, other than to explain the basics. When people think of the content of the web today, they think of ‘websites’ and ‘webpages’. These sites and pages have addresses which are unique, so that when you type the address into your browser, you’ll know where to expect to be directed to. On these websites and webpages, people can write about all sorts of things, any number of topics. And these topics can be repeatedly discussed on various different websites and pages.

What’s missing is the links between these sites and pages. Usually, it takes a search via Google (other search engines are available, folks) to get a good example of this. You type in the topic that you’re looking for, and the search engine will return all the pages it can find where the words that form that topic are mentioned. But there’s no single webpage which represents the topic itself. If there was, and if it had its own, unique address, then everyone could, on their own websites, link to that ‘topic’ as a way of saying “This is exactly the thing I’m talking about.”. If several people did this, their websites would be automatically linked together by the fact they share the same topic – rather than by the fact that someone has created a physical link between one page and another. The former makes more sense, and is much more useful in the long run. Indeed, if these ‘topics’ or ‘concepts’ had their own, unique, permanent addresses on the Internet, then the web would become not only a store of ‘pages’, but of ‘concepts’ which happen to have ‘pages’ related to them – not everything on the web would have to be a ‘page’. Only things that were ‘pages’ (say, of a book…) would then be referred to as a page. We could reclaim the word page, people!

Ahem, I think I’m getting slightly off topic. Anyway, the point is, that the good folks at /programmes are taking these ideas on board, and are creating addresses (i.e. URLs) for each programme that the BBC produces. For instance, an address like this: http://www.bbc.co.uk/programmes/b00gfzhq is an address that uniquely identifies that programme. People all over the web can use that address when discussing that particular programme, so everyone will know exactly what they are referring to. In effect, therefore, although if you type in that address into your browser, you are presented with a ‘page’ about that programme, the address itself represents the programme, rather than the page about the programme. (Because I could post one blog entry saying ‘I loved watching this particular programme’, and another saying ‘I hate this particular page about the programme’).
All this is well and good, but it relies on a solid foundation. These foundations have been constructed over a number of years by people who have been thinking about the structure of what the BBC produces, and the way in which it produces and distributes its content. These structures, or models, include thinking about how a programme is organised into brands (e.g. Blackadder), series (e.g. Series 2), episodes (e.g. ‘Bells’), even versions of the episode (e.g. pre- and post-watershed versions), and then broadcasts of a particular version on a particular channel at a particular time.

The modelling around this is by no means complete, and is being refined and improved all the time. However, I would suggest that, for the most part, the modelling around these sorts of things is approaching maturity, in that those structures are fairly well accepted (that’s not to say they won’t change, but I think most people working on these things now agree that there’s areas of the model which are pretty stable). As I have mentioned, I think these structures represent the production and distribution of the BBC’s content. But what about the content itself?

I think what we haven’t yet looked at modelling is the structure of the content – this applies particularly to fiction content, but also, perhaps, to sport. Taking the semantic web ideas into account, if we have unique URLs for each episode, each series, each brand, why not have a URL for each character within a programme, for each event that connects characters, for each place within the programme? Just as if we make brands, series and episodes in effect the building blocks for ourselves and others to create all sorts of semantically interlinked websites on top of (using SPARQL, the semantic web equivalent of SQL queries – querying concepts and the links between them, using the web as a huge datastore), if we were to give characters, events, places their own unique addresses, we could mash them up to create sites (and new content) such as timelines from a particular character’s point of view, follow story arcs etc.

I like to imagine that the ultimate would be something where every character, event and place in the fictional universe of a programme has an address – and then just like taking toy models of those things, we and the audience could make our own stories from them. You want to add your own characters to the mix? Sure, give them an unique address (for instance, in your own webspace), and start linking them to other characters, events etc. One final point to bear in mind as a caution, however. Although essentially all this structure is a good thing, one thing we should be wary of is creating too much structure, and limiting what we can do with it. If we fix things down too much, saying that this is the ‘official’ version of a particular event, or background or character, in too rigid a way, this will limit creativity and stimulate only arguments about who’s right or wrong. We want to give people the building blocks and toys to create new stories, but we don’t want to restrict the stories they tell.

So, that’s the theory. Now for the practice. I’ve made it my mission to explore these ideas, and experiment with making them a reality. Thanks to a great presentation the other day by Yves Raimond, Nicholas Humphrey and Patrick Sinclair, I’m getting to grips with ontologies such as FOAF. I’m going to be using FOAF and the Events Ontology in particular to try and express stories in a semantic way, and see whether we need a new ontology for storytelling, and what we need from it. As I say, it’s going to be very much trial and error. I could sit here with an extremely detailed plan, working out all my structures and linking everything up straight away, or even not getting started until I’ve got a new ontology in place and working just right. But I think it would be more useful to get in there, try things, discuss them and come back with new ideas. In short, I may not do things correctly the first, second or even third time, but it’s a case of experimenting, and seeing what works, how else it could work, and what’s the best way of doing things.

With that out the way, I invite you to join me. Please give me feedback on what I discuss here, and what I construct. It’ll be extremely welcome and will hopefully help us build a greater way of doing things even faster. And it’s already begun – I’ll write a seperate blog post soon on my first forays into the world of Fictional FOAF modelling….