Corpus MAVIR: video

Convenciones de transcripción

@Title: Multimedia Retrieval and Evaluation
@File: mavir10.xml
@Participants: PAU, Paul Clough, (man, C, 3, associate professor, lecturer, Sheffield)	
@Date: 27/11/2008
@Place: Madrid
@Situation: Conference (III Jornadas MAVIR), conference room, not hidden, researcher as observer
@Topic: evaluation and development of multimedia information retrieval systems
@Source: MAVIR 
@Class: formal in natural context, conference, monologue
@Length: 1h 27' 24"
@Words: 15659
@Acoustic_quality: A
@Transcriber: Sergio Calvo Páez
@Revisor: L. Campillos, M. Garrote
@Comments:

PAU so as [/] as Anselmo &uh just told you / my talk today // my lecture today is gonna / consist mainly of thinking about or talking about trends / in multimedia retrieval / and &uh today I'm really going be focusing on access in visual material  00:15

PAU  so not really gonna be talking about audio / &uh so that would be a kind of another / &uh lecture in its own right  00:22

PAU  &um but I also wanna then talk / so first we're gonna summarize and just &uh give you an overview / of how do we access &uh visual media / that not &on [/] only includes images but also includes videos as well  00:33

PAU  &uh then I want to talk a little bit about evaluation / so evaluation is key / if we want to keep developing &uh / good systems  00:40

PAU  ok? // so we must be able to evaluate the systems // the multimedia systems that we're building  00:44

PAU  &uh and in particular as Anselmo said I'd like to talk about ImageCLEF / which is one of the evaluations I've been involved with for few years now  00:52

PAU  &uh if / &uh we have any time left at the end / what I’d also like to do is just give you a few examples of some of the research I've been doing / over the past year or two / &uh related to &uh multimedia retrieval / just to give you a flavour / for some of the things that we do at Sheffield  01:07

PAU  so first of all let me just &uh / make sure that you all &uh / understand what &uh information retrieval is  01:14

PAU  &uh so information retrieval [/] I work in a information retrieval IR group // and &uh typically information retrieval / &um / you might say is / &uh / the [/] the aim of finding information relevant to a user need / so that includes for example storage / retrieval / presentation / user interaction / and all the other aspects / of &uh searching / &uh for information  01:35

PAU  &uh what type of information ? 01:37

PAU  well we can include texts / &uh images / audio / ISO transcripts / OCR documents // a whole range of [/] of different media types  01:47

PAU  what is a [/] a user need ? 01:49

PAU  well the aim is to find information relevant to a user need / but typically in information retrieval what we're trying to do is find information / about / or on a given topic  01:59

PAU  so normally we kind of assume that the &uh aim or the goal is a kind of thematic / that's a thematic need  so find me documents / on the same topic / as my query  02:09

PAU  &um / that involves typically searching / and browsing as well so browsing xxx [/] browsing is xxx [/] also an important part / of information retrieval  02:18

PAU  &uh / so the goal is to find the information relevant / to a user need  02:23

PAU  &uh relevance itself is a concept  02:25

PAU &uh it's absolutely core to information retrieval  02:28

PAU  there have been numerous papers / books published on the whole concept of relevance  02:32

PAU  &uh because relevance is typically information retrieval people understand it // is about thematic / or topical similarity / but of course it's not just about that  02:42

PAU  ok? // for &some [/] for something to be relevant to a need / &uh it also depends on things such as context // such as some subjective / &um measures / as well  02:50

PAU  so is a [/] is a &docum [/] is a document by a certain author ?  02:54

PAU  is the authority if this is the best document  02:56

PAU  &uh and those aspects which not only are about &uh the theme or &uh topic of a document // they are about other things as well  03:04

PAU  &uh over the &f [/] past few years I guess we might say there've been general trends in information retrieval  03:10

PAU  &uh so information retrieval is &uh / not a new field // &uh it's been run for quite a long time // and the kind of classical research &uh areas in information retrieval where things like / information retrieval models / &um so xxx [/] various types of models for example / language models / &um vector space models and so on  03:27

PAU  &uh evaluation typically on smaller collections / so in information retrieval evaluation / &uh started out / on quite small / &uh collections  03:36

PAU  &uh there was quite a focus in the early years on various query languages / and on document indexing / and &uh typically everything was done on quite small / &uh document collections  03:46

PAU  &uh then of course / &um things changed // and / the kind of information retrieval / &uh area that we are at the moment / and field that we're in / involves a &hume [/] an absolutely massive range of different types / of research areas  03:59

PAU  as of course with the growth of the Internet / and the &um [///] we have things like Internet search engines / web search / &uh we have / markup languages / semi-structured search / not xxx structured search  04:10

PAU  we then have multimedia contents or multimedia search  04:13

PAU  we have distributed collections / there's quite a lot of research there  04:16

PAU  we now have cloud computing // so how do you do IR / over a cloud computing ?  04:21

PAU  &uh we have user interaction as a big / area / &um that is / &uh taking off / still / a lot of work to be done there  04:27

PAU  we have now multilingualism / because of globalisation  04:30

PAU  &uh / we have social search / Web 2.0 / and so on  04:34

PAU  all of these fields now are part of information retrieval as we know it today  04:38

PAU  &uh and / I guess the modern information retrieval / and the retrieval we know today is really being driven by several things such as technology / &uh such as research / &uh / the environments / and the needs of users as well  04:49

PAU  so that’d been the driving fact as to make retrieval what we have / today  04:53

PAU  so specifically I want to just talk about multimedia information retrieval  04:57

PAU  and &um / there's quite a good [///] this is a quote just from one of the &uh / ACM / grand challenge papers / and this was back in 2005 and two quite well known figures in &uh / multimedia retrieval &uh / Rowe and Jain / &uh basically said this was a grand challenge at the time for multimedia / retrieval / that is make / &uh capturing / storing finding and using / digital media and everyday occurrence on / &uh our computer environments  05:23

PAU  &uh now I &s [/] still believe that that is a grand challenge because we still have a [/] a lot of personal multimedia / &uh and / often is quite difficult to be able to manage / even our own photo collections for example  05:35

PAU  so there's a long way to go  05:37

PAU  &uh so this I think is a challenge that still exists for researchers today  05:40

PAU  so there's a long way to go / in multimedia retrieval  05:43

PAU  so let me just talk a little bit about / &uh / multimedia retrieval  05:48

PAU  &uh first of all &uh digital multimedia  05:50

PAU  &uh I guess / most of us know that digital multimedia content is growing / pretty rapidly / &uh not only in the commercial sector but the domestic sector as well  05:59

PAU  &uh / and that's driving new forms of interaction // the &inter [/] interaction that we have / with images / speech / video / texts and other forms of unstructured data  06:09

PAU  so we’re meeting multimedia data not only at work / but also we're meeting &multi &uh [/] multimedia data at home as well  06:15

PAU  so just think about how many multimedia devices you do you own ? 06:19

PAU  do you own a mobile phone which can take photos / video clips ? 06:22

PAU  do you have digital voice recorders ? 06:24

PAU  do you have / digital video cameras / &uh digital cameras ? 06:26

PAU  &uh interactive TVs and so on ?  06:29

PAU  &uh / we / ourselves / just personally / we have an awful lot of digital &uh capture devices  06:35

PAU  &uh applications of course of digital multimedia are / absolutely vast / so we mustn't just restrict it to the kind of academic domain // but of course entertainment / &uh is a very big market / big field xxx // &uh digital photo software / &uh management / music downloads / e-learning / &uh mobile media and so on  06:54

PAU  these all are applications of effective multimedia retrieval  06:57

PAU  so &multi [/] multimedia information // this kind of xxx [/] &um quite diverse / quite multi-faceted =  07:05

PAU  &uh and this is quite a useful diagram just illustrating the various media types that you have  07:10

PAU  so typically the multimedia / &uh we would have sounds / &uh text and image // with sound we might have music / as opposed to spoken document annotations  07:20

PAU &uh and they are quite different because / both of those require quite different forms of / &um / processing  07:27

PAU  and then with text of course we might have captions or subtitles // we could have ISO transcripts from the audio  07:34

PAU  &um / then also we can have &ob [/] &um / &uh obviously &uh bigger texts as well // so we might have an image embedded in a webpage  07:42

PAU  and the text surrounding the image gives us some context / &uh and tells us something about the image  07:47

PAU  &uh then of course with the images we can have &stim [/] still images // there's an awful lot of work to be done there // we can have black and white / we can have colour / we can have various quality and so on... 07:56

PAU   and then we have moving images // we can have animations / which are moving images xxx xxx xxx xxx// and then / going right to the other extreme / we have videos / &uh movies / videos which typically have both moving images together with sounds / together with potentially &uh texts as well // because we can transcribe the audio / &uh to give us texts  08:14

PAU  &uh that gives a very very rich / &uh set of data for which we can then provide search and browse / &uh techniques and functionality  08:21

PAU  so let me just talk a little bit about how we access &uh visual media  08:25

PAU  &uh mainly I want to focus first of all on image retrieval // so how do we / &uh retrieve images ?  08:31

PAU  so first of all the kind of &uh assumption is users want to retrieve / &um a document // so a visual document rather than text  08:38

PAU  that's what I'm assuming // so you are a user / and you &uh go [/] you go to Google images / and the assumption is that you are looking for a [/] an image / to satisfy your need  08:46

PAU  so // a search requests typically xxx expressed &uh either using &uh example images // so / giving an image / find me other images like this one // or more commonly // so that doesn't happen like that when you go to Google images // on Google images you would actually type in &uh [/] &uh [/] &uh [/] &um [/] &uh [/] a written interpretation of your need // so you'd just type in images of / whatever / Madrid / of / &um / whatever you're looking for  09:13

PAU  &um / there's an awful lot of application areas / &uh for image retrieval // &uh for &examp [/] example you might be a researcher / &um searching / digital archives // for example you might be working &uh for the BBC  09:25

PAU  you could be a curator in a library // in digital library // &um looking for images  09:30

PAU  you could be an illustrator looking to &um [/] finding example photographs to illustrate an article  09:35

PAU  &uh you could be a professional / accessing for example a science database // for example in medicine / so a health professional // maybe you are looking for a particular radiograph / or set of radiographs  09:47

PAU  &um / or you could just be a domestic user // &look [/] trying to find that picture of / holiday last year / and the children standing on the beach // and you cannot find it anyway // it seems to have / been lost somewhere in your multimedia collection  10:00

PAU  &um / there's a very &uh useful paper / &um by &uh &Pete &uh [/] by Armitage and &uh [/] xxx Armitage and / a guy called Peter Enser // who's a retired professor now / but he used to be at the university at Southampton  10:13

PAU  he's very very well known / in kind of pictorial information retrieval / because he's done quite a lot of studying / of what users want / and the information needs / so a kind of pictorial of visual information needs of users  10:24

PAU  and actually I think we have to &s [/] first to recognise that this is place where we have to start // so if you are a student and you're a kind of mover / perhaps technical student // &uh do come back to this  10:34

PAU  &uh why // ok? // are you providing a technology ? 10:37

PAU  what do your users want ? 10:39

PAU ok? // we do need to understand what types of different search requests / &um could / &uh a user have / for the different media / &um that you're trying to provide  10:47

PAU  &um so the Armitage and Enser paper is very useful  10:50

PAU  &um they analyse the number of &uh user needs from looking at search logs / &uh from a number of libraries // &uh digital libraries and &uh physical libraries  10:59

PAU  and it's just interesting they came up with this kind of chart // if you like goodish / &uh table // ok? // various types of queries // you could classify the queries into these various types  11:08

PAU  so what people were searching for  11:10

PAU  and all of these databases that people were searching were pictorial databases // so they were typically a sort of databases from museums / from libraries / &uh and so on  11:19

PAU  and &uh basically they would say that there're a / sort of three main things / if you like / or three main types of query  11:25

PAU  so there were queries / &uh / which basically are where people are looking for something quite specific // either the query might be / David Beckham // ok? // so the [/] the image that you want to &s [/] of something that is quite specific  11:37

PAU  but then perhaps you have a more generic type of request / so a picture of something  11:43

PAU  so find me pictures of &um / famous footballers for example  11:47

PAU  or find me pictures of the Madrid &uh squad / or something like that  11:51

PAU  and then you have the kind of abstract level / so in the library or in the &uh analysis of the logs / they did find that you sometimes get quite abstract queries  11:59

PAU  so find me a kind of fun pictures of people at a party  12:03

PAU  &uh find me pictures of &um people enjoying themselves at the beach  12:07

PAU  you know / quite abstract concepts  12:10

PAU  ok? // so &uh as information retrieval people we have to think how might we answer / that type of query // how might we build the system / to answer that type of query  12:18

PAU  perhaps it means that we don't just have a very simple search box anymore  12:22

PAU  perhaps we now need to allow the users be able to browse the collection // create some kind of multi-faceted search  12:28

PAU  perhaps we need to make use of semantics information extraction technology and so on / to be able to address an information need like that  12:36

PAU  by xxx so / if you are a student and you're interested in this / go and have a look at that paper // you can access it online  12:40

PAU  and I think it's a good place to understand [/] &uh to start [/] to understand what the user actually &uh wants // what they're looking for  12:47

PAU  if we were to kind of classify image retrieval and the various image retrieval techniques / &um John Eakings / he is from the University of Northumbria / he's again quite well known in the information retrieval community // and &uh he proposed the three &uh level framework for image retrieval  13:02

PAU  so you can do image retrieval / at level one // that's retrieval based upon very primitive features  13:08

PAU  for example features of colour / of texture / &um of shape and layout  13:15

PAU  that's [/] you can do retrieval at that level / so that's level one  13:18

PAU  &uh level two // &uh Eakings suggests that you can do retrieval by derived / &um / &er features  13:24

PAU  for example it might be that you have your images tagged in some way // so if I for example was &er / accessing images from a / &uh [/] a news agency / or some kind of photo / &um / database // it's quite likely that the images come with captions / generated by the people of archived and stored the images  13:43

PAU  and so there for example it might just be that what you're doing is trying to / answer a query such as find me pictures of Toni Blair  13:49

PAU  &uh he also identified a kind of third level / of retrieval / which is retrieval by abstract &uh / attributes  13:57

PAU  &uh and this is much much harder // if you like // so here what you're doing is actually abstracting / and there's a bit of a gap / between the low level features such as colour / spatial layout / texture / and getting to the kind / if you like / the semantics of an image // 14:12

PAU  that is there're images that might depict death / images depicting war // ok?  14:16

PAU  you look at an image / yourself and you can definitely work that out  14:21

PAU  but how on earth would a computer / or an / image processing technique be able to infer that / from / the basic primitive content / colour / textures / spatial layout / or some of the tags that have been assigned ? 14:33

PAU   and the problem is a little bit like this // so find me images similar to this   14:39

PAU  well it kind [/] it gets harder as you move up here // ok? // so find me images just similar to this based upon xxx xxx / I want images / with the same colour composition / same texture / same shapes  ok? // that's reasonably ok / we can do that quite well  14:53

PAU  find me other images depicting death  14:56

PAU  ok? // it starts to become a little bit harder  14:58

PAU  so it's a bit like this / the little penguin is looking for &uh / this little fish / ok? // that is a example image  15:05

PAU  which ones [/] which one of these is the correct image ? 15:08

PAU  because they also ain't to be roughly the same shape // some of them have &sim [/] similar colour &com [/] composition // but is a fish hhh {%act: laugh}  15:17

PAU  ok? // so there are no relevant images because there are no other fishes / ok?  15:21

PAU  and the problem is just like that / there's a massive what we call semantic gap / between the low level / &uh primitive features / and the kind of high level concepts  15:30

PAU  &um / there are / I guess two main retrieval methods // &uh two ways of how you might retrieve &uh visual media or images  15:41

PAU  &er the first one is kind of traditional information retrieval // that is text based information retrieval // &uh and so we would normally call that description-based  15:50

PAU  and that is we would typically use the abstracted features which should have been assigned to the image  15:54

PAU  so most images don't exist on their own // most images / come with some / kind of context // perhaps they come with the caption // perhaps they come with some other metadata which is embedded in the image / in the EXIF data for example // &uh and some of that can then be used / &um for retrieval // &um we can just use / off-the-shelf standard / &er text based retrieval techniques // so you can index a bunch of images if they have captions using xxx  // and very effectively / you could answer &uh queries &uh written &uh [/] &er variable queries // ok? // and you could that pretty easily / that's &uh [/] that's very very simple  16:30

PAU  the other way that you could retrieve images is called content based / &uh image retrieval / CBIR // and the idea here is &uh [/] actually what you're doing is not making use of the abstracted features or not making use of anything assigned / &uh to the image  16:43

PAU  what you actually trying to do is make use of the image content itself / the visual content  16:47

PAU  &um / and so here what you're doing is basically / &um making use of the pixels // you're making use / &er of the colour / or greyscale intensities of the pixels themselves / and then what you're trying to do is abstract from that  17:00

PAU  and from this you can work out // you can generate histograms of colour // you can work out &uh a / set of grids of texture // &uh shapes // you can identify shapes and so on  17:10

PAU  and xxx &uh [/] where [/] as opposed to a kind of &um / things like the description / being xxx [/] &er being added manually  17:18

PAU  &uh typically the content by stuff is all done automatically // so it's kind of no manual intervention  17:23

PAU  &uh of course in / these days we do still have [///] you know // we have now techniques where you can automatically label images as well // and they're sort of now starting to make use of both of these / &um techniques  17:34

PAU  of course you don't just have to use these techniques on their own // you can use them in combination and in fact that's where you have the most powerful / &uh retrieval system  17:44

PAU  the retrieval system where you can actually make use of the visual features together with the / associated text as well // because you could start the query / in &uh [/] &uh with text // so find me pictures of / whatever // you get back a set of results // and then on &uh / each retrieved cycle you can then say ah! // well find me images more like this one // and by more like this I mean images of the same kind of / shape / layout / &uh colour content  18:08

PAU  here is an example of description &uh based image retrieval  18:14

PAU  so the images annotated using texts // and then you just use traditional text based &uh retrieval techniques  18:19

PAU  &uh the approach is very very popular // so you [/] you only have to look online // look at Flickr / Google / Youtube and so on  18:26

PAU  &um [/] &uh &w [/] why is it so popular ? 18:29

PAU  well basically you can &um generate a retrieval system which should be efficient  18:33

PAU  so as soon as you start involving the actual content of the images / and you start generating features from the content / the retrieval often gets a bit slower // &um there's a lot more processing to do // if you can just make use of the texts / with other systems // ok? // &uh which are very efficient at / accessing texts  18:49

PAU  &um so here for example what we have [/] and this comes from &uh a &his [/] historic library // &um it comes from St Andrews &um / library / from the university library  18:58

PAU  &uh and they've given us a bunch of images which we’ve used in evaluation called ImageCLEF which I'll mention in a moment  19:04

PAU  and here is an example image // ok?  19:07

PAU  here is a lovely lighthouse / &uh a striped lighthouse  19:09

PAU  and this is actually a postcard // it's been a scanned postcard  19:14

PAU  and here is some of the &uh description that you have // ok? // so we have &uh semi-structured data / which is already quite useful // because we don't have to look for the location / it actually tells us / that this is the location  19:26

PAU  so for example the short title / is the Smeaton Tower in Plymouth  19:30

PAU  &uh the longer title / is Plymouth Hoe / the Smeaton / &uh lighthouse tower // and then there's a description / and the description then is a // if you like // &uh an interpretation of the image which has been added by the historian  19:44

PAU  &uh the interesting thing is that if / different people wrote the description is likely to also be / different // because we all see something a little be different // ok? // in the picture  19:53

PAU  however the historians or the librarians have been trained / to write these descriptions in quite structured ways  19:59

PAU  so there a [/] there is a certain manner of consistency / &um going across these descriptions   20:04

PAU  so here the description is / red and white striped lighthouse / on coastal cliff / with harbour and town beyond / and substantial building on cliff terrace below / which is there  20:16

PAU  so such a quite detailed description / of what somebody sees / &uh in an image and what they / &uh think is important / for actually then being able to index the image and retrieve it later  20:26

PAU  and then we have the information such as the date of when it was registered / 1904 / the photographer / &uh and so on  20:33

PAU  so you might just think that oh! this is pretty simple // all we do is just index this // ok? // using a text based retrieval system and then we can find stuff  20:41

PAU  however there are some problems // &uh it doesn't always &uh work quite so easily  20:46

PAU  first of all you have to add the annotations  20:49

PAU  so this library / &um spend [///] well they employ somebody fulltime / to add these annotations  20:56

PAU  they're under pressure / because of financial cost / to get rid of that person  21:00

PAU  what happens then ? 21:01

PAU  who's gonna annotate them ? 21:02

PAU  because if the images don't have annotations the current system / only works on texts / so the images will be lost // they would not be able to be found  21:10

PAU  ok? // what happens ? 21:11

PAU  &um so annotation is typically very expensive // is also subjective as well // so different people add the xxx [/] different labels / &uh often we’ll add different texts  21:22

PAU  the other big problem is the vocabulary used / and if you read it carefully / is a very British English vocabulary // &uh is also a &uh historical vocabulary // is in the vocabulary in the style of that particular library  21:35

PAU  and the problem there is that you might have a casual use of the xxx xxx // because they want to provide this is a / service to the general public  21:42

PAU  perhaps they want to provide it to &uh / a global audience as well  21:46

PAU  so you come along &uh [/] &uh [/] a kind of Spanish &uh [/] &uh member of the general public // and you type in [///] well maybe you don't have a word for lighthouse // maybe you translate it into English into something quite different // but actually even for somebody in English / the &uh text is / very historic  22:02

PAU  &um so all other words that used [/] they’re actually words used / a few year ago  22:06

PAU  ok? // so general public today / wouldn't necessarily even be able to find the images // even though they are typing in English // so there's a vocabulary mismatch // what you have to do is come up with the technique to overcome that  22:18

PAU  &uh the other / thing is that sometimes is quite difficult to express more kind of abstract / &uh aspects of this image  22:26

PAU  so actually if we had an image which was quite emotive // xxx what words would you use ? 22:31

PAU  so should we add some words which describe emotion ? 22:34

PAU  should we add some words which describe for example speed / the activity / the action that's happening ? 22:39

PAU &um because it might be that somebody &uh is trying to illustrate the texts  22:43

PAU  you might a journalist and you want emotive pictures  22:46

PAU  ok? // so you want a kind of funny pictures of lighthouses  22:49

PAU  you want tall &light [/] lighthouses / well-known lighthouses / and so on  22:54

PAU  what words would you add ?  22:54

PAU  what scope of words do you add ?  22:56

PAU  and this isn't a problem that's new // librarians have thought about this and they're with this problem for many many years  23:02

PAU  however as information retrieval people / as technical people we just have to understand / that this is &uh what we're dealing with // ok?  23:09

PAU  this is the type of content we're dealing with  23:11

PAU  it looks easy / on first glance / but it's not always quite as easy / as it first appears  23:16

PAU  just to mention another problem that you sometimes get / and this was a rare problem when we first use this collection / is that you have a kind of notes field here or background field // and this notes field is kind of useful for the historian because it just adds a lot of kind of / peripheral information / historical information  23:32

PAU  however / that text does not describe what is in the image // so when you go / searching using a system which just indexes everything // and you [/] you should get back some funny results / and you think / why do they get that result ? 23:44

PAU because the image / doesn't match my query / and you're a little bit dissatisfied / and the reason is that you've actually matched some text which is [/] which is matching in the background // ok? // which is some kind of peripheral / historical text / just gives some context / &um extra additional context to the image 24:00

PAU  so a simple solution is not to index the notes / and you perform a little bit better  24:05

PAU  ok? // but if you just have to understand some of these things about the content that you're dealing with  24:09

PAU  if we now turn to content based image retrieval how does that work ?  24:14

PAU  &uh we typically what we do is we &uh [/] we &uh / go through a process of &e [/] &uh / extracting features / so a feature extraction process  24:22

PAU  so for example here / this is a histogram of the &uh colours used in the image  24:27

PAU  and there are various techniques for &uh representing colour space / for representing the kind of &um histograms … 24:33

PAU  this effectively is a kind of colour composition if you like / of our image / and we can do this for various features / so you can do it for texture / we can do it for shape / &uh layout and so on  24:43

PAU  we then index these features in a / index similar to the way that we might index &uh words / from a standard text based retrieval system we can apply very similar / inverted indexing techniques // &um and then once we've done this / then the user has a search request which in this system would be / here is &g [/] &uh an example image / and what the user wants / what their requests translate into / is basically find me images similar / visually similar to this one // that's the only type of search that they can perform / using this kind of system  25:14

PAU  yes you could trick [///] &uh so you [/] you could change it slightly / so you could allow the user to control // well I want images which are more / like this in terms of colour / in terms of texture / you could have those sliders perhaps on the interface / in terms of shape // but you can only basically return back images which are visually similar in some way  25:32

PAU  &uh typically in this type of system there's also a relevance feedback commonly used / where users can refine // ok? // the process // so you can kind of get a better / hopefully solution towards the end  25:44

PAU  &um so just to say then / normally you would have &uh colour // that's one of the features that you can have  25:54

PAU  &uh so colour described in the pixel intensity // there're various colour space models HSV / RGB is a common one  26:01

PAU  &uh and typically what you would do is have a histogram which defines a distribution of colour pixels in an image / or greyscale if you're working with black and white  26:10

PAU  &uh sometimes it's been shown to be very effective that actually your first stage / in a technique like this / is to convert your colour image into greyscale // and then you do everything in greyscale // because you reduce the space if you like / the feature space // that often works quite well  26:24

PAU  &uh texture is another thing  26:27

PAU  &uh texture typically &uh defines things like smoothness / contrast / it looks at regularity / directionality  26:34

PAU &uh however it's been shown that on its own texture is often of limited value // so you need to use it together with / something like colour  26:41

PAU  and then shape as well or segmentation / you can take an image and you can segment it // so in the lighthouse we could see that there's a tall object // that could be one thing // there's another little object / &uh on the left hand side which is another building // that would be something else // and each of those form / separate objects / which we can index  26:58

PAU  &uh it has also been shown that / &um [//] xxx &uh there [/] there's a difference between for example &um / basically computing histogram across the whole image // ok? // versus what we could do is chop the images / into little squares // ok? // turn it into a grid and then have &um colour histograms that each of those little squares or each of those little grid [/] &uh grids squares // and it's found [///] that's called the local indexing technique / the other one is global // and it's found that local indexing / does seem to work better / &uh than global indexing  27:29

PAU  &uh here [/] so here is an example output from the CBIR system  27:33

PAU  &uh / so this is &uh built by one of the &uh co-organisers of the ImageCLEF / &uh Henning Müller / &uh and Stéphane Marchand-Maillet and they work at the University of Geneva // and they do an awful lot of work // they really are &uh a kind of &um / very deep / and very heavy into content based [/] visual content based image retrieval // and they've created the system which you can download and use called Viper  27:54

PAU  so you can download and index your own set of images and have a go for yourself  27:58

PAU  &um / and here is an example of the Viper working // so they've created the system called medGIFT // so the tool is called &uh [/] it was [/] it used to be Viper and is now called GIFT // and they've created a version called medGIFT which works on radiological images / &uh in a clinical application // so this works in a hospital at the University of Geneva  28:16

PAU  and the idea was that the &uh radiologist comes along / with that example / &uh image  28:22

PAU  ok? // &um and here we go / the system then basically has indexed previous // ok? // &uh images // and it ranks them // basically produces // if you like // the most similar // ok? // and just ranks some by similarity  28:36

PAU  and you can [/] you look at this and you think / well this seems to work reasonably well // ok? // we seem to be doing quite well here  28:41

PAU  &uh so let’s try another example  28:43

PAU  this time we have a patient here with the &er / hands up // ok? // there’s a certain condition on the &er [///] I think it was on the skin // and you can see that the results are a little bit more variable now // so there's some kind of false hits if you like  28:55

PAU  so the first few results seem pretty good // but then as we get along we seem to have got some x-rays here // &uh we seem we've got some kind of arms / and other bits on pieces // so the results aren't quite so satisfying  29:08

PAU  so one thing we notice at content based retrieval is often [///] well it's domain xxx [/] often works better on some domains than others // it also tends to be quite query dependent as well  29:17

PAU  &uh here is another example / &uh using the same system  29:22

PAU  this time what we've done is gone for [///] well let's not just use medical images because that's a specialized domain / let's just use general images // let's use those images from that historic photo collection  29:32

PAU  so here for example we have a lovely kind of sunset / in black and white // you can't really see it // and here are the / &um [/] here are the matching images / or the similar images // and I guess as a user / if your need [///] and we're kind of assuming that your need is find me other sunsets // ok? // so find me &ima [/] images &s [/] visually similar to this  29:51

PAU  you would probably be reasonably satisfied // because they kind of look quite similar // ok? // so you're quite happy  29:57

PAU  however / &uh if we use a slightly / &uh trickier picture … 30:01

PAU  so here this is a picture of York Minster  30:04

PAU  it's a famous cathedral / &uh in England  30:07

PAU  and actually what the user wanted was / well find me more pictures of York Minster  30:11

PAU  I've given your picture of York Minster / why is the system so stupid ? 30:14

PAU  because it’s returning back not York Minster / but also so different pictures // it's not even got / pictures of the same building / of the same type of structure  30:22

PAU  and part of the reason here is that actually it's just that they're very very difficult tasks / to do this reliably and to do it very well  30:29

PAU  we can then optimise the system and we could perform a lot better // it would be even better if we could also combine not only the / content / but also the / &uh features / of the captions / the text from the caption as well / and then we could start to do // &uh perhaps something which is a little bit more successful  30:45

PAU  but what you will find if you play around with that [///] so download for yourself GIFT // &uh have a play with it index your own photo collection // and you will see that sometimes the results are great // other times the results really aren’t quite so good  30:56

PAU  so let me just talk about some of the other things going on / &um in image retrieval  31:04

PAU  one of the a &uh perhaps more interesting areas at the moment / and this has been going on for a few years / &uh it's automatic image indexing  31:11

PAU  that is what you're trying to do is learn a mapping / between the low level &uh primitive features / and high level semantic concepts  31:19

PAU  ok? // and the aim will then be that once we train up a system / we should be able to sign new words // ok? // or new concepts / to the existing images or new unseen images  31:28

PAU  and that's really useful 'cause imagine the process of labelling images // &uh manually / it's quite a tiresome / it's quite a slow process // but we could partly automate the process if we could &pro [/] &pro [/] provide for example a set of xxx [/] of key concepts // ok? // for the annotated / just to say oh! yeah let's to say hhh {%act: onomatopoeia} // and just check them // ok? // rather than silly have to generate them  31:49

PAU  if I [/] we could even do away with any kind of manual checking and we can just allow the system / to provide the &um concepts or the key words / itself / if we wanted to  31:58

PAU  and here is the system / &uh which is &uh / reasonably good // you can access this online // it's called LI [/] LI / PR  32:06

PAU  and although you probably can't see / &um some of the &um / words which have been subscribed here // &um some of them look pretty good / so here for example we have words such as plant / and it is [/] it looks like a sort of cucumber // &uh that kind of plants // it's got indoor / it's got fruit / it's got food // and that seems pretty good / so you're quite satisfied  32:27

PAU  some of these other ones look quite good as well / there's an old building / a castle // we've got historical building // we've got Scotland even / which actually is correct because it's a Scottish castle // we also have predator / we have dogsled / we have eyes / which are wrong // they shouldn't be in there  32:43

PAU  &uh one of the most / interesting ones / the best one I think for &uh people from England // so there's only two of us from England so [/] so only me and my wife would appreciate this  32:52

PAU  but this one is the picture of the queen mother // and although we have some tags such as industry / maybe / fashion // well she was very fashionable // &uh we have fish / not quite sure where that comes from // &uh female / modern / she was always termed the modern grandmother / she was very modern // and I think the best one / comes out first / as the most / appropriate keyword is gem / she was a British gem / she really was / and I don't think it necessarily &interpretate [/] interpreted that and meant / to put the word gem  33:21

PAU  but it's interesting when you play around with the &techn [/] technique like this [/] again / you get quite variable results // but sometimes it works very well // &uh other times it really is a little bit surprising / and &uh / you're not quite sure what's going on  33:33

PAU  but that's only a big area for liking information retrieval / visual information retrieval at the moment  33:37

PAU  that is learning the mapping between the low level primitive features / and the high level concepts / and there're various techniques &uh / for doing that  33:44

PAU  &uh another &uh &te [/] or an area which is quite popular / &um is effectively / &uh classifying images  33:52

PAU  so for example we might want to classify a bunch of our own personal photos / as either indoor outdoor  33:58

PAU  well that would be so kind of quite useful because [/] 'cause then we could say &um my query is just find me &uh pictures from parties // ok? // my own photos // but then you perhaps on the interface you could have a filter or drop-down box / which is able to filter either they're indoor ones or the outdoor ones  34:13

PAU  this is also very useful for example &uh they use a lot in &uh things like intelligence communities // so scanning // you know // a sort of &uh various pictures / find me indoor ones and these are outdoor and so on  34:24

PAU  a lot of this technology is even built into our digital cameras now // so / a lot of digital cameras today have face recognition // well they can identify where there's a face / because often that's where you want to focus / the picture  34:35

PAU  they can also detect whether you're indoors or taking a photo outdoors and now whether the flash should be on and what / &um kind of / &uh level of flash to apply  34:43

PAU  but typically you do [/] do this type of task like this / you take an image / you divide it into &uh a grid // you could then represent / &um the colour / and the texture / and for each of those grid squares / you can &rem [/] represent the colours [/] colour histogram / &uh and so on ... 34:59

PAU  you then can generate basically from some training data // ok? // for each of these little grid squares / &uh we know that / for example we can define whether / part of the image is in or out // so you can do that from training data / for colour and for texture // and then we basically combine all this / together all this information together to give us an overall judgement  35:20

PAU  do we decide overall whether the image is / in or out  35:23

PAU  but a lot of these grid squares here / &uh seem to be implying that parts of this image are inside // so overall the average is out / though we classify the images being inside and we’d be corrected  35:34

PAU  and there's a lot of work going on and how best to / classify inside outside / and other types of &um classes as well  35:41

PAU  it's probably worth mentioning that &um / although overall perhaps on general photographs / the problem is very difficult // ok? // for content based retrieval  35:51

PAU  &uh the results aren't always very good and very appealing  35:54

PAU  however there are certain domain / specific applications / which do seem to work very very well / they seem to be quite visually if you like / in terms of applications / they tend to [///] they seem to appeal very well / to the visual features or properties of the images  36:06

PAU  &uh one of these areas is &uh face detection / and recognition  36:10

PAU  &uh face recognition is a much much harder task than face detection // so the task of face detection is just to say in an image like this // can we basically distinguish [/] let's draw little boxes around where faces are ?  36:23

PAU  that's your kind of first task / that's face detection // and that is kind of useful because we can have a lot of our own personal photographs / and it might be that we just wanna say find me photographs / which have people in them  36:34

PAU  ok? // a very very simple task // just find me [///] we're not worried about / which people // just find me all my own photos which have people in them  36:42

PAU  &uh however / &uh face &uh recognition / &uh is much much harder [///] actually identifying that that is xxx xxx // ok? // &um is much much harder problem  36:54

PAU  but here is an example output / from a system // here is our query image // and here are the / example images or here are the retrieved images  37:03

PAU  and actually you could see actually the system is pretty good // and the system is good enough that it can actually even deal with when somebody is rotating or moving their head slightly  37:12

PAU  and this of course is important because if this technology is being used in an airport // for example where people are moving and not always are gonna be exactly in the same position // you need to perhaps to deal with slight changes / and variations / and so on  37:25

PAU  and so this technology has a couple of false hits so as one here // but overall &uh is reasonably good // and it works well because it's domain &uh specific  37:33

PAU  there is &uh [/] there is &uh [///] if you go to Polar Rose / Polar Rose is a / &uh application / &uh with this aim of trying to basically / &um / highlight or pinpoint all the faces and all the photos on the web  that's its kind of goal if you like  37:48

PAU  &um so have a look at that / it's kind of interesting [/] interesting idea  37:52

PAU  &uh other domain specific applications &uh / seem to be work [/] &uh working quite well / &um [/] &uh medical retrieval / or &uh clinical applications  38:02

PAU  for example retrieving x-rays / retrieving radiographs / &uh retrieving kind of &uh / images like these / &um photographs / &uh so kind of / &um various / &um organs ... 38:15

PAU  and a lot of this does seem to be working quite well // and part of the reason that works often very well is that the features that you reduce / for example in identifying certain types of x-ray / are quite specific to that domain  38:26

PAU  so one of the features that seems to work well is if you have a kind of direction feature // ok? // because often you're either looking at something sort of standing up or on the side like that hhh {%act: he moves his hand horizontally} or on &uh a plane like that hhh {%act: he moves his hand diagonally}  38:38

PAU  and that's used often in medical retrieval  38:41

PAU  also a lot of images tend to be greyscale as opposed to colour so that / seems to help things quite well // &um  38:48

PAU  &uh another interesting &uh tool // &uh and you can have a play around with this // this came out from  Microsoft // it's actually constructing 3D photographs on separate / individual &uh 2D photos  39:01

PAU  &uh and they have an application where you're able to do this  39:04

PAU  so you have lots of different photos taken of the same object or same place // and what it aims to do is / find salient points / in the various photos // and then it kinds stitches the photos together in a kind of 3D representation  39:16

PAU  and it's really good if you go and have a xxx [/] go and have a play with it you can actually try up the application up for yourself  39:21

PAU  it uses typically photos from Flickr // that is because the Flickr community have uploaded lots of photos about specific &uh places / or [/] or regions  39:30

PAU  &um and once you have a lot of data // this relies on having a lot of photos / lots of photos from / different &um positions of the same objects // &uh it can stitch things together and create quite realistic / and quite xxx / well pretty cool &uh 3D representations // so it's worth having a look at that  39:47

PAU  &uh there're many many examples of content based image retrieval systems  39:52

PAU  &uh IBM produced one called QBIC which is one of the first ones that used &uh colour histograms  39:57

PAU  &uh there are / huge [/] there are a mass [/] a mass of other ones as well // and so what I would suggest is &uh just go away &uh and have a look at some of these  40:06

PAU  &uh it has been shown / &uh over the [/] over the years / that one of the key algorithms is using a colour histogram  40:11

PAU  colour seems to be quite a dominant feature / in terms of find me images like this one // &uh colour seems to be quite dominant / even greyscale / &uh seems to be quite dominant  40:20

PAU  of course on content based image retrieval there are / &um / like text based retrieval lots and lots of problems  40:27

PAU  &uh it's not a particular easy task // it's very challenging task // &uh [/] and some of the &uh [/] some of the problems of this // &uh there's a problem with the sensory gap // that is typically when you digitalize something you lose something from the real world // &um / so there's a problem already if we’ve already lost some information  40:44

PAU  &uh there is &uh [/] the classic semantic gap // and that is the gap between / basically you're dealing with pixels / you're dealing with very low low level features // but actually they're representing very high level semantic concepts // &uh and somehow you need to bridge that gap // and there're various techniques to bridge the gap / for example relevance feedback could be one way / automatically labelling images making use of text features and so on / would help to bridge the gap  41:08

PAU  &um often &um when you look at some of the content based image retrieval systems produced / they're not that interactive for the user // &uh so perhaps the user interaction / &um / &uh has not really been / &um designed from the point of view of the general user // so perhaps you couldn't get your mum or your dad to you use a content based retrieval system / because it involves / changing properties / of the colour or the &um / the texture and so on  41:32

PAU  that is not necessarily true of all systems / there are some systems &whi [/] which are very good / &uh and do certain work quite well / but overall / I would say that's the case  41:40

PAU  evaluation there're very &uh few common datasets  41:44

PAU  &um so there's little comparison // &um proof of performance of various algorithms // although again that has changed over recent years // and there are some classic datasets that you could use / for evaluating your content based retrieval system  41:57

PAU  &um actually / &um / there are / at the time I read the slide I said very few commercial systems most on academic // that probably is &uh [/] is still true // but with examples such as the Microsoft / where you're stitching together the photos to create something in 3D // there are systems and tools starting to come along // &um but actually they're not still picked up / it's surprising but content based retrieval systems / are still not being / picked up and heavily used / &uh in some domains where you would expect them to be used  42:25

PAU  so why don't we see the &uh this type of technology used more / in digital libraries ?  42:30

PAU  it still seems to be very academic / &um very kind of research orientated  42:34

PAU  why is it &no [/] xxx use more for example in / &uh large xxx photographic collections ?  42:38

PAU  ok? // if you look at a lot of these collections online they're still very simple / very text based  42:43

PAU  why not ? 42:43

PAU well part of the reason is that the text based approach works quite well // ok? // as it's quite difficult to necessarily beat it  42:49

PAU  &um so some of the open issues / &um areas of future research / obviously bridging the semantic gap is a big one  42:57

PAU  so automatically labelling pictures / providing relevance feedback [/] effective relevance feedback / providing browse search functionality to enable a user to use it to explore collections in a more interactive way  43:07

PAU  &um somehow thinking about how you put humans in the loop ?  43:10

PAU  so how do you put a human in the loop in a content based retrieval system ?  43:14

PAU  where in the loop do you put them ?  43:16

PAU  &uh what kind of interaction &uh functionality do you provide ?  43:19

PAU  &uh we need to be developing &uh interactive systems that meet / &uh [/] &uh [/] the needs of real users / in real scenarios / in real situations / across different domains / because typically the technology / works best when you tune it to specific domains  43:33

PAU  &uh we need to be carrying on developing efficient retrieval methods // &um that is being up to scale the systems up to millions and millions of images  43:41

PAU  we also need to keep on [/] on creating standardized evaluations benchmarks / on which to evaluate our systems and various algorithms / and thinking about the evaluation method [/] measures  43:52

PAU  so which measures evaluate best with human &uh performance and effectiveness ? 43:56

PAU  and of course standards are continuing / and these’re issues which are ongoing  44:00

PAU  &uh MPEG standards are great but // you know // we need to &uh have a &sta [/] standardized / a set of standards as well for the metadata  44:07

PAU  &uh collaborative systems // so how could we develop perfective ways of sharing / &uh multimedia ?  44:14

PAU  so for example Flickr is a great &ex &uh [/] great tool // &uh a good example / Youtube / ways of sharing  44:21

PAU  &um how could we exploit user participation / to enhance multimedia ?  44:26

PAU so one classic example is the ESP game / &uh used to automatically label / &uh images which is being bought up by Google // so the Google image labeler tool  44:35

PAU  well basically you're relying on the power of people / human computation // you're basically relying on people labelling images for you / to then be able to use that as training data for your / &er system  44:46

PAU  &uh semantic search / so there's &uh [/] that's another ongoing &uh topic / that is focusing more in extracting for example objects / labelling objects within images / detecting concepts within the media // and particularly in / images where you have very complex backgrounds // &um so that is the general photograph for example is still very very difficult / &um image to deal with // images which tend to be a kind of you know / a apple sitting on a table / classic kind of image analysis type of evaluation benchmark / so not so realistic  45:15

PAU  you know // the photos that we take are very very complex / consist of many many interacting objects  45:20

PAU  &uh multimodal analysis // so we'd seen that the best way to perform efficient retrieval / &uh / and successful retrieval would be to combine the media // ok? // so it's a half combined / image and &uh text based retrieval methods working together // but somehow you need to combine this xxx you do xxx  45:36

PAU  you have a separate image retrieval system or &co [/] content based / retrieval system / a separate text based retrieval system // you perform separate searches somehow / and then you combine the results list in some way // or do you have a system which actually combines visual and textual features into the whole model ?  45:51

PAU  ok? // that's [/] yeah you have to think where [/] whereabouts would you do that ? 45:55

PAU  and also there's a big / move to what's experiential multimedia systems / multimedia is very rich // it would enable us so kinds of kind of interactive and &um quite interesting and cool interactive approaches // &um that is [/] could we use a kind of experiential or &expe [/] or provide more experience for the users to enable them to explore our collections ?  46:15

PAU  so I think here about 3D collections / &uh art / museum &uh / collections / galleries and so on  46:21

PAU  &um so now I just wanna turn to video retrieval  46:25

PAU  so &um we've dealt / many with images // and you've already seen that dealing with just still images is / kind of quite difficult / quite complex xxx xxx lots of issues / &uh lots of challenges  46:35

PAU  &uh video retrieval unfortunately is even harder // &um because you're dealing with moving images / so lots and lots of still images / &um together in some kind of &uh / context  46:44

PAU  &uh video [/] video &exhi [/] exhibit very similar properties / visual properties to images // but the main difference is the kind of spatio-temporal {%alt: tempo} aspects // that is a video composes of a / &uh lot of still images  46:56

PAU  so you have to remember that // ok? // there's a kind of temporal context / the temporal ordering / to these images  47:01

PAU  &um so to access video material / means that you have to index the videos in some way  47:08

PAU  &uh typically what you would do is take &uh / long video clip and break it / into segments / or into bits // and then you would index / perhaps those bits  47:17

PAU  &uh so for example / &uh you could index / from a video / the still images // so you can take each of the images // &uh you could index the audio transcripts // you could divide the image into segments / and &um index those // you could index any metadata that comes with the video / if it's for example from &uh a broadcasting corporation / or from a / library // ok? // you could access the [/] &uh / the video on that basis  47:41

PAU  &uh most videos can be &um / hierarchically organised which kind of helps us  47:46

PAU  so most / &uh videos would typically consist of a clip // ok? // so it might be the whole video / xxx it's just a short video / or part of the video // &uh typically the clip consists of / &uh a story or a scene // &um the scene consists of / &uh shots / which then consist of a number of individual frames // and then the frames can be handled very similar to the way that you would handle still images  48:11

PAU  &uh there are / automatic techniques for dividing a video // so one of the most popular techniques is called shot boundary detection  48:18

PAU  basically we're trying to identify shots // and a shot is just to find there's a kind of change in the camera angle  48:24

PAU  there is also techniques for scene and story / &uh boundary detection as well  48:31

PAU  that's slightly harder because often the scene or the story / is perhaps more semantic // ok? // it's more related to the kind of narrative of either a new story / or some kind of broadcast  48:41

PAU  and the way that you typically do that is you actually look [//] here is one still image for example if you're doing shot boundary detection // here is another still image // let's [/] let's look for the change // ok? // the transition between them // and when we see the kind of shot transition so a big change in colour / big change in texture and so on // we can then chop / at that point  49:02

PAU  &uh now that's quite naif / and is mainly using signal intensity  49:06

PAU  you can also make use of for example the associated metadata / so the ISO transcript  49:11

PAU  we can look for a pause / for example in the transcript // that might indicate an actual break // so we can combine features together  49:18

PAU  &uh again / I'll be pointing to &uh / very / useful paper again by Peter &En [/] Enser / &uh and a lady called Christine Sandom  49:28

PAU  &uh this is back in 2002 // and again they ask the same question // great to have this king of technology available to us // but / what the people want / when they're accessing video collections ?  49:39

PAU  &uh and it was just interesting that they broke down / they did a similar kind of analysis // and here they were finding that basically / people typically were looking for a safe [/] they're looking for / &uh certain people / so the who dimension  49:50

PAU  they're mainly looking for example for / examples of people  49:55

PAU  so find me / for example / fishermen / find me academics // or here // this is a &speci [/] specific // so find me pictures of / David Beckham or video shots sorry / video clips of David Beckham / and so on  50:07

PAU  so it is again interesting just to have a look through the numbers / just to give you an idea on what does the user want / when they're accessing a &uh video retrieval system  50:15

PAU  one of the hardest things when you're accessing videos it's designing good user interfaces / because you're dealing with the complex media / &uh complex media type  50:24

PAU  &uh video libraries are / &um typically / &um should support various user interactions  50:30

PAU  &um this is just standard video libraries // that is the [/] the &um [/] the people working at the library / those accessing the / collections xxx to browse // and select the videos [/] a specific video program / &uh from the collection / so that might be an episode of Neighbours / an episode of &uh / the Simpsons or whatever  50:47

PAU  they must at least be able to carry out that function  50:49

PAU  &uh they should be able to query the content of the collections // so for example have a search box and be able to type in / Simpsons and get all the video clips or all the videos of the Simpsons  50:58

PAU  &uh they must be able to browse / ideally the content / of a video program  51:04

PAU  now either you provide a / some kind of browsing functionality or they have to sit and watch the video from beginning to end  51:10

PAU  ok? // &uh they must be able to watch the video program either all of them / or one of them  51:16

PAU  and ideally they must be able to requery / &uh within the video library // ideally within a program as well // so you want to be able to kind of search / parts of / &uh a video for example  51:26

PAU  &um one of the things you can do / &uh is generate surrogates  51:32

PAU  &um after you do your segmentation so let’s say we have a big video clip // we divide it into parts // ok? // which make it a little bit easier to scan and to index // that is we could allow the user to jump in / at certain parts / or certain segments  51:46

PAU  what you might want to do is actually visually represent each of those segments  51:50

PAU  so why don't we select the representative image // or what we called keyframe // to represent each of those segments or each of those parts ?  51:57

PAU  that's something that we could do  51:58

PAU  &uh we could also represent certain words for example from the segment  52:03

PAU  we could generate something like a tag cloud / which should be a kind of linguistic or &uh verbal / &um expression if you like / of that video clip  52:11

PAU  so various things that we could do // we [/] we could generate various types of surrogate / to represent each of those segments // and that would allow or would help the user / be able to kind of jump in / to the video &uh / and get a feel for the video / and various segments / within a video  52:25

PAU  &uh there aren’t so many / &um video retrieval systems available that you can just download off-the-shelf  52:32

PAU  mainly because it's a quite complex task / to actually build / and then set up / and run a video retrieval system  52:38

PAU  &um IBM do have one &um / that they produced the IBM multimedia analysis and retrieval system / &um MARS // and &uh / it's an interesting a system / ‘cause its very really is a state of the art / multimedia management  52:50

PAU  &uh it's shown / on a comparative system evaluation at TREC / or TRECVID  52:55

PAU  &uh basically the components are showing to achieve well class performance // so really is / &um worth looking at the various states of the art  53:02

PAU  &uh uses various multimodal techniques / so it makes use of the text features / of an image / so the ISO transcripts / the speech / the audio / &uh together with the visual  53:12

PAU  it uses multimodal techniques / and various machine learning techniques // &uh to xxx model semantic concepts // so when the video / when it divides it into bits / &uh and the frames / and the individual keyframes / they even assign concepts to it  53:23

PAU  so this part of the video is happy bit / it is a xxx xxx or so on  53:27

PAU  &um / and they have &uh / a library of around a hundred concepts / so they say for example that &uh / they're able to / &uh classify and assign for example that this part is about / and it involves or has a sky / it has people / beaches and so on  53:41

PAU  &uh humans are required to create the training data / so &um // you need quite a lot of training data // but they have techniques for minimising / &um the amount of human effort involved / &um quite clever techniques  53:52

PAU  then it [/] then the system assigns confidence scores for classification // so you can roughly see how accurately / &uh &er or how confident it is assigning these labels  54:02

PAU  &um and the system is also able to exploit relationships between &um the various concepts  54:08

PAU  &um so beach for example is more likely to be related to the concept / sand // &um than perhaps for example sky / and water  54:17

PAU  and then the concepts can be used as part of the search and browse &functional [/] &uh search and browse functionality // that is you can just look for video clips which have sky in it / video clips which have beaches in it / for example  54:29

PAU  so the interface is sort of interesting / do have a look at xxx [/]  some of the papers on this if you're interested / &um because they have quite &uh / a nice interface  54:36

PAU  and this is the sort of thing that you end up with / they index huge amounts of &uh [/] lots and lots of &uh / video clips // and here is just an example // the type of thing that we are indexing  54:48

PAU  so it's quite nice because actually what they're playing around with typically / are quite real world things // perhaps video clips on Youtube / that kind of thing  54:55

PAU  and they tend to be a lot harder to deal with / they're not just kind of academic / &uh examples / they're real life examples  55:01

PAU  &um challenges are very similar / &uh as for those for image retrieval it's not &uh an easy task  55:07

PAU  &uh robust shot boundary detection  55:10

PAU  &uh shot boundary detection is definitely getting more advanced / it's getting better / more accurate // &uh but it's still difficult / and ideally what you want to do to get get good shot boundary detection / &er is basically make use of not &on [/] only the visual content / so not only the kind of visual features // so the transitions between images in terms of colour / shape / texture  55:29

PAU  you also wanna make use of for example the audio / the ISO if you have it / so if there’s a pause and there's a big change in [/] &uh big change in the colour content and so on // that might indicate more of a / good shot / or more of a boundary / where you want to break the &uh video / than just using any of the &uh / features on their own  55:46

PAU  &uh story segmentation is definitely a challenge // that is trying to &um / divide / &uh video / a kind of high level / a sort of more semantic type of / things if you like / &uh running through a &narra [/] narrative // that's just very difficult // typically &um / because the context is very normally application dependent  56:04

PAU  so a lot of the work so far has been applied on news videos / on news broadcast / so TRECVID for example // &uh and the news broadcast typically follow a standard structure  56:13

PAU  ok? // and you can exploit that structure / as your kind of [/] trying to do things like shot boundary detection or story segmentation // other types of videos such as documentaries don't do that // don't have that  56:23

PAU  &uh one of the other challenges / is this whole notion of kind of labelling / parts of the images / so trying to apply a kind of quite high level semantic concepts // such is difficult  56:33

PAU  for example especially when you want to apply more than say a hundred / you know concepts maybe one thousand // if your library has ten thousand // why have some kind of ontology that xxx ?  56:42

PAU  how would you do that ?  56:43

PAU  providing interactive search / &uh browse interfaces is just difficult // it's quite a hard / &uh media  56:49

PAU  &uh providing personalised search / is also quite tough in this domain as well  56:53

PAU  &um if you want to have a look at some kind of current state of the art or research going on / in &um image and video retrieval // then have a look at the CIBR / series of conferences // &um they run every year  57:08

PAU  &um this unfortunately is only from two thousand and seven // but this is just a tag cloud / of &um / just the &um / call for papers // and you can see some of the things that they're looking at / mainly image and video retrieval // but they're addressing now things such as the web // so perhaps dealing with &um Youtube videos / images from Flickr / &uh indexing / thinking about applications / in particular // where can we apply this / technology // thinking about systems is also a heritage / so a call for heritage // looking ontologies / and &on [/] ontological structure summarisation // so how can we effectively summarise for example &uh a video / into something much shorter // and so on  57:44

PAU  so it's a very good source for [/] for seeing where the state of the art is at the moment  57:47

PAU  ok now just &uh / in the kind of final part of the lecture / or the talk / I just wanna talk a little bit about evaluation  57:57

PAU  &um so it's with [/] have a very kind of quick / and a rough introduction to image and video retrieval // that's a very very [/] very very brief introduction / there's an awful lot more to add than that // &uh that we can cover in just kind of one hour one hour one hour and a half  58:12

PAU  but let me just brief you and talk a little bit about evaluation / because I feel that this is one of the / kind of the / important challenges if you like / when importing images / for developing good / &uh image and video retrieval systems  58:22

PAU  &um why is evaluation important ?  58:25

PAU  will evaluating the performance of a system / &uh is an important part of the development process ?  58:30

PAU  so if you're developing a system you normally want to be evaluating parts of it or the whole thing // ok? // to see how you're improving it  58:36

PAU  you try various tricks of the parameters / changing the parameters // so continually trying to improve your system  58:41

PAU  &um you want to establish / to what extent the system / &uh being developed / meets the needs of the end-users  58:48

PAU  that also assumes that you have quite a good understanding of what the [/] the end-user actually wants // you might have an understanding of the typical tasks that they perform // the typical tasks and queries that they perform // which your system should serve / and which your system should need  59:01

PAU  so ideally when you evaluate you want to evaluate against real / life / kind of scenarios / or realistic scenarios // ok?  59:08

PAU  &um you also evaluate to show the effects of changing the underlying system / &um or its functionality  59:15

PAU  you wanna be able to say if I change the algorithm / or this algorithm / or if I change it like this // what would be the effect on the effectiveness or the performance of the system overall ?  59:23

PAU  &um you also perhaps wanna be able to &um compare different systems  59:27

PAU  so is the &um / IBM MARS system better than / the system by somebody else ?  59:32

PAU  so we need to be able to do some kind of comparative performance  59:35

PAU  so evaluation is very very important  59:37

PAU  &um typically evaluation of IR system / does have quite a long history  59:43

PAU  &um we'd normally tend to focus on either / the system / or the algorithms or the user  59:48

PAU  &uh there're so many &um / perhaps &um / large scale evaluation campaigns which tend to focus on both / but we'll come back to that in a moment  59:55

PAU  &um Tefko {%alt: Tesco} Saracevic / &uh in ninety five / &uh distinguishes six levels of evaluation for IR systems / of information systems in generally / but you can include IR systems within this  1:00:07

PAU  and &uh / he identifies / the engineering level / the input level / the processing level / output level / use and user / and the social level // and basically sais that we should be [/] be evaluating our systems / our information systems at all these different levels  1:00:22

PAU  now I would say that so far a lot of the focus on the IR evaluation is really being more on the kind of algorithms / that’s been more on evaluating the technologies and the techniques / perhaps less so / on evaluating the output / on evaluating the use / the user // particularly what would be the effect on the social [/] you know the kind of social implication   1:00:40

PAU  that is to carry out that sort of evaluation normally &come [/] normally involves some kind of longer term evaluation // of actually employing the system / in say / a realistic setting in an organisation // studying the use of that over a long time  1:00:54

PAU  so of course that's something that is not appropriate and possible for every / kind of evaluation campaign if you like  1:00:59

PAU  &um but certainly that has trended to be more of a focus on less just to evaluate / &um the technologies / or the algorithms rather than &uh / use  1:01:07

PAU  &um so now I just wanna mention &uh very briefly &um / one of the evaluation campaigns I've been involved with for a few years now  1:01:17

PAU  &um the evaluation campaign is called CLEF // &um stands for the Cross-Language Evaluation Forum  1:01:24

PAU  and &um CLEF is &uh / been going for &uh [/] well it's gonna be its tenth year / &uh next year // and &uh its focus is [/] is mainly on cross language information access  1:01:33

PAU  so evaluating &um various systems that provide &uh cross-language or multilingual / information access  1:01:40

PAU  and &uh CLEF ranks a whole / range of different types of tasks // so &er you're able to evaluate all different kinds of information systems  1:01:48

PAU  &uh for example you can just evaluate multilingual ad-hoc / retrieval // that is just giving a query / retrieve a bunch of documents from a multilingual collection // and they have &um [/] they have tracks / which test that  1:02:01

PAU  there is [/] there is domain specific / &um multilingual retrieval // there's a kind of a task for that  1:02:07

PAU  there's interactive / &um information retrieval // &uh which Julio is gonna talk a bit about tomorrow I think  1:02:13

PAU  then we have question answering / &um which again &um the guys from UNED / are heavily involved with  1:02:21

PAU  we have cross-language retrieval &uh in web documents / or from web documents  1:02:26

PAU  we have cross-language retrieval &uh &geo [/] geographical / &uh cross-language retrieval // so that's where the &er focus is a little bit more on / exploiting the spatial / &uh properties of documents  1:02:37

PAU  &uh cross-language video retrieval / that was new for this year // &uh that's a really interesting task // &uh very very interesting  1:02:44

PAU  and then also multilingual information filtering as well  1:02:47

PAU  &um so CLEF / has a huge range of different tracks // and they're trying to evaluate all kinds of information systems across all different kinds of contexts and domains  1:02:57

PAU  so that seems to be that we’re heading in the right direction here // this seems to be pretty good  1:03:01

PAU  so let me just talk about ImageCLEF  1:03:05

PAU  &uh so &I [/] ImageCLEF was set up in &um two thousand and three  it was the first time we &uh ran it // it's still running now // and &uh hopefully it will run next year as well  1:03:14

PAU  &uh it's a part of CLEF // and &uh we have a number of different tasks  1:03:17

PAU  &um this year the tasks that we had were &uh retrieval / of general photographs from a general photographic collection  1:03:23

PAU  &um you might think that's kind of easy but they're general photographs // so that tends to be a little bit harder / they tend to be kind of touristic &pho [/] photos // &um so the type of photos that you might find on / Flickr for example  1:03:34

PAU  &uh so they are just / a little bit harder to work with  1:03:37

PAU  &um those photos so that photo collection &uh had bilingual / or had multilingual captions as well // so you could exploit either the visual properties of the image or the text / &uh maybe interesting  1:03:47

PAU  &uh this year we xxx [/] specifically focused on what we call the diversity task  1:03:51

PAU  so the aim is basically / &uh to try and &uh / retrieve &uh diverse groups or groups of images  1:03:57

PAU  &uh I'll show you an example of that in a moment  1:03:59

PAU  and we also have a classification task as well  1:04:02

PAU  &uh so could you label images / &uh from this general photographic collection with a very simple / set of categories or concepts ?  1:04:09

PAU  &uh we also had &uh / a Wikipedia task this year  1:04:13

PAU  so this was &uh a collection of images &um taken from Wikipedia  1:04:17

PAU  &uh not only did you have images but of course Wikipedia has semi-structured data as well  1:04:22

PAU  so &uh the &uh participants can make use of that  1:04:25

PAU  and that's kind of interesting because again it's quite realistic task / it's quite a realistic setting  1:04:29

PAU  and this task evolved from INEX // and basically as [/] as INEX which has been a large scale evaluation campaign / for &uh evaluating &s [/] &uh semi-structured or structured information retrieval systems  1:04:40

PAU  INEX &uh has closed now / and stopped running // &uh but the people around the &uh Wikipedia task have moved over to ImageCLEF // and so now that's kind of part of / &uh ImageCLEF // so that's where it comes from  1:04:51

PAU  &uh for a number of years now / we've been running a specific / &uh medical image retrieval track  1:04:56

PAU  &uh that's been very very interesting / we've had &um / a good set of realistic collections / of medical images // &uh which include / not only the images but also the case notes as well / the case histories  1:05:07

PAU  that's a very very specific challenge // and so here for example we tend to &uh recruit or [/] or we get a lot of participants from &um / medical institutions and hospitals for example // or medical research institutions // and that is because they often [/] you are dealing with quite specific clinical terminology // and they wanna make use of / &uh some of that terminology in their / retrieval  1:05:27

PAU  &uh there's also &uh [///] on the medical images we also had a &um / automatic image and &uh / &um annotation task as well  1:05:34

PAU  &um and that also in conjunction with I-CLEF / &uh we've also been running an interactive / &um cross-language image retrieval task as well  1:05:43

PAU  so we've been quite busy hhh {%act: laugh}  1:05:46

PAU  &uh so ImageCLEF itself // &um what we've tried to do is mainly promote &uh / system / evaluation // but we are aware / that is very important to evaluate the user as well // so we did try or we have tried to perform / &uh user-centered evaluation as well / on this kind of large scale setting // and just to say with this large scale setting what you're trying to do / is basically attract &uh participants / from groups around the world / where you're trying to then compare various systems  1:06:13

PAU  so it's not just like a single group // ok? // &uh working on &uh their own system // you're trying to compare various systems  1:06:19

PAU  &uh what we typically provide / are a number of resources to help you evaluate and test your systems  1:06:25

PAU  so over the years ImageCLEF has provided a number of document collections // that includes both the images and the text  1:06:32

PAU  those collections / we have &uh / as far as possible tried to make sure / that you can get access to those // they are publicly accessible  1:06:39

PAU  ok? // that it's been one of our goals / to create &um image collections and sets of collections which anybody can get hold of for research purposes  1:06:46

PAU  and that is very hard // ok? // to do / kind of realistic / image retrieval evaluation // ideally you want a set of images / for example from a large news agency / or something like that // but copyright / problems are get in the way  1:06:59

PAU  &um but we have secured a number of &um / collections / for use in ImageCLEF  1:07:03

PAU  so not only do we provide the document collection / we provide some example search tasks // &uh typical kind of search tasks // &uh typically [/] &uh normally ad-hoc retrieval tasks or search tasks  1:07:14

PAU  we then also provide the relevance judgements / for each of those tasks  1:07:18

PAU  so we tell you how to do the collection // which of the documents are relevant to a user need or not  1:07:23

PAU  and that is exactly what you need to then be able to evaluate / &um the system effectiveness  1:07:28

PAU  and we also provide other resources as well // so we supply participants with example content-based retrieval systems / if they want to use them  1:07:36

PAU  we provide for example the medical task access to / &uh clinical terminology list / &uh gazettes and so on  1:07:42

PAU  &uh an annual workshop is held every year in conjunction with CLEF / and the ECDL  1:07:48

PAU  and &um [/] xxx [/] xxx [/] so the workshop that then people are able to / &uh basically tell us // ok? // how have they achieved / &um &er their performance  1:07:57

PAU  so we normally select the best performance systems // and they present / &uh their algorithms / or how they've achieved their / &uh results  1:08:03

PAU  just to mention to say that &s [/] &um [///] this is mainly focused on image retrieval  1:08:10

PAU  if you're trying to evaluate your video retrieval systems // TRECVID is the obvious / &um place to go to // &um but also now / &um we have CLEF video as well // ok? // so that's another place that you can go / to evaluate video retrieval systems  1:08:22

PAU  and they operate in / pretty much the same way  1:08:25

PAU  &um so I'm not gonna run through this but just to say from two thousand and three we started / &uh quite small // so we just had one / image retrieval task / which is based upon that / historic photographic collection // if you remember that lighthouse  1:08:39

PAU  so we had twenty thousand images from Saint Andrews Library  1:08:42

PAU  &uh a set of about fifty / &um search tasks / or queries derived / from a query log / from Saint Andrews University // &uh and then basically the goal was to find // you know // the system had to find the relevant images // and they were a kind of diverse query tasks to test in xxx very different / &uh aspects // and we had four participants / including ourselves // so we didn't seem to attract too much  1:09:04

PAU  so at that point we were thinking about / killing the track  1:09:06

PAU  &uh however we decided to carry on  1:09:08

PAU  in two thousand and four we added a medical retrieval task  1:09:11

PAU  &uh things got slightly better because we had seventeen participants / &uh which is great  1:09:16

PAU  in two thousand and five we then had four tasks // we added two image classification tasks / because people from the community said that they would be interested for them // so should add that / and we did  1:09:26

PAU  &uh then in two thousand and six / we had thirty participants for four tasks  1:09:30

PAU  we also had an object classification task using the &uh data / provided to us from LTU / which is excellent that's very real world data  1:09:38

PAU  two thousand and seven thirty five participants  1:09:40

PAU  classification task was made harder // because it was turned into a hierarchical classification task // &uh which is a harder task  1:09:47

PAU  and then in two thousand and eight this year we managed to &uh get forty five &um participants &um to take part in our track // &um which is great I mean it's really good to have sixty three people registered / forty five people take part  1:09:58

PAU  and I think part of it / is because we introduced this new / task as well // we had the Wikipedia task // and we had a new collection for example for the ad-hoc tasks that people &uh seemed to quite like that  1:10:07

PAU  &uh so sixty three groups registered for five tasks  1:10:11

PAU  they're kind of xxx [/] the photo retrieval tasks from an [/] &uh general collections that's kind of just to search from a general photographic collection / still seemed to prove very popular  1:10:19

PAU  &uh the medical &uh retrieval tasks are quite popular as well // the &me [/] the Wikipedia task was also / &um popular as well  1:10:26

PAU  &um / I will say some of the highlights from this year  1:10:30

PAU  &uh we had really good participation  1:10:32

PAU  it's always quite difficult to / attract people to take part in / a competition / or evaluation that you run // because there's a lot of work // and there's no kind of / immediate benefits // if you like  1:10:42

PAU  so it's very encouraging that we had in the end forty five people actually submit [/] &uh submissions to &um / to the evaluation campaign  1:10:49

PAU  &um we had this interesting and I'll show an example in a moment / but this interesting task of being able to &uh / promote diversity / in your retrieval system / in the image retrieval system // that seemed to work very well  1:11:01

PAU  the Wikipedia task was great / that seemed to attract &uh a lot of interest  1:11:05

PAU  the &um [/] I'll tell you about this in a moment / but the medical retrieval task was particularly interesting / because the challenge there was actually what you had was a bunch of / &uh medical literature // ok? // &uh journals for example / medical journals // and so there you had / an awful lot of text / in the &uh [/] around the image which was not associate with the image itself  1:11:23

PAU  so actually what you needed to do is select the / bit of relevant text that you would associate / with the image // so filter out the / background or the noise if you like // &uh which proved to be quite challenging  1:11:34

PAU  &uh Quaero / which is a big &uh / EU-funded &uh project sponsored / &uh a workshop that we ran before / &um we had &uh ImageCLEF / before the ImageCLEF event / where we really focused on multimedia retrieval evaluation issues  1:11:48

PAU  and we've run this now for a couple of years and we've attracted people / Alan Smeaton / &uh Donna Harman and people like that come along and talk a bit about evaluation  1:11:55

PAU  and the goal is great / just to get people together and discuss / some of the issues which / we feel are facing us today / &uh to evaluate multimedia retrieval systems  1:12:03

PAU  &um so the various tasks // &um I'm not gonna run through this / in detail but just to say we had a photographic retrieval task / aimed at promoting diversity  1:12:12

PAU  we had an automatic concept detection task / which had very simple hierarchy of objects  1:12:17

PAU  a Wikipedia retrieval task / and then two medical retrieval tasks  1:12:21

PAU  &um so the photo retrieval task looked to be like this hhh {%act: pointing to the screen}  1:12:26

PAU  the aim was to promote diversity in your retrieval system  1:12:29

PAU  &uh what does that mean ?  1:12:30

PAU  &uh well imagine that we do a search like this / and &uh we want images of typical Australian &amages [/] &uh Australian animals // we do a search // ok? // and we get back this / set of results // ok? // so images of typically called Australian animals // ok? // and we get a lot of / kangaroos and wallabies  1:12:48

PAU  &uh actually the position of ten in this case is one // so / it's a perfect result // isn't it ?  1:12:54

PAU  well let's see  1:12:55

PAU  ok? // everything is relevant so / all seems to be good  1:13:00

PAU  now what happens then if we had a retrieval system // &uh which did this ?  1:13:04

PAU  ah! / this is interesting // because now what we have / are different / Australian &an [/] &uh &um animals  1:13:13

PAU  so actually what we've done here / we have a system that now promotes diversity  1:13:17

PAU  that is what we have done is / cluster together the results / into groups // select representatives / from those clusters // and put those in the top ten // ok? // and that's what we call a diverse result  1:13:30

PAU  that's promoting diversity  1:13:32

PAU  so now what we find xxx [/] xxx xxx [/] I haven't got it  1:13:36

PAU  but here the precision is also one // but which result / is more satisfying ?  1:13:41

PAU  xxx you probably prefer this result and we can show some empirical evidence where we've tried this with users / and users would prefer this result / as have some other people as well  1:13:50

PAU  so it seems to be that promoting diversity is a good thing to do // and so what we want to do [/] do in this task / is to actually encourage people to do that / and then provide evaluation resources / so the people could access their systems  1:14:02

PAU  and that involves for us thinking the evaluation measures // 'cause precision recall don’t work any more / so that we have come up with something else  1:14:09

PAU  &um so it's a very interesting task / &uh around  1:14:15

PAU  hhh {%act: click} &uh the visual concept detection task  1:14:17

PAU  basically people had to take the images and assign these concepts // and these concepts were on a kind of hierarchical &fa [/] &uh fashion  1:14:24

PAU  so we had the classic indoor and outdoor concepts // you must &uh label a image and tell us whether it's indoor or outdoor  1:14:30

PAU  but you &m [/] must also tell us whether there are people in the images / whether there are animals in the images // water / sky / whether it's day or night / whether there's a road or pathway in the image  1:14:40

PAU  and again we provided the resources so the people could evaluate the systems // and so with the &uh outcome of this  1:14:46

PAU  you can look for the papers and you can actually see how well can this task be done  1:14:51

PAU  and why is this kind of useful / while a lot of the images we use here // with the types of images you might have in your own personal collections ?  1:14:57

PAU  so actually you could apply this technology to your own stuff // and add these labels automatically  1:15:03

PAU  and that actually might help you an awful lot / when it comes to retrieve in your own [///] and browsing through your own personal photos  1:15:09

PAU  &uh the Wikipedia task / very simple / just looked like this // so people were given this kind of information  1:15:17

PAU  &er you had a number of queries that you had to perform // &uh and this kind of semi-structured retrieval  1:15:22

PAU  so you had to know and understand / something about Wikipedia first / to be able to exploit / &uh the text  1:15:27

PAU  again it's kind of challenging // because sometimes / there's very little text // sometimes the text doesn't relate to the image // and you're kind of [/] you know // you have to work out // &uh the best text to use  1:15:36

PAU  &uh and then we had the medical annotation task / &uh in two thousand eight  1:15:42

PAU  ok? // and this is very interesting // had to use a hierarchy of classes  1:15:44

PAU  &uh in the first task that we had / the aim was to use a coding scheme / which was for ideological &uh images  1:15:50

PAU  so that is that you might have an image and you have to say / well [/] basically there's about a twelve digit code // and this code represent different aspects of the image // that is the image involves an x-ray // xxx [/] xxx somebody standing up like this hhh {%act: shaking right hand vertically} / and so on // so you [/] you classify the image in &uh quite a complex way  1:16:06

PAU  again / some kind of evaluation run // and that was very interesting in this task / local features again // we've seen this before but local features were outperforming the global ones  1:16:15

PAU  that is dividing an image into bits and &um basically computing / features / on the segments of the image or the bits of the image rather than taking the features over the whole image // seemed to be working quite well  1:16:26

PAU  and it did seem to be very much that machine learning techniques are the keys to success in that task // so selecting the [/] the right machine learning technique / &uh seem to be very very key  1:16:35

PAU  and then the medical retrieval task // just to say // as I said // it was just very very interesting // that is it was a task where you're not retrieving now images from just a kind of image collection with short captions // you are basically given a set of scientific articles / or medical articles // and your goal is basically to find a set of images / &um from those articles // but it's just tough // because not all of the text in the article relates to the image so xxx  1:17:02

PAU  again and it's also very realistic // so this is &uh / quite a [/] a good example of &um / trying to do a task / &uh which basically / &uh deals with something in the real world  1:17:11

PAU  the case in a kind of education or research setting in a hospital for example  1:17:15

PAU  people need to train // &uh they need to look up literature &uh and so on  1:17:18

PAU  and so they want to search for this kind of information  1:17:21

PAU  it's kind of very real world  1:17:22

PAU  &uh for two thousand and nine / what are we gonna be doing ? 1:17:26

PAU  well / we hope that basically we'll be &uh / running again // so &uh at the moment we're trying to &uh organise between ourselves / it's about / eight different people involved in organising this track / is [/] is quite big  1:17:36

PAU  &uh the medical retrieval task we want to continue // &uh but this time the goal / is going to be to &uh actually retrieve cases instead of images  1:17:44

PAU  so you might have ten different images related to a somebody's case  1:17:47

PAU  somebody goes into a hospital / and have a condition / a case [/] set of case notes / so a case file is generated for that person  1:17:54

PAU  that consists of lots of text about the condition / perhaps about the outcome / about the individual  1:17:59

PAU  this is all being in an automatized xxx [/]   1:18:02

PAU  &um and then you might have a number of different images taken from [///] you might have an x-ray / you might have &uh something else // which are all collected together in this thing called the case note  1:18:10

PAU  typically what we've done is just said // well let's just retrieve an image // ok? // that's a little bit simpler  1:18:16

PAU  but it could be / a more realistic thing to do / and / much harder thing to do // is actually retrieve the whole case / set of case notes / which is sort of be very useful / particularly in the / clinical setting  1:18:25

PAU  &uh the Wikipedia task / we want &que [/] more queries in different languages  1:18:29

PAU  &uh the object / &uh recognition task / and somebody's approaches where they have a robot vision database / that they'd like to be explored  1:18:36

PAU  &uh photographic retrieval task // we're &um at the moment talking to a big news agency / to see whether they let us &um have a sample of about a million / of their images // so that we can &te [/] then &te [/] test diversity on a much larger / collection // which should be very interesting  1:18:50

PAU  &uh and I just wanted to end up by just saying that &um // lab-style evaluation is good // which effectively is what I'm talking about here // but // &um evaluation resources xxx provided by various communities such as TREC and by &CLE [/] and by CLEF / have really shown a lot of positive effects on the IR community  1:19:10

PAU  they have been very very useful // they have promoted / &um research / in information retrieval // they've been very very beneficial  1:19:18

PAU  and the results’ve even gone onto informed commercial IR systems  1:19:22

PAU  so // you know // these evaluation campaigns are very very important  1:19:26

PAU  &uh however / although the benchmarks are important / there are still many questions / which we face with / which perhaps this type of benchmark or this type of evaluation / doesn't necessarily answer  1:19:36

PAU  &uh that is / how do we measure the accuracy ?  1:19:40

PAU  what measure do we need to use / to measure [/] measure system effectiveness ?  1:19:44

PAU  do we just use precision and recall ?  1:19:46

PAU  do we use precision at ten ?  1:19:47

PAU  do we use a kind of &um / boundary preference ?  1:19:50

PAU  do we use / any of the other / million different types of evaluation techniques and measures ?  1:19:56

PAU  which one do we use ?  1:19:57

PAU  is precision and call [/] &uh recall good enough ?  1:19:59

PAU  another important thing is actually which of those measure actually correlates well / with human effectiveness / or user / &um performance // user effectiveness or user's success ?  1:20:09

PAU  so if we do an &evalu [/] &uh do an experiment / where we get people / to say whether they're satisfied with the results or not // ok?  1:20:16

PAU  do those results correlate with system effectiveness / whether the system has performed well or not ?  1:20:21

PAU  'cause ideally when we try to evaluate the systems in this kind of setting // we do wanna be making use of measures which correlate with human / &uh success / and satisfaction  1:20:30

PAU  &uh the other key thing is well what is the role of the user / in the evaluation process of multimedia retrieval techniques ?  1:20:37

PAU  it is very difficult to involve the user / to create a large scale / user evaluation campaign  1:20:42

PAU  &uh Julio will / probably explain this tomorrow // but attempts have been done / in CLEF / as part of I-CLEF // but it's difficult [/] it's difficult to try to get people to participate // difficult trying to think of the right task // difficult trying to think of the right way of doing the evaluation and so on // it's just / harder / than kind of setting up a system orientated evaluation  1:21:00

PAU  the other big thing is how much does multimedia retrieval depend upon the context ?  1:21:04

PAU  'cause you don't often make use of context // ok?  1:21:07

PAU  o it doesn't matter where you are // it doesn't matter what time of day it is / and so on  1:21:11

PAU  &uh I guess the other thing to think about then is / this notion of user judgements versus / system effectiveness or technical &accurateness [/] accuracy  1:21:21

PAU  &uh there is some research that shows that a good user / with a bad system / &uh is usually better than a bad user with a good system  1:21:29

PAU  &um so research by Hersh / Allan / Turpin and Hersh and so on  1:21:35

PAU  &um so it seems to be that what you want to do / is effectively give users &um [/] &um [///] well / actually what the research shows / it doesn't really matter what type of system you give a user // give him a bad one or &uh a good one // they will adapt // and they will basically turn out performing the same  1:21:50

PAU  &uh however &um // as part of the paper published at SIGIR this year which contradicted / a lot of the previous research  1:21:58

PAU  and in this paper what we did was actually make use of a much larger set of users / and much larger set of topics  1:22:04

PAU  so in one of these previous experiments they only used five users  1:22:08

PAU  we used &uh nearly sixty users  1:22:11

PAU  and basically we found &um / that although even when system effectiveness ok? [///] so you have a very small difference between system effectiveness // even then / there were significant differences between / user / performance and user satisfaction  1:22:24

PAU  that is we would say that actually system effectiveness // a &s [/] a user / will be affected / by the performance of the system // that is what we would conclude  1:22:33

PAU  so there's a bit of conflict here / so more research needs to be done  1:22:37

PAU  &um multimedia [///] the other big problem we have / is that multimedia retrieval &um / researchers are not usually  / &uh user interface people or human &inte [/] &uh human computer interaction people // and for good reason because &of [/] often we're in different camps // ok?  1:22:50

PAU  it's very [/] very rare to get people who are a kind of // you know // across-the-board if you like  1:22:54

PAU  &um so that is &um / often researchers have little experience / with real user tests / and setting up user's evaluations  1:23:01

PAU  &um probably very little interest in investing time and effort in &put [/] carrying out user evaluations because they don't necessarily see the benefit  1:23:09

PAU  &um also a lot of the communities / for example the computer of vision communities / one hard ground truth // ok? // they're not worried about this soft / fluffy kind of user's stuff  1:23:20

PAU  &um but I would argue and say that is important // and we do need to be thinking about that  1:23:25

PAU  but the question / how do we / put that inside an evaluation / campaign ?  1:23:29

PAU  I'm gonna / leave that 'cause we have / almost run out of time  1:23:34

PAU  just one thing to say that on the context during treated / &uh retrieval // certainly in the medical task that we run // ok? // retrieval typically defines the [/] depends on the context  1:23:45

PAU  ok? // that is // &uh many domains require of viewing images in specific context  1:23:50

PAU  for example in the medical domain // no medical doctor would be analysing the images without some kind of clinical context  1:23:57

PAU  they would need [/] they would know something about the age of [/] of [/] &uh &um / the participant in the photo / the sex / &uh lab results and so on  1:24:05

PAU  that is we need that contextual information which is why in the medical tasks / we want to move to using case notes // because that provides the kind of clinical context / to these individual photos which we believe provides a much more / realistic type of evaluation  1:24:19

PAU  so we need to keep testing on different domains // &um that's pretty important  1:24:26

PAU  &um say in conclusion then // &um I think it's still unclear whether a direct relationship exists between IR effectiveness and user satisfaction with the search results  1:24:36

PAU  we seem to have some slightly contradictory / &um results here but we're good to exploit that further  1:24:41

PAU  this is important // &uh because previous experiments confirm strong relationship between the performance of the &uh [/] &uh the user / and the system // &um but / it's contradictory  1:24:52

PAU  so we just seem to be that some research says that it doesn't matter how well a system performs / the user can adapt / and perform equally well with a good or bad system  1:25:00

PAU  other research sais well actually that's not true  1:25:02

PAU  so now we need some more evaluation to be confirming this  1:25:06

PAU  &uh measures of system effectiveness  1:25:09

PAU  that's a big problem  1:25:10

PAU  which measure do we choose ?  1:25:12

PAU  which measure do we evaluate our systems with ?  1:25:14

PAU  ok? // we need to be / having more experience where we can correlate / system effectiveness with user performance  1:25:19

PAU  that's an important aspect / xxx an important area  1:25:22

PAU  &uh other important areas / I think another measure is [/] perhaps we need to think of / we don't necessarily think of xxx these large scale evaluations / include not only user satisfaction  1:25:31

PAU  what about system speed ?  1:25:32

PAU  ok? // that's quite a hard thing to / perhaps incorporate / 'cause everybody is working on / different systems // but how we might include that ?  1:25:39

PAU  obviously system speed / if your algorithm takes / a year to index and run / over that set of queries / might take to five seconds // ok? // which one is better ? 1:25:47

PAU xxx yours performs better / &uh position at ten // but it takes a year  1:25:51

PAU  &um what about user confidence ?  1:25:54

PAU  what about task interestingness // what about task difficulty ?  1:25:56

PAU  these all things that perhaps we could take / into account / &uh with part of our evaluation measures  1:26:01

PAU  &um so / a way forward // &um let's see // let's do a little bit more research in which measures of system / effectiveness correlate best with human satisfaction ... 1:26:12

PAU  &uh let's continue then with large scale evaluations / &um across domains and tasks  1:26:18

PAU  &uh I think we need a combination of measures perhaps to successfully evaluate IR systems // because each of the measures does tell us something different // and does give us some interestingness  1:26:27

PAU  &uh we probably wanna be thinking how could we include some kind of performance measure / or some measure that correlates with &um &use [/] user &um [/] &um interaction / satisfaction  1:26:38

PAU  &um some kind of measure that indicates the effort that's been involved / in terms of building the system that's then being used / to &uh / &uh run and &par [/] and participate in a competition  1:26:48

PAU  how many man-hours did you have // I mean xxx members of the staff and so on ?  1:26:52

PAU  &um we need to continue constructing realistic benchmarks // so we've been trying to do is in ImageCLEF // &um but you know / wouldn't be lovely to be extending the domains ?  1:27:01

PAU  &um moving more to video retrieval / and moving more to audio retrieval and so on ?  1:27:05

PAU  and we must // I end on this // we must conduct more user experiments  1:27:09

PAU  what the people really want from image retrieval and video retrieval systems ?  1:27:12

PAU  and I'll hope / we need more of I-CLEF  1:27:15

PAU  so // keep going with I-CLEF  1:27:17

PAU  and that's it  1:27:20

PAU  sorry for running over  1:27:23