Convenciones de transcripción

@Title: Multimedia Retrieval and Evaluation
@File: mavir10.xml
@Participants: PAU, Paul Clough, (man, C, 3, associate professor, lecturer, Sheffield)	
@Date: 27/11/2008
@Place: Madrid
@Situation: Conference (III Jornadas MAVIR), conference room, not hidden, researcher as observer
@Topic: evaluation and development of multimedia information retrieval systems
@Source: MAVIR 
@Class: formal in natural context, conference, monologue
@Length: 1h 27' 24"
@Words: 15659
@Acoustic_quality: A
@Transcriber: Sergio Calvo Páez
@Revisor: L. Campillos, M. Garrote

PAU so as [/] as Anselmo &uh just told you / my talk today // my lecture today is gonna / consist mainly of thinking about or talking about trends / in multimedia retrieval / and &uh today I'm really going be focusing on access in visual material  00:15

PAU so not really gonna be talking about audio / &uh so that would be a kind of another / &uh lecture in its own right 00:22

PAU &um but I also wanna then talk / so first we're gonna summarize and just &uh give you an overview / of how do we access &uh visual media / that not &on [/] only includes images but also includes videos as well 00:33

PAU &uh then I want to talk a little bit about evaluation / so evaluation is key / if we want to keep developing &uh / good systems 00:40

PAU ok? // so we must be able to evaluate the systems // the multimedia systems that we're building 00:44

PAU &uh and in particular as Anselmo said I'd like to talk about ImageCLEF / which is one of the evaluations I've been involved with for few years now 00:52

PAU &uh if / &uh we have any time left at the end / what I’d also like to do is just give you a few examples of some of the research I've been doing / over the past year or two / &uh related to &uh multimedia retrieval / just to give you a flavour / for some of the things that we do at Sheffield 01:07

PAU so first of all let me just &uh / make sure that you all &uh / understand what &uh information retrieval is 01:14

PAU &uh so information retrieval [/] I work in a information retrieval IR group // and &uh typically information retrieval / &um / you might say is / &uh / the [/] the aim of finding information relevant to a user need / so that includes for example storage / retrieval / presentation / user interaction / and all the other aspects / of &uh searching / &uh for information 01:35

PAU &uh what type of information ? 01:37

PAU well we can include texts / &uh images / audio / ISO transcripts / OCR documents // a whole range of [/] of different media types 01:47

PAU what is a [/] a user need ? 01:49

PAU well the aim is to find information relevant to a user need / but typically in information retrieval what we're trying to do is find information / about / or on a given topic 01:59

PAU so normally we kind of assume that the &uh aim or the goal is a kind of thematic / that's a thematic need so find me documents / on the same topic / as my query 02:09

PAU &um / that involves typically searching / and browsing as well so browsing xxx [/] browsing is xxx [/] also an important part / of information retrieval 02:18

PAU &uh / so the goal is to find the information relevant / to a user need 02:23

PAU &uh relevance itself is a concept 02:25

PAU &uh it's absolutely core to information retrieval 02:28

PAU there have been numerous papers / books published on the whole concept of relevance 02:32

PAU &uh because relevance is typically information retrieval people understand it // is about thematic / or topical similarity / but of course it's not just about that 02:42

PAU ok? // for &some [/] for something to be relevant to a need / &uh it also depends on things such as context // such as some subjective / &um measures / as well 02:50

PAU so is a [/] is a &docum [/] is a document by a certain author ? 02:54

PAU is the authority if this is the best document 02:56

PAU &uh and those aspects which not only are about &uh the theme or &uh topic of a document // they are about other things as well 03:04

PAU &uh over the &f [/] past few years I guess we might say there've been general trends in information retrieval 03:10

PAU &uh so information retrieval is &uh / not a new field // &uh it's been run for quite a long time // and the kind of classical research &uh areas in information retrieval where things like / information retrieval models / &um so xxx [/] various types of models for example / language models / &um vector space models and so on 03:27

PAU &uh evaluation typically on smaller collections / so in information retrieval evaluation / &uh started out / on quite small / &uh collections 03:36

PAU &uh there was quite a focus in the early years on various query languages / and on document indexing / and &uh typically everything was done on quite small / &uh document collections 03:46

PAU &uh then of course / &um things changed // and / the kind of information retrieval / &uh area that we are at the moment / and field that we're in / involves a &hume [/] an absolutely massive range of different types / of research areas 03:59

PAU as of course with the growth of the Internet / and the &um [///] we have things like Internet search engines / web search / &uh we have / markup languages / semi-structured search / not xxx structured search 04:10

PAU we then have multimedia contents or multimedia search 04:13

PAU we have distributed collections / there's quite a lot of research there 04:16

PAU we now have cloud computing // so how do you do IR / over a cloud computing ? 04:21

PAU &uh we have user interaction as a big / area / &um that is / &uh taking off / still / a lot of work to be done there 04:27

PAU we have now multilingualism / because of globalisation 04:30

PAU &uh / we have social search / Web 2.0 / and so on 04:34

PAU all of these fields now are part of information retrieval as we know it today 04:38

PAU &uh and / I guess the modern information retrieval / and the retrieval we know today is really being driven by several things such as technology / &uh such as research / &uh / the environments / and the needs of users as well 04:49

PAU so that’d been the driving fact as to make retrieval what we have / today 04:53

PAU so specifically I want to just talk about multimedia information retrieval 04:57

PAU and &um / there's quite a good [///] this is a quote just from one of the &uh / ACM / grand challenge papers / and this was back in 2005 and two quite well known figures in &uh / multimedia retrieval &uh / Rowe and Jain / &uh basically said this was a grand challenge at the time for multimedia / retrieval / that is make / &uh capturing / storing finding and using / digital media and everyday occurrence on / &uh our computer environments 05:23

PAU &uh now I &s [/] still believe that that is a grand challenge because we still have a [/] a lot of personal multimedia / &uh and / often is quite difficult to be able to manage / even our own photo collections for example 05:35

PAU so there's a long way to go 05:37

PAU &uh so this I think is a challenge that still exists for researchers today 05:40

PAU so there's a long way to go / in multimedia retrieval 05:43

PAU so let me just talk a little bit about / &uh / multimedia retrieval 05:48

PAU &uh first of all &uh digital multimedia 05:50

PAU &uh I guess / most of us know that digital multimedia content is growing / pretty rapidly / &uh not only in the commercial sector but the domestic sector as well 05:59

PAU &uh / and that's driving new forms of interaction // the &inter [/] interaction that we have / with images / speech / video / texts and other forms of unstructured data 06:09

PAU so we’re meeting multimedia data not only at work / but also we're meeting &multi &uh [/] multimedia data at home as well 06:15

PAU so just think about how many multimedia devices you do you own ? 06:19

PAU do you own a mobile phone which can take photos / video clips ? 06:22

PAU do you have digital voice recorders ? 06:24

PAU do you have / digital video cameras / &uh digital cameras ? 06:26

PAU &uh interactive TVs and so on ? 06:29

PAU &uh / we / ourselves / just personally / we have an awful lot of digital &uh capture devices 06:35

PAU &uh applications of course of digital multimedia are / absolutely vast / so we mustn't just restrict it to the kind of academic domain // but of course entertainment / &uh is a very big market / big field xxx // &uh digital photo software / &uh management / music downloads / e-learning / &uh mobile media and so on 06:54

PAU these all are applications of effective multimedia retrieval 06:57

PAU so &multi [/] multimedia information // this kind of xxx [/] &um quite diverse / quite multi-faceted = 07:05

PAU &uh and this is quite a useful diagram just illustrating the various media types that you have 07:10

PAU so typically the multimedia / &uh we would have sounds / &uh text and image // with sound we might have music / as opposed to spoken document annotations 07:20

PAU &uh and they are quite different because / both of those require quite different forms of / &um / processing 07:27

PAU and then with text of course we might have captions or subtitles // we could have ISO transcripts from the audio 07:34

PAU &um / then also we can have &ob [/] &um / &uh obviously &uh bigger texts as well // so we might have an image embedded in a webpage 07:42

PAU and the text surrounding the image gives us some context / &uh and tells us something about the image 07:47

PAU &uh then of course with the images we can have &stim [/] still images // there's an awful lot of work to be done there // we can have black and white / we can have colour / we can have various quality and so on... 07:56

PAU and then we have moving images // we can have animations / which are moving images xxx xxx xxx xxx// and then / going right to the other extreme / we have videos / &uh movies / videos which typically have both moving images together with sounds / together with potentially &uh texts as well // because we can transcribe the audio / &uh to give us texts 08:14

PAU &uh that gives a very very rich / &uh set of data for which we can then provide search and browse / &uh techniques and functionality 08:21

PAU so let me just talk a little bit about how we access &uh visual media 08:25

PAU &uh mainly I want to focus first of all on image retrieval // so how do we / &uh retrieve images ? 08:31

PAU so first of all the kind of &uh assumption is users want to retrieve / &um a document // so a visual document rather than text 08:38

PAU that's what I'm assuming // so you are a user / and you &uh go [/] you go to Google images / and the assumption is that you are looking for a [/] an image / to satisfy your need 08:46

PAU so // a search requests typically xxx expressed &uh either using &uh example images // so / giving an image / find me other images like this one // or more commonly // so that doesn't happen like that when you go to Google images // on Google images you would actually type in &uh [/] &uh [/] &uh [/] &um [/] &uh [/] a written interpretation of your need // so you'd just type in images of / whatever / Madrid / of / &um / whatever you're looking for 09:13

PAU &um / there's an awful lot of application areas / &uh for image retrieval // &uh for &examp [/] example you might be a researcher / &um searching / digital archives // for example you might be working &uh for the BBC 09:25

PAU you could be a curator in a library // in digital library // &um looking for images 09:30

PAU you could be an illustrator looking to &um [/] finding example photographs to illustrate an article 09:35

PAU &uh you could be a professional / accessing for example a science database // for example in medicine / so a health professional // maybe you are looking for a particular radiograph / or set of radiographs 09:47

PAU &um / or you could just be a domestic user // &look [/] trying to find that picture of / holiday last year / and the children standing on the beach // and you cannot find it anyway // it seems to have / been lost somewhere in your multimedia collection 10:00

PAU &um / there's a very &uh useful paper / &um by &uh &Pete &uh [/] by Armitage and &uh [/] xxx Armitage and / a guy called Peter Enser // who's a retired professor now / but he used to be at the university at Southampton 10:13

PAU he's very very well known / in kind of pictorial information retrieval / because he's done quite a lot of studying / of what users want / and the information needs / so a kind of pictorial of visual information needs of users 10:24

PAU and actually I think we have to &s [/] first to recognise that this is place where we have to start // so if you are a student and you're a kind of mover / perhaps technical student // &uh do come back to this 10:34

PAU &uh why // ok? // are you providing a technology ? 10:37

PAU what do your users want ? 10:39

PAU ok? // we do need to understand what types of different search requests / &um could / &uh a user have / for the different media / &um that you're trying to provide 10:47

PAU &um so the Armitage and Enser paper is very useful 10:50

PAU &um they analyse the number of &uh user needs from looking at search logs / &uh from a number of libraries // &uh digital libraries and &uh physical libraries 10:59

PAU and it's just interesting they came up with this kind of chart // if you like goodish / &uh table // ok? // various types of queries // you could classify the queries into these various types 11:08

PAU so what people were searching for 11:10

PAU and all of these databases that people were searching were pictorial databases // so they were typically a sort of databases from museums / from libraries / &uh and so on 11:19

PAU and &uh basically they would say that there're a / sort of three main things / if you like / or three main types of query 11:25

PAU so there were queries / &uh / which basically are where people are looking for something quite specific // either the query might be / David Beckham // ok? // so the [/] the image that you want to &s [/] of something that is quite specific 11:37

PAU but then perhaps you have a more generic type of request / so a picture of something 11:43

PAU so find me pictures of &um / famous footballers for example 11:47

PAU or find me pictures of the Madrid &uh squad / or something like that 11:51

PAU and then you have the kind of abstract level / so in the library or in the &uh analysis of the logs / they did find that you sometimes get quite abstract queries 11:59

PAU so find me a kind of fun pictures of people at a party 12:03

PAU &uh find me pictures of &um people enjoying themselves at the beach 12:07

PAU you know / quite abstract concepts 12:10

PAU ok? // so &uh as information retrieval people we have to think how might we answer / that type of query // how might we build the system / to answer that type of query 12:18

PAU perhaps it means that we don't just have a very simple search box anymore 12:22

PAU perhaps we now need to allow the users be able to browse the collection // create some kind of multi-faceted search 12:28

PAU perhaps we need to make use of semantics information extraction technology and so on / to be able to address an information need like that 12:36

PAU by xxx so / if you are a student and you're interested in this / go and have a look at that paper // you can access it online 12:40

PAU and I think it's a good place to understand [/] &uh to start [/] to understand what the user actually &uh wants // what they're looking for 12:47

PAU if we were to kind of classify image retrieval and the various image retrieval techniques / &um John Eakings / he is from the University of Northumbria / he's again quite well known in the information retrieval community // and &uh he proposed the three &uh level framework for image retrieval 13:02

PAU so you can do image retrieval / at level one // that's retrieval based upon very primitive features 13:08

PAU for example features of colour / of texture / &um of shape and layout 13:15

PAU that's [/] you can do retrieval at that level / so that's level one 13:18

PAU &uh level two // &uh Eakings suggests that you can do retrieval by derived / &um / &er features 13:24

PAU for example it might be that you have your images tagged in some way // so if I for example was &er / accessing images from a / &uh [/] a news agency / or some kind of photo / &um / database // it's quite likely that the images come with captions / generated by the people of archived and stored the images 13:43

PAU and so there for example it might just be that what you're doing is trying to / answer a query such as find me pictures of Toni Blair 13:49

PAU &uh he also identified a kind of third level / of retrieval / which is retrieval by abstract &uh / attributes 13:57

PAU &uh and this is much much harder // if you like // so here what you're doing is actually abstracting / and there's a bit of a gap / between the low level features such as colour / spatial layout / texture / and getting to the kind / if you like / the semantics of an image // 14:12

PAU that is there're images that might depict death / images depicting war // ok? 14:16

PAU you look at an image / yourself and you can definitely work that out 14:21

PAU but how on earth would a computer / or an / image processing technique be able to infer that / from / the basic primitive content / colour / textures / spatial layout / or some of the tags that have been assigned ? 14:33

PAU and the problem is a little bit like this // so find me images similar to this 14:39

PAU well it kind [/] it gets harder as you move up here // ok? // so find me images just similar to this based upon xxx xxx / I want images / with the same colour composition / same texture / same shapes ok? // that's reasonably ok / we can do that quite well 14:53

PAU find me other images depicting death 14:56

PAU ok? // it starts to become a little bit harder 14:58

PAU so it's a bit like this / the little penguin is looking for &uh / this little fish / ok? // that is a example image 15:05

PAU which ones [/] which one of these is the correct image ? 15:08

PAU because they also ain't to be roughly the same shape // some of them have &sim [/] similar colour &com [/] composition // but is a fish hhh {%act: laugh} 15:17

PAU ok? // so there are no relevant images because there are no other fishes / ok? 15:21

PAU and the problem is just like that / there's a massive what we call semantic gap / between the low level / &uh primitive features / and the kind of high level concepts 15:30

PAU &um / there are / I guess two main retrieval methods // &uh two ways of how you might retrieve &uh visual media or images 15:41

PAU &er the first one is kind of traditional information retrieval // that is text based information retrieval // &uh and so we would normally call that description-based 15:50

PAU and that is we would typically use the abstracted features which should have been assigned to the image 15:54

PAU so most images don't exist on their own // most images / come with some / kind of context // perhaps they come with the caption // perhaps they come with some other metadata which is embedded in the image / in the EXIF data for example // &uh and some of that can then be used / &um for retrieval // &um we can just use / off-the-shelf standard / &er text based retrieval techniques // so you can index a bunch of images if they have captions using xxx // and very effectively / you could answer &uh queries &uh written &uh [/] &er variable queries // ok? // and you could that pretty easily / that's &uh [/] that's very very simple 16:30

PAU the other way that you could retrieve images is called content based / &uh image retrieval / CBIR // and the idea here is &uh [/] actually what you're doing is not making use of the abstracted features or not making use of anything assigned / &uh to the image 16:43

PAU what you actually trying to do is make use of the image content itself / the visual content 16:47

PAU &um / and so here what you're doing is basically / &um making use of the pixels // you're making use / &er of the colour / or greyscale intensities of the pixels themselves / and then what you're trying to do is abstract from that 17:00

PAU and from this you can work out // you can generate histograms of colour // you can work out &uh a / set of grids of texture // &uh shapes // you can identify shapes and so on 17:10

PAU and xxx &uh [/] where [/] as opposed to a kind of &um / things like the description / being xxx [/] &er being added manually 17:18

PAU &uh typically the content by stuff is all done automatically // so it's kind of no manual intervention 17:23

PAU &uh of course in / these days we do still have [///] you know // we have now techniques where you can automatically label images as well // and they're sort of now starting to make use of both of these / &um techniques 17:34

PAU of course you don't just have to use these techniques on their own // you can use them in combination and in fact that's where you have the most powerful / &uh retrieval system 17:44

PAU the retrieval system where you can actually make use of the visual features together with the / associated text as well // because you could start the query / in &uh [/] &uh with text // so find me pictures of / whatever // you get back a set of results // and then on &uh / each retrieved cycle you can then say ah! // well find me images more like this one // and by more like this I mean images of the same kind of / shape / layout / &uh colour content 18:08

PAU here is an example of description &uh based image retrieval 18:14

PAU so the images annotated using texts // and then you just use traditional text based &uh retrieval techniques 18:19

PAU &uh the approach is very very popular // so you [/] you only have to look online // look at Flickr / Google / Youtube and so on 18:26

PAU &um [/] &uh &w [/] why is it so popular ? 18:29

PAU well basically you can &um generate a retrieval system which should be efficient 18:33

PAU so as soon as you start involving the actual content of the images / and you start generating features from the content / the retrieval often gets a bit slower // &um there's a lot more processing to do // if you can just make use of the texts / with other systems // ok? // &uh which are very efficient at / accessing texts 18:49

PAU &um so here for example what we have [/] and this comes from &uh a &his [/] historic library // &um it comes from St Andrews &um / library / from the university library 18:58

PAU &uh and they've given us a bunch of images which we’ve used in evaluation called ImageCLEF which I'll mention in a moment 19:04

PAU and here is an example image // ok? 19:07

PAU here is a lovely lighthouse / &uh a striped lighthouse 19:09

PAU and this is actually a postcard // it's been a scanned postcard 19:14

PAU and here is some of the &uh description that you have // ok? // so we have &uh semi-structured data / which is already quite useful // because we don't have to look for the location / it actually tells us / that this is the location 19:26

PAU so for example the short title / is the Smeaton Tower in Plymouth 19:30

PAU &uh the longer title / is Plymouth Hoe / the Smeaton / &uh lighthouse tower // and then there's a description / and the description then is a // if you like // &uh an interpretation of the image which has been added by the historian 19:44

PAU &uh the interesting thing is that if / different people wrote the description is likely to also be / different // because we all see something a little be different // ok? // in the picture 19:53

PAU however the historians or the librarians have been trained / to write these descriptions in quite structured ways 19:59

PAU so there a [/] there is a certain manner of consistency / &um going across these descriptions 20:04

PAU so here the description is / red and white striped lighthouse / on coastal cliff / with harbour and town beyond / and substantial building on cliff terrace below / which is there 20:16

PAU so such a quite detailed description / of what somebody sees / &uh in an image and what they / &uh think is important / for actually then being able to index the image and retrieve it later 20:26

PAU and then we have the information such as the date of when it was registered / 1904 / the photographer / &uh and so on 20:33

PAU so you might just think that oh! this is pretty simple // all we do is just index this // ok? // using a text based retrieval system and then we can find stuff 20:41

PAU however there are some problems // &uh it doesn't always &uh work quite so easily 20:46

PAU first of all you have to add the annotations 20:49

PAU so this library / &um spend [///] well they employ somebody fulltime / to add these annotations 20:56

PAU they're under pressure / because of financial cost / to get rid of that person 21:00

PAU what happens then ? 21:01

PAU who's gonna annotate them ? 21:02

PAU because if the images don't have annotations the current system / only works on texts / so the images will be lost // they would not be able to be found 21:10

PAU ok? // what happens ? 21:11

PAU &um so annotation is typically very expensive // is also subjective as well // so different people add the xxx [/] different labels / &uh often we’ll add different texts 21:22

PAU the other big problem is the vocabulary used / and if you read it carefully / is a very British English vocabulary // &uh is also a &uh historical vocabulary // is in the vocabulary in the style of that particular library 21:35

PAU and the problem there is that you might have a casual use of the xxx xxx // because they want to provide this is a / service to the general public 21:42

PAU perhaps they want to provide it to &uh / a global audience as well 21:46

PAU so you come along &uh [/] &uh [/] a kind of Spanish &uh [/] &uh member of the general public // and you type in [///] well maybe you don't have a word for lighthouse // maybe you translate it into English into something quite different // but actually even for somebody in English / the &uh text is / very historic 22:02

PAU &um so all other words that used [/] they’re actually words used / a few year ago 22:06

PAU ok? // so general public today / wouldn't necessarily even be able to find the images // even though they are typing in English // so there's a vocabulary mismatch // what you have to do is come up with the technique to overcome that 22:18

PAU &uh the other / thing is that sometimes is quite difficult to express more kind of abstract / &uh aspects of this image 22:26

PAU so actually if we had an image which was quite emotive // xxx what words would you use ? 22:31

PAU so should we add some words which describe emotion ? 22:34

PAU should we add some words which describe for example speed / the activity / the action that's happening ? 22:39

PAU &um because it might be that somebody &uh is trying to illustrate the texts 22:43

PAU you might a journalist and you want emotive pictures 22:46

PAU ok? // so you want a kind of funny pictures of lighthouses 22:49

PAU you want tall &light [/] lighthouses / well-known lighthouses / and so on 22:54

PAU what words would you add ? 22:54

PAU what scope of words do you add ? 22:56

PAU and this isn't a problem that's new // librarians have thought about this and they're with this problem for many many years 23:02

PAU however as information retrieval people / as technical people we just have to understand / that this is &uh what we're dealing with // ok? 23:09

PAU this is the type of content we're dealing with 23:11

PAU it looks easy / on first glance / but it's not always quite as easy / as it first appears 23:16

PAU just to mention another problem that you sometimes get / and this was a rare problem when we first use this collection / is that you have a kind of notes field here or background field // and this notes field is kind of useful for the historian because it just adds a lot of kind of / peripheral information / historical information 23:32

PAU however / that text does not describe what is in the image // so when you go / searching using a system which just indexes everything // and you [/] you should get back some funny results / and you think / why do they get that result ? 23:44

PAU because the image / doesn't match my query / and you're a little bit dissatisfied / and the reason is that you've actually matched some text which is [/] which is matching in the background // ok? // which is some kind of peripheral / historical text / just gives some context / &um extra additional context to the image 24:00

PAU so a simple solution is not to index the notes / and you perform a little bit better 24:05

PAU ok? // but if you just have to understand some of these things about the content that you're dealing with 24:09

PAU if we now turn to content based image retrieval how does that work ? 24:14

PAU &uh we typically what we do is we &uh [/] we &uh / go through a process of &e [/] &uh / extracting features / so a feature extraction process 24:22

PAU so for example here / this is a histogram of the &uh colours used in the image 24:27

PAU and there are various techniques for &uh representing colour space / for representing the kind of &um histograms … 24:33

PAU this effectively is a kind of colour composition if you like / of our image / and we can do this for various features / so you can do it for texture / we can do it for shape / &uh layout and so on 24:43

PAU we then index these features in a / index similar to the way that we might index &uh words / from a standard text based retrieval system we can apply very similar / inverted indexing techniques // &um and then once we've done this / then the user has a search request which in this system would be / here is &g [/] &uh an example image / and what the user wants / what their requests translate into / is basically find me images similar / visually similar to this one // that's the only type of search that they can perform / using this kind of system 25:14

PAU yes you could trick [///] &uh so you [/] you could change it slightly / so you could allow the user to control // well I want images which are more / like this in terms of colour / in terms of texture / you could have those sliders perhaps on the interface / in terms of shape // but you can only basically return back images which are visually similar in some way 25:32

PAU &uh typically in this type of system there's also a relevance feedback commonly used / where users can refine // ok? // the process // so you can kind of get a better / hopefully solution towards the end 25:44

PAU &um so just to say then / normally you would have &uh colour // that's one of the features that you can have 25:54

PAU &uh so colour described in the pixel intensity // there're various colour space models HSV / RGB is a common one 26:01

PAU &uh and typically what you would do is have a histogram which defines a distribution of colour pixels in an image / or greyscale if you're working with black and white 26:10

PAU &uh sometimes it's been shown to be very effective that actually your first stage / in a technique like this / is to convert your colour image into greyscale // and then you do everything in greyscale // because you reduce the space if you like / the feature space // that often works quite well 26:24

PAU &uh texture is another thing 26:27

PAU &uh texture typically &uh defines things like smoothness / contrast / it looks at regularity / directionality 26:34

PAU &uh however it's been shown that on its own texture is often of limited value // so you need to use it together with / something like colour 26:41

PAU and then shape as well or segmentation / you can take an image and you can segment it // so in the lighthouse we could see that there's a tall object // that could be one thing // there's another little object / &uh on the left hand side which is another building // that would be something else // and each of those form / separate objects / which we can index 26:58

PAU &uh it has also been shown that / &um [//] xxx &uh there [/] there's a difference between for example &um / basically computing histogram across the whole image // ok? // versus what we could do is chop the images / into little squares // ok? // turn it into a grid and then have &um colour histograms that each of those little squares or each of those little grid [/] &uh grids squares // and it's found [///] that's called the local indexing technique / the other one is global // and it's found that local indexing / does seem to work better / &uh than global indexing 27:29

PAU &uh here [/] so here is an example output from the CBIR system 27:33

PAU &uh / so this is &uh built by one of the &uh co-organisers of the ImageCLEF / &uh Henning Müller / &uh and Stéphane Marchand-Maillet and they work at the University of Geneva // and they do an awful lot of work // they really are &uh a kind of &um / very deep / and very heavy into content based [/] visual content based image retrieval // and they've created the system which you can download and use called Viper 27:54

PAU so you can download and index your own set of images and have a go for yourself 27:58

PAU &um / and here is an example of the Viper working // so they've created the system called medGIFT // so the tool is called &uh [/] it was [/] it used to be Viper and is now called GIFT // and they've created a version called medGIFT which works on radiological images / &uh in a clinical application // so this works in a hospital at the University of Geneva 28:16

PAU and the idea was that the &uh radiologist comes along / with that example / &uh image 28:22

PAU ok? // &um and here we go / the system then basically has indexed previous // ok? // &uh images // and it ranks them // basically produces // if you like // the most similar // ok? // and just ranks some by similarity 28:36

PAU and you can [/] you look at this and you think / well this seems to work reasonably well // ok? // we seem to be doing quite well here 28:41

PAU &uh so let’s try another example 28:43

PAU this time we have a patient here with the &er / hands up // ok? // there’s a certain condition on the &er [///] I think it was on the skin // and you can see that the results are a little bit more variable now // so there's some kind of false hits if you like 28:55

PAU so the first few results seem pretty good // but then as we get along we seem to have got some x-rays here // &uh we seem we've got some kind of arms / and other bits on pieces // so the results aren't quite so satisfying 29:08

PAU so one thing we notice at content based retrieval is often [///] well it's domain xxx [/] often works better on some domains than others // it also tends to be quite query dependent as well 29:17

PAU &uh here is another example / &uh using the same system 29:22

PAU this time what we've done is gone for [///] well let's not just use medical images because that's a specialized domain / let's just use general images // let's use those images from that historic photo collection 29:32

PAU so here for example we have a lovely kind of sunset / in black and white // you can't really see it // and here are the / &um [/] here are the matching images / or the similar images // and I guess as a user / if your need [///] and we're kind of assuming that your need is find me other sunsets // ok? // so find me &ima [/] images &s [/] visually similar to this 29:51

PAU you would probably be reasonably satisfied // because they kind of look quite similar // ok? // so you're quite happy 29:57

PAU however / &uh if we use a slightly / &uh trickier picture … 30:01

PAU so here this is a picture of York Minster 30:04

PAU it's a famous cathedral / &uh in England 30:07

PAU and actually what the user wanted was / well find me more pictures of York Minster 30:11

PAU I've given your picture of York Minster / why is the system so stupid ? 30:14

PAU because it’s returning back not York Minster / but also so different pictures // it's not even got / pictures of the same building / of the same type of structure 30:22

PAU and part of the reason here is that actually it's just that they're very very difficult tasks / to do this reliably and to do it very well 30:29

PAU we can then optimise the system and we could perform a lot better // it would be even better if we could also combine not only the / content / but also the / &uh features / of the captions / the text from the caption as well / and then we could start to do // &uh perhaps something which is a little bit more successful 30:45

PAU but what you will find if you play around with that [///] so download for yourself GIFT // &uh have a play with it index your own photo collection // and you will see that sometimes the results are great // other times the results really aren’t quite so good 30:56

PAU so let me just talk about some of the other things going on / &um in image retrieval 31:04

PAU one of the a &uh perhaps more interesting areas at the moment / and this has been going on for a few years / &uh it's automatic image indexing 31:11

PAU that is what you're trying to do is learn a mapping / between the low level &uh primitive features / and high level semantic concepts 31:19

PAU ok? // and the aim will then be that once we train up a system / we should be able to sign new words // ok? // or new concepts / to the existing images or new unseen images 31:28

PAU and that's really useful 'cause imagine the process of labelling images // &uh manually / it's quite a tiresome / it's quite a slow process // but we could partly automate the process if we could &pro [/] &pro [/] provide for example a set of xxx [/] of key concepts // ok? // for the annotated / just to say oh! yeah let's to say hhh {%act: onomatopoeia} // and just check them // ok? // rather than silly have to generate them 31:49

PAU if I [/] we could even do away with any kind of manual checking and we can just allow the system / to provide the &um concepts or the key words / itself / if we wanted to 31:58

PAU and here is the system / &uh which is &uh / reasonably good // you can access this online // it's called LI [/] LI / PR 32:06

PAU and although you probably can't see / &um some of the &um / words which have been subscribed here // &um some of them look pretty good / so here for example we have words such as plant / and it is [/] it looks like a sort of cucumber // &uh that kind of plants // it's got indoor / it's got fruit / it's got food // and that seems pretty good / so you're quite satisfied 32:27

PAU some of these other ones look quite good as well / there's an old building / a castle // we've got historical building // we've got Scotland even / which actually is correct because it's a Scottish castle // we also have predator / we have dogsled / we have eyes / which are wrong // they shouldn't be in there 32:43

PAU &uh one of the most / interesting ones / the best one I think for &uh people from England // so there's only two of us from England so [/] so only me and my wife would appreciate this 32:52

PAU but this one is the picture of the queen mother // and although we have some tags such as industry / maybe / fashion // well she was very fashionable // &uh we have fish / not quite sure where that comes from // &uh female / modern / she was always termed the modern grandmother / she was very modern // and I think the best one / comes out first / as the most / appropriate keyword is gem / she was a British gem / she really was / and I don't think it necessarily &interpretate [/] interpreted that and meant / to put the word gem 33:21

PAU but it's interesting when you play around with the &techn [/] technique like this [/] again / you get quite variable results // but sometimes it works very well // &uh other times it really is a little bit surprising / and &uh / you're not quite sure what's going on 33:33

PAU but that's only a big area for liking information retrieval / visual information retrieval at the moment 33:37

PAU that is learning the mapping between the low level primitive features / and the high level concepts / and there're various techniques &uh / for doing that 33:44

PAU &uh another &uh &te [/] or an area which is quite popular / &um is effectively / &uh classifying images 33:52

PAU so for example we might want to classify a bunch of our own personal photos / as either indoor outdoor 33:58

PAU well that would be so kind of quite useful because [/] 'cause then we could say &um my query is just find me &uh pictures from parties // ok? // my own photos // but then you perhaps on the interface you could have a filter or drop-down box / which is able to filter either they're indoor ones or the outdoor ones 34:13

PAU this is also very useful for example &uh they use a lot in &uh things like intelligence communities // so scanning // you know // a sort of &uh various pictures / find me indoor ones and these are outdoor and so on 34:24

PAU a lot of this technology is even built into our digital cameras now // so / a lot of digital cameras today have face recognition // well they can identify where there's a face / because often that's where you want to focus / the picture 34:35

PAU they can also detect whether you're indoors or taking a photo outdoors and now whether the flash should be on and what / &um kind of / &uh level of flash to apply 34:43

PAU but typically you do [/] do this type of task like this / you take an image / you divide it into &uh a grid // you could then represent / &um the colour / and the texture / and for each of those grid squares / you can &rem [/] represent the colours [/] colour histogram / &uh and so on ... 34:59

PAU you then can generate basically from some training data // ok? // for each of these little grid squares / &uh we know that / for example we can define whether / part of the image is in or out // so you can do that from training data / for colour and for texture // and then we basically combine all this / together all this information together to give us an overall judgement 35:20

PAU do we decide overall whether the image is / in or out 35:23

PAU but a lot of these grid squares here / &uh seem to be implying that parts of this image are inside // so overall the average is out / though we classify the images being inside and we’d be corrected 35:34

PAU and there's a lot of work going on and how best to / classify inside outside / and other types of &um classes as well 35:41

PAU it's probably worth mentioning that &um / although overall perhaps on general photographs / the problem is very difficult // ok? // for content based retrieval 35:51

PAU &uh the results aren't always very good and very appealing 35:54

PAU however there are certain domain / specific applications / which do seem to work very very well / they seem to be quite visually if you like / in terms of applications / they tend to [///] they seem to appeal very well / to the visual features or properties of the images 36:06

PAU &uh one of these areas is &uh face detection / and recognition 36:10

PAU &uh face recognition is a much much harder task than face detection // so the task of face detection is just to say in an image like this // can we basically distinguish [/] let's draw little boxes around where faces are ? 36:23

PAU that's your kind of first task / that's face detection // and that is kind of useful because we can have a lot of our own personal photographs / and it might be that we just wanna say find me photographs / which have people in them 36:34

PAU ok? // a very very simple task // just find me [///] we're not worried about / which people // just find me all my own photos which have people in them 36:42

PAU &uh however / &uh face &uh recognition / &uh is much much harder [///] actually identifying that that is xxx xxx // ok? // &um is much much harder problem 36:54

PAU but here is an example output / from a system // here is our query image // and here are the / example images or here are the retrieved images 37:03

PAU and actually you could see actually the system is pretty good // and the system is good enough that it can actually even deal with when somebody is rotating or moving their head slightly 37:12

PAU and this of course is important because if this technology is being used in an airport // for example where people are moving and not always are gonna be exactly in the same position // you need to perhaps to deal with slight changes / and variations / and so on 37:25

PAU and so this technology has a couple of false hits so as one here // but overall &uh is reasonably good // and it works well because it's domain &uh specific 37:33

PAU there is &uh [/] there is &uh [///] if you go to Polar Rose / Polar Rose is a / &uh application / &uh with this aim of trying to basically / &um / highlight or pinpoint all the faces and all the photos on the web that's its kind of goal if you like 37:48

PAU &um so have a look at that / it's kind of interesting [/] interesting idea 37:52

PAU &uh other domain specific applications &uh / seem to be work [/] &uh working quite well / &um [/] &uh medical retrieval / or &uh clinical applications 38:02

PAU for example retrieving x-rays / retrieving radiographs / &uh retrieving kind of &uh / images like these / &um photographs / &uh so kind of / &um various / &um organs ... 38:15

PAU and a lot of this does seem to be working quite well // and part of the reason that works often very well is that the features that you reduce / for example in identifying certain types of x-ray / are quite specific to that domain 38:26

PAU so one of the features that seems to work well is if you have a kind of direction feature // ok? // because often you're either looking at something sort of standing up or on the side like that hhh {%act: he moves his hand horizontally} or on &uh a plane like that hhh {%act: he moves his hand diagonally} 38:38

PAU and that's used often in medical retrieval 38:41

PAU also a lot of images tend to be greyscale as opposed to colour so that / seems to help things quite well // &um 38:48

PAU &uh another interesting &uh tool // &uh and you can have a play around with this // this came out from Microsoft // it's actually constructing 3D photographs on separate / individual &uh 2D photos 39:01

PAU &uh and they have an application where you're able to do this 39:04

PAU so you have lots of different photos taken of the same object or same place // and what it aims to do is / find salient points / in the various photos // and then it kinds stitches the photos together in a kind of 3D representation 39:16

PAU and it's really good if you go and have a xxx [/] go and have a play with it you can actually try up the application up for yourself 39:21

PAU it uses typically photos from Flickr // that is because the Flickr community have uploaded lots of photos about specific &uh places / or [/] or regions 39:30

PAU &um and once you have a lot of data // this relies on having a lot of photos / lots of photos from / different &um positions of the same objects // &uh it can stitch things together and create quite realistic / and quite xxx / well pretty cool &uh 3D representations // so it's worth having a look at that 39:47

PAU &uh there're many many examples of content based image retrieval systems 39:52

PAU &uh IBM produced one called QBIC which is one of the first ones that used &uh colour histograms 39:57

PAU &uh there are / huge [/] there are a mass [/] a mass of other ones as well // and so what I would suggest is &uh just go away &uh and have a look at some of these 40:06

PAU &uh it has been shown / &uh over the [/] over the years / that one of the key algorithms is using a colour histogram 40:11

PAU colour seems to be quite a dominant feature / in terms of find me images like this one // &uh colour seems to be quite dominant / even greyscale / &uh seems to be quite dominant 40:20

PAU of course on content based image retrieval there are / &um / like text based retrieval lots and lots of problems 40:27

PAU &uh it's not a particular easy task // it's very challenging task // &uh [/] and some of the &uh [/] some of the problems of this // &uh there's a problem with the sensory gap // that is typically when you digitalize something you lose something from the real world // &um / so there's a problem already if we’ve already lost some information 40:44

PAU &uh there is &uh [/] the classic semantic gap // and that is the gap between / basically you're dealing with pixels / you're dealing with very low low level features // but actually they're representing very high level semantic concepts // &uh and somehow you need to bridge that gap // and there're various techniques to bridge the gap / for example relevance feedback could be one way / automatically labelling images making use of text features and so on / would help to bridge the gap 41:08

PAU &um often &um when you look at some of the content based image retrieval systems produced / they're not that interactive for the user // &uh so perhaps the user interaction / &um / &uh has not really been / &um designed from the point of view of the general user // so perhaps you couldn't get your mum or your dad to you use a content based retrieval system / because it involves / changing properties / of the colour or the &um / the texture and so on 41:32

PAU that is not necessarily true of all systems / there are some systems &whi [/] which are very good / &uh and do certain work quite well / but overall / I would say that's the case 41:40

PAU evaluation there're very &uh few common datasets 41:44

PAU &um so there's little comparison // &um proof of performance of various algorithms // although again that has changed over recent years // and there are some classic datasets that you could use / for evaluating your content based retrieval system 41:57

PAU &um actually / &um / there are / at the time I read the slide I said very few commercial systems most on academic // that probably is &uh [/] is still true // but with examples such as the Microsoft / where you're stitching together the photos to create something in 3D // there are systems and tools starting to come along // &um but actually they're not still picked up / it's surprising but content based retrieval systems / are still not being / picked up and heavily used / &uh in some domains where you would expect them to be used 42:25

PAU so why don't we see the &uh this type of technology used more / in digital libraries ? 42:30

PAU it still seems to be very academic / &um very kind of research orientated 42:34

PAU why is it &no [/] xxx use more for example in / &uh large xxx photographic collections ? 42:38

PAU ok? // if you look at a lot of these collections online they're still very simple / very text based 42:43

PAU why not ? 42:43

PAU well part of the reason is that the text based approach works quite well // ok? // as it's quite difficult to necessarily beat it 42:49

PAU &um so some of the open issues / &um areas of future research / obviously bridging the semantic gap is a big one 42:57

PAU so automatically labelling pictures / providing relevance feedback [/] effective relevance feedback / providing browse search functionality to enable a user to use it to explore collections in a more interactive way 43:07

PAU &um somehow thinking about how you put humans in the loop ? 43:10

PAU so how do you put a human in the loop in a content based retrieval system ? 43:14

PAU where in the loop do you put them ? 43:16

PAU &uh what kind of interaction &uh functionality do you provide ? 43:19

PAU &uh we need to be developing &uh interactive systems that meet / &uh [/] &uh [/] the needs of real users / in real scenarios / in real situations / across different domains / because typically the technology / works best when you tune it to specific domains 43:33

PAU &uh we need to be carrying on developing efficient retrieval methods // &um that is being up to scale the systems up to millions and millions of images 43:41

PAU we also need to keep on [/] on creating standardized evaluations benchmarks / on which to evaluate our systems and various algorithms / and thinking about the evaluation method [/] measures 43:52

PAU so which measures evaluate best with human &uh performance and effectiveness ? 43:56

PAU and of course standards are continuing / and these’re issues which are ongoing 44:00

PAU &uh MPEG standards are great but // you know // we need to &uh have a &sta [/] standardized / a set of standards as well for the metadata 44:07

PAU &uh collaborative systems // so how could we develop perfective ways of sharing / &uh multimedia ? 44:14

PAU so for example Flickr is a great &ex &uh [/] great tool // &uh a good example / Youtube / ways of sharing 44:21

PAU &um how could we exploit user participation / to enhance multimedia ? 44:26

PAU so one classic example is the ESP game / &uh used to automatically label / &uh images which is being bought up by Google // so the Google image labeler tool 44:35

PAU well basically you're relying on the power of people / human computation // you're basically relying on people labelling images for you / to then be able to use that as training data for your / &er system 44:46

PAU &uh semantic search / so there's &uh [/] that's another ongoing &uh topic / that is focusing more in extracting for example objects / labelling objects within images / detecting concepts within the media // and particularly in / images where you have very complex backgrounds // &um so that is the general photograph for example is still very very difficult / &um image to deal with // images which tend to be a kind of you know / a apple sitting on a table / classic kind of image analysis type of evaluation benchmark / so not so realistic 45:15

PAU you know // the photos that we take are very very complex / consist of many many interacting objects 45:20

PAU &uh multimodal analysis // so we'd seen that the best way to perform efficient retrieval / &uh / and successful retrieval would be to combine the media // ok? // so it's a half combined / image and &uh text based retrieval methods working together // but somehow you need to combine this xxx you do xxx 45:36

PAU you have a separate image retrieval system or &co [/] content based / retrieval system / a separate text based retrieval system // you perform separate searches somehow / and then you combine the results list in some way // or do you have a system which actually combines visual and textual features into the whole model ? 45:51

PAU ok? // that's [/] yeah you have to think where [/] whereabouts would you do that ? 45:55

PAU and also there's a big / move to what's experiential multimedia systems / multimedia is very rich // it would enable us so kinds of kind of interactive and &um quite interesting and cool interactive approaches // &um that is [/] could we use a kind of experiential or &expe [/] or provide more experience for the users to enable them to explore our collections ? 46:15

PAU so I think here about 3D collections / &uh art / museum &uh / collections / galleries and so on 46:21

PAU &um so now I just wanna turn to video retrieval 46:25

PAU so &um we've dealt / many with images // and you've already seen that dealing with just still images is / kind of quite difficult / quite complex xxx xxx lots of issues / &uh lots of challenges 46:35

PAU &uh video retrieval unfortunately is even harder // &um because you're dealing with moving images / so lots and lots of still images / &um together in some kind of &uh / context 46:44

PAU &uh video [/] video &exhi [/] exhibit very similar properties / visual properties to images // but the main difference is the kind of spatio-temporal {%alt: tempo} aspects // that is a video composes of a / &uh lot of still images 46:56

PAU so you have to remember that // ok? // there's a kind of temporal context / the temporal ordering / to these images 47:01

PAU &um so to access video material / means that you have to index the videos in some way 47:08

PAU &uh typically what you would do is take &uh / long video clip and break it / into segments / or into bits // and then you would index / perhaps those bits 47:17

PAU &uh so for example / &uh you could index / from a video / the still images // so you can take each of the images // &uh you could index the audio transcripts // you could divide the image into segments / and &um index those // you could index any metadata that comes with the video / if it's for example from &uh a broadcasting corporation / or from a / library // ok? // you could access the [/] &uh / the video on that basis 47:41

PAU &uh most videos can be &um / hierarchically organised which kind of helps us 47:46

PAU so most / &uh videos would typically consist of a clip // ok? // so it might be the whole video / xxx it's just a short video / or part of the video // &uh typically the clip consists of / &uh a story or a scene // &um the scene consists of / &uh shots / which then consist of a number of individual frames // and then the frames can be handled very similar to the way that you would handle still images 48:11

PAU &uh there are / automatic techniques for dividing a video // so one of the most popular techniques is called shot boundary detection 48:18

PAU basically we're trying to identify shots // and a shot is just to find there's a kind of change in the camera angle 48:24

PAU there is also techniques for scene and story / &uh boundary detection as well 48:31

PAU that's slightly harder because often the scene or the story / is perhaps more semantic // ok? // it's more related to the kind of narrative of either a new story / or some kind of broadcast 48:41

PAU and the way that you typically do that is you actually look [//] here is one still image for example if you're doing shot boundary detection // here is another still image // let's [/] let's look for the change // ok? // the transition between them // and when we see the kind of shot transition so a big change in colour / big change in texture and so on // we can then chop / at that point 49:02

PAU &uh now that's quite naif / and is mainly using signal intensity 49:06

PAU you can also make use of for example the associated metadata / so the ISO transcript 49:11

PAU we can look for a pause / for example in the transcript // that might indicate an actual break // so we can combine features together 49:18

PAU &uh again / I'll be pointing to &uh / very / useful paper again by Peter &En [/] Enser / &uh and a lady called Christine Sandom 49:28

PAU &uh this is back in 2002 // and again they ask the same question // great to have this king of technology available to us // but / what the people want / when they're accessing video collections ? 49:39

PAU &uh and it was just interesting that they broke down / they did a similar kind of analysis // and here they were finding that basically / people typically were looking for a safe [/] they're looking for / &uh certain people / so the who dimension 49:50

PAU they're mainly looking for example for / examples of people 49:55

PAU so find me / for example / fishermen / find me academics // or here // this is a &speci [/] specific // so find me pictures of / David Beckham or video shots sorry / video clips of David Beckham / and so on 50:07

PAU so it is again interesting just to have a look through the numbers / just to give you an idea on what does the user want / when they're accessing a &uh video retrieval system 50:15

PAU one of the hardest things when you're accessing videos it's designing good user interfaces / because you're dealing with the complex media / &uh complex media type 50:24

PAU &uh video libraries are / &um typically / &um should support various user interactions 50:30

PAU &um this is just standard video libraries // that is the [/] the &um [/] the people working at the library / those accessing the / collections xxx to browse // and select the videos [/] a specific video program / &uh from the collection / so that might be an episode of Neighbours / an episode of &uh / the Simpsons or whatever 50:47

PAU they must at least be able to carry out that function 50:49

PAU &uh they should be able to query the content of the collections // so for example have a search box and be able to type in / Simpsons and get all the video clips or all the videos of the Simpsons 50:58

PAU &uh they must be able to browse / ideally the content / of a video program 51:04

PAU now either you provide a / some kind of browsing functionality or they have to sit and watch the video from beginning to end 51:10

PAU ok? // &uh they must be able to watch the video program either all of them / or one of them 51:16

PAU and ideally they must be able to requery / &uh within the video library // ideally within a program as well // so you want to be able to kind of search / parts of / &uh a video for example 51:26

PAU &um one of the things you can do / &uh is generate surrogates 51:32

PAU &um after you do your segmentation so let’s say we have a big video clip // we divide it into parts // ok? // which make it a little bit easier to scan and to index // that is we could allow the user to jump in / at certain parts / or certain segments 51:46

PAU what you might want to do is actually visually represent each of those segments 51:50

PAU so why don't we select the representative image // or what we called keyframe // to represent each of those segments or each of those parts ? 51:57

PAU that's something that we could do 51:58

PAU &uh we could also represent certain words for example from the segment 52:03

PAU we could generate something like a tag cloud / which should be a kind of linguistic or &uh verbal / &um expression if you like / of that video clip 52:11

PAU so various things that we could do // we [/] we could generate various types of surrogate / to represent each of those segments // and that would allow or would help the user / be able to kind of jump in / to the video &uh / and get a feel for the video / and various segments / within a video 52:25

PAU &uh there aren’t so many / &um video retrieval systems available that you can just download off-the-shelf 52:32

PAU mainly because it's a quite complex task / to actually build / and then set up / and run a video retrieval system 52:38

PAU &um IBM do have one &um / that they produced the IBM multimedia analysis and retrieval system / &um MARS // and &uh / it's an interesting a system / ‘cause its very really is a state of the art / multimedia management 52:50

PAU &uh it's shown / on a comparative system evaluation at TREC / or TRECVID 52:55

PAU &uh basically the components are showing to achieve well class performance // so really is / &um worth looking at the various states of the art 53:02

PAU &uh uses various multimodal techniques / so it makes use of the text features / of an image / so the ISO transcripts / the speech / the audio / &uh together with the visual 53:12

PAU it uses multimodal techniques / and various machine learning techniques // &uh to xxx model semantic concepts // so when the video / when it divides it into bits / &uh and the frames / and the individual keyframes / they even assign concepts to it 53:23

PAU so this part of the video is happy bit / it is a xxx xxx or so on 53:27

PAU &um / and they have &uh / a library of around a hundred concepts / so they say for example that &uh / they're able to / &uh classify and assign for example that this part is about / and it involves or has a sky / it has people / beaches and so on 53:41

PAU &uh humans are required to create the training data / so &um // you need quite a lot of training data // but they have techniques for minimising / &um the amount of human effort involved / &um quite clever techniques 53:52

PAU then it [/] then the system assigns confidence scores for classification // so you can roughly see how accurately / &uh &er or how confident it is assigning these labels 54:02

PAU &um and the system is also able to exploit relationships between &um the various concepts 54:08

PAU &um so beach for example is more likely to be related to the concept / sand // &um than perhaps for example sky / and water 54:17

PAU and then the concepts can be used as part of the search and browse &functional [/] &uh search and browse functionality // that is you can just look for video clips which have sky in it / video clips which have beaches in it / for example 54:29

PAU so the interface is sort of interesting / do have a look at xxx [/] some of the papers on this if you're interested / &um because they have quite &uh / a nice interface 54:36

PAU and this is the sort of thing that you end up with / they index huge amounts of &uh [/] lots and lots of &uh / video clips // and here is just an example // the type of thing that we are indexing 54:48

PAU so it's quite nice because actually what they're playing around with typically / are quite real world things // perhaps video clips on Youtube / that kind of thing 54:55

PAU and they tend to be a lot harder to deal with / they're not just kind of academic / &uh examples / they're real life examples 55:01

PAU &um challenges are very similar / &uh as for those for image retrieval it's not &uh an easy task 55:07

PAU &uh robust shot boundary detection 55:10

PAU &uh shot boundary detection is definitely getting more advanced / it's getting better / more accurate // &uh but it's still difficult / and ideally what you want to do to get get good shot boundary detection / &er is basically make use of not &on [/] only the visual content / so not only the kind of visual features // so the transitions between images in terms of colour / shape / texture 55:29

PAU you also wanna make use of for example the audio / the ISO if you have it / so if there’s a pause and there's a big change in [/] &uh big change in the colour content and so on // that might indicate more of a / good shot / or more of a boundary / where you want to break the &uh video / than just using any of the &uh / features on their own 55:46

PAU &uh story segmentation is definitely a challenge // that is trying to &um / divide / &uh video / a kind of high level / a sort of more semantic type of / things if you like / &uh running through a &narra [/] narrative // that's just very difficult // typically &um / because the context is very normally application dependent 56:04

PAU so a lot of the work so far has been applied on news videos / on news broadcast / so TRECVID for example // &uh and the news broadcast typically follow a standard structure 56:13

PAU ok? // and you can exploit that structure / as your kind of [/] trying to do things like shot boundary detection or story segmentation // other types of videos such as documentaries don't do that // don't have that 56:23

PAU &uh one of the other challenges / is this whole notion of kind of labelling / parts of the images / so trying to apply a kind of quite high level semantic concepts // such is difficult 56:33

PAU for example especially when you want to apply more than say a hundred / you know concepts maybe one thousand // if your library has ten thousand // why have some kind of ontology that xxx ? 56:42

PAU how would you do that ? 56:43

PAU providing interactive search / &uh browse interfaces is just difficult // it's quite a hard / &uh media 56:49

PAU &uh providing personalised search / is also quite tough in this domain as well 56:53

PAU &um if you want to have a look at some kind of current state of the art or research going on / in &um image and video retrieval // then have a look at the CIBR / series of conferences // &um they run every year 57:08

PAU &um this unfortunately is only from two thousand and seven // but this is just a tag cloud / of &um / just the &um / call for papers // and you can see some of the things that they're looking at / mainly image and video retrieval // but they're addressing now things such as the web // so perhaps dealing with &um Youtube videos / images from Flickr / &uh indexing / thinking about applications / in particular // where can we apply this / technology // thinking about systems is also a heritage / so a call for heritage // looking ontologies / and &on [/] ontological structure summarisation // so how can we effectively summarise for example &uh a video / into something much shorter // and so on 57:44

PAU so it's a very good source for [/] for seeing where the state of the art is at the moment 57:47

PAU ok now just &uh / in the kind of final part of the lecture / or the talk / I just wanna talk a little bit about evaluation 57:57

PAU &um so it's with [/] have a very kind of quick / and a rough introduction to image and video retrieval // that's a very very [/] very very brief introduction / there's an awful lot more to add than that // &uh that we can cover in just kind of one hour one hour one hour and a half 58:12

PAU but let me just brief you and talk a little bit about evaluation / because I feel that this is one of the / kind of the / important challenges if you like / when importing images / for developing good / &uh image and video retrieval systems 58:22

PAU &um why is evaluation important ? 58:25

PAU will evaluating the performance of a system / &uh is an important part of the development process ? 58:30

PAU so if you're developing a system you normally want to be evaluating parts of it or the whole thing // ok? // to see how you're improving it 58:36

PAU you try various tricks of the parameters / changing the parameters // so continually trying to improve your system 58:41

PAU &um you want to establish / to what extent the system / &uh being developed / meets the needs of the end-users 58:48

PAU that also assumes that you have quite a good understanding of what the [/] the end-user actually wants // you might have an understanding of the typical tasks that they perform // the typical tasks and queries that they perform // which your system should serve / and which your system should need 59:01

PAU so ideally when you evaluate you want to evaluate against real / life / kind of scenarios / or realistic scenarios // ok? 59:08

PAU &um you also evaluate to show the effects of changing the underlying system / &um or its functionality 59:15

PAU you wanna be able to say if I change the algorithm / or this algorithm / or if I change it like this // what would be the effect on the effectiveness or the performance of the system overall ? 59:23

PAU &um you also perhaps wanna be able to &um compare different systems 59:27

PAU so is the &um / IBM MARS system better than / the system by somebody else ? 59:32

PAU so we need to be able to do some kind of comparative performance 59:35

PAU so evaluation is very very important 59:37

PAU &um typically evaluation of IR system / does have quite a long history 59:43

PAU &um we'd normally tend to focus on either / the system / or the algorithms or the user 59:48

PAU &uh there're so many &um / perhaps &um / large scale evaluation campaigns which tend to focus on both / but we'll come back to that in a moment 59:55

PAU &um Tefko {%alt: Tesco} Saracevic / &uh in ninety five / &uh distinguishes six levels of evaluation for IR systems / of information systems in generally / but you can include IR systems within this 1:00:07

PAU and &uh / he identifies / the engineering level / the input level / the processing level / output level / use and user / and the social level // and basically sais that we should be [/] be evaluating our systems / our information systems at all these different levels 1:00:22

PAU now I would say that so far a lot of the focus on the IR evaluation is really being more on the kind of algorithms / that’s been more on evaluating the technologies and the techniques / perhaps less so / on evaluating the output / on evaluating the use / the user // particularly what would be the effect on the social [/] you know the kind of social implication 1:00:40

PAU that is to carry out that sort of evaluation normally &come [/] normally involves some kind of longer term evaluation // of actually employing the system / in say / a realistic setting in an organisation // studying the use of that over a long time 1:00:54

PAU so of course that's something that is not appropriate and possible for every / kind of evaluation campaign if you like 1:00:59

PAU &um but certainly that has trended to be more of a focus on less just to evaluate / &um the technologies / or the algorithms rather than &uh / use 1:01:07

PAU &um so now I just wanna mention &uh very briefly &um / one of the evaluation campaigns I've been involved with for a few years now 1:01:17

PAU &um the evaluation campaign is called CLEF // &um stands for the Cross-Language Evaluation Forum 1:01:24

PAU and &um CLEF is &uh / been going for &uh [/] well it's gonna be its tenth year / &uh next year // and &uh its focus is [/] is mainly on cross language information access 1:01:33

PAU so evaluating &um various systems that provide &uh cross-language or multilingual / information access 1:01:40

PAU and &uh CLEF ranks a whole / range of different types of tasks // so &er you're able to evaluate all different kinds of information systems 1:01:48

PAU &uh for example you can just evaluate multilingual ad-hoc / retrieval // that is just giving a query / retrieve a bunch of documents from a multilingual collection // and they have &um [/] they have tracks / which test that 1:02:01

PAU there is [/] there is domain specific / &um multilingual retrieval // there's a kind of a task for that 1:02:07

PAU there's interactive / &um information retrieval // &uh which Julio is gonna talk a bit about tomorrow I think 1:02:13

PAU then we have question answering / &um which again &um the guys from UNED / are heavily involved with 1:02:21

PAU we have cross-language retrieval &uh in web documents / or from web documents 1:02:26

PAU we have cross-language retrieval &uh &geo [/] geographical / &uh cross-language retrieval // so that's where the &er focus is a little bit more on / exploiting the spatial / &uh properties of documents 1:02:37

PAU &uh cross-language video retrieval / that was new for this year // &uh that's a really interesting task // &uh very very interesting 1:02:44

PAU and then also multilingual information filtering as well 1:02:47

PAU &um so CLEF / has a huge range of different tracks // and they're trying to evaluate all kinds of information systems across all different kinds of contexts and domains 1:02:57

PAU so that seems to be that we’re heading in the right direction here // this seems to be pretty good 1:03:01

PAU so let me just talk about ImageCLEF 1:03:05

PAU &uh so &I [/] ImageCLEF was set up in &um two thousand and three it was the first time we &uh ran it // it's still running now // and &uh hopefully it will run next year as well 1:03:14

PAU &uh it's a part of CLEF // and &uh we have a number of different tasks 1:03:17

PAU &um this year the tasks that we had were &uh retrieval / of general photographs from a general photographic collection 1:03:23

PAU &um you might think that's kind of easy but they're general photographs // so that tends to be a little bit harder / they tend to be kind of touristic &pho [/] photos // &um so the type of photos that you might find on / Flickr for example 1:03:34

PAU &uh so they are just / a little bit harder to work with 1:03:37

PAU &um those photos so that photo collection &uh had bilingual / or had multilingual captions as well // so you could exploit either the visual properties of the image or the text / &uh maybe interesting 1:03:47

PAU &uh this year we xxx [/] specifically focused on what we call the diversity task 1:03:51

PAU so the aim is basically / &uh to try and &uh / retrieve &uh diverse groups or groups of images 1:03:57

PAU &uh I'll show you an example of that in a moment 1:03:59

PAU and we also have a classification task as well 1:04:02

PAU &uh so could you label images / &uh from this general photographic collection with a very simple / set of categories or concepts ? 1:04:09

PAU &uh we also had &uh / a Wikipedia task this year 1:04:13

PAU so this was &uh a collection of images &um taken from Wikipedia 1:04:17

PAU &uh not only did you have images but of course Wikipedia has semi-structured data as well 1:04:22

PAU so &uh the &uh participants can make use of that 1:04:25

PAU and that's kind of interesting because again it's quite realistic task / it's quite a realistic setting 1:04:29

PAU and this task evolved from INEX // and basically as [/] as INEX which has been a large scale evaluation campaign / for &uh evaluating &s [/] &uh semi-structured or structured information retrieval systems 1:04:40

PAU INEX &uh has closed now / and stopped running // &uh but the people around the &uh Wikipedia task have moved over to ImageCLEF // and so now that's kind of part of / &uh ImageCLEF // so that's where it comes from 1:04:51

PAU &uh for a number of years now / we've been running a specific / &uh medical image retrieval track 1:04:56

PAU &uh that's been very very interesting / we've had &um / a good set of realistic collections / of medical images // &uh which include / not only the images but also the case notes as well / the case histories 1:05:07

PAU that's a very very specific challenge // and so here for example we tend to &uh recruit or [/] or we get a lot of participants from &um / medical institutions and hospitals for example // or medical research institutions // and that is because they often [/] you are dealing with quite specific clinical terminology // and they wanna make use of / &uh some of that terminology in their / retrieval 1:05:27

PAU &uh there's also &uh [///] on the medical images we also had a &um / automatic image and &uh / &um annotation task as well 1:05:34

PAU &um and that also in conjunction with I-CLEF / &uh we've also been running an interactive / &um cross-language image retrieval task as well 1:05:43

PAU so we've been quite busy hhh {%act: laugh} 1:05:46

PAU &uh so ImageCLEF itself // &um what we've tried to do is mainly promote &uh / system / evaluation // but we are aware / that is very important to evaluate the user as well // so we did try or we have tried to perform / &uh user-centered evaluation as well / on this kind of large scale setting // and just to say with this large scale setting what you're trying to do / is basically attract &uh participants / from groups around the world / where you're trying to then compare various systems 1:06:13

PAU so it's not just like a single group // ok? // &uh working on &uh their own system // you're trying to compare various systems 1:06:19

PAU &uh what we typically provide / are a number of resources to help you evaluate and test your systems 1:06:25

PAU so over the years ImageCLEF has provided a number of document collections // that includes both the images and the text 1:06:32

PAU those collections / we have &uh / as far as possible tried to make sure / that you can get access to those // they are publicly accessible 1:06:39

PAU ok? // that it's been one of our goals / to create &um image collections and sets of collections which anybody can get hold of for research purposes 1:06:46

PAU and that is very hard // ok? // to do / kind of realistic / image retrieval evaluation // ideally you want a set of images / for example from a large news agency / or something like that // but copyright / problems are get in the way 1:06:59

PAU &um but we have secured a number of &um / collections / for use in ImageCLEF 1:07:03

PAU so not only do we provide the document collection / we provide some example search tasks // &uh typical kind of search tasks // &uh typically [/] &uh normally ad-hoc retrieval tasks or search tasks 1:07:14

PAU we then also provide the relevance judgements / for each of those tasks 1:07:18

PAU so we tell you how to do the collection // which of the documents are relevant to a user need or not 1:07:23

PAU and that is exactly what you need to then be able to evaluate / &um the system effectiveness 1:07:28

PAU and we also provide other resources as well // so we supply participants with example content-based retrieval systems / if they want to use them 1:07:36

PAU we provide for example the medical task access to / &uh clinical terminology list / &uh gazettes and so on 1:07:42

PAU &uh an annual workshop is held every year in conjunction with CLEF / and the ECDL 1:07:48

PAU and &um [/] xxx [/] xxx [/] so the workshop that then people are able to / &uh basically tell us // ok? // how have they achieved / &um &er their performance 1:07:57

PAU so we normally select the best performance systems // and they present / &uh their algorithms / or how they've achieved their / &uh results 1:08:03

PAU just to mention to say that &s [/] &um [///] this is mainly focused on image retrieval 1:08:10

PAU if you're trying to evaluate your video retrieval systems // TRECVID is the obvious / &um place to go to // &um but also now / &um we have CLEF video as well // ok? // so that's another place that you can go / to evaluate video retrieval systems 1:08:22

PAU and they operate in / pretty much the same way 1:08:25

PAU &um so I'm not gonna run through this but just to say from two thousand and three we started / &uh quite small // so we just had one / image retrieval task / which is based upon that / historic photographic collection // if you remember that lighthouse 1:08:39

PAU so we had twenty thousand images from Saint Andrews Library 1:08:42

PAU &uh a set of about fifty / &um search tasks / or queries derived / from a query log / from Saint Andrews University // &uh and then basically the goal was to find // you know // the system had to find the relevant images // and they were a kind of diverse query tasks to test in xxx very different / &uh aspects // and we had four participants / including ourselves // so we didn't seem to attract too much 1:09:04

PAU so at that point we were thinking about / killing the track 1:09:06

PAU &uh however we decided to carry on 1:09:08

PAU in two thousand and four we added a medical retrieval task 1:09:11

PAU &uh things got slightly better because we had seventeen participants / &uh which is great 1:09:16

PAU in two thousand and five we then had four tasks // we added two image classification tasks / because people from the community said that they would be interested for them // so should add that / and we did 1:09:26

PAU &uh then in two thousand and six / we had thirty participants for four tasks 1:09:30

PAU we also had an object classification task using the &uh data / provided to us from LTU / which is excellent that's very real world data 1:09:38

PAU two thousand and seven thirty five participants 1:09:40

PAU classification task was made harder // because it was turned into a hierarchical classification task // &uh which is a harder task 1:09:47

PAU and then in two thousand and eight this year we managed to &uh get forty five &um participants &um to take part in our track // &um which is great I mean it's really good to have sixty three people registered / forty five people take part 1:09:58

PAU and I think part of it / is because we introduced this new / task as well // we had the Wikipedia task // and we had a new collection for example for the ad-hoc tasks that people &uh seemed to quite like that 1:10:07

PAU &uh so sixty three groups registered for five tasks 1:10:11

PAU they're kind of xxx [/] the photo retrieval tasks from an [/] &uh general collections that's kind of just to search from a general photographic collection / still seemed to prove very popular 1:10:19

PAU &uh the medical &uh retrieval tasks are quite popular as well // the &me [/] the Wikipedia task was also / &um popular as well 1:10:26

PAU &um / I will say some of the highlights from this year 1:10:30

PAU &uh we had really good participation 1:10:32

PAU it's always quite difficult to / attract people to take part in / a competition / or evaluation that you run // because there's a lot of work // and there's no kind of / immediate benefits // if you like 1:10:42

PAU so it's very encouraging that we had in the end forty five people actually submit [/] &uh submissions to &um / to the evaluation campaign 1:10:49

PAU &um we had this interesting and I'll show an example in a moment / but this interesting task of being able to &uh / promote diversity / in your retrieval system / in the image retrieval system // that seemed to work very well 1:11:01

PAU the Wikipedia task was great / that seemed to attract &uh a lot of interest 1:11:05

PAU the &um [/] I'll tell you about this in a moment / but the medical retrieval task was particularly interesting / because the challenge there was actually what you had was a bunch of / &uh medical literature // ok? // &uh journals for example / medical journals // and so there you had / an awful lot of text / in the &uh [/] around the image which was not associate with the image itself 1:11:23

PAU so actually what you needed to do is select the / bit of relevant text that you would associate / with the image // so filter out the / background or the noise if you like // &uh which proved to be quite challenging 1:11:34

PAU &uh Quaero / which is a big &uh / EU-funded &uh project sponsored / &uh a workshop that we ran before / &um we had &uh ImageCLEF / before the ImageCLEF event / where we really focused on multimedia retrieval evaluation issues 1:11:48

PAU and we've run this now for a couple of years and we've attracted people / Alan Smeaton / &uh Donna Harman and people like that come along and talk a bit about evaluation 1:11:55

PAU and the goal is great / just to get people together and discuss / some of the issues which / we feel are facing us today / &uh to evaluate multimedia retrieval systems 1:12:03

PAU &um so the various tasks // &um I'm not gonna run through this / in detail but just to say we had a photographic retrieval task / aimed at promoting diversity 1:12:12

PAU we had an automatic concept detection task / which had very simple hierarchy of objects 1:12:17

PAU a Wikipedia retrieval task / and then two medical retrieval tasks 1:12:21

PAU &um so the photo retrieval task looked to be like this hhh {%act: pointing to the screen} 1:12:26

PAU the aim was to promote diversity in your retrieval system 1:12:29

PAU &uh what does that mean ? 1:12:30

PAU &uh well imagine that we do a search like this / and &uh we want images of typical Australian &amages [/] &uh Australian animals // we do a search // ok? // and we get back this / set of results // ok? // so images of typically called Australian animals // ok? // and we get a lot of / kangaroos and wallabies 1:12:48

PAU &uh actually the position of ten in this case is one // so / it's a perfect result // isn't it ? 1:12:54

PAU well let's see 1:12:55

PAU ok? // everything is relevant so / all seems to be good 1:13:00

PAU now what happens then if we had a retrieval system // &uh which did this ? 1:13:04

PAU ah! / this is interesting // because now what we have / are different / Australian &an [/] &uh &um animals 1:13:13

PAU so actually what we've done here / we have a system that now promotes diversity 1:13:17

PAU that is what we have done is / cluster together the results / into groups // select representatives / from those clusters // and put those in the top ten // ok? // and that's what we call a diverse result 1:13:30

PAU that's promoting diversity 1:13:32

PAU so now what we find xxx [/] xxx xxx [/] I haven't got it 1:13:36

PAU but here the precision is also one // but which result / is more satisfying ? 1:13:41

PAU xxx you probably prefer this result and we can show some empirical evidence where we've tried this with users / and users would prefer this result / as have some other people as well 1:13:50

PAU so it seems to be that promoting diversity is a good thing to do // and so what we want to do [/] do in this task / is to actually encourage people to do that / and then provide evaluation resources / so the people could access their systems 1:14:02

PAU and that involves for us thinking the evaluation measures // 'cause precision recall don’t work any more / so that we have come up with something else 1:14:09

PAU &um so it's a very interesting task / &uh around 1:14:15

PAU hhh {%act: click} &uh the visual concept detection task 1:14:17

PAU basically people had to take the images and assign these concepts // and these concepts were on a kind of hierarchical &fa [/] &uh fashion 1:14:24

PAU so we had the classic indoor and outdoor concepts // you must &uh label a image and tell us whether it's indoor or outdoor 1:14:30

PAU but you &m [/] must also tell us whether there are people in the images / whether there are animals in the images // water / sky / whether it's day or night / whether there's a road or pathway in the image 1:14:40

PAU and again we provided the resources so the people could evaluate the systems // and so with the &uh outcome of this 1:14:46

PAU you can look for the papers and you can actually see how well can this task be done 1:14:51

PAU and why is this kind of useful / while a lot of the images we use here // with the types of images you might have in your own personal collections ? 1:14:57

PAU so actually you could apply this technology to your own stuff // and add these labels automatically 1:15:03

PAU and that actually might help you an awful lot / when it comes to retrieve in your own [///] and browsing through your own personal photos 1:15:09

PAU &uh the Wikipedia task / very simple / just looked like this // so people were given this kind of information 1:15:17

PAU &er you had a number of queries that you had to perform // &uh and this kind of semi-structured retrieval 1:15:22

PAU so you had to know and understand / something about Wikipedia first / to be able to exploit / &uh the text 1:15:27

PAU again it's kind of challenging // because sometimes / there's very little text // sometimes the text doesn't relate to the image // and you're kind of [/] you know // you have to work out // &uh the best text to use 1:15:36

PAU &uh and then we had the medical annotation task / &uh in two thousand eight 1:15:42

PAU ok? // and this is very interesting // had to use a hierarchy of classes 1:15:44

PAU &uh in the first task that we had / the aim was to use a coding scheme / which was for ideological &uh images 1:15:50

PAU so that is that you might have an image and you have to say / well [/] basically there's about a twelve digit code // and this code represent different aspects of the image // that is the image involves an x-ray // xxx [/] xxx somebody standing up like this hhh {%act: shaking right hand vertically} / and so on // so you [/] you classify the image in &uh quite a complex way 1:16:06

PAU again / some kind of evaluation run // and that was very interesting in this task / local features again // we've seen this before but local features were outperforming the global ones 1:16:15

PAU that is dividing an image into bits and &um basically computing / features / on the segments of the image or the bits of the image rather than taking the features over the whole image // seemed to be working quite well 1:16:26

PAU and it did seem to be very much that machine learning techniques are the keys to success in that task // so selecting the [/] the right machine learning technique / &uh seem to be very very key 1:16:35

PAU and then the medical retrieval task // just to say // as I said // it was just very very interesting // that is it was a task where you're not retrieving now images from just a kind of image collection with short captions // you are basically given a set of scientific articles / or medical articles // and your goal is basically to find a set of images / &um from those articles // but it's just tough // because not all of the text in the article relates to the image so xxx 1:17:02

PAU again and it's also very realistic // so this is &uh / quite a [/] a good example of &um / trying to do a task / &uh which basically / &uh deals with something in the real world 1:17:11

PAU the case in a kind of education or research setting in a hospital for example 1:17:15

PAU people need to train // &uh they need to look up literature &uh and so on 1:17:18

PAU and so they want to search for this kind of information 1:17:21

PAU it's kind of very real world 1:17:22

PAU &uh for two thousand and nine / what are we gonna be doing ? 1:17:26

PAU well / we hope that basically we'll be &uh / running again // so &uh at the moment we're trying to &uh organise between ourselves / it's about / eight different people involved in organising this track / is [/] is quite big 1:17:36

PAU &uh the medical retrieval task we want to continue // &uh but this time the goal / is going to be to &uh actually retrieve cases instead of images 1:17:44

PAU so you might have ten different images related to a somebody's case 1:17:47

PAU somebody goes into a hospital / and have a condition / a case [/] set of case notes / so a case file is generated for that person 1:17:54

PAU that consists of lots of text about the condition / perhaps about the outcome / about the individual 1:17:59

PAU this is all being in an automatized xxx [/] 1:18:02

PAU &um and then you might have a number of different images taken from [///] you might have an x-ray / you might have &uh something else // which are all collected together in this thing called the case note 1:18:10

PAU typically what we've done is just said // well let's just retrieve an image // ok? // that's a little bit simpler 1:18:16

PAU but it could be / a more realistic thing to do / and / much harder thing to do // is actually retrieve the whole case / set of case notes / which is sort of be very useful / particularly in the / clinical setting 1:18:25

PAU &uh the Wikipedia task / we want &que [/] more queries in different languages 1:18:29

PAU &uh the object / &uh recognition task / and somebody's approaches where they have a robot vision database / that they'd like to be explored 1:18:36

PAU &uh photographic retrieval task // we're &um at the moment talking to a big news agency / to see whether they let us &um have a sample of about a million / of their images // so that we can &te [/] then &te [/] test diversity on a much larger / collection // which should be very interesting 1:18:50

PAU &uh and I just wanted to end up by just saying that &um // lab-style evaluation is good // which effectively is what I'm talking about here // but // &um evaluation resources xxx provided by various communities such as TREC and by &CLE [/] and by CLEF / have really shown a lot of positive effects on the IR community 1:19:10

PAU they have been very very useful // they have promoted / &um research / in information retrieval // they've been very very beneficial 1:19:18

PAU and the results’ve even gone onto informed commercial IR systems 1:19:22

PAU so // you know // these evaluation campaigns are very very important 1:19:26

PAU &uh however / although the benchmarks are important / there are still many questions / which we face with / which perhaps this type of benchmark or this type of evaluation / doesn't necessarily answer 1:19:36

PAU &uh that is / how do we measure the accuracy ? 1:19:40

PAU what measure do we need to use / to measure [/] measure system effectiveness ? 1:19:44

PAU do we just use precision and recall ? 1:19:46

PAU do we use precision at ten ? 1:19:47

PAU do we use a kind of &um / boundary preference ? 1:19:50

PAU do we use / any of the other / million different types of evaluation techniques and measures ? 1:19:56

PAU which one do we use ? 1:19:57

PAU is precision and call [/] &uh recall good enough ? 1:19:59

PAU another important thing is actually which of those measure actually correlates well / with human effectiveness / or user / &um performance // user effectiveness or user's success ? 1:20:09

PAU so if we do an &evalu [/] &uh do an experiment / where we get people / to say whether they're satisfied with the results or not // ok? 1:20:16

PAU do those results correlate with system effectiveness / whether the system has performed well or not ? 1:20:21

PAU 'cause ideally when we try to evaluate the systems in this kind of setting // we do wanna be making use of measures which correlate with human / &uh success / and satisfaction 1:20:30

PAU &uh the other key thing is well what is the role of the user / in the evaluation process of multimedia retrieval techniques ? 1:20:37

PAU it is very difficult to involve the user / to create a large scale / user evaluation campaign 1:20:42

PAU &uh Julio will / probably explain this tomorrow // but attempts have been done / in CLEF / as part of I-CLEF // but it's difficult [/] it's difficult to try to get people to participate // difficult trying to think of the right task // difficult trying to think of the right way of doing the evaluation and so on // it's just / harder / than kind of setting up a system orientated evaluation 1:21:00

PAU the other big thing is how much does multimedia retrieval depend upon the context ? 1:21:04

PAU 'cause you don't often make use of context // ok? 1:21:07

PAU o it doesn't matter where you are // it doesn't matter what time of day it is / and so on 1:21:11

PAU &uh I guess the other thing to think about then is / this notion of user judgements versus / system effectiveness or technical &accurateness [/] accuracy 1:21:21

PAU &uh there is some research that shows that a good user / with a bad system / &uh is usually better than a bad user with a good system 1:21:29

PAU &um so research by Hersh / Allan / Turpin and Hersh and so on 1:21:35

PAU &um so it seems to be that what you want to do / is effectively give users &um [/] &um [///] well / actually what the research shows / it doesn't really matter what type of system you give a user // give him a bad one or &uh a good one // they will adapt // and they will basically turn out performing the same 1:21:50

PAU &uh however &um // as part of the paper published at SIGIR this year which contradicted / a lot of the previous research 1:21:58

PAU and in this paper what we did was actually make use of a much larger set of users / and much larger set of topics 1:22:04

PAU so in one of these previous experiments they only used five users 1:22:08

PAU we used &uh nearly sixty users 1:22:11

PAU and basically we found &um / that although even when system effectiveness ok? [///] so you have a very small difference between system effectiveness // even then / there were significant differences between / user / performance and user satisfaction 1:22:24

PAU that is we would say that actually system effectiveness // a &s [/] a user / will be affected / by the performance of the system // that is what we would conclude 1:22:33

PAU so there's a bit of conflict here / so more research needs to be done 1:22:37

PAU &um multimedia [///] the other big problem we have / is that multimedia retrieval &um / researchers are not usually / &uh user interface people or human &inte [/] &uh human computer interaction people // and for good reason because &of [/] often we're in different camps // ok? 1:22:50

PAU it's very [/] very rare to get people who are a kind of // you know // across-the-board if you like 1:22:54

PAU &um so that is &um / often researchers have little experience / with real user tests / and setting up user's evaluations 1:23:01

PAU &um probably very little interest in investing time and effort in &put [/] carrying out user evaluations because they don't necessarily see the benefit 1:23:09

PAU &um also a lot of the communities / for example the computer of vision communities / one hard ground truth // ok? // they're not worried about this soft / fluffy kind of user's stuff 1:23:20

PAU &um but I would argue and say that is important // and we do need to be thinking about that 1:23:25

PAU but the question / how do we / put that inside an evaluation / campaign ? 1:23:29

PAU I'm gonna / leave that 'cause we have / almost run out of time 1:23:34

PAU just one thing to say that on the context during treated / &uh retrieval // certainly in the medical task that we run // ok? // retrieval typically defines the [/] depends on the context 1:23:45

PAU ok? // that is // &uh many domains require of viewing images in specific context 1:23:50

PAU for example in the medical domain // no medical doctor would be analysing the images without some kind of clinical context 1:23:57

PAU they would need [/] they would know something about the age of [/] of [/] &uh &um / the participant in the photo / the sex / &uh lab results and so on 1:24:05

PAU that is we need that contextual information which is why in the medical tasks / we want to move to using case notes // because that provides the kind of clinical context / to these individual photos which we believe provides a much more / realistic type of evaluation 1:24:19

PAU so we need to keep testing on different domains // &um that's pretty important 1:24:26

PAU &um say in conclusion then // &um I think it's still unclear whether a direct relationship exists between IR effectiveness and user satisfaction with the search results 1:24:36

PAU we seem to have some slightly contradictory / &um results here but we're good to exploit that further 1:24:41

PAU this is important // &uh because previous experiments confirm strong relationship between the performance of the &uh [/] &uh the user / and the system // &um but / it's contradictory 1:24:52

PAU so we just seem to be that some research says that it doesn't matter how well a system performs / the user can adapt / and perform equally well with a good or bad system 1:25:00

PAU other research sais well actually that's not true 1:25:02

PAU so now we need some more evaluation to be confirming this 1:25:06

PAU &uh measures of system effectiveness 1:25:09

PAU that's a big problem 1:25:10

PAU which measure do we choose ? 1:25:12

PAU which measure do we evaluate our systems with ? 1:25:14

PAU ok? // we need to be / having more experience where we can correlate / system effectiveness with user performance 1:25:19

PAU that's an important aspect / xxx an important area 1:25:22

PAU &uh other important areas / I think another measure is [/] perhaps we need to think of / we don't necessarily think of xxx these large scale evaluations / include not only user satisfaction 1:25:31

PAU what about system speed ? 1:25:32

PAU ok? // that's quite a hard thing to / perhaps incorporate / 'cause everybody is working on / different systems // but how we might include that ? 1:25:39

PAU obviously system speed / if your algorithm takes / a year to index and run / over that set of queries / might take to five seconds // ok? // which one is better ? 1:25:47

PAU xxx yours performs better / &uh position at ten // but it takes a year 1:25:51

PAU &um what about user confidence ? 1:25:54

PAU what about task interestingness // what about task difficulty ? 1:25:56

PAU these all things that perhaps we could take / into account / &uh with part of our evaluation measures 1:26:01

PAU &um so / a way forward // &um let's see // let's do a little bit more research in which measures of system / effectiveness correlate best with human satisfaction ... 1:26:12

PAU &uh let's continue then with large scale evaluations / &um across domains and tasks 1:26:18

PAU &uh I think we need a combination of measures perhaps to successfully evaluate IR systems // because each of the measures does tell us something different // and does give us some interestingness 1:26:27

PAU &uh we probably wanna be thinking how could we include some kind of performance measure / or some measure that correlates with &um &use [/] user &um [/] &um interaction / satisfaction 1:26:38

PAU &um some kind of measure that indicates the effort that's been involved / in terms of building the system that's then being used / to &uh / &uh run and &par [/] and participate in a competition 1:26:48

PAU how many man-hours did you have // I mean xxx members of the staff and so on ? 1:26:52

PAU &um we need to continue constructing realistic benchmarks // so we've been trying to do is in ImageCLEF // &um but you know / wouldn't be lovely to be extending the domains ? 1:27:01

PAU &um moving more to video retrieval / and moving more to audio retrieval and so on ? 1:27:05

PAU and we must // I end on this // we must conduct more user experiments 1:27:09

PAU what the people really want from image retrieval and video retrieval systems ? 1:27:12

PAU and I'll hope / we need more of I-CLEF 1:27:15

PAU so // keep going with I-CLEF 1:27:17

PAU and that's it 1:27:20

PAU sorry for running over 1:27:23