A common trend in image tagging research is to focus on visually relevant tags. This tends to ignore the personal and social aspect of tags, especially on photoblogging websites such as Flickr. Previous work has identified that many of the tags that users provide on images are not visually relevant.
One common trend in image tagging research is to focus on visually relevant
tags, and this tends to ignore the personal and social aspect of tags,
especially on photoblogging websites such as Flickr. Previous work has
correctly identified that many of the tags that users provide on images are not
visually relevant (i.e. representative of the salient content in the image) and
they go on to treat such tags as noise, ignoring that the users chose to
provide those tags over others that could have been more visually relevant.
Another common assumption about user generated tags for images is that the
order of these tags provides no useful information for the prediction of tags
on future images. This assumption also tends to define usefulness in terms of
what is visually relevant to the image. For general tagging or labeling
applications that focus on providing visual information about image content,
these assumptions are reasonable, but when considering personalized image
tagging applications, these assumptions are at best too rigid, ignoring user
choice and preferences.
We challenge the aforementioned assumptions, and provide a machine learning
approach to the problem of personalized image tagging with the following
contributions: 1.) We reformulate the personalized image tagging problem as a
search/retrieval ranking problem, 2.) We leverage the order of tags, which does
not always reflect visual relevance, provided by the user in the past as a cue
to their tag preferences, similar to click data, 3.) We propose a technique to
augment sparse user tag data (semi-supervision), and 4.) We demonstrate the
efficacy of our method on a subset of Flickr images, showing improvement over
previous state-of-art methods.