
Large Scale Visual Analytics
Workshop in Conjunction with the IEEE International Conference on Data Mining (ICDM-11), December 10, 2011, Vancouver, Canada
Modern intelligent mobile devices have made visual data acquirement increasingly affordable and popular. Moreover, the development in cloud computing and social networks encourages users to share their information. Hence we have witnessed a recent surge of the visual information on the Internet. In the past, researchers have proposed many approaches to analyse and extract relevant information from data collections. These traditional data mining tools have proven useful for many applications. In principle, the general-purpose algorithms are applicable to visual data analysis. However, from the perspective of designing tools for data analysis, the newly emerged visual data have several distinctive characteristics to account for.
(i) For a particular task, the sheer volume of the visual data set of interest can be very huge and thus pose challenge to the algorithm. E.g. to answer a query of particular images, it is necessary to consider all or an appropriate portion of the available images on the Internet. Therefore we need algorithms that can scale to very large problems. Considering the fact that such tasks often come with constraints, e.g. respond time or available physical memory, we often want the algorithms to be flexible to compromise some performance for limited resources, e.g. algorithms that work in online settings or that produce suboptimal but useful results in premature termination.
(ii) With improved visual data acquisition, the processing of each data item is becoming more demanding. A modern mobile phone camera delivers image of the size comparable to that of those produced by a professional digital camera one decade ago. This requires description of visual information that is both efficient and representative. Therefore we are interested in novel feature extraction methods, as well as model based representation methods, e.g. subspace learning algorithms.
(iii) Individual data items often possess side information for exploit. A good example is the geographic information in images. GPS receiver is one of the sensors that are recently made custom affordable, and equipped in many mobile devices. Now many images are taken with embedded highly accurate location information. Another example is the pictures on the social websites and online personal albums. Those pictures often bear tags and comments provided by public users. Contrast to the GPS tags, the user-provided side information are not always accurate, it sometimes can be vague even misleading. Therefore we are interested in techniques and innovations that can take advantage of helpful side information, and is resilient to erroneous inputs.
(iv) Side information presents in the form of relations between data items. On the Internet, the contents interact with the users dynamically: people "click" their way to interested material. So the online data will benefit algorithms that can be aided by relational side information. Moreover, we want techniques that account for the fact that the side information are evolving with time.
The aim of this workshop is to identify interesting and challenges in large scale problems of visual computing. We will gather ideas and discuss theoretical and practical issues on broadly related topics. We hope the discussion on the workshop encourage advances in this emerging area.