From 22nd to 29th October the International Conference on Computer Vision was held in Venice: the 30th edition of the event can be described by some interesting statistics:
- More than 3100 attendees (an increase of 113% compared to ICCV2015)
- An increase of almost 30% in the number of paper submissions
- There were 65 sponsors this year and the number of exhibitors increased by more than 350%.
- Most authors were from the USA and China and the most common topics were recognition including detection, categorization and indexing
- The word most used in the title of a paper was “Learning”, followed by “network” and “image”
- The organizations that submitted the most papers were Tsinghua University, Carnegie Mellon University and Google
— KM Labs Europe (@KMLabsEU) November 8, 2017
Here I list some of the most popular topics presented in ICCV.
- Generative Adversarial Networks (GAN) has become very popular with several applications such as image synthesis from text description, video synthesis from a single image, image style transformation, image super-resolution, or image in-painting (filling missing parts of images). At the beginning of ICCV, a whole day tutorial focused on GANs with a motivating introduction into the area given by Ian Goodfellow showing potential applications and suggesting best-practice. Compared to how this topic was covered earlier this year in the CVPR 2017 event, there has been a great increase of interest in GAN and I am curious to see how this topic will develop in future.
- Many papers focused on semantic and instance-level segmentation and the best paper award was assigned to one of these, entitled “Mask R-CNN” prepared by He et al. The trend I noticed is that semantic and instance-level segmentation is slowly becoming a more popular topic than the classic problem of object detection.
- A hot topic within ICCV 2017 was activity recognition in videos with Visual Question Answering (VQA), as I already mentioned in my blog post about CVPR. VQA naturally links to understanding content from videos, which is a task that is much more time-consuming than understanding the content in a picture. I expect that once VQA for videos starts to emerge, it may lead to huge improvements in video analysis.
- The work that most impressed me was ‘Turning Corners into Cameras’ from Bouman et al. With a demonstration performed next to the poster, the authors showed how to detect (and in some cases track) movements of a target that was obscured behind a corner of an object using a standard RGB camera. The source code is available for testing and I think this is astonishing: it sounds like magic – to have a camera that can see what is around a corner!