A team of scientists at the University of New York has developed a new multi-scale, sliding window approach that can be utilized for image classification, detection, and localization. This approach in comparison to others based on the ILSVRC datasets, ranked 4th in classification, 1st in localization and 1st in detection.
Another important contribution in this paper is the clarification of how ConvNets can be effectively used for detection and localization tasks. This team is the first to clarify how this could be achieved with regards to ImageNet. This proposed plot includes substantial modifications to the neural network design for classification. Also, through this approach, it is shown how different tasks can be learned simultaneously by using a single shared network.