17. September 2017 | Allgemein
Posted by aisolab


Earlier this year, Google hosted a competition on Kaggle for YouTube video classification. Google provided 7 million videos with a total of 450,000 hours, which would be classified in 4716 categories. The third-placed team, a group of researchers from Tsinghua University and Baidu, have recently published their approach.

With a 7-layer deep LSTM architecture, an accuracy of 82.75% is achieved according to the used Global Average Precision metric.

The architecture of the temporal residual CNN used is as follows:

Source: Tsinghua University, 2017