Behzad , Mahaseni (2021) Spotting events in football videos with a combination of two-stream convolutional neural network and dilated recurrent neural network / Behzad Mahaseni. Masters thesis, Universiti Malaya.
PDF (The Candidate's Agreement) Restricted to Repository staff only Download (166Kb) | |
PDF (Thesis M.A.) Download (2457Kb) |
Abstract
In this research, we address the problem of event detection and localization in football (soccer) videos. While the problem of event detection in videos is itself a research problem, event detection in sports, especially in football, has an important commercial impact as well. Football is played by more than 250 million players in 200+ nations. In addition, it has the highest television audience in sport. This makes football the most popular sport in the world. Considering the advancement in streaming technologies on mobile platforms, it is important to develop efficient and fast processing algorithms for thousands of videos captured and stored in the cloud. Unlike images, videos provide additional temporal information. While this additional information is helpful, it also makes the reasoning more challenging. On one hand, from the local correlation between adjacent frames, it is possible to identify the short-range correlation between player movements. On the other hand, one can identify the mid-range and long-range correlation between events that are seconds away from each other. One important challenge in analyzing long videos is how to consider all range of correlations (short - long) between video frames. Localizing (temporal segmentation) events in a football video is a challenging problem. While the general problem of temporal segmentation in videos have been extensively addressed in the literature, to the best of our knowledge this work is the among the first to address the event localization problem in “long” football videos using end-to-end deep learning techniques. Football videos are long and the correlation between frames in the video ranges from short to long. To model various range of correlations in football videos, we propose to use a combination of two-stream CNNs and dilated RNNs with LSTM cells, to capture short-range and long-range correlations. Our experimental result shows 5.4% - 11.4% accuracy improvement compared to the state of the art and the baselines for the problem of spotting in long videos presented in the largest football dataset available for research community (i.e., SoccerNet).
Item Type: | Thesis (Masters) |
---|---|
Additional Information: | Dissertation (M.A.) – Faculty of Computer Science & Information Technology, Universiti Malaya, 2021. |
Uncontrolled Keywords: | Deep learning; Recurrent neural networks; Two-stream CNN; Sport video analysis; Activity detection and spotting |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > T Technology (General) |
Divisions: | Faculty of Computer Science & Information Technology > Dept of Artificial Intelligence |
Depositing User: | Mr Mohd Safri Tahir |
Date Deposited: | 03 Jul 2023 07:47 |
Last Modified: | 03 Jul 2023 07:47 |
URI: | http://studentsrepo.um.edu.my/id/eprint/14569 |
Actions (For repository staff only : Login required)
View Item |