Aarat's Projects

A dataset was created which consisted of 11,035 unique frames extracted from a total of 55 men's and women's cricket match highlights from YouTube which contained more than 1,300,000 total frames by training a simple Denoising Autoencoder with an architecture of 128*128*3 - 64 - 128*128*3 for semantic hashing and extracting the unique frames. As depicted in the figure below, the extracted frames were then categorized into three distinct classes -

Bowling (3676 frames):
All frames associated with the delivery of the ball with the batter facing the camera.

Field (3645 frames): All frames which focused on the field and fielding activities without being a closeup on the player

Miscellaneous (3714 frames): All frames which included diverse elements, such as audience shots, replays, and player zoom-ins, among others.

A deep scene detector model was trained on this dataset having the architecture as depicted in Fig. 8. The performance metrics of the scene detector model, tested on 2210 unseen images from separate cricket match highlights videos from YouTube is shown in Table 2.

Know More

CRICS A Cricket Scene Dataset & Classifier

Know More

CRICS
A Cricket Scene Dataset & Classifier