We are interested in the detection of team-sport players to control the autonomous production of images to render a sport game action. In other words, the information about players positions is used to select the view point to adopt to render the action, typically by cropping within a fixed view. Hence, we are not interested in the accurate segmentation of each individual, but we are eager to determine whether a given foreground activity either results from (one or several) players, or is caused by some other reason like, for example, dynamic advertisement panels or spot lighting.
As another consequence of our application context, our system has to deal with severe deformations of the object-of-interest (players are running, jumping, falling down, connecting to each others, etc.). Hence, to be effective, it can not only rely on the characterization of the standard appearance of a standing human, like it is done for pedestrian detection for example, but it has to exploit as much of the a priori information that is available about the appearance of the object (e.g. players’jerseys have a known colour) and of the scene (sport hall, known background advertisements). Since this a priori information changes from one game to another, the classifier has to be trained online, so as to adapt to the game at hand.
Our goal is to improve a foreground silhouette detector, by using an appearance-based classifier to differentiate false and true positives among the foreground silhouette detections. The main idea consists in training the classifier based on the probably correct decisions taken by a foreground detector, thus no manual annotation is required to generate the training set, which makes it possible to retrain and adapt the classifier to the case at hand. Because it exploits colour and gradient visual features, the appearance-based classifier offers a complementary information compared to the one provided by the foreground detector. It makes the overall detection more reliable.
The proposed classifier follows the Random Ferns classifier. It relies on an ensemble of random sets of binary tests to characterize the texture describing the visual appearance of the target.
Video samples presenting the results:
The results are computed on the SPIROUDOME and APIDIS datasets.
On the videos:
- boxes = foreground detection
- blue boxes are rejected by the classifier
- red and green boxes are kept by the classifier. Green boxes are close to the ground truth of the players’ positions. Red boxes are far from it.
SPIROUDOME
APIDIS
SPIROUDOME
Related references: (Parisot and De Vleeschouwer 2017) (Parisot et al. 2013)