TUB-IRML at the MediaEval 2014 Violent Scenes Detection Task

Data & Analytics

esra-acar
The present document can't read!
Please download to view
11
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Description
Text
  1. 1. TUB-IRML at MediaEval 2014 Violent Scenes DetectionTask: Violence Modeling through Feature Space PartitioningEsra Acar, Sahin AlbayrakCompetence Center Information Retrieval & Machine Learning
  2. 2. Outline►The Violence Detection MethodVideo Representation Violence Detection Model►Results & Discussion►Conclusions & Future Work16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 2
  3. 3. The Violence Detection Method►The two main components of our method are: (1) the representation of video segments, and (2) the learning of a violence model.16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 3
  4. 4. Video Representation (1)The generation process of sparse coding based audio and visual representations for video segments.16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 4
  5. 5. Video Representation (2)The generation of audio and visual dictionaries with sparse coding.16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 5
  6. 6. Video Representation (3)► In addition to the mid-level audio and visual representations,we use low-level features which are:Motion-related descriptors – Violent Flow (ViF) which is adescriptor proposed for real-time detection of violent crowdbehaviors, and Static content representations – Affect-related colordescriptors such as statistics on saturation, brightness andhue in the HSL color space, and colorfulness.16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 6
  7. 7. Violence Detection Model►Violence is a concept which can audio-visually be expressed indiverse manners.►We learn multiple models for the violence concept instead of aunique model. Feature space partitioning by clustering video segments inthe training dataset, and Learn a different model for each violence sub-concept.►We perform a classifier selection to solve the classifiercombination issue.16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 7
  8. 8. Results & DiscussionThe MAP2014 and MAP@100 of our method with different representationsMethod MAP2014 –MoviesMAP@100 –MoviesMAP2014 –Web videosMAP@100 –Web videosRun1 0.169 0.368 0.517 0.582Run2 0.139 0.284 0.371 0.478Run3 0.080 0.208 0.477 0.495Run4 0.172 0.409 0.489 0.586Run5 0.170 0.406 0.479 0.567SVM-based0.093 0.302 - -unique modelRun1  MFCC-based mid-level audio representationsRun2  HoG- and HoF-based mid-level features and ViFRun3  Affect-related color featuresRun4  Audio and visual features (except color)Run5  All audio-visual representations are linearly fused at the decision level16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 8
  9. 9. Conclusions & Future Work►The mid-level audio representation based on MFCC andsparse coding provides promising performance in terms of MAP2014 andMAP@100 metrics, and also outperforms our visual representations.► As a future work, we need to extend/improve our visual representation set, and further investigate the feature space partitioning concept.16 October 2014 TUB-IRML at MediaEval 2014 Violent Scenes Detection Task 10
  10. 10. M.Sc.Competence Center Information Retrieval &Machine Learningwww.dai-labor.deFonFax+49 (0) 30 / 314 – 74+49 (0) 30 / 314 – 74 003DAI-LaborTechnische Universität BerlinFakultät IV – Elektrontechnik & InformatikSekretariat TEL 14Ernst-Reuter-Platz 710587 Berlin, Deutschland11Esra AcarResearcheresra.acar@tu-berlin.deThanks!013TUB-IRML at MediaEval 16 October 2014 2014 Violent Scenes Detection Task
Comments
Top