Changes

21 bytes added ,  17:50, 2 October 2013
no edit summary
Line 39: Line 39:  
|-
 
|-
 
| October 3
 
| October 3
 +
| ''(MSR talk, no meeting)''
 +
|
 +
|-
 +
| October 10
 
| Abhishek Sharma
 
| Abhishek Sharma
 
| A Sentence is Worth a Thousand Pixels
 
| A Sentence is Worth a Thousand Pixels
|-
  −
| October 10
  −
| TBA
  −
| TBA
   
|-
 
|-
 
| October 17
 
| October 17
Line 87: Line 87:     
===A Sentence is Worth a Thousand Pixels===
 
===A Sentence is Worth a Thousand Pixels===
Speaker: [http://www.umiacs.umd.edu/~bhokaal/ Abhishek Sharma] -- Date: October 3, 2013
+
Speaker: [http://www.umiacs.umd.edu/~bhokaal/ Abhishek Sharma] -- Date: October 10, 2013
    
We are interested in holistic scene understanding where images are accompanied with text in the form of complex sentential descriptions. We propose a holistic conditional random field model for semantic parsing which reasons jointly about which objects are present in the scene, their spatial extent as well as semantic segmentation, and employs text as well as image information as input. We automatically parse the sentences and extract objects and their relationships, and incorporate them into the model, both via potentials as well as by re-ranking candidate detections. We demonstrate the effectiveness of our approach in the challenging UIUC sentences dataset and show segmentation improvements of 12.5% over the visual only model and detection improvements of 5% AP over deformable part-based models.
 
We are interested in holistic scene understanding where images are accompanied with text in the form of complex sentential descriptions. We propose a holistic conditional random field model for semantic parsing which reasons jointly about which objects are present in the scene, their spatial extent as well as semantic segmentation, and employs text as well as image information as input. We automatically parse the sentences and extract objects and their relationships, and incorporate them into the model, both via potentials as well as by re-ranking candidate detections. We demonstrate the effectiveness of our approach in the challenging UIUC sentences dataset and show segmentation improvements of 12.5% over the visual only model and detection improvements of 5% AP over deformable part-based models.
199

edits