Line 35: |
Line 35: |
| {| class="wikitable" cellpadding="10" border="1" cellspacing="0" | | {| class="wikitable" cellpadding="10" border="1" cellspacing="0" |
| |- | | |- |
− | |Date
| + | ! Date |
− | |Speaker
| + | ! Speaker |
− | |Title
| + | ! Title |
| |- | | |- |
| | June 9 | | | June 9 |
Line 92: |
Line 92: |
| | | |
| ====Multi-Agent Event Recognition in Structured Scenarios==== | | ====Multi-Agent Event Recognition in Structured Scenarios==== |
− | Speaker: Vlad Morariu -- Date: June 9, 2011 | + | Speaker: [http://www.umiacs.umd.edu/~morariu/ Vlad Morariu] -- Date: June 9, 2011 |
| | | |
| I will present a framework for the automatic recognition of complex multi-agent events in settings where structure is imposed by rules that agents must follow while performing activities. Given semantic spatio-temporal descriptions of what generally happens (i.e., rules, event descriptions, physical constraints), and based on video analysis, the framework determines the events that occurred. Knowledge about spatio-temporal structure is encoded using first-order logic using an approach based on Allen's Interval Logic, and robustness to low-level observation uncertainty is provided by Markov Logic Networks (MLN). The main contribution is that the framework integrates interval-based temporal reasoning with probabilistic logical inference, relying on an efficient bottom-up grounding scheme to avoid combinatorial explosion. Applied to one-on-one basketball, the framework detects and tracks players, their hands and feet, and the ball, generates event observations from the resulting trajectories, and performs probabilistic logical inference to determine the most consistent sequence of events. | | I will present a framework for the automatic recognition of complex multi-agent events in settings where structure is imposed by rules that agents must follow while performing activities. Given semantic spatio-temporal descriptions of what generally happens (i.e., rules, event descriptions, physical constraints), and based on video analysis, the framework determines the events that occurred. Knowledge about spatio-temporal structure is encoded using first-order logic using an approach based on Allen's Interval Logic, and robustness to low-level observation uncertainty is provided by Markov Logic Networks (MLN). The main contribution is that the framework integrates interval-based temporal reasoning with probabilistic logical inference, relying on an efficient bottom-up grounding scheme to avoid combinatorial explosion. Applied to one-on-one basketball, the framework detects and tracks players, their hands and feet, and the ball, generates event observations from the resulting trajectories, and performs probabilistic logical inference to determine the most consistent sequence of events. |
Line 98: |
Line 98: |
| | | |
| ===A Vision System to Extract "Simple" Objects in a Purely Bottom-Up Fashion=== | | ===A Vision System to Extract "Simple" Objects in a Purely Bottom-Up Fashion=== |
− | Speaker: Ajay Mishra -- Date: June 16, 2011 | + | Speaker: [http://www.umiacs.umd.edu/~mishraka/ Ajay Mishra] -- Date: June 16, 2011 |
| | | |
| Human perception, being active, is inextricably linked to visual fixation. Despite the obvious importance of fixation, it has not become an integral part of computer vision/robotics algorithms so far. To incorporate fixation and attention in a computer vision framework, we have proposed a new segmentation framework that takes a fixation point (i.e a single point) inside a "simple" object as its input and outputs the region corresponding to that object. We have also designed a new attentional mechanism that utilizes the concept of neural border-ownership to automatically select the fixation points inside different "simple" objects in the scene. All of this together creates a fully automatic system that outputs only the regions corresponding to the "simple" objects without knowing the actual number or the size of the objects in the scene. | | Human perception, being active, is inextricably linked to visual fixation. Despite the obvious importance of fixation, it has not become an integral part of computer vision/robotics algorithms so far. To incorporate fixation and attention in a computer vision framework, we have proposed a new segmentation framework that takes a fixation point (i.e a single point) inside a "simple" object as its input and outputs the region corresponding to that object. We have also designed a new attentional mechanism that utilizes the concept of neural border-ownership to automatically select the fixation points inside different "simple" objects in the scene. All of this together creates a fully automatic system that outputs only the regions corresponding to the "simple" objects without knowing the actual number or the size of the objects in the scene. |
Line 108: |
Line 108: |
| | | |
| ===Fast Imaging with Slow Camera=== | | ===Fast Imaging with Slow Camera=== |
− | Speaker: Dikpal Reddy -- Date: June 30, 2011 | + | Speaker: [http://www.umiacs.umd.edu/~dikpal/ Dikpal Reddy] -- Date: June 30, 2011 |
| | | |
| Abstract: Over the years, the spatial resolution of cameras has steadily increased but the temporal resolution has remained the same. In this talk, I will present my work on converting a regular slow camera into a faster one. We capture and accurately reconstruct fast events using our slower prototype camera by exploiting the temporal redundancy in videos. First, I will show how by fluttering the shutter during the exposure duration of a slow 25fps camera we can capture and reconstruct a fast periodic video at 2000fps. Next, I will present its generalization where we show that per-pixel modulation during exposure, in combination with brightness constancy constraints allows us to capture a broad class of motions at 200fps using a 25fps camera. In both these techniques we borrow ideas from compressive sensing theory for acquisition and recovery. | | Abstract: Over the years, the spatial resolution of cameras has steadily increased but the temporal resolution has remained the same. In this talk, I will present my work on converting a regular slow camera into a faster one. We capture and accurately reconstruct fast events using our slower prototype camera by exploiting the temporal redundancy in videos. First, I will show how by fluttering the shutter during the exposure duration of a slow 25fps camera we can capture and reconstruct a fast periodic video at 2000fps. Next, I will present its generalization where we show that per-pixel modulation during exposure, in combination with brightness constancy constraints allows us to capture a broad class of motions at 200fps using a 25fps camera. In both these techniques we borrow ideas from compressive sensing theory for acquisition and recovery. |