Lu Dong (董璐)Ph.D. Student
Department of Computer Science and Engineering |
|
The majority of current models have been trained on limited close set data, which often results in subpar performance when applied to real-world video data. One of the major challenges in this context is the issue of self-occlusion, which refers to instances where parts of a subject's body obstruct or cover other parts. This results in a less smooth and precise performance. To tackle this issue, my approach involved a transformer-based pipeline that incorporates both self communication and cross communication mechanisms. Applying this technique in consecutive frames has resulted in a marked improvement in smooth and has been verified through extensive experimentation. The results of these experiments demonstrate that the proposed pipeline outperforms state-of-the-art performance on the Human3.6 dataset, as well as exhibiting the best performance on our in-house self-occlusion dataset.
Last Updated on Feb, 2023 |