Result: Heatmap Pooling Network for Action Recognition From RGB Videos.

Title:

Heatmap Pooling Network for Action Recognition From RGB Videos.

Authors:

Liu M, Liu J, Jiang Y, He B

Source:

IEEE transactions on pattern analysis and machine intelligence [IEEE Trans Pattern Anal Mach Intell] 2026 Mar; Vol. 48 (3), pp. 3726-3743.

Publication Type:

Journal Article

Language:

English

Journal Info:

Publisher: IEEE Computer Society Country of Publication: United States NLM ID: 9885960 Publication Model: Print Cited Medium: Internet ISSN: 1939-3539 (Electronic) Linking ISSN: 00985589 NLM ISO Abbreviation: IEEE Trans Pattern Anal Mach Intell Subsets: MEDLINE

Imprint Name(s):

Original Publication: [New York] IEEE Computer Society.

MeSH Terms:

Video Recording*/methods , Pattern Recognition, Automated*/methods , Image Processing, Computer-Assisted*/methods , Human Activities*/classification , Neural Networks, Computer*, Humans ; Algorithms

Entry Date(s):

Date Created: 20251205 Date Completed: 20260204 Latest Revision: 20260206

Update Code:

20260207

DOI:

10.1109/TPAMI.2025.3640697

PMID:

41348797

Database:

MEDLINE

Further Information

*Human action recognition (HAR) in videos has garnered widespread attention due to the rich information in RGB videos. Nevertheless, existing methods for extracting deep features from RGB videos face challenges such as information redundancy, susceptibility to noise and high storage costs. To address these issues and fully harness the useful information in videos, we propose a novel heatmap pooling network (HP-Net) for action recognition from videos, which extracts information-rich, robust and concise pooled features of the human body in videos through a feedback pooling module. The extracted pooled features demonstrate obvious performance advantages over the previously obtained pose data and heatmap features from videos. In addition, we design a spatial-motion co-learning module and a text refinement modulation module to integrate the extracted pooled features with other multimodal data, enabling more robust action recognition. Extensive experiments on several benchmarks namely NTU RGB+D 60, NTU RGB+D 120, Toyota-Smarthome and uncrewed aerial vehicles (UAV)-Human consistently verify the effectiveness of our HP-Net, which outperforms the existing human action recognition methods.*

*Result*: Heatmap Pooling Network for Action Recognition From RGB Videos.

*Further Information*

*Links*

*Additional functions*

Result: Heatmap Pooling Network for Action Recognition From RGB Videos.

Further Information

Links

Additional functions