Item type |
Symposium(1) |
公開日 |
2022-07-06 |
タイトル |
|
|
タイトル |
A study on estimating the accurate head IMU motion from Video |
タイトル |
|
|
言語 |
en |
|
タイトル |
A study on estimating the accurate head IMU motion from Video |
言語 |
|
|
言語 |
eng |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_5794 |
|
資源タイプ |
conference paper |
著者所属 |
|
|
|
九州大学 |
著者所属 |
|
|
|
九州大学 |
著者所属 |
|
|
|
公立はこだて未来大学 |
著者所属 |
|
|
|
九州大学 |
著者所属 |
|
|
|
九州大学 |
著者所属(英) |
|
|
|
en |
|
|
Kyushu University |
著者所属(英) |
|
|
|
en |
|
|
Kyushu University |
著者所属(英) |
|
|
|
en |
|
|
Future University Hakodate |
著者所属(英) |
|
|
|
en |
|
|
Kyushu University |
著者所属(英) |
|
|
|
en |
|
|
Kyushu University |
著者名 |
MinYen, Lu
ChenHao, Chen
石田, 繁巳
中村, 優吾
荒川, 豊
|
著者名(英) |
Minyen, Lu
Chenhao, Chen
Shigemi, Ishida
Yugo, Nakamura
Yutaka, Arakawa
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Inertial measurement unit (IMU) data have been utilized in human activity recognition (HAR). In recent studies, deep learning recognition for IMU data has caught researchers' attention for the capability of automatic feature extraction and accurate prediction. On the other hand, the challenge of data collection and labeling discourages researchers to step into it. IMUTube provides a solution by building up a pipeline to estimate virtual IMU data from YouTube videos for body motion. For head motion data, several methods, such as OpenFace 2.0 provide the function of predicting facial landmarks and calculating head facing angle from video. However, to our knowledge, there is no study focusing on estimating IMU data from human head motion. In our previous work DisCaaS, we created the M3B dataset which contains IMU and 360-degree video data from the meeting. We exploit head motion data extraction models to predict participants' nodding and speaking gestures. In order to further improve the performance of nodding recognition, in this paper, we are interested in understanding the quality of estimated gyro data calculated from these existing head motion models. We investigate the difference between the motion data estimated from video and those measured by a 9-axis sensor not only in the time domain but also in the frequency domain. Finally, we discuss the future direction of the result. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
Inertial measurement unit (IMU) data have been utilized in human activity recognition (HAR). In recent studies, deep learning recognition for IMU data has caught researchers' attention for the capability of automatic feature extraction and accurate prediction. On the other hand, the challenge of data collection and labeling discourages researchers to step into it. IMUTube provides a solution by building up a pipeline to estimate virtual IMU data from YouTube videos for body motion. For head motion data, several methods, such as OpenFace 2.0 provide the function of predicting facial landmarks and calculating head facing angle from video. However, to our knowledge, there is no study focusing on estimating IMU data from human head motion. In our previous work DisCaaS, we created the M3B dataset which contains IMU and 360-degree video data from the meeting. We exploit head motion data extraction models to predict participants' nodding and speaking gestures. In order to further improve the performance of nodding recognition, in this paper, we are interested in understanding the quality of estimated gyro data calculated from these existing head motion models. We investigate the difference between the motion data estimated from video and those measured by a 9-axis sensor not only in the time domain but also in the frequency domain. Finally, we discuss the future direction of the result. |
書誌情報 |
マルチメディア,分散,協調とモバイルシンポジウム2022論文集
巻 2022,
p. 918-923,
発行日 2022-07-06
|
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |