Applications of Eye Tracking for Region of Interest HEVC Encoding
Sainio, Joose Valtteri
Permanent address of the item is
Katseenseurannan sovellukset mielenkiintoisen alueen HEVC-pakkaukselle
The increase in video streaming services and video resolutions has exploded the volume of Internet video traffic. New video coding standards, such as High Efficiency Video Coding (HEVC) have been developed to mitigate this inevitable video data explosion with better compression. The aim of video coding is to reduce the video size while maintaining the best possible perceived quality. Region of Interest (ROI) encoding particularly addresses this objective by focusing on the areas that humans would pay the most attention at and encode them with higher quality than the non-ROI areas. Methods for finding the ROI, and video encoding in general, take advantage of the Human Visual System (HVS). Computational HVS models can be used for the ROI detection but all current state-of-the-art models are designed for still images. Eye tracking data can be used for creating and verifying these models, including models suitable for video, which in turn calls for a reliable way to collect eye tracking data. Eye tracking glasses allow the widest range of possible scenarios out of all eye tracking equipment. Therefore, the glasses are used in this work to collect eye tracking data from 41 different videos. The main contribution of this work is to present a real-time system using eye tracking data to enhance the perceived quality of the video. The proposed system makes use of video recorded from the scene camera of the eye tracking glasses and Kvazaar open-source HEVC encoder for video compression. The system was shown to provide better subjective quality over the native rate control algorithm of Kvazaar. The obtained results were evaluated with Eye tracking Weighted PSNR (EWPSNR) that represents the HVS better than traditional PSNR. The system is shown to achieve up to 33% bit rate reduction for the same EWPSNR and on average 5-10% reduction depending on the parameter set. Additionally, the encoding time is improved by 8-20%.