Publication Details
Hongyue Huang



In recent years, video services have found applications in many domains such as the entertainment industry, tele-medicine, and online education to name a few. The consumption of video content is constantly increasing, driven by the consumer's need to store and transmit more information. The technological development of optical sensors now offers the possibility to produce top-of-the-line cameras with increased image resolution to capture and create new content. The drastically increasing user demand and ever higher resolutions raise the new challenges for the development of efficient coding solutions for storing or streaming the digital content by employing new image and video coding methods with an increased performance. The current High Efficiency Video Coding (H.265/HEVC) standard was ratified by the Joint Collaborative Team on Video Coding (JCT-VC) in 2013, and achieves up to 50% bitrate savings compared with the prior Advanced Video Coding (H.264/AVC) standard for the same video quality. Over the years, new video coding solutions with an increased performance were proposed by modifying the HEVC framework and by introducing more efficient coding modules. In recent years, the rapid development and success of machine learning (ML) techniques have attracted a lot of attention. The machines are able to process two-dimensional (2D) images with the help of deep-learning-based (DL-based) tools built using Convolutional Neural Networks (CNN). In the compression domain, researchers have been exploring the potential offered by DL-based solutions by extending the conventional image and video frameworks using ML techniques, such that the coding system yields a better compression performance. In this doctoral thesis research, we propose novel coding solutions by integrating more efficient DL-based tools into the HEVC framework. The propose DL-based methods provide an improved performance and achieve lower compression rates and higher video quality. The first research direction of this thesis is to reduce the compression bitrate by introducing novel DL-based intra-prediction tools. The proposed CNN-based tools make use of a much wider causal neighbourhood than the traditional HEVC intra-prediction method to compute the prediction of the current block. For each angular intra-prediction mode, the CNN-based prediction is introduced to replace the conventional HEVC prediction. The experiments show that the proposed lossless video coding systems provide up to 5.1% bitrate savings compared with the HEVC standard. In our research, we also devised a low-complexity architecture which contains a reduced number of parameters and provides a reduced runtime, compared with the original architecture, while yielding a very close coding performance compared with the original solution. The second research direction of this thesis is to enhance the quality of HEVC decoded video sequences and light field images (LFIs) by proposing novel DL-based filtering tools. In our work, we introduce a novel CNN architecture that leverages the frame-size input patches to refine details of reconstructed frames with the help of the global context. A novel mode map is devised to offer to the neural network additional information regarding the frame. The experimental results show that the proposed CNN-based filter yields an excellent performance of 11.1% BD-rate savings when filtering the HEVC intra-coded frames and outperforms the state-of-the-art DL-based methods. Moreover, the experiments show that the enhanced intra-coded frames can also help inter-coded frames gain bitrate savings and quality improvement. Furthermore, we extended our work to LFI coding and i