Yongmin Li    PhD, MEng, BEng
Click on the links below or scroll down for details.
  1. Medical Imaging
  2. AI for Business Intelligence
  3. Road Sign Recognition
  4. Foveated Ray Tracing
  5. Set-Membership Filtering
  6. Reconstructing Background of DNA Microarray Imagery
  7. Dynamic Face Models: Construction and Applications
  8. Semantic Video Analysis
  9. Incremental and Robust Subspace Learning
  10. Robust Panoramic Scene Construction from MPEG Video

  1. Medical Imaging

  2. Medical imaging seeks to reveal the structures and activities inside the human body that are normally invisible behind other body parts such as skin and bones. With a wide range of imaging technologies such as radiography, magnetic resonance imaging, ultrasound, echocardiography and tomography, it has long been of great importance in clinical practices for measurement, identification, location and structural analysis of targeted organs, tissues or areas of abnormality. Over the past decade, we have investigated various categories of technologies, from the statistical methods such as Expectation-Maximization and probabilistic modelling, Markov Random Field, Graph Cut, Level Set, to the deep learning methods such as Convolutional Neural Network (CNN), UNet, Res-UNet, and nnUNet. We have won a number of academic awards including the Best Papaer Awards Bioimaging, HIS and 1st place in the Retinal OCT Fluid Challenge (RETOUCH) online competition, MICCAI 2023 and 2nd place in the Fetal Tissue Annotation and Segmentation Challenge (FeTA), MICCAI 2022.

    References:
    • Ndipenoch, N., Miron, A., & Li, Y. (2024). Performance Evaluation of Retinal OCT Fluid Segmentation, Detection, and Generalization Over Variations of Data Sources. IEEE Access, 12, 31719-31735. 1st place, the Retinal OCT Fluid Challenge (RETOUCH) online competition, MICCAI 2023 (link).
    • N McConnell, N Ndipenoch, Y Cao, A Miron, Y Li (2023). Exploring advanced architectural variations of nnUNet. Neurocomputing.2nd place, the Fetal Tissue Annotation and Segmentation Challenge (FeTA), MICCAI 2022((link).
    • McConnell, N., Miron, A., Wang, Z., & Li, Y. Integrating Residual, Dense, and Inception Blocks into the nnUNet. In IEEE 35th International Symposium on Computer Based Medical Systems. 2022.
    • Ndipenoch, N., Miron, A., & Li, Y. (2022). Simultaneous Segmentation of Layers and Fluids in Retinal OCT Images. In IEEE Conference on Image and Signal Processing, BioMedical Engineering and Informatics. 2022.
    • Wang, C., Li, Y. Blood Vessel Segmentation from Retinal Images. In The 20th IEEE International Conference on BioInformatics And BioEngineering. 2020.
    • Dodo, B., Li, Y., Kaba, D., & Liu, X. (2019). Retinal Layer Segmentation in Optical Coherence Tomography Images. IEEE Access, Vol 7, pp. 152388-152398.
    • Dodo, B., Li, Y., Tucker, A., & Liu, X. Retinal OCT Segmentation Using Fuzzy Region Competition and Level Set Methods. In IEEE International Symposium on Computer-Based Medical Systems. Andalucia, Spain, 2019.
    • Dodo, B., Li, Y., Liu, X., & Dodo M. Level Set Segmentation of Retinal OCT Images. In International Conference on Bioimaging. Czech Republic, 2019.
    • Dodo, B., Li, Y., Eltayef, K., & Liu, X. Graph-Cut Segmentation of Retinal Layers from OCT Images. In International Conference on Bioimaging. Portugal, 2018. Best Student Paper Award.
    • Wang, C., Wang, Y., & Li, Y. Automatic Choroidal Layer Segmentation Using Markov Random Field And Level Set Method. IEEE Journal of Biomedical and Health Informatics, volume 21, issue 6, pages 1694-1702, 2017.
    • Djibril Kaba, Yaxing Wang, Chuang Wang, Xiaohui Liu, Haogang Zhu, Ana G. Salazar-Gonzalez and Yongmin Li. Retina Layer Segmentation Using Kernel Graph Cuts and Continuous Max-Flow. Optics Express, volume 23, issue 6, pages 7366-7384, 2015.
    • Ana Salazar-Gonzalez, Djibril Kaba, Yongmin Li and Xiaohui Liu. Segmentation of blood vessels and optic disc in retinal images. IEEE Journal of Biomedical and Health Informatics, volume 18, number 6, page 1874-1886, 2014.
    • A. Salazar-Gonzalez, D. Kaba and Y. Li. MRF Reconstruction of Retinal Images for the Optic Disc Segmentation. In Proc. International Conference on Health Information Science (HIS 2012), Beijing, China, 2012. Best Paper Award.
    • A. Salazar-Gonzalez, Y. Li and X. Liu. Optic Disc Segmentation by Incorporating Blood Vessel Compensation. In Proc. International Workshop on Computational Intelligence in Medical Imaging, Paris, France, 2011.
    • A. Salazar-Gonzalez, Y. Li and X. Liu. Retinal blood vessel segmentation via graph cut. In Proc. International Conference on Control, Automation, Robotics and Vision, Singapore, December 2010.

  3. AI for Business Intelligence

  4. Business operations involve a large amount of data from their internal communications, presentations, reports, accounting and marketing, and external information from governments, industry sectors, customers, suppliers, competitors etc. These data are often available in unstructured or semi-structured formats, and therefore make the data analytics a challenging task. In this project, we aim to develop novel methods using the modern artificial intelligence (AI) and natural language processing (NLP) technologies to support decision making for both business operations and strategic planning.

    For example, Virtual Try-On, a process where realistic image and video contents are automatically generated from a user’s online profile (preferences, browsing history, shared pictures etc.) by embedding compatible items (e.g. clothes, hats, sunglasses etc) to the original pictures of the user, can be useful for personalised marketing (funded by the EPSRC).

    In another example, we have delivered the UK's first commercial Capital Allowance Assessment system, where accounting and contract data are automatically collated and categorised to generate tax claim reports that comply with complex regulations (funded by the Innovate UK).

    References:
    • T. Islam, A. Miron, X. Liu, & Y. Li. FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video Synthesis from Static Imagery. arXiv preprint arXiv:2310.00106, 2023.
    • T. Islam, A. Miron, X. Liu, & Y. Li. Image-Based Virtual Try-On: Fidelity and Simplification, 2023.
    • T. Islam, A. Miron, X. Liu, & Y. Li. SVTON: Simplified Virtual Try-On. In International Conference on Machine learning and Applications. 2022.
    • Islam, T., Miron, A., Liu, X., & Li, Y. (2024). Deep Learning in Virtual Try-On: A Comprehensive Survey. IEEE Access.

  5. Road Sign Recognition

  6. Road sign recognition is of great interest in Intelligent Transportation and Automatic Vehicles. However, this problem is non-trivial owing to limitations such as low resolution of input video, poor lighting condition, cluttered background, and changing scales/views while vehicles are moving. A comprehensive approach to online detection and recognition of traffic signs is presented in this work. The process is comprised of the typical three stages of detection, tracking and recognition, as commonly used in many object recognition problems. At the detection stage, a quad-tree operation is performed first on the densities of sign-specific colour gradients in order to locate the regions of interest. A regular polygon detector or a boosted classifier cascade is then used to detect the possible signs. We have also developed a Confidence-Weighted Mean Shift algorithm to refine the often redundant detection results. For the recognition, we have developed and evaluated several different approaches, including the model-based method of class specific discriminative features and data-driven methods of SimBoost and similarity-learning kernel regression trees. We have also demonstrated that these methods can be used for other general object recognition.

    References:
  7. Forveated Real-Time Ray Tracing

  8. Ray tracing is capable of generating high degree of realistic images but suffers from real-time performance. We have presented an approach that significantly improves the real-time performance of ray tracing. This is done by combining foveated rendering based on eye tracking with reprojection rendering using previous frames in order to drastically reduce the number of new image samples per frame. To reproject samples a coarse geometry is reconstructed from a G-Buffer. Possible errors introduced by this reprojection as well as parts that are critical to the perception are scheduled for resampling. Additionally, a coarse colour buffer is used to provide an initial image, refined smoothly by more samples were needed.

    Evaluations and user tests show that our method achieves real-time frame rates, while visual differences compared to fully rendered images are hardly perceivable. This method is well-suited for wide-FOV Head-Mounted Displays with eye tracking and can be used for many Virtual Reality applications.

    References:
  9. Set-Membership Filtering, EPSRC EP/C007654/1

  10. In many real-world scenarios, tracking non-rigid objects (e.g. human faces) from video or image sequences, or in other words, estimating the ideal hidden state of a dynamic system with noisy visual observation, is a non-trivial task. Owing to the severe non-linearity from the intrinsic characteristics of the objects themselves, their dynamics, and the external complications from measurement environment, the traditional techniques (e.g. the Kalman Filter, the Extended Kalman Filter and the Unscented Kalman Filter) are inappropriate for this problem. Other methods such as Particle Filtering suffer from intensive computation and model degeneration. To overcome these limitations, we have developed methods of Set-Membership Filtering, where the state estimate is guaranteed within a set specified by an ellipsoid bound of the state vector. Recursive algorithms are also developed for fast computing.

    References:
  11. Reconstructing Background of DNA Microarray Imagery, EPSRC EP/C524586/1

  12. DNA microarray technology has enabled biologists to study all the genes within an entire organism to obtain a global view of gene interaction and regulation. However, the technology is still early in its development, and errors may be introduced at each of the main stages of the microarray process: spotting, hybridisation, and scanning. Consequently the microarray image data collected often contain errors and noise, which will then be propagated down through all later stages of processing and analysis. Therefore to realise the potential of such technology it is crucial to obtain high quality image data that would indeed reflect the underlying biology in the samples. If this is not achieved many of the subtle and low level gene expression genes, which are often of biological significance, will not be analysed. Although there is recently much research on how to detect and eliminate these variations and errors, the progress has been slow. We have initiated research to develop a novel way of processing microarray image data by reconstructing background noise of the microarray chip, and this has shown much early promise in extracting high quality cDNA image data. Instead of using the standard approach of correcting anomalies in the signal, we focus on estimating the noise as accurately as possible, to the extent that we almost ignore the signal until the last stage of processing. The proposed project brings together expertise from the disparate fields of image processing, data mining and molecular biology to make an interdisciplinary attempt in advancing the state of art in this important area. It is particularly timely since there is an urgent need to have image analysis software that can save both time and labour as well as provide high-quality image data.

    References:
  13. Dynamic Face Models: Construction and Applications

  14. A comprehensive framework for face detection, head pose estimation, tracking, and recognition is presented in this work. Statistical learning methods and, in particular, Support Vector Machines (SVMs) are applied to multi-view face detection and 3D head pose estimation. A dynamic multi-view face model is designed to extract the identity and geometrical information of moving faces from video inputs. Kernel Discriminant Analysis (KDA), a non-linear method to maximise the between-class variance and minimise the within-class variance, is proposed to represent face patterns. The facial identity structures across views and over time, referred to as Identity Surfaces, are constructed for face recognition.

    References:
  15. Semantic Video Analysis

  16. With the rapidly growing mass of video data from media services, Internet and home digital cameras/camcorders, automatic methods for semantic video analysis become essential for parsing, indexing, retrieval, summarisation of these data. This task can take various forms depending on the granularity of semantics and application scenarios, e.g. from video genre classification, scene segmentation, to specific event/person detection and behaviour analysis. The aim of this research is to develop a multi-layered framework for semantic analysis of raw video. (1) At the lowest level, the acoustic and visual features, e.g. the mel-frequency cepstral coefficients (MFCC), colour, texture, shape and motion, are integrated together, which provide a fundamental description of video content. (2) At the intermediate layer, the low-level audio/visual features are processed and integrated to form the so-called "atomic semantic features" which seek to represent the semantic concepts over a minimal temporal period. (3) The user-oriented semantics analysis is performed at a higher level using statistic models such as the Bayesian Network and the Hidden Markov Model.

    References:
  17. Incremental and Robust Subspace Learning

  18. Principal Component Analysis (PCA) has been of great interest in computer vision and pattern recognition. In particular, incrementally learning a PCA model, which is computationally efficient for large scale problems as well as adaptable to reflect the variable state of a dynamic system, is an attractive research topic with numerous applications such as adaptive background modelling and active object recognition. In addition, the conventional PCA, in the sense of least mean squared error minimisation, is susceptible to outlying measurements. Unfortunately the two issues have only been addressed separately in the previous studies. In this work, we have presented a novel algorithm of incremental PCA, and then extended it to robust PCA. In oppose to most previous studies where robust PCA is solved by intensive iterative algorithms, we use the current PCA model at each updating step to evaluate the likelihood of an element of a new observation to be an outlier, so that the robust analysis is efficiently embedded in the incremental updating framework. This is the key idea of our algorithm. Compared with the previous studies on robust PCA, our algorithm is computationally more efficient. We have applied this method to dynamic background modelling and multi-view face modelling, and obtained very encouraging results.

    References:
  19. Robust Panoramic Scene Construction from MPEG Video

  20. Coarse macroblock motion vectors can be extracted from MPEG video with a minimal decompression. With a reasonable MPEG encoder, most of motion vectors usually reflect the real motion in a video scene although they are coded for compression purpose. Based on this observation, we developed a method of image mosaicking from MPEG video in this work. The main idea is that global motion estimation from MPEG motion vectors can be formulated as a robust parameter estimation problem which treats the "good" motion vectors as inliers and "bad" ones outliers. The bi-directional motion information in B-frames provides multiple routes to warp a frame to its previous anchor frame. A Least Median of Squares based algorithm is adopted for robust motion estimation. In the case of a large proportion of outliers, we detect possible algorithm failure and then perform re-estimation along a different route or interpolate the transform from neighbouring frames. We also developed a simplified method for constructing static background panorama and dynamic foreground panorama.

    References:
    • Y. Li, L-Q. Xu, G. Morrison, C. Nightingale and J. Morphett. Robust panorama from MPEG video. In Proc. IEEE International Conference on Multimedia and Expo (ICME2003), Baltimore, USA, July 2003.