Authors:
(1) Vinicius Yu Okubo, B.S. in electrical engineering from the University of São Paulo in 2022 and currently, he is pursuing his M.S. in electrical engineering at the University of São Paulo;
(2) Kotaro Shimizu, B.S. degree in Physics from Waseda University, Japan, in 2019 and M.S. degree in Physics from the University of Tokyo, Japan, 2021 and He has been pursuing his Ph.D. in Physics as a JSPS research fellowship for young scientists in the University of Tokyo since 2021;
(3) B.S. Shivaram, received his B.S. degree in Physics, Chemistry and Mathematics from Bangalore University, India, in 1977 and the M.S. degree in Physics from the Indian Institute of Technology, Madras, India, in 1979 and his Ph.D. in experimental condensed matter physics from Northwestern University, Evanston, Illinois in 1984;
(4) Hae Yong Kim, He received the B.S. and M.S. degrees (with distinctions) in computer science and the Ph.D. degree in electrical engineering from the Universidade de São Paulo (USP), Brazil, in 1988, 1992 and 1997, respectively.
Table of Links
II. RELATED WORKS
A. JUNCTIONS AND TERMINALS DETECTION
Junctions and terminals are shapes with relevance that extend beyond materials. Within computer vision, recognizing and enumerating them has been performed in diverse contexts, such as natural landscapes [20], biology [21] and handwriting images [22].
B. CLASSICAL METHODS FOR JUNCTIONS AND TERMINALS DETECTION
One prevalent approach for detecting junctions and terminals involves using skeletonization as a pre-processing step. This process reduces the image to one-pixel-wide lines to represent its structures. Points in skeleton can then be identified as terminals, junctions and crossings based on their neighboring pixels. This technique has been applied in vascular images [23] and in the analysis of handwritten Chinese characters [24].
Pre-processing techniques using contour information have also been explored for junction detection. Lee and Wu [25] investigated stroke extraction in Chinese characters. Their method segments regions according to their contour, identifying junctions by counting neighboring regions. Maire et al. [26] proposed a junction detector in natural images by locating intersecting contours. This approach applies an expectation–maximization style algorithm to iteratively select relevant contours and suggest the junction’s position.
However, skeletonization and contour finding are noise-sensitive processes and a pre-processing error will lead to a detection error.
Junctions and terminals can also be identified by analyzing the arrangement of linear structures within the image. Su et al. [27] describe a technique for identifying these linear structures using the Hessian Matrix. This approach has been validated in biological images such as blood vessels, neutrites and tree branches. Xia et al. [20] present a junction detection method in natural images, based on amplitudes and phases of the normalized gradients of the image.
Template-based approaches quantify the similarity of the appearance of image regions and the template. Deriche and Blaszka [28] modeled this approach as energy minimization, which is calculated by the deviation between the image and a predetermined model. This enabled the detection of key image features, such as edges, corners and terminals.
C. DEEP LEARNING METHODS FOR JUNCTIONS AND TERMINALS DETECTION
Owing to the success of R-CNN based detection techniques, Pratt et al. [21] developed a pipeline for identifying junctions and crossings in retinal vascular structures. Their method involves two main steps: initially, detection regions are proposed centred along the blood vessels, which are then classified as junctions, crossings or background. To generate the detection regions, their approach requires a binary segmented version of the exam. Theses images undego a skeletonization process, with the resulting points serving as references for the centers of the blood vessels.
Zhao et al. [29], addressing the same problem, proposed using a Mask R-CNN base model [30] for region proposal. This strategy enables inference without the need for binary segmented version of the exam. However, during training, the segmented images are still used in the Mask R-CNN model to enhance its learning capabilities. This approach surpassed the performance of previous techniques in the detection of junctions and crossings in retinal vascular images.
This paper is available on arxiv under CC BY 4.0 DEED license.