Region-based Fully Convolutional Network

Introduction to R-FCN R-FCN or Region-based Fully Convolutional Networks is a type of region-based object detector. Unlike previous object detectors where a costly per-region subnetwork is applied hundreds of times, R-FCN is a fully convolutional network, with almost all computation shared on the entire image. How R-FCN Works R-FCN achieves this by utilising position-sensitive score maps. These score maps are used to address a dilemma between translation-invariance in image classification an

RepPoints

RepPoints is a recent development in the field of object detection for computer vision. This representation uses a set of points to indicate the spatial extent of an object and semantically significant local areas, and it is learned via weak localization supervision from rectangular ground-truth boxes and implicit recognition feedback. This new representation allows for a more effective and efficient detection of objects compared to traditional bounding boxes. What are RepPoints? RepPoints ar

RetinaMask

RetinaMask is an advanced object detection method that enhances the capabilities of the RetinaNet technique. It achieves this by including various technical advancements such as instance mask prediction, adaptive loss, and including more challenging examples during the training process. The Concept of Object Detection Object detection is a key objective in the field of computer vision, which is the study of how computers can be made to interpret and understand images and videos. Object detect

RetinaNet-RS

RetinaNet-RS is an advanced object detection model that works by scaling up the input resolution from 512 to 768 and changing the ResNet backbone depth from 50 to 152. This model is an improvement upon the original RetinaNet. What is RetinaNet? RetinaNet is an object detection model that uses a one-stage approach to detect objects. In contrast to traditional two-stage models, RetinaNet uses a single neural network to generate object proposals and classify objects at the same time. This approa

RetinaNet

RetinaNet is a powerful object detection model that uses a focal loss function to address class imbalance during training. This one-stage detector is made up of a backbone network and two subnetworks that work together to detect objects in an image. What is RetinaNet? RetinaNet is an advanced object detection model that uses a single, unified network composed of a backbone network and two task-specific subnetworks. The backbone network is responsible for computing a convolutional feature map

RFB Net

Have you ever heard of RFB Net? It may sound like something out of a science fiction movie, but it's actually a type of object detector that uses a receptive field block module. This technology is used to identify objects in images or videos, and it's becoming increasingly popular in the world of computer vision. What is RFB Net? Simply put, RFB Net is an object detector that uses a specific type of module called a receptive field block to identify objects in an image or video. This technolog

RPDet

RPDet, also known as RepPoints Detector, is an advanced object detection model used in artificial intelligence. It follows an anchor-free and two-stage approach, relying on deformable convolutions for its operation. This model uses RepPoints as the basic representation of objects in the system. How RPDet Works The RPDet system starts by obtaining RepPoints from the center points of the object. It then goes through a process of regression to calculate offsets, which are then used to obtain the

RTMDet: An Empirical Study of Designing Real-Time Object Detectors

RTMDet Overview: An Introduction to Object Detection Model RTMDet is a state-of-the-art object detection model that uses real-time multi-detection as its primary approach to identifying objects in images or video streams. This deep learning model is built on top of the Faster R-CNN architecture, which is widely popular for its accuracy and speed in detecting objects from complex images. RTMDet model utilizes a region proposal network(RPN) and a small convolution network to classify them into ca

ScanSSD

ScanSSD is a technology designed to locate mathematical formulas that are embedded within textlines of a document page image. It is a Single Shot Detector (SSD) that uses only visual features for detection, meaning that no formatting or typesetting information such as layout, font, or character labels are employed in the process. How does ScanSSD work? The ScanSSD system makes use of a sliding window method that locates formulas at multiple scales within a 600 dpi image. Once the candidate de

Side-Aware Boundary Localization

Understanding Side-Aware Boundary Localization (SABL) As technology advances, computer vision has become an important area of research to enable machines to interpret the world visually. One critical component of computer vision is object detection, where algorithms are used to identify objects in digital images or videos. Object detection has a lot of real-world applications, such as surveillance, autonomous driving, augmented reality, and robotics. One common task in object detection is to d

Sparse R-CNN

Sparse R-CNN: A New Object Detection Method Object detection is a critical task in the field of computer vision, where the goal is to detect and locate objects in an image. Many object detection methods rely on generating a large number of object proposals or candidate regions, and then classifying each of these regions to determine if they contain an object. This method is known to be computationally expensive and can result in slow detection times. Sparse R-CNN is a new object detection metho

SSD

SSD stands for single-stage object detection, a type of method used in computer vision to identify objects in images. It discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, allowing it to handle objects of various sizes. How Does SSD Work? At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the ob

Stand-Alone Self Attention

Overview of Stand-Alone Self Attention (SASA) If you're familiar with the computational neural network model known as ResNet and its spatial convolution method, you might be interested in Stand-Alone Self Attention (SASA). SASA is a technique that replaces Convolution with self-attention, producing a fully self-attentional model. In this article, we'll explore what SASA is, how it works, and its implications. What is SASA? Stand-Alone Self Attention (SASA) is a deep learning technique that u

ThunderNet

Overview of ThunderNet: Two-Stage Object Detection Model ThunderNet is a state-of-the-art two-stage object detection model for detecting objects in images. The model is designed to address the computationally expensive structures of current two-stage detectors. Its backbone utilizes SNet, a ShuffleNetV2 inspired network that is designed for object detection. ThunderNet's detection head design is modeled after Light-Head R-CNN, with further compression of the Region Proposal Network (RPN) and R-

TridentNet

TridentNet is a highly advanced and innovative object detection architecture that is designed to create scale-specific feature maps that have a uniform representational power. With its state-of-the-art structure and unique features, TridentNet has quickly become a highly popular solution for those seeking accurate and efficient object detection. The Basics of TridentNet Architecture The foundational aspect of TridentNet is a parallel multi-branch architecture, with each branch of the network

U2-Net

Saliency detection is a common task in computer vision, used to identify the most important parts or objects within an image. U2-Net is a new architecture designed specifically for salient object detection (SOD). The Nested U-Structure Architecture U2-Net follows a two-level nested U-structure architecture, which allows the network to go deeper and attain higher resolution without increasing memory and computation cost. The U-structure is a popular architecture for image segmentation, consist

VarifocalNet

What is VFNet? VFNet, short for VarifocalNet, is a new approach to accurately ranking a large number of candidate detections in object detection. It is made up of two new components, a loss function called Varifocal Loss and a star-shaped bounding box feature representation. Together, these components create a dense object detector on the FCOS architecture. How Does VFNet Work? The Varifocal Loss function is a new method for training a dense object detector to predict the Intersection over A

YOLOP

What is YOLOP? YOLOP is a new technology in the field of self-driving cars that stands for "You Only Look Once Perception". It is a driving perception network that performs multiple tasks simultaneously such as traffic object detection, drivable area segmentation, and lane detection. YOLOP uses a lightweight CNN to extract image features which are then fed to three decoders to complete their respective tasks. YOLOP is considered as a lightweight version of Tesla's HydraNet self-driving vehicle

Prev 1234 3 / 4 Next