The post articulates how Visual Learning works in the RM2 Platform at a high-level. The Visual learning feature in RM2 is integrated into the unified architecture where visual object detection and learning are integrated to achieve real-time detection and behavior prediction in a given environment.
In order to accurately detect objects and learn from correct data associations, it is critical to extract unit data parameters toward the development of a proper foundation for the establishment of a relationship between unique parameters. This approach will result in high accuracy for the identification of objects, or for learning object behavior.
In this post, we draw an example of how unit parameters are extracted from a given image. In the following example, we selected a glass vase with flowers to demonstrate how the selected image is converted to tags for information processing routines.
Extraction of Shape, Light, and Color
The imported image processed via an extraction routine to separate the image into three different layers. The first layer works on a method to draw out the edges to its single pixel, while the second layer forms the light information and the third layer constitutes the color.
These layers are created from sub-routines that extract individual objects using an edge learning algorithm, that in itself separates the light information and associated colors with the boundaries of the object.
The depth of the object is calculated in the second frame, where the light separation routine is conducted, the parameters from light source detection and position are combined with light gradient values.
The light tags are also employed in the classification of transparent and opaque objects. In the case of mirrors, detected symmetry or fractal patterns detected may signify an anomaly, or the machine can be supervised by calibrating the central object complex (self) to learn such scenarios.
Further, the shape extracting sub-routines are supported by super subroutines that are responsible for extracting primary shapes from the object. The relationship between these unique shapes is tagged to the shape definitions of the object.
These unique extracts are matched on a grid to derive micro tags associated with pattern differentiation among shapes. By mapping against a grid, the pattern analyzer decodes the shape pattern into codes using the highlighted dots, which are activated by overlaying the shape.
The tag creator reads the activated dots to produce an alphanumeric strip using the positions of the activated dots. A sample micro tag sequence for a shape may appear like the strip below
The X positions are denoted by numbers, whereas the Y positions are denoted by letters. In the sample above, you can see that D7 represents a dot where the prefix letter represents the y position and the suffix number represents the x position of the dot. This micro-tag sequences which are based on a particular pattern, are converted to macro-labels for the rapid detection of shape patterns.
For example, if all prefixes or suffixes are sequential, it would tag as [SL], denoting a straight line. Whereas, if the start prefix and suffix are the same as end prefix and suffix it denotes a circle [CRL]. The distance between letter range (lowest-highest) gives the height of a curve in the line and the distance between number range (lowest to highest) conveys the width of a line (if bent).
Using these patterns, the routine creates a macro tag assembly for the shape that will be incorporated along with the other tags correlating to light and color to form an object tag assembly, that is further incorporated into the frame tag assembly. For machine vision applications requiring image processing, the frame tag assembly can be used for information processing and learning.
However, for real-time processing or learning from real-world images or even videos, the tag assemblies of a single frame are sequenced based on a time-stamp to generate event tags (memories). By detecting behavior patterns among objects, the machine can learn or predict actions in real-life situations.
For observational or visual-based learning, object positioning parameters play a key role, which is addressed by utilizing a spatial grid. The integrated grid allows the process to position objects employing spatial parameters, which enables the machine to match patterns for comparison that involve determiners of size, distance, orientation, and depth.
The preset grid allows for 360° spherical mapping of the surrounding environment, enabling the machine to map coordinates for a given focus area, which may be scaled using the pan rule. The grid is instrumental in giving the machine the capacity to detect the dimensions of an object, as well as its distance, orientation, depth of external surface and gaps between objects in a single scene. The Y axis allows the machine to learn about uneven terrains, potholes, cliffs, or even low-level obstructions and the like.
This method will give machines the ability to measure their environment accurately and proceed with responsible actions by taking into account every possible behavior of every object in a given environment, thereby reducing dangers and risks. Autonomous Cars, Industrial Machines, Customer Service Robots and Multi-Utility Robots can depend on such accuracy to be able to deliver risk-free high-performance services.