The post provides an overview of how the RM2 Network employs unsupervised learning to process Natural Language using reference visual inputs along with the object label, just as humans do. We believe that in order to deliver effective machine-human communication, we need to integrate visual cues with language that will provide the ability to learn, reason, explain abstracts and understand the sentiment in a given conversation and help maintain context at all times.
In order to explain how the language processing works, we present an overview of the entire network and how unsupervised learning is conducted for autonomous learning
The RM2 Network is a hybrid model for Unsupervised Learning that combines aspects of Kohonen’s Self-Organizing Map (SOM) and Recurrent Networks like Hopfields Network. Click here to read more on all the existing models that influence the hybrid.
On a high level, the working of RM Network can be explained as follows
(1)The inputs are stored as specified by the data-aware hierarchical semantic model, which organizes the data into a tag assembly (patterns) with encoded weights (0,1). The objective is to achieve the formation of the object layer from the input layer, and these objects further form a network between other objects, connected by a particular time stamp, which is further clustered based on object reference parameters to create the top layer, which can be termed as the memory layer.
The inputs that can be visual, sound, language, touch or any other sensory data, are handled in the similar fashion wherein all build a relationship to the object layer, making it easy to associate with a given visual or an associated memory. You can say that the memory layer has complete information (in a hierarchical fashion) about a particular scenario with time parameters, objects, shapes, colors, labels (names), behaviors, derivative fields and outcome associations. All of the data is converted to a tag assembly.
(2) During processing, the new tag assembly received is compared to the existing tag assembly of an object neuron, where the match reveals the differences and similarities between two tag assemblies. The similarities strengthen the relationship, and the differences are matched at the input layer. If the match is not available, the system creates a new node and auto-labels with a new string sequence. On exposure to the language tags, the objects can create a network of labels to update natural language words and its associations using visual cues. This will allow the machine to understand conflicting patterns (reasoning) and arrive at possible pattern types that can undo the conflict (planning solutions).
(3) The Weights, which are the core for state activation, work at an individual input layer, and cascades all the way to the top-level to arrive at the cumulative weight so as to understand the threshold. On every input, weights are added to the previous iteration’s output, and when it reaches a particular threshold, the state changes to 0 or 1.
NLP Layer of Labels
The Natural Language processing is a subset of the overall unsupervised model. An NLP Network is visible from the labels assigned to each node, along with classifiers. As the machine is exposed to language, based on matching parameters, the machine would update the label of the node with natural language keywords.
The language is primarily based on the relationship between objects. The data organization structure of a natural language may be explained as follows: every noun is a node wherein a relationship is built between two nouns, based on occurrence. Each relationship has an associated verb that defines the action. Preposition defines the position and direction of the node and its relationship. The adjective is used to assign weights to a particular node, and the adverb is used to assign weights to a particular relationship. The interjections contribute to the deduction of sentiment in a given sentence and the conjunction is just a construct rule in a conversation that coordinates two or more scenarios.
Over this data relationship, the algorithm will organize data based on the extracted POS (parts of speech) in a given sentence (questions, statements) in order to create assemblies in real-time. The routine of the algorithm can be broken into two simple steps that can be made to work either by supervision or through autonomous learning:
2. Matching Assemblies
The diagram below explains how the flow of Decomposition and Reassembly:
Decomposition: The Decomposing routine involves extracting words and computing their weights based on rules to arrive at a present state containing context tags and sentiment tags to be used in matching assemblies. When the user inputs (speaks or texts), the input sentence is tagged for available special expressions in the sentence to distribute weights. In parallel, the input sentence is parsed through the word extractor for POS (Parts of Speech) tagging with appropriate word relationship. The tagging process between the nouns and verbs of a sentence reveal the relationships defined at nodes to create contexts. The adverb/adjective defines the degree of the relationship, which contributes the cumulative score of the context
Matching Assembly: Based on the tags received, the matching assembly will look for the activated nodes to assemble an answer. Based on past learning and associations, the machine can reassemble an output using individual keywords. Past Keyword relationships and weights can help in learning and quick response generation. For supervised activity, the administrator can configure weights or can manually supervise the answer database.
To summarize, the algorithm facilitates real-time detection of sentiment and relevance during a chat conversation, the proposed system aims to incorporate a real-time scoring method to deduce weights and states in order to understand sentiments and context references and pass the tags to the reassembly algorithm to formulate answers in real-time.
How Does Reasoning Work?
The tag assemblies that hold the relationship weights are the primary enablers for the machines to learn, question, and reason. Learning and decision-making, which incorporates questioning and reasoning, are dependent on the way synthesized data (knowledge derivatives) are organized. Organizing data in a natural fashion allows one to detect spatial and temporal relationships between data parameters. These spatial and temporal data patterns have macro weights that pertain to the pattern set. Conflicts between these weights push data relationships to an ambiguous state, allowing the machine to generate a question or reason, in order to achieve a confirmative state, which is defined by the primary rule of the long-term objective.
It is important to understand how natural intelligence exhibits an integrated approach in managing visual nodes and their respective labels. These labels are utilized to assemble languages so that it can orally articulate explanations for the reasoning that it makes, or to explain an abstraction (quantificational schemata).
For example, if you were wondering how the machine would understand the difference between past, present, and future tenses, you would see that these insights may be detected within the temporal layout of the associated data parameters. Using the weight scale, it can easily create a time dimension to properly place the object and associated actions and translate the results to language.
Below is another example of how a machine might gather impetus to ask a question, or be in a state of doubt. In the given two questions that, (a) every person is a man or a woman, and (b) Addison is a man and a woman. How would a computer know which one is true? Such data conflicts lead to questions.
But how did the machine detect this conflict? In the above scenario, the system generates four tags and creates a relationship based on available terms. So we have two relationships for the word “man” and “woman” i.e., “person” and “Addison”. However, due to weights, the relationship would have a positive weight for “and” and a negative weight for “or”.
Due to conflicting weights in a similar relationship, the confirmation logic fails to trigger and the relationship is in an unconfirmed state. As the global rule of the algorithm is to achieve a confirmed state, it revisits the node and generates a question for confirmation.
If stated that “Addison is not a woman”. the relationship between Addison and woman (which had positive weights) is negated by “not” and achieves a zero state where the relationship is lost. Now that there is no conflict, the confirmation state is achieved based on the available relationship.
In order to avoid confirmation in the first instance, the default confirmation weight on the platform is set to 25 such exact matches of the entire association, in order to achieve a confirmation state. This ensures that the machine arrives at the correct decisions following repeated observation or occurrences.
How Does Abstracting Work?
During one of our workshops, a question was raised as to how a machine might comprehend words such as “filibuster or awakening” when encountered during a conversation or reading. This workshop helped us to detect how the machine may comprehend a pattern of this sort.
For example, when the word filibuster was encountered, the machine scanned for an available match, and when not located might consult a dictionary file, just as humans would do. The dictionary file explained filibuster as “an action such as prolonged speaking which obstructs progress in a legislative assembly in a way that does not technically contravene the required procedures.” Using the POS segregation routine, the machine was able to detect objects and an action reference, and it created a spatial and temporal map between the detected references.
The pattern detected within the machine may be explained as follows: Using nouns, it creates an object cluster (from legislative assembly) with spatial parameters. Using a general word such as ‘assembly’, it creates an object network, and using verbs, it creates the relationship between these nodes, which in this case may be referred to as speaking. This word association (prolonged) allows for the setting of a degree to the association. The word ‘progress’ allows the machine to create a temporal weight for the above set, whereas the word ‘obstruct’ allows the construction of a degree to the temporal weight. Each time an adjective or adverb is presented; it creates a degree comprised of three states (low, normal, and high). As temporal weights have priority over spatial weights, the resulting cumulative weight is negative, inclining its decision toward the “not-to-do” list.
Relationships and weights play a critical role in enabling the machine to comprehend in a similar way way as humans, and exhibit intelligence through perception, learning, reasoning, and abstracting, and do so more naturally than hard-wired machines can.