How Loop54's Language Model works

Loop54’s language model goes well beyond basic text-matching. In addition to classic NLP problems, Loop54 utilizes query transformation and expansion to “fuzzily” match the query to words found in the product metadata.

As shown in Figure 2 below, the Loop54 language model is augmented with a second, proprietary model, aptly named by Loop54 as “Generalized Organization in Layered Expanding Maps” (GOLEM). This second, in-house built model, consists of thousands of layered Neural Networks specialized in clustering quantifiable data objects and is designed to determine the “context” behind each and every search query (learn more about the evolution of the Loop54 algorithms). So for a given query, it determines what area (or context) of the catalogue the most relevant results (i.e. products) are located - and uses visitor behaviour to continuously improve relevancy.

Language model

Figure 2: Loop54’s search engine includes a language model and a system of neural networks

Simply put, Loop54’s language model is a combination of a data structure (e.g. Trie Tree, Ternary Search Tree, etc.) to efficiently find which products contain certain words, and a collection of tools to modify strings and traverse the data structure in different ways (e.g. fuzzy matching, stemming, etc.)

Trie Tree

Illustration of Loop54 trie tree data structure

Whereas traditional search engines would solve these problems with word clustering and synonym dictionaries, Loop54 has spend 3+ years in R&D building two models that, when put together, are capable of deciphering the nuances and ambiguity of language while at the same time not being dependent on the actual words used to relevantly rank results.

Finding the words and defining the match types

Loop54 traverses the language model data structure to find different types of matches to the query.

The different match types are:

In the simplest case, the engine will receive a query like "chair", and will traverse the trie tree to find if the model contains the word "chair".

match types

Exact - the word exactly matches the query (chair)

If it finds a match, it returns a word with a specified match type depending on the type of match it is. It may find the exact word “chair” in the model, and therefore would return Exact("chair").

Fuzzy - it approximately matches the query (cheir)

Partial - the query is a partial of the word (chairman).

But it may also find "chairs" - where “chair” is a partial match and so would return a match type of Partial("chairs").

Reverse Partial - the query is a reverse partial of the word (armchair)

Compound - the query is a compound of several words (spacechair → space + chair)

SubWords - the query contains components of other words. Unlike compound matches, where all words must exist, with Subwords only the first word must exist (chairxxxx → chair)

Queries usually consist of a bunch of combinations of these match types, so may look something like:

[Exact(chair), Fuzzy(chairs), ReversePartial(armchair)...]

If the query contains several words, each word would be represented as a list of possible matches. This is called a Query Collection.

For example, the Query Collection for “black chair” could be represented as two match type containers, one for each word:

[
[Exact(black), Partial(blackish), Fuzzy(block)...],
[Exact(chair), Fuzzy(chairs), ReversePartial(armchair)...]
]

Fuzzy matching explained [video]

Using Machine Learning to overcome the nuance and complexity of language

Imagine walking into a physical store and asking for a “bag”, and the store clerk hands over all the products called “bags” for you to look at, but not any of the products labelled “sack”, “backpack” or “duffel”.

This would never happen in a physical store because the salesperson would point you to the section of the store where all relevant bags types are found. Moreover, the salesperson may be able to tailor his/her recommendation based on information ascertained from your loyalty membership profile or by asking a few simple questions before pointing you in the right direction.

With Loop54’s advanced Machine Learning model, the same customer experience is possible online. Loop54’s proprietary models map the relationships between products so that the shopper is no longer inhibited by the words they use. The search engine can quickly and accurately direct shoppers to the right area, or “context”, of the catalogue, regardless of the words they used in their query. Whether it be “bag”, “sack” or “duffel”, Loop54 will locate and present the shopper with the most relevant results.

neural network

Illustration of Loop54’s Machine Learning model