Yext Answers Algorithm Update: Milky Way

August 2020 - Answers Algorithm Update

Yext Answers is constantly improving its search algorithm to provide more relevant results over time. Milky Way is the first official upgrade to the Answers Algorithm and includes a series of important upgrades to provide better search precision and recall.

Using BERT to Improve NER for Locations

BERT is a breakthrough approach from Google that was added into the Google Search Algorithm in October 2019. Answers has now incorporated this same technology to improve the ability to detect location names.

Often, location names are the exact same as people or product names. For example, the following two queries both include the word, or token, "Orlando."

However, in one, the user is clearly referring to the actor Orlando Bloom and in the other, the city of Orlando, Florida. Classifying Orlando in the first query as a name and in the second one as a place is called Named Entity Recognition (NER).

It is easy for humans to tell apart the two Orlandos because we are not looking at the word not in isolation, but rather in context, and BERT allows our algorithm to do the same. In the second query, the phrase "locations near" gives both us humans and the algorithm the context to understand that Orlando is a place, not a movie star!

So, how does BERT work? BERT (Bidirectional Encoder Representations from Transformers) is designed to learn the contextual relationship between words in a text. In other words, it looks at all of the words together and learns the relationship between them. Using the approach outlined in BERT, we are able to drastically improve our ability to perform NER.

To accomplish these improvements, we manually labeled 72,916 search queries to teach "BERT" how to identify when a token indicates a location or another type of entity. Implementing BERT improved results for 72% of location-based queries made in Yext Answers. Since location-based queries make up about 18% of all queries submitted to Yext Answers, BERT will improve the results for nearly 13% of all queries. This model will continue to improve over time as we label more queries.

This feature is currently available in English, and other languages will be added soon.

Using the graph to improve location detection

When parsing out potential locations in a query, Answers considers locations all over the world. Answers has historically used location-biasing to find the most relevant locations. With this update, Answers now uses the Knowledge Graph itself to better identify locations.

To demonstrate how this feature works, we will use "Locations in Green Bay" as an example:

Answers needs to figure out to which Green Bay the user is referring. In the United States, there are two places with the name "Green Bay:" one is a major city in Wisconsin that is home to NFL's Green Bay Packers, and the other is a small community in Virginia of approximately 1,500 people.

Before this change, Answers would use the popularity of a place and the distance to the user to determine the most likely match, which is known as "location-biasing." If you searched from Virginia, Yext Answers would likely return results near Green Bay, Virginia, and if you searched from elsewhere, you would most likely see results in or near Green Bay, Wisconsin.

However, if the Knowledge Graph only includes locations in Wisconsin, Yext Answers will now interpret "Green Bay" as the city in Wisconsin, even if the user is searching from Virginia. The community Green Bay, Virginia would never be applied as a filter because the Knowledge Graph does not include any locations nearby.

This feature is available in all languages that Yext Answers supports: English, Spanish, French, Italian, and German.

Updates to the Healthcare Taxonomy

The addition of 3,000 new medical terms improves results for medical searches, reflecting both how patients and providers search.

The Yext for Healthcare Taxonomy has been updated to include over 3,000 new synonyms, conditions, treatments and procedures, reflecting both how patients search for healthcare in a patient-search-first taxonomy as well as how providers search in clinical terms. Please see the example below that demonstrates the difference between how patients and providers search.

Example patient query: "brain tumor"
Example provider query: "Glioblastoma"

Improved stemming and refined typo tolerance

In this algorithm update, typo tolerance has been reduced and more advanced stemming has been added to textSearch fields.

The balance between stemming, type tolerance, and spellcheck is a delicate one. They all help match words from the query with words in the search index that may vary slightly. With the Milky Way update, Yext Answers provides a better tradeoff between recall and precision by improving stemming capabilities and reducing typo tolerance.

Here is an example of the improved stemming in action on yext.com:
Previously, the FAQ, "Who does Yext integrate with?", would not have surfaced in the search results below, which search for the word "integrate." Now, Yext Answers stems both "integrations" and "integrate," and the FAQ is returned as the second result.

This feature is available in all languages that Yext Answers supports: English, Spanish, French, Italian, and German.

All Blog Posts