Nomisoft
Menu

MySQL Full-text Searching

18th May 2015

I recently improved the search functionality on an e-commerce site for a client to increase the relevancy of the results return. Previously the site was using a fairly basic and limited 'LIKE' select query to search for the whole entered search phrase in either the products id, title or description.

The old code


SELECT product_id FROM products WHERE product_id LIKE "search term" OR title LIKE "search term" OR description LIKE "search term"

The new approach made use of MySQL's full-text search functionality and implementing a kind of scoring system in which to order the results by how often the search terms appear in the products id, title and description.

Firstly I added Full-text indexes onto the table


ALTER TABLE products ADD FULLTEXT(product_id), ADD FULLTEXT(title), ADD FULLTEXT(description), ADD FULLTEXT(product_id, title, description);

By default MATCH... AGAINST will perform in natural language mode which was fine for my needs. More complex search algorithms can be performed by switching to use in boolean mode which opens up a vast array of operators. For example MATCH(title) AGAINST('+foo -bar' IN BOOLEAN MODE) would return results that include 'foo' in the title but don't include 'bar'.

The new code


SELECT product_id, MATCH (product_id) AGAINST ("search term") as id_relevancy, MATCH (title) AGAINST ("search term") as title_relevancy, MATCH (description) AGAINST ("search term") as description__relevancy FROM products WHERE MATCH (product_id, title, description) AGAINST ("search term") ORDER BY (id__relevancy * 10)+(title__relevancy * 5)+(description__relevancy) DESC

This approach also has the added bonus of looking for each word rather than the phrase as a whole. If a search is made with 3 words a product which has all 3 in the title will rank higher than one which only contains 2 of the words in the title. I also made sure in my code that I passed singular and plural versions of words to the search term.

If the product's id matches the search term its highly likely that's what the user was searching for therefore we apply a 'weight' of 10. The search term being in the title is more important than it being in the description so we apply a 'weight' of 5 to the title. The scores are then tallied up and the products returned sorted by this.

The results the user receives are now a lot more accurate, helping them to find exactly what they're looking for which should hopefully increase conversion rates on the site. At the same time of implementing the above I also implemented some additional tracking to record data of searches made in order to produce graphs for the client to see the most search for terms/products and how they change over time.