Understand more about full-text search for Rails developer

If you’re a Rails developer, then you might have came across to use `pg_search` gem. It’s allow you to do full-text search in a Rails app very easily. All you need to do is to add the gem to your Gemfile and do the bundle install then include the module into a Rails model that you want to search against.

After all that you tested out the search functionality and all worked very nice then you decided to improve the performance by adding full-text search functionality. For example you have the following code

pg_search_scope :search_by_name, against: :name, using: {
  tsearch: {
    tsvector_column: "tsvector_name_tsearch",
    dictionary: "english",
    prefix: true
  }
}

Now you’re happy with the performance boost. But you’re also realize that the search doesn’t return all result as before. For example: you have an article with the following name `Business landing page` and the search `landi` or `landin` doesn’t return any result at all but `landing` return the correct result. So why is that ?

If you’re notice a little bit, you can see that in the code above we’re using dictionary is english as the way to do the index. It means that when you have the following content `Business landing page` then here is the list of tokens that Postgresql will do the index


"Business landing page" =>
  ["business",
   "businesses",
   "land",
   "lands",
   "landed"
   "landing",
   "page",
   "paged",
   "pages",
   "paging"
  ]

`landi` or `landin` are not the correct English words and they never got indexed so that why they never got return.

More  about dictionaries

  • Simple dictionary: The simple dictionary template operates by converting the input token to lower case and checking it against a file of stop words. If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the lower-cased form of the word is returned as the normalized lexeme.
  • Synonym dictionary: This dictionary template is used to create dictionaries that replace a word with a synonym.
  • Thesaurus dictionary: A thesaurus dictionary is a collection of words that includes information about the relationships of words and phrases
  • Ispell dictionary: The Ispell dictionary template supports morphological dictionaries, which can normalize many different linguistic forms of a word into the same lexeme. For example, an English Ispell dictionary can match all declensions and conjugations of the search term bank, e.g., banking, banked,banks, banks', and bank's.

 

#References

Advertisements
Understand more about full-text search for Rails developer

One thought on “Understand more about full-text search for Rails developer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s