For our capstone project, my team is building a REST API for the backend of a full-stack web application. One of the project requirements is to implement a “search by keyword” feature. This basically entails the user entering a search term on the website, clicking a button, and the site would display results based on the search term entered by the user. On the backend, we are creating an API endpoint that accepts the search term from the URL query string as an input parameter, queries the database for records that include a match based on the search term, and returns all such records, if any.
As a specific example use case, suppose a user wants to search on the word “camping.” Ideally, our application should also include results with the words “camp” and “camps” as well as “camping.” One approach would be to use the SQL LIKE
operator:
SELECT * FROM my_table WHERE column_with_text LIKE 'camp%';
You can see one issue here. We would need to extract “camp” from “camping”, as well from “camps”, “camper” and any other possible forms a word can take. Not the easiest functionality to try to implement. Fortunately, the relational database we are using, PostgreSQL, supports full text search.
PostgreSQL provides two functions for implementing the type of search described above. The first is to_tsvector
, which is passed the name of a database column with text as input and returns a list of basic lexical units of a language, called lexemes. Let’s say we have the following text in our database:
We went camping in the mountains for fun and relaxation.
SELECT to_tsvector(column_with_text) FROM my_table where id = 1;
'camp':3 'fun':8 'mountain':6 'relax':10 'went':2
“camping” is reduced to its lexeme “camp”, and the same occurs for other words.
The second function, plainto_tsquery
, is passed a string and converts it to a lemexe to be searched for.
SELECT plainto_tsquery('camping');
'camp'
So our search using the LIKE
operator
SELECT * FROM my_table WHERE column_with_text LIKE 'camp%';
can instead be accomplished using full text search:
SELECT * FROM my_table
WHERE to_tsvector(column_with_text) @@ plainto_tsquery('camping')
without any need to manipulate the search term or map it to a base term.
We can easily apply the search to multiple columns:
SELECT * FROM my_table
WHERE to_tsvector(column_with_text) @@ plainto_tsquery('camping')
OR to_tsvector(another_column) @@ plainto_tsquery('camping')
OR to_tsvector(thrid_column) @@ plainto_tsquery('camping')
As you can see, PostgreSQL provides support for full text search, and we can use this to meet our project requirement to implement a “search by keyword” feature.