engineering

engineering

Bistri Gource System Branch

Bistri Team October 6, 2011 engineering

Bistri Gource System Branch from bistri on Vimeo.

The Bistri engineering team is proud to show their work in an unconventional way… using lovely flowers, animated by Puppet 🙂

The Bistri project is displayed by Gource as an animated tree with the root directory of the project at its centre. Directories appear as branches with files as leaves. Developers can be seen working on the tree at the times they contributed to the project.

Implement Full Text Search with Postgresql

Bistri Team September 24, 2011 engineering

For our first post, we will talk about the last feature we released on Bistri.com, the contact search.

Previously the contact search was made of database requests with multiple LIKE and OR in the WHERE clause which as you may know lead to high response time and cpu usage. Moreover any typo in the search string change completely the result, e.g. a search on ‘Tom‘ will never bring ‘Thomas‘ as a result because it’s a strict string comparison.

What we wanted for Bistri.com is a quick and typo friendly contact search. With the help of our Postgresql guru Stéphane, we quickly move to a specific Postgres contrib called Trigram (aka pg_trgm).

Install the contrib

Trigram like other contribs is located in

//share/postgresql/contrib

For osx, the following command will install the contrib in the database in parameters :

psql -U postgres  -f /Library/PostgreSQL/9.0/share/postgresql/contrib/pg_trgm.sql

Prepare database

You need to create a special index on the requested column. Let’s say you want to request the name column in the table user, this is the syntax :

create index idx_user_name  on user using GIN(name gin_trgm_ops);

Requesting your column

Now you may want to test, for this you can use the similarity function, which display how the result approachs your request :

select name, similarity(name, 'tom') from user where name % 'tom';

concatenation | similarity
--------------+------------
Toma Sha      |     0.4
Thomas        |     0.3
(2 rows)

Please note that the % operator only will back results whose the similarity amount is higher than 0.3. This a default parameter, you can tune it with the set_limit function.

select set_limit( 0.1 )