Chapter 6. Searching the Product Catalog

Search is an extremely powerful and important addition to any e-commerce website. A lousy search engine means the customer will not find what they want, will not make a purchase, and will probably go someplace else. Far too many websites fail to provide their users with an adequate search engine. It's not clear, exactly, why this has been so, but it could be partly due to the difficulty of implementing search technology.

Not only is Django easy to integrate with open source search engine software, but there are dozens of community projects that add sophisticated search functionality to any Django app with a minimum of effort. This chapter will explore Django and search, including:

  • The naïve, but simple, search strategy
  • Simple MySQL-based index searches
  • An overview of open source search engine tools
  • Configuring the Sphinx, Whoosh, and Xapian engines
  • Using the Haystack Django search module

Stupidly simple search

By far the easiest search functionality in Django is to simply use the ORM filter method and a contains clause. We'll call this the naïve approach to search, but it can be useful for quick, relatively simple search needs or when we want to specifically search an individual field for an exactly matching query term.

The contains clause makes use of the SQL LIKE statement, which performs a case-sensitive match against the table column. Django also supports the icontains clause, which performs the same function but uses SQL's ILIKE statement to perform case-insensitive matches.

A naive filter-based search can be performed like so:

>>> results = Product.objects.filter(name__icontains='cranberry sauce')

Note that the icontains lookup requires the search term to exactly match the field contents. Thus in the previous example any Product object with a name field containing just 'cranberry' or just 'sauce' will not be included in the results. Only exact matches for 'cranberry sauce' are matched.

We can affect a more keyword-like approach by splitting our search term on whitespace and passing it to a Django in clause:

>>> terms = 'cranberry sauce'.split()
>>> results = Product.objects.filter(name__in=terms)

Here the results QuerySet will include any Product object whose name exactly matches on of the terms 'cranberry' or 'sauce'. Now we've created the opposite problem, namely that a Product whose name field is 'cranberry sauce' (the exact term used for searching) will not be found in the results QuerySet. There are many similar hacks one could attempt, but they would only marginally increase our results.

Even if these methods returned better results, they still lack any kind of ordering. The order_by ORM method can be used to order on fields, but this is completely disconnected from our search. There is no concept of relevance when using the ORM filter method. This is as it should be because most of the time this functionality is not desired. The ORM methods are designed to be quick methods of retrieving database objects.

As you can see, Django's built-in ORM is not designed for the search problem. They are of very limited use for the kinds of search functionality to which users have grown accustomed in web applications, especially in the e-commerce space. Amazon.com's extremely sophisticated search functionality is the standard many users now expect.

To achieve better search results, we must move on to more sophisticated tools. Django has very limited support for searching builtin to the framework, but in the next section we will examine the primary method that is included. It's not fully automated, however, and will require us to directly manipulate our database tables to get up and running.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset