Friday 27 May 2016

The Best Django Haystack and Elasticsearch Tutorial



Search is one of the most basic and major requirements of any modern web-app. Search helps your users find the relevant content, articles, products etc. I have been developing web application for a decade now and frankly I have not seen a single web app that does not implement search. Without the search functionality an app’s user experience goes for a toss as most of the users would not be able to reach the content on your app that they are looking for and in such a case it is very unlikely that they’ll ever return back to your app. With that being said, today’s article will show how you can setup search functionality in your django web app using haystack and elastic search. Let’s Look into what role each of these play.

The Django Framework

Django framework is one of the most popular web application development frameworks written in python to develop amazing we application within the deadlines. If you have reached this article, I assume you already have a fair knowledge about django and how it works. So we are not gonna go in details and straight-up move to the next to applications that we’d required to implement search functionality in our django app.

Elastic Search

Elasticsearch is a lucene based search server that offers full text search and supports schema-less JSON Documents. In Laymen terms, this is the search engine that indexes (stores) the searchable content for your web application. Elasticsearch is written in Java and can be used as a stand-alone service. It offers an API that can be used to communicate to the search server over HTTP.

Haystack

Haystack is a django app that acts as a wrapper over multiple search backends (servers) including elasticsearch, solr etc. Different search server may work differently and communicate differently and also serve the search results differently. Haystack implements generic methods and classes that can be used with multiple search servers. So for instance say you have been using elasticsearch for your web application but due to some requirement you need to switch to another search server (for example Solr). Just imagine all the work you’d require to put in to change the entire search flow just because you chose to try another search engine. With haystack it would only take a one line setting change and your search would work as it currently does.

Now we have a basic idea about what django elasticsearch and haystack are and what roles do they play. So let’s put them all together and I promise you it wouldn’t take more than 10 mins to have full text search up and running for your django web application.

Django Haystack and Elasticsearch

I am assuming that you already have a django app to work on so we are gonna dive directly into setting up haystack and elasticsearch. Let’s first install haystack using pip.

pip install django-haystack

The above will install the latest stable release of the django-haystack package. In case you are looking to install the development version of haystack you can use the following command rather.


pip install -e git+https://github.com/toastdriven/django-haystack.git@master#egg=django-haystack


once you have installed django-haystack, add it to your INSTALLED_APPS in your django project’s settings file.


INSTALLED_APPS = [
    ...
    'haystack',

]


Now let’s download, unzip and run elasticsearch before we configure haystack to use elasticsearch. You can download the latest version of elasticsearch from the official elasticsearch website. Once you have downloaded the latest package, unzip it at a desired location on your computer. For this example I am going to extract it in ~/es folder.


mkdir ~/es
mv ~/Downloads/elasticsearch-5.0.0-alpha2.tar.gz ~/es
cd ~/es
tar -xzvf elasticsearch-5.0.0-alpha2.tar.gz
cd elasticsearch-5.0.0

Once the files from the package have been extracted you can simply start your elasticsearch server using the following command.


./bin/elasticsearch -d



Once elasticsearch has been started you can confirm it by going to http://127.0.0.1:9200/ in your web browser. If elasticsearch has been successfully started, you’ll see some information like the following.

elasticsearch

Great! Now let’s tell haystack that we want to use the elasticsearch server to index our searchable content. In order to do so we’ll need to define a backends setting for haystack in our projects settings.py file.


HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
        'URL': 'http://127.0.0.1:9200/',
        'INDEX_NAME': 'products',
    },
}

Add the above to your settings.py file. the ENGINE parameter in the above configuration tells hasytack which backend engine to use. In our case we set it to elasticsearch. INDEX_NAME defines the default index name that you would like to store your data in and query your data from. The URL configuration option tells haystack the address where elasticsearch will be available.



Well, now we have the basic setup up and running. Let’s add some data to our search index. You can do this according to your project. For this example I am going to index the data from a model name Product from my app called catalog.

This is how my product model looks like.


from django.db import models

class Product(models.Model):
    name = models.CharField(max_length=244)
    description = models.TextField()

    def __unicode__(self):
        return self.name

In order to store my products data into elasticsearch, I am required to create a search_indexes.py file in the application where my models reside. So I will create the file first.


touch apps/catalog/search_indexes.py

Now I will create a ProductIndex in this file to tell haystack, what data from my Product model I want to store in the search engine.


from haystack import indexes

from apps.catalog.models import Product


class ProductIndex(indexes.SearchIndex, indexes.Indexable):
    text = indexes.CharField(document=True, use_template=True)
    name = indexes.CharField(model_attr='name')
    description = indexes.CharField(model_attr='description')

    def get_model(self):
        return Product

    def index_queryset(self, using=None):
        return self.get_model().objects.all()


As we notified haystack through our product index that we’d like to use a template for our text field. We’ll now create it. In your main templates directory create a file called search/indexes/catalog/product_text.txt. Make sure you change the path to use your own app name and index name. So if your app is called blog and your model & Index are called Post and PostIndex respectively then you should create a file called search/indexes/blog/post_text.txt

now we’ll add the searchable information in the template file we just created.


{{ object.name }}
{{ object.description }}


As I had only two fields and I wanted to use both in the search template I added them, you can use any text fields from your model that you’d like to search on.

Next thing we’d do is to add the search urls to our urlconfig (urls.py) file.


(r'^search/', include('haystack.urls')),

Now add a template that implements a search form and shows the search results at templates/search/search.html

{% extends 'base.html' %}

{% block content %}
    <h2>Search</h2>

    <form method="get" action=".">
        <table>
            {{ form.as_table }}
            <tr>
                <td>&nbsp;</td>
                <td>
                    <input type="submit" value="Search">
                </td>
            </tr>
        </table>

        {% if query %}
            <h3>Results</h3>

            {% for result in page.object_list %}
                <p>
                    <a href="{{ result.object.get_absolute_url }}">{{ result.object.title }}</a>
                </p>
            {% empty %}
                <p>No results found.</p>
            {% endfor %}

            {% if page.has_previous or page.has_next %}
                <div>
                    {% if page.has_previous %}<a href="?q={{ query }}&amp;page={{ page.previous_page_number }}">{% endif %}&laquo; Previous{% if page.has_previous %}</a>{% endif %}
                    |
                    {% if page.has_next %}<a href="?q={{ query }}&amp;page={{ page.next_page_number }}">{% endif %}Next &raquo;{% if page.has_next %}</a>{% endif %}
                </div>
            {% endif %}
        {% else %}
            {# Show some example queries to run, maybe query syntax, something else? #}
        {% endif %}
    </form>
{% endblock %}

That’s it. Now we’ll build the index by running the following command.

python manage.py rebuild_index


Now you have full text search enabled on your django project. Now in case elasticsearch does not suit your requirements or you want to try some other search server. All you’d need to do is Install and run the new search server, change the engine settings in your settings.py file and finally run python manage.py rebuild_index. For example if we switch to SOLR.

HAYSTACK_CONNECTIONS = {
    'default': {
        'ENGINE': 'haystack.backends.solr_backend.SolrEngine',
        'URL': 'http://127.0.0.1:9200/',
        'INDEX_NAME': 'products',
    },
}

No comments:

Post a Comment