Jul 7, 2014. Sphinx in Docker. The basics.

With an ear to the interwebs, you’ll hear a few things about Docker. Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications. In this blog post, I’m going to outline a very basic example of how to use Sphinx from within a Docker container.

What is Docker?

This question is answered quite nicely here. The main points (taken directly from the site):

  • With Docker, developers can build any app in any language using any toolchain. “Dockerized” apps are completely portable and can run anywhere.
  • Sysadmins use Docker to provide standardized environments for their development, QA, and production teams, reducing “works on my machine” finger-pointing. By “Dockerizing” the app platform and its dependencies, sysadmins abstract away differences in OS distributions and underlying infrastructure.
  • How is Docker is different than virtual machines? The Docker Engine container comprises just the application and its dependencies. It runs as an isolated process in userspace on the host operating system, sharing the kernel with other containers. Thus, it enjoys the resource isolation and allocation benefits of VMs but is much more portable and efficient.

They present a nice list of use cases here. Take a look at them for inspiration.

Docker and Sphinx

I’ll leave it up to you to decide what this means for your team, but for the curious, the rest of this post will outline what it might look like to use Sphinx in a Docker container.

To follow along, download Docker and check out this github repo (which contains a Dockerfile, a short and sweet Sphinx configuration example, and some .sh files). You may notice that I’ve already outlined the following steps.

Or!

Go pull this image from the Docker hub (and skip the build step):

sudo docker.io pull stefobark/sphinxdocker

Clone and build

In this example, I’m running Ubuntu 14.04, so “docker” becomes “docker.io”. After cloning the github repo, your next step will be to go to the appropriate directory and build the container.

sudo docker.io build -t sphinx .

The Dockerfile consists of a list of commands that adds the Sphinx PPA and installs Sphinx-beta, creates some directories, ADDs our .sh files, and exposes port 9306 to the host machine. After running the command above, Docker will run through the steps outlined in the Dockerfile and should eventually tell you that the image was successfully built.

Run

Now, with our image successfully built, we can start up our Sphinx container. Like this:

sudo docker.io run -p 9311:9306 -v /path/to/local/sphinx/conf:/etc/sphinxsearch/ -v /local/data/directory:/var/lib/sphinx -d sphinx ./indexandsearch.sh

What’s happening?

  • -p 9311:9306 opened 9311 on the host machine to 9306 within the container. And, in our Sphinx configuration file, we have Sphinx listening for MySQL protocol on 9306. Therefore, we should be able open up the command line interface to Sphinx on the host machine’s 9311.
  • with -v we’re adding volumes to the container. Now, we can keep the Sphinx configuration file, and the various index data files, on the host machine. This is quite handy, and it’s pretty important for the index files, as they won’t survive unexpected container shutdowns.
  • -d daemonizes the container
  • ./indexandsearch.sh finally runs indexer and starts searchd.

Is it running?

Now is a good time to check:

sudo docker.io ps
CONTAINER ID        IMAGE                     COMMAND                CREATED             STATUS              PORTS                    NAMES
250a08d4adc1        sphinx:latest             ./indexandsearch.sh    8 seconds ago       Up 3 seconds        0.0.0.0:9311->9306/tcp   backstabbing_fermi   
8fd479563a87        stefobark/mysql:latest    /run.sh                2 days ago          Up 36 hours         0.0.0.0:3311->3306/tcp   elegant_albattani

You can see the container is running, the status column shows us that it’s been up for 3 seconds. You can also see that we have a running MySQL container.

If you don’t see a running Sphinx container, take a look at the logs to see what happened. Grab the container id and check docker.io logs (container id). It should be pretty easy to see what went wrong and adjust accordingly.

sphinxy.conf

When we started this Sphinx container, we shared the directory on our host machine containing our very basic configuration file. In that file, “sphinxy.conf”, I’ve pointed Sphinx to a MySQL datasource. To set up this source, I got the container ID of the running MySQL container and ran docker.io inspect (container id).

Here is the result of inspecting that container:

 "NetworkSettings": {
        "IPAddress": "172.17.0.2",
        "IPPrefixLen": 16,
        "Gateway": "172.17.42.1",
        "Bridge": "docker0",
        "PortMapping": null,
        "Ports": {
            "3306/tcp": [
                {
                    "HostIp": "0.0.0.0",
                    "HostPort": "3309"
                }
            ]
        }

I took this and used it to define the datasource (this could, obviously, be different for you. You can point Sphinx to whatever datasource you like).

source src1
{
    type              = mysql
    sql_host          = 172.17.42.1
    sql_user          = admin
    sql_pass          =
    sql_db            = testing
    sql_port          = 3309 
 
    sql_query         = select * from test
    sql_field_string  = content
    sql_attr_uint     = g_id
}

This configuration file is really naked (I’m using a lot of default settings), but if you take a look at the rest of the file, take note of the paths it’s setting and that (as was mentioned above) we’re telling searchd to listen for MySQL protocol on 9306.

Go check our documentation to learn more about these settings.

indexandsearch.sh and searchd.sh

indexandsearch.sh is simple:

#!/bin/bash
 
/usr/bin/indexer -c /etc/sphinxsearch/sphinxy.conf test
./searchd.sh

It runs indexer, using sphinxy.conf, which has been mounted to the container from the host machine directory we just shared with the -v option. It finishes by running searchd.sh (which… starts searchd).

Command line search

Now, to be certain that Sphinx has successfully indexed our test data and is ready to serve searches, let’s try this (from the host machine):

mysql -h127.0.0.1 -P9311

or

mysql -h0.0.0.0 -P9311

or, even quicker

mysql -h0 -P9311

This should result in:

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 2.2.3-id64-beta (r4690)
 
Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.
 
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
mysql>

Don’t be fooled. We’re talking to Sphinx here. You can see that the server version is 2.2.3-id64-beta (r4690). Now, go ahead and try searching the ‘test’ index.

select * from test;

At this point, if everything went well, you should see whatever data you chose to index. That’s it. Docker seems pretty convenient. Give it a try.

Video

To finish, here’s a video that goes through each of these steps, demonstrating what it looks like to run Sphinx in a container:



Have you tried Docker with Sphinx? I’d be happy to hear what you thought about it.

Tags: ,


« »

2 Responses to “Sphinx in Docker. The basics.”

  1. Dave says:

    Any chance to get it on a centos container?

  2. steve says:

    Hi Dave, funny you should ask. Just got a comment from Leonardo Di Donato, on the video, about a Sphinx image he built with centos. Here it is:

    Some months ago I’ve created a docker image for Sphinx Search (beta and stable) and open sourced them on github (i.e., https://github.com/leodido/dockerfiles/tree/master/sphinxsearch:latest).
    This image is built on a CentOS image and compiles Sphinx Search from source (with support for libstemmer, mysql, odbc, xml with iconv, etc. etc.). It also provides directories to mount volumes ..
    Since this docker image has been published on the official docker repository (i.e. https://registry.hub.docker.com/u/leodido/sphinxsearch) you can use it simply typing “docker.io pull leodido/sphinxsearch” into your console ..

Leave a Reply