Debugging SOLR results | weKnow Inc. Skip to main content

Debugging SOLR results

I recently worked on a project which used SOLR  and the search results were not showing as expected. The client had very specific requirements on the order of the search results, comprised by complex data types with lots of fields that affected the result score.

Fortunately we had a sample result-set, such that that certain search term should return expected results. This was critical, as it provided a consistent baseline from which to adjust SOLR  boost values. That way,  we were able to test how close the results were to our objective.

Even with the sample results, things started getting a little more difficult when we tried to understand how the scores for each search result were pulled together. Did an exact match on a taxonomy term cause a boosted aggregated field to show higher in the results? Does the length of the title field affect the results? These are just a few questions we had.

SOLR  includes a handy web interface that allows you to query and even debug in detail how each result score is composed, but it is so detailed that could be difficult to read.

2.4805353 = weight(ts_title:jame in 134) [ClassicSimilarity], result of:
  2.4805353 = fieldWeight in 134, product of:
    1.0 = tf(freq=1.0), with freq of:
      1.0 = termFreq=1.0
    4.9610705 = idf(docFreq=36, maxDocs=1943)
    0.5 = fieldNorm(doc=134)

Simplifying your life with Splainer for Solr 

A tool that greatly helped us to review SOLR search results  and even suggestions, was http://splainer.io/ 

Installing it is straightforward, this is the list of steps you need to follow:

  1. Clone the repo in your local computer.
  2. Install the dependencies as mentioned as mentioned in the  README.
  3. Query your SOLR server.

Making Solr available to Splainer

Splainer runs on your host operating system. Many developers today -- including myself -- use Docker to run a local development environment. This can complicate using Splainer since it runs on your host OS, rather than inside a container. 

We can’t use an IP address or a hostname as Docker containers don’t appear as a separate server. Instead, we need to expose the Solr port on our Solr container:

  1. Open the docker-compose.yml file your editor of choice.
  2. Alternatively, you can create a new 
  3. Located the solr service in the file.
  4. Add a ports: item, and a port mapping in the format of: portOnHost:portInContainer
  5. Restart the container set: docker-compose kill && docker-compose up -d

When you’re finished, you should have something like this:

solr:
image: solr:5.5
  ...
  ports:
    # Expose SOLR port to localhost:8001
    - 8001:8983

If you only want to expose the port temporarily, you can instead create a docker-compose.override.yml file to expose the port. Then, add the override file to your .gitignore.

Getting the query 

Now that we have a way to access Solr from our host OS, we need the query that Drupal  is making to the server. Queries to Solr are sent as an HTTP GET request. There are a few ways to get that query:

  • Snoop the port using network tools.
  • Dump the Solr container logs using docker-compose logs -ft.
  • Temporarily modify the Search API module to output the query URL.

I decided to hack the Search API module. This isn’t the best practice approach, but done temporarily, it gets us the query URL as as we ran it.

  1. Locate SolrConnectorPluginBase.php in the Search API module.
  2. Locate the search() method of the SolrConnector class.
  3. Add the following right before the return $this->executeRequest():
  // Dump the Solr query URL to screen.
  \Symfony\Component\VarDumper\VarDumper::dump(
  $endpoint->getBaseUri().$request->getUri().$request->getRawData()
    );

Using this approach, you will be able to see the query Drupal is sending to the Solr server  when making the search:

Debugging SOLR results

Splain’n the Solr query 

Once we have the Solr query URL, we can put Splainer.io to work. The URL displayed by our modified Search API module uses our Solr container’s service name. That is, the name in the docker-compose.yml file. 

Splainer.io can’t use this name as it’s running outside of Docker. Since we exposed the Solr port earlier, we can use localhost for the server name, specifying the port:

Using splainer.io

After we run the the query, we will be able to see how each result composes its score, Splainer makes this easy as it includes graphs to to visualize how each field that matches adds to the search score:

Using splainer.io
Using splainer.io

Takeaway

http://splainer.io is a great tool to help you debug and refine your Solr configuration for your Drupal site. Do you know any other tools to help you debugging Solr results? Feel free to add them in the comments.

Happy debugging!