There are a great number of request parameters to configure Solr searches, especially when considering all of the components such as faceting and highlighting. Only the core search parameters that aren't specific to any query parser are listed here. Furthermore, in-depth explanations for some lie further in the chapter.
The parameters affecting the query are as follows:
q
: This refers to the user query or just query for short. This typically originates directly from user input. The query syntax is determined by the defType
parameter.defType
: This is a reference to the query parser for the user query in q
. The default is lucene
with the syntax to be described shortly. You'll most likely use dismax
or edismax
discussed later in the chapter.Prefer DisMax or eDisMax for user queries
For processing queries from users, you should use dismax
or edismax
, which is described later in the chapter. It supports several features that enhance relevancy, and more limited syntax options that prevent a user from getting unexpected results or an error if they inadvertently use the lucene
native syntax.
fq
: This is a filter query that limits the scope of the user query, similar to a WHERE
clause in SQL. Unlike the q
parameter, it has no effect on scoring. This parameter can be repeated as desired. Filtering has been described later in the chapter.qt
: This is a reference to the request handler described earlier. By default, it doesn't work anymore with Solr's default configuration.A query could match any number of the documents in the index, perhaps even all of them, such as in our first example of *:*
. Solr doesn't generally return all the documents. Instead, you indicate to Solr with the start
and rows
parameters to return a contiguous series of them. The start
and rows
parameters are explained as follows:
start
(default: 0
): This is the zero-based index of the first document to be returned from the result set. In other words, this is the number of documents to skip from the beginning of the search results. If this number exceeds the result count, then it will simply return no documents; but this is not considered an error.rows
(default: 10
): This is the number of documents to be returned in the response XML, starting at index start
. Fewer rows will be returned if there aren't enough matching documents. This number is basically the number of results displayed at a time on your search user interface.The output-related parameters are explained as follows:
fl
: This parameter accepts a comma- and/or space-delimited list of values that determine which fields will be present in the response documents. This parameter can be specified multiple times. We'll cover the fl
parameter details next.sort
: This refers to a comma-separated field listing to sort on, with a directionality specifier of asc
or desc
after each field; for example: r_name
asc
, score desc
. The default is score desc
. You can also sort by functions, which is a more advanced subject for the next chapter. There is more to sorting than meets the eye; read more about it later in this chapter.wt
: This is the response format, also known as writer type or query response writer, defined in solrconfig.xml
. Since the subject of picking a response format has to do with how you will integrate with Solr, further recommendations and details are left to Chapter 9, Integrating Solr. For now, here is the list of options by name: xml
(the default and aliased to standard
), json
, python
, php
, phps
, ruby
, javabin
, csv
, xslt
, velocity
.version
: This refers to the requested version of Solr's response structure, if different than the default. Solr's response format hasn't changed in years. However, if Solr's response structure changes, then it will do so under a new version. By using this in the request from client code, a best practice, you reduce the chances of your client code breaking if Solr is updated.As noted in the preceding section, the fl
parameter is used to specify which fields are included in each of the response documents. The fl
parameter accepts a wide range of value types, all of which can be freely mixed together in any order:
fl
parameter cause the same fields to be present in the response documents; for example, fl=a_name
.fl=sum(1,2,sum(3,4))
.fl=new_name:original_name
syntax. The result of a function call can also be aliased with fl=ten:product(2,5)
.score
to the fl
parameter.*
to refer to all fields and/or partially matching field names. For example, if you want only fields that start with a_
, you would use fl=a_*
.fl=[explain style=text]
. Custom transformers can be created using Java, and there are several built-in transformers available:docid
: This adds the Lucene internal document ID to each document.shard
: This adds the name of the SolrCloud shard that produced the result to each document (Chapter 10, Scaling Solr, documents SolrCloud).explain
: This embeds explain information for each document. This transformer accepts an optional style argument set to one of these values: nl
, text
, or html
. Solr's explain is covered in the following section on debugging.value
: This adds static values to each document. This transformer has one required parameter, v
, which sets the value of the field. An optional type parameter t
can be set to one of these values: int
, double
, float
, and date
. An example is [value v=1 t=double]
.Each of the types aforementioned can be combined and aliased as needed. Here's an example URL that makes use of the many valid fl
values: http://localhost:8983/solr/mbartists/select?q=*:*&fl=type,a_*&fl=theScore:score,three:sum(1,2),luceneID:[docid]
.
And here is a sample document from that query response:
<doc> <str name="type">Artist</str> <str name="a_name">F.D. Project</str> <date name="a_release_date_latest">2004-11-30T00:00:00Z</date> <float name="theScore">1.0</float> <float name="three">3.0</float> <int name="luceneID">0</int> </doc>
These diagnostic parameters are helpful during development with Solr. Obviously, you'll want to be sure NOT to use these, particularly debugQuery
, in a production setting because of performance concerns.
indent
: This is a Boolean option that will indent the output to make it easier to read. It works for most of the response formats.debugQuery
: If true
, then following the search results is <lst name="debug">
with diagnostic information. It contains voluminous information about the parsed query string, how the scores were computed, and timings for all of the Solr components to perform their part of the processing such as faceting. You may need to use the View Source
function of your browser to preserve the formatting used in the score computation section. Debugging queries and enhancing relevancy is documented further in the next chapter.
explainOther
: If you want to determine why a particular document wasn't matched by the query or why it wasn't scored highly enough, then you can set a query for this parameter, such as id:"Release:12345"
, and output of the debugQuery
will be sure to include the first document matching this query in its output.
debug
: This is a parameter to specify individual debugging features—add a debug
parameter pair for each of these values as desired:query
: Returns information about how the query was parsedresults
: Returns scoring information for each matching documenttiming
: Returns component timing informationtrue
: Equivalent to debugQuery=true
echoHandler
: If true
, then this emits the Java class name identifying the Solr request handler.echoParams
: This controls whether or not query parameters are returned in the response header, as seen verbatim earlier. This is used to debug URL encoding issues, or to verify the complete set of parameters in effect—those present in the request (the URL plus HTTP post data) and those defined in the request handler. Specifying none
disables this, which is appropriate for production real-world use. The default value is explain
, which causes Solr to include only the parameters present in the request. Finally, you can use all
to include those parameters configured in the request handler in addition to those in the URL.debug.explain.structured
: When true
, the result of the score explanation is returned as structured data.Finally, there is another parameter that is not easily categorized called timeAllowed
. This parameter accepts a value in milliseconds, which is a threshold used as the maximum time for a query to complete. If the query does not complete by this time limit, intermediate results are returned. Long-running queries should be very rare, but this allows you to cap them so that they don't overburden your production server.