Configuring Apache Solr 4.x for Drupal, with password authentication

Most of high traffic or complex Drupal sites use Apache Solr as the search engine. It is much faster and more scaleable than Drupal's search module.

In this article, we describe one way of many for having a working Apache Solr installation for use with Drupal 7.x, on Ubunutu Server 12.04 LTS. The technique described should work with Ubunut 14.04 LTS as well.

In a later article, now published at: article, we describe how to install other versions of Solr, using the Ubuntu/Debian way.

Objectives

For this article, we focus on having an installation of Apache Solr with the following objectives:

  • Use the latest stable version of Apache Solr
  • Least amount of software dependencies, i.e. no installation of Tomcat server, and no full JDK, and no separate Jetty
  • Least amount of necessary complexity
  • Least amount of software to install and maintain
  • A secure installation

This installation can be done on the same host that runs Drupal, if it has enough memory and CPU, or it can be on the database server. However, it is best if Solr is on a separate server dedicated for search, with enough memory and CPU.

Installing Java

We start by installing the Java Runtime Environment, and choose the headless server variant, i.e. without any GUI components.

sudo aptitude update
sudo aptitude install default-jre-headless

Downloading Apache Solr

Second, we need to download the latest stable version of Apache Solr from a mirror near you. At the time of writing this article, it is 4.7.2. You can find the closest mirror to you at Apache's mirror list.

cd /tmp
wget http://apache.mirror.rafal.ca/lucene/solr/4.7.2/solr-4.7.2.tgz

Extracting Apache Solr

Next we extract the archive, while still in the /tmp directory.

tar -xzf solr-4.7.2.tgz

Moving to the installation directory

We choose to install Solr in /opt, because it is supposed to contain software that is not installed from Ubuntu's repositories, using the apt dependency management system, nor tracked for security updates by Ubuntu.

sudo mv /tmp/solr-4.7.2 /opt/solr

Creating a "core"

Apache Solr can serve multiple sites, eached served by a "core". We will start with one core, called simply "drupal".

cd /opt/solr/example/solr
sudo mv collection1 drupal

Now edit the file ./drupal/core.properties and change the name= to drupal, like so:

name=drupal

Copying the Drupal schema and Solr configuration

We now have to copy the Drupal Solr configuration into Solr. Assuming your site is in installed in /var/www, these commands achieve the tasks:

cd /opt/solr/example/solr/drupal/conf
sudo cp /var/www/sites/all/modules/contrib/apachesolr/solr-conf/solr-4.x/* .

Then edit the file: /opt/solr/example/solr/drupal/conf/solrconfig.xml, and comment our or delete the following section:

<useCompoundFile>false</useCompoundFile>
<ramBufferSizeMB>32</ramBufferSizeMB>
<mergeFactor>10</mergeFactor>

Setting Apache Solr Authentication, using Jetty

By default, a Solr installation listens on the public Ethernet interface of a server, and has no protection whatsoever. Attackers can access Solr, and change its settings remotely. To prevent this, we set password authentication using the embedded Jetty that comes with Solr. This syntax is for Apache Solr 4.x. Earlier versions use a different syntax.

The following settings work well for a single core install, i.e. search for a single Drupal installation. If you want multi-core Solr, i.e. for many sites, then you want to fine tune this to add different roles to different cores.

Then edit the file: /opt/solr/example/etc/jetty.xml, and add this section:

<!-- ======= Securing Solr ===== -->
<Call name="addBean">
  <Arg>
    <New class="org.eclipse.jetty.security.HashLoginService">
      <Set name="name">Solr</Set>
      <Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set>
      <Set name="refreshInterval">0</Set>
    </New>
  </Arg>
</Call>

Then edit the file: /opt/solr/example/etc/webdefault.xml, and add this section:

<security-constraint>
  <web-resource-collection>
    <web-resource-name>Solr</web-resource-name>
    <url-pattern>/*</url-pattern>
  </web-resource-collection>
  <auth-constraint>
    <role-name>search-role</role-name>
  </auth-constraint>
</security-constraint>

<login-config>
  <auth-method>BASIC</auth-method>
  <realm-name>Solr</realm-name>
</login-config>

Finally, create a new file named /opt/solr/example/etc/realm.properties, and add the following section to it:

user_name: password, search-role

Note that "search-role" must match what you put in webdefault.xml above.

Instead of "user_name", use the user name that will be used for logging in to Solr. Also, replace "password" with a real strong hard to guess password.

Finally, make sure that the file containing passwords is not readable to anyone but the owner.

chmod 640 /opt/solr/example/etc/realm.properties

Changing File Ownership

We then create a user for solr.

sudo useradd -d /opt/solr -M -s /dev/null -U solr

And finally change ownership of the directory to solr

sudo chown -R solr:solr /opt/solr

Automatically starting Solr

Now you need Solr to start automatically when the server is rebooted. To do this, download the attached file, and copy it to /etc/init.d

sudo cp solr-init.d.sh.txt /etc/init.d/solr
sudo chmod 755 /etc/init.d/solr

And now tell Linux to start it automatically.

sudo update-rc.d solr start 95 2 3 4 5 .

For now, start Solr manually.

sudo /etc/init.d/solr start

Now Solr is up and running.

Verify that it is running by accessing the following URL:

http://x.x.x.x:8983/solr/

Replace x.x.x.x by the IP address of the server that is running Solr.

You can also view the logs at:

tail -f /opt/solr/example/logs/solr.log

Configuring Drupal's Apache Solr module

After you have successfully installed, configured and started Solr, you should configure your Drupal site to interact with the Solr seserver. First, go to this URL: admin/config/search/apachesolr/settings/solr/edit, and enter the information for your Solr server. You should use the URL as follows:

http://user:password@x.x.x.x:8983/solr/drupal

Now you can proceed to reindex your site, by sending all the content to Solr.

Removing Solr

If you ever want to cleanly remove Apache Solr that you installed from the server using the above instructions, then use the sequence of the commands below:

sudo /etc/init.d/solr stop

sudo update-rc.d solr disable

sudo update-rc.d solr remove

sudo rm /etc/init.d/solr

sudo userdel solr

sudo rm -rf /opt/solr

sudo aptitude purge default-jre-headless

Additional Resources

AttachmentSize
Plain text icon solr-init.d.sh_.txt1.23 KB

Contents: 

Tags: 

Comments

Service reload not working

Hi!

Thanks for the great tutorial!

I noticed that the /etc/init.d/solr reload method does not work. This is because the script exits in the do_stop() function. You can fix this by moving the exit statements to the case structure like so:

start)
  do_start
  exit $RC
  ;;

stop)
  do_stop
  exit $RC
  ;;

restart)
  do_stop
  sleep 3
  do_start
  exit $RC
  ;;

Remember to remove the exit statements from functions do_stop() and do_start().

I'm writing a Ansible script to automate the Solr configuration. Please check it out if you are interested.

https://github.com/jiv-e/multicore-solr

Thanks

Thanks for the comment.

It is a restart, not a reload, since restart stops and starts the daemon, while reload just reloads the modified configuration.

I implemented your changes as you said, except that the do_start() and do_stop() functions needed to replace the exit by a return.

Not able to login.

Thank you so much for your tutorial. It took me weeks trying to get solr and jetty working for drupal and your tutorial was the one that got me so much close. One issue is remaining for me and I would appreciate someone help me get this token care of.

I am having a login issue. it keep telling me password incorrec even though I am using the credentials I put in realme.properties.

here is what I have in Jetty.xml

<Call name="addBean">
      <Arg>
        <New id="DeploymentManager" class="org.eclipse.jetty.deploy.DeploymentManager">
          <Set name="contexts">
            <Ref id="Contexts" />
          </Set>
          <Call name="setContextAttribute">
            <Arg>org.eclipse.jetty.server.webapp.ContainerIncludeJarPattern</Arg>
            <Arg>.*/servlet-api-[^/]*\.jar$</Arg>
          </Call>
         
         
          <!-- Add a customize step to the deployment lifecycle -->
          <!-- uncomment and replace DebugBinding with your extended AppLifeCycle.Binding class
          <Call name="insertLifeCycleNode">
            <Arg>deployed</Arg>
            <Arg>starting</Arg>
            <Arg>customise</Arg>
          </Call>
          <Call name="addLifeCycleBinding">
            <Arg>
              <New class="org.eclipse.jetty.deploy.bindings.DebugBinding">
                <Arg>customise</Arg>
              </New>
            </Arg>
          </Call>
          -->
         
        </New>
      </Arg>
    </Call>
   
    <!-- ======= Securing Solr ===== -->
<Call name="addBean">
  <Arg>
    <New class="org.eclipse.jetty.security.HashLoginService">
      <Set name="name">Solr</Set>
      <Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set>
      <Set name="refreshInterval">0</Set>
    </New>
  </Arg>
</Call>
<!-- ======= End Securing Solr ===== -->

here is what I have in webdefault.xml

<!--
  <security-constraint>
    <web-resource-collection>
      <web-resource-name>Disable TRACE</web-resource-name>
      <url-pattern>/</url-pattern>
      <http-method>TRACE</http-method>
    </web-resource-collection>
    <auth-constraint/>
  </security-constraint>
  -->
 
  <!-- ======= Securing Solr ===== -->
  <security-constraint>
  <web-resource-collection>
    <web-resource-name>Solr</web-resource-name>
    <url-pattern>/*</url-pattern>
  </web-resource-collection>
  <auth-constraint>
    <role-name>search-role</role-name>
  </auth-constraint>
</security-constraint>

<login-config>
  <auth-method>BASIC</auth-method>
  <realm-name>Solr</realm-name>
</login-config>
<!-- ======= End Securing Solr ===== -->

Figured out the login issue

I was using search-role as a username. But turned out I should be using the credentials on realme.properties like so:

this_is_the_username: this_is_the_password, search-role

Thank you for this great tutorial. This is the only tutorial that actually helped from hundreds out there in internet.

Clarified

Thanks for reporting back. I tried to clarify this part a bit better.

I was looking for a blog

I was looking for a blog which talks how to configure Apache Solr 4.x for Drupal. Thanks for this information, I am truly helped.

/opt/solr/example/etc/realm.properties

Hi, for some reason, I do not have the file: /opt/solr/example/etc/realm.properties. I do have a ../solr/example/solr/collection1/core.properties containing only name, but no realm.properties anywhere. Should I just create this file and insert the authentication code, or does this mean something is wrong?

I've followed these

I've followed these instructions to a T a couple of times and am getting the following error:

HTTP ERROR 404

Problem accessing /solr/drupal. Reason:

    Not Found

Powered by Jetty://

Any ideas how to troubleshoot this?

Host name?

Which host name or IP address did you use for the /solr path? Make sure it is the host name or IP address where Solr is installed.

I tried these instructions several times on a pristine Ubuntu virtual machine and they do work.

Check the logs to see what the problem(s) are.

Hey, Thanks for replying.. I

Hey,

Thanks for replying.. I have solr installed in a fresh debian wheezy lxc guest on an ubuntu 14.04 host. I am using iptables to forward port 8983 to the lxc container.

I can get to the solr admin page no problem (http://mydomain.com:8983/solr/admin) . Is there something I need to do to get the hostname recognised?

Add it to the hosts file?

The sure way is to add the IP address and host name to the hosts file of the machine that will access it. Or you can use the IP address directly.

"Now you need Solr to start

"Now you need Solr to start automatically when the server is rebooted. To do this, download the attached file, and copy it to /etc/init.d."

Where is the attached file?

I added the authentication but search is not working

Hi
I did as per the above steps and everything works ,the admin page asks me for username and password and i do full indexing.
But when i search from my php application i get an error

Fatal error: Uncaught exception 'Apache_Solr_HttpTransportException' with message ''401' Status: Unauthorized' in C:\wamp\www\2682014\solr\example\webapps\SolrPhpClient\Apache\Solr\Service.php on line 338

But if i remove the solr dashboard authentication everything works fine
Please advise

Different core

Is it possible to configure the apache solr module to use a differnet core? My drupal installation keeps defaulting to another core in my multi-core installation of apache solr.

Thank you so much for this

Thank you so much for this great tutorial! May god bless you and your family for eternity! ;)