Most of high traffic or complex Drupal sites use Apache Solr as the search engine. It is much faster and more scaleable than Drupal's search module.
In this article, we describe one way of many for having a working Apache Solr installation for use with Drupal 7.x, on Ubunutu Server 12.04 LTS. The technique described should work with Ubunut 14.04 LTS as well.
In a later article, now published at: article, we describe how to install other versions of Solr, using the Ubuntu/Debian way.
For this article, we focus on having an installation of Apache Solr with the following objectives:
- Use the latest stable version of Apache Solr
- Least amount of software dependencies, i.e. no installation of Tomcat server, and no full JDK, and no separate Jetty
- Least amount of necessary complexity
- Least amount of software to install and maintain
- A secure installation
This installation can be done on the same host that runs Drupal, if it has enough memory and CPU, or it can be on the database server. However, it is best if Solr is on a separate server dedicated for search, with enough memory and CPU.
We start by installing the Java Runtime Environment, and choose the headless server variant, i.e. without any GUI components.
sudo aptitude update
sudo aptitude install default-jre-headless
Downloading Apache Solr
Second, we need to download the latest stable version of Apache Solr from a mirror near you. At the time of writing this article, it is 4.7.2. You can find the closest mirror to you at Apache's mirror list.
Extracting Apache Solr
Next we extract the archive, while still in the /tmp directory.
tar -xzf solr-4.7.2.tgz
Moving to the installation directory
We choose to install Solr in /opt, because it is supposed to contain software that is not installed from Ubuntu's repositories, using the apt dependency management system, nor tracked for security updates by Ubuntu.
sudo mv /tmp/solr-4.7.2 /opt/solr
Creating a "core"
Apache Solr can serve multiple sites, eached served by a "core". We will start with one core, called simply "drupal".
sudo mv collection1 drupal
Now edit the file ./drupal/core.properties and change the name= to drupal, like so:
Copying the Drupal schema and Solr configuration
We now have to copy the Drupal Solr configuration into Solr. Assuming your site is in installed in /var/www, these commands achieve the tasks:
sudo cp /var/www/sites/all/modules/contrib/apachesolr/solr-conf/solr-4.x/* .
Then edit the file: /opt/solr/example/solr/drupal/conf/solrconfig.xml, and comment our or delete the following section:
Setting Apache Solr Authentication, using Jetty
By default, a Solr installation listens on the public Ethernet interface of a server, and has no protection whatsoever. Attackers can access Solr, and change its settings remotely. To prevent this, we set password authentication using the embedded Jetty that comes with Solr. This syntax is for Apache Solr 4.x. Earlier versions use a different syntax.
The following settings work well for a single core install, i.e. search for a single Drupal installation. If you want multi-core Solr, i.e. for many sites, then you want to fine tune this to add different roles to different cores.
Then edit the file: /opt/solr/example/etc/jetty.xml, and add this section:
<!-- ======= Securing Solr ===== -->
<Set name="config"><SystemProperty name="jetty.home" default="."/>/etc/realm.properties</Set>
Then edit the file: /opt/solr/example/etc/webdefault.xml, and add this section:
Finally, create a new file named /opt/solr/example/etc/realm.properties, and add the following section to it:
user_name: password, search-role
Note that "search-role" must match what you put in webdefault.xml above.
Instead of "user_name", use the user name that will be used for logging in to Solr. Also, replace "password" with a real strong hard to guess password.
Finally, make sure that the file containing passwords is not readable to anyone but the owner.
chmod 640 /opt/solr/example/etc/realm.properties
Changing File Ownership
We then create a user for solr.
sudo useradd -d /opt/solr -M -s /dev/null -U solr
And finally change ownership of the directory to solr
sudo chown -R solr:solr /opt/solr
Automatically starting Solr
Now you need Solr to start automatically when the server is rebooted. To do this, download the attached file, and copy it to /etc/init.d
sudo cp solr-init.d.sh.txt /etc/init.d/solr
sudo chmod 755 /etc/init.d/solr
And now tell Linux to start it automatically.
sudo update-rc.d solr start 95 2 3 4 5 .
For now, start Solr manually.
sudo /etc/init.d/solr start
Now Solr is up and running.
Verify that it is running by accessing the following URL:
Replace x.x.x.x by the IP address of the server that is running Solr.
You can also view the logs at:
tail -f /opt/solr/example/logs/solr.log
Configuring Drupal's Apache Solr module
After you have successfully installed, configured and started Solr, you should configure your Drupal site to interact with the Solr seserver. First, go to this URL: admin/config/search/apachesolr/settings/solr/edit, and enter the information for your Solr server. You should use the URL as follows:
Now you can proceed to reindex your site, by sending all the content to Solr.
If you ever want to cleanly remove Apache Solr that you installed from the server using the above instructions, then use the sequence of the commands below:
sudo /etc/init.d/solr stop
sudo update-rc.d solr disable
sudo update-rc.d solr remove
sudo rm /etc/init.d/solr
sudo userdel solr
sudo rm -rf /opt/solr
sudo aptitude purge default-jre-headless