Solr: a High-level Overview for Sysadmins

Operating System

Solr runs fine on both Microsoft and Linux-based environments. I've only deployed Solr on Linux, though, so I can't vouch for the stability of running Solr on Windows in a production environment.

Java

Solr requires the Sun Java JDK to run. Any version of the JDK above 6 is fine.

Daemon

Solr is a server, and you'll probably want to daemonize it, so it starts up automatically upon reboot.

See the Resources section for some init.d and monit scripts.

Firewall configuration

The default Solr port is 8983. You need to ensure this port is open on the server.

Hardware resources

Generally speaking, Solr is rarely CPU-bound. It is mostly I/O-bound, due to disk seeks.

CPU usage is usually higher if the application is write-intensive.

The more RAM you have, the larger you can set the query and document caches, and therefore the less I/O contention is an issue.

It is not rare to see Solr handling 100+ queries per second with a 3% CPU load with a modestly-speced server.

If you're running Solr on the same physical server as a relational database which has heavy load, it would be ideal to place Solr on a separate physical drive.

In a low-to-medium load application, its usually alright to run Solr on the same physical server as the web server.

Sample hardware configuration

  • Dual-core CPU, preferably above 2GHz
  • 2-16GB RAM
  • Fast HDD (10k RPM/sec and above)

If you're running Solr on Amazon EC2, I've found the small server to be sufficient for a number of small-medium sized Solr installations.

Backups

The 2 things you want to backup are:

  1. The data directory which contains the index
  2. Configuration files like schema.xml, solrconfig.xml etc

Restoring a backup is as simple as shutting down Solr, copying the respective files over, then restarting Solr.

Index corruption

In the event of power loss or catastrophic hardware failure, Solr is usually pretty good about maintaining index integrity, so starting up again after reboot just works.

If the index does get corrupted, well, that's what backups are for! :-)