You are here: Foswiki>Tasks Web>Item12763 (25 Feb 2015, MichaelDaum)Edit Attach

Item12763: Solr plugin installation did not work as documented

pencil
Priority: Normal
Current State: Closed
Released In: n/a
Target Release: n/a
Applies To: Extension
Component: SolrPlugin
Branches:
Reported By: LeilaPearson
Waiting For:
Last Change By: MichaelDaum
With any luck, much of this is fixed already in upcoming Plugin versions. I was using the July 2013 released version of the plugin.

Here's a list of the issues that I ran into and what I did about them:

When I tried to start solr after installing the bin file, the first thing I ran into is that the solrstart script required me to pass the path to my foswiki installation as an argument to work. Otherwise, it was defaulting to a trunk development directory. There seem to be similar issues for a bunch of the scripts in the tools directory. They expect environment variables to set - which I guess are set if they are called by foswiki itself - but aren't if called from the command line.

After getting past this problem, Solr still didn't start properly. Manually starting it using java -jar start.jar, I was able to see the solr logs and to see that it was looking for webapps in the "contexts" directory - but there was no contexts directory under foswiki/

Also, it seemed that there should have been a foswiki directory under the multicore directory (and also an _template directory), but it didn't exist there. The direcotry contained conf/ lib/ and solr.xml only.

When I tried changing the Jetty configuration to look in the webapps dir instead of contexts, I got a bunch of Java errors.

At this point, since it seems that the solr and jetty configuration files weren't matching the actual file structure, I decided to download my own copy of solr to see if I could get it working.

  • I downloaded my own copy of solr 4.6 and also solr 4.3 (the version that the plugin seemed to be based on). Solr 4.6 was easy to get working. I customized it as follows:
    • renamed the example directory to solr-jetty (example just didn't make sense for a production deployment)
    • removed some example sub-directories that didn't apply: example-DIH, exampledocs, example-schemaless, multicore
    • created a core for foswiki based on the "collection1" example: cp -r collection1/ foswiki
      • edited core.properties to remove the name "collection1" so it would use the core directory name - foswiki - as the core name.
    • ran java -jar start.jar in solr-jetty to check that both cores were up and running, which they were
    • made changes to 3 files: solrconfig.xml, schema.xml, and stopwords.txt
      • diffed the solrconfig.xml files from the default 4.3 installation and the one that came with the plugin, then manually applied all the changes that looked like they made sense to the solrconfig.xml file in my new foswiki core.
        • also fixed some other things that were resulting in warnings (use of deprecated code)
      • replaced the schema.xml file with the one from the plugin multicores/conf directory
        • replaced all instances of stopwords- with stopwords_
        • replaces stopwords_se with stopwords_sv for Swedish (just seemed to be incorrect for some reason)
        • fixed a few other things that were resulting in warnings (use of deprecated code)
      • copied foswiki/conf/lang/stopwords_en.txt to foswiki/conf/stopwords.txt
    • copied over the mapping-japanese.txt file from the plugin multicores/conf directory
    • disabled collection1 by renaming the core.properties file in the collection1 directory to core.properties.orig (alternatively the whole directory could be deleted)
    • restarted the server, and checked the logs for errors and warnings.

When trying to follow the instructions for indexing the data, I ran into the problem that the instructions assume you are using virtual hosts, which I'm not - so I had to use the non-virtual host version of the scripts.

When running queries, I saw warnings in the Solr logs that Solr would use Highlighter instead of FastVectorHighlighter so I fixed this - and also a typo in the query where it says "Contignuous".

I also found that Solr was detecting some english pages as German - and figured out that I should have set CONTENT_LANGUAGE to be "en" in my site preferences so it didn't need to use language detection (since my site is all English).

Finally, I generally set up my foswiki server with at least 2 instances of foswiki - a production version and a test version. This isn't the foswiki virtual hosts but apache named virtual hosts. From a solr perspective, I still need two cores though so I can have two separate indexes and two separate solr configurations to play with. It didn't make sense, to leave solr and jetty installed in foswiki/solr - so I decided to install it under /opt/solr instead where it would make more sense to share it.

-- LeilaPearson - 01 Mar 2014

I've attached to this ticket all of my modified files, along with the files they are based on so they can be diffed to see what specific changes were made.
  • .new files are my new versions of the files
  • .plugin files are the original plugin files - from the July 2013 release (which is the current release as of this writing)
  • .solr43 are the corresponding files from the default solr 4.3.0 release - without the foswiki customizations.
  • .solr461 are the corresponding files from the default solr 4.6.1 release - without the foswiki customizations.

My .new files are basically the diffs between the .plugin and .solr43 files applied on top of the .solr461 files, with a few additional fixes and changes related to the upgrade to solr 4.6.1.

-- LeilaPearson - 01 Mar 2014

I also attached a modified version of SolrPlugin.txt and a tar file to go with it. If you follow these instructions, you should be able to get things up and running the way I did - however, this may not be the best way to do things - especially if you already have jetty or tomcat running on your server. The solr distibution is set up as an example instead of a production package, which means my method results in lots of unecessary files, plus a bit of an odd file organization.

-- LeilaPearson - 01 Mar 2014

Thanks a lot for providing a documentation of what you did to install SolrPlugin. This is highly appreciated and will be of help upgrading from solr-4.3 to new er ones. I'll be cherry-picking some of your changes and integrate them into a next release while leaving aside some other changes. Here are my inspecting your changes:

Mapping-japanese.txt didn't change.

The plugin's stopwords.txt has got a more extensive list of english stopwords ... I'll keep that.

I won't rename the stopwords files nor move them into a separate lang/ directory due to backwards compatibility.

The rest of the changes in schema.xml (replacing SortableXXX with TrieXXX) are good ones.

I'd always prefer to run solr using the server's own jetty (or tomcat) servlet engine controlled by its init process ... and not the one shipped inside the example jetty bundled in the solr tar ball. I can't copy over your SolrPlugin.txt changes as they heavily rely on morphing the upstream example directory into a manually installed solr service ... which I'd rather not promote to do. Core of the problem is that the upstream solr distribution is packaged in a rather odd way, i.e. the extra libraries are scattered all over the place in different directories. I tried to mitigate this pain by rebundling these binaries as SolrPlugin-bin package in a way that makes more sense and needs a simple extraction process instead of wading thru the upstream distributions while separating their jetty jars from the real net binaries that make up solr and its other goodies.

At the end of the day only three four things need to be taken out of the upstream tarball:

  1. solr.war
  2. non-jetty lib jars
  3. diff of changes in stopword files
  4. diff of changes in solrconfig.xml and schema.xml

Finally, there are some extensive changes to solrconfig.xml that heavily alter the request handlers in there ... I am completely unsure what's going on there atm ... needs digging deeper.

-- MichaelDaum - 09 Jan 2015

This is going to change drastically with Solr 5. Some of the proposed changes have been added as part of Item13280.

-- MichaelDaum - 25 Feb 2015
 

ItemTemplate edit

Summary Solr plugin installation did not work as documented
ReportedBy LeilaPearson
Codebase 1.1.9
SVN Range
AppliesTo Extension
Component SolrPlugin
Priority Normal
CurrentState Closed
WaitingFor
Checkins
TargetRelease n/a
ReleasedIn n/a
CheckinsOnBranches
trunkCheckins
masterCheckins
ItemBranchCheckins
Release01x01Checkins
I Attachment Action Size Date Who Comment
SolrPlugin.txttxt SolrPlugin.txt manage 42 K 01 Mar 2014 - 16:37 LeilaPearson My modified version of SolrPlugin.txt which describes how to set up Solr the way I did.
solr-config-mods.tar.gzgz solr-config-mods.tar.gz manage 121 K 01 Mar 2014 - 16:35 LeilaPearson A package with the files I modified, as well as the files they are based on. The .new files are my mods, .plugin are the July 2013 plugin versions of these files, and .solr43 and .solr461 are the versions from the 4.3.0 and 4.6.1 releases respectively
solr-config.tgztgz solr-config.tgz manage 31 K 01 Mar 2014 - 16:37 LeilaPearson The tar file referenced in my version of SolrPlugin.txt.
Topic revision: r4 - 25 Feb 2015, MichaelDaum
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy