Welcome to the Apache Solr Wiki
Contents
Solr Documentation
Official documentation for the latest release of Solr can be found on the Solr website. Of particular note is the Solr Reference Guide which is published by the project after each minor release.
The rest of this wiki is community edited and captures version agnostic information, User submitted Tips & Tricks, historical information on Solr, as well as some areas of Solr not yet covered in the Reference Guide.
General Information
SOLR mailing lists: Sign-up or Search/Browse the Mailing List Archives
Please read these tips on using the mailing lists effectively before posting.
The Solr FAQ and the Solr Relevancy FAQ
SolrResources - Books, Blogs, Reviews, Articles, Product Sheet, Presentations, etc...
Support - People and companies for hire
Solr Change Log with all the juicy info about recently committed features.
Solr Development
HackingSolr -- Info for people interested in hacking and customizing the Solr application
TestingSolr -- Information for running Solr unit tests
NightlyBuilds -- Jenkins (formerly Hudson) hosts nightly Solr builds
Latest stable code for 3.x branch (svn URL).
TaskList of ideas for future development
HowToContribute improvements
HowToCompileSolr -- steps tested under Windows 7 / Windows 8
- Experiments
Guice + Restlet -- Docs on experimental refactoring of Solr to use Guice and Restlet
Using Solr
Installation and Configuration
- Includes information about useful settings in specific environments
Search and Indexing
- Indexing Documents
Adding Documents in XML format - Covers XML syntax for adds, deletes, commits and optimizes
Adding Documents in JSON format - Covers JSON syntax for adds, deletes, commits and optimizes
DataImportHandler - Solr contrib that supports full and delta indexing directly from SQL databases, and local or REST accessible XML files.
AnalysisRequestHandler - Analyzing documents without indexing
Solr Content Extraction Library (Solr Cell) - Covers how to index MS Word, PDF, etc. using Solr Cell (a.k.a. ExtractingRequestHandler). Also see the older version at UpdateRichDocuments
Update Processors - Update Processors define how an update request is processed.
Deduplication - Prevent or tag duplicate documents
- Searching Solr
Request Handlers - Control the logic used to process requests. Several different Request Handlers are included with Solr, or you can write your own custom implementation.
Response Writers - Control the formatting of the responses generated by Request Handlers. Several different Response Writers are included with Solr, or you can write your own custom implementation.
- Input Parameters
QueryParametersIndex - index of query parameters in following wiki pages
Search Components - Search Components provide core functionality to a Request Handler.
Query Syntax - Syntax for default query parsing, and how to specify a Query Parser.
Function Queries - Using the values in fields in functions and as factors in scoring
Faceted search - Category counts for search results
(Geo)Spatial Search - Find results near a point
Field Collapsing / Result Grouping - documents with a common field value are grouped
Join - Do joins (similar to database) on documents
SolrCloud
When you are ready to scale beyond a single node, see SolrCloud. Please see SolrCloud using Jboss for running SolrCloud using Jboss.
Advanced Tools
Carrot2-based Document Clustering - Summarize/compare all documents returned by a query
Language Detection - Deduce the language of a document
UIMA Natural Language Processing - Sophisticated NLP suite, originally from IBM Research
OpenNLP Natural Language Processing - Simple NLP suite
Business Rules - Alter stored documents and query results with flexible dynamic rules engine
Tips, Tricks and Use Cases
Auto-complete - Use either Faceting with facet.prefix or Suggester or TermsComponent
UniqueKey - Covers tips about unique keys in the schema
Japanese Language Support - How to search Japanese text, best practices and various considerations
Using PreAnalyzedField type and PreAnalyzedUpdateProcessorFactory for integration with external document processing pipelines
Solr Clients
IntegratingSolr - includes information about accessing Solr from a variety of programming languages and existing third party applications.
Operations and Production
- Index Replication
Built in SolrRequestHandler based SolrReplication
Unix script based CollectionDistribution
User-contributed content
Translations - Unofficial translations of the official documentation, in hope of easing the review process.
:TODO:
How to implement basic indexing in Tomcat
How to edit this Wiki
This Wiki is a collaborative site where anyone can contribute and share, however due to to excessive link spam only confirmed humans are allowed to edit pages -- the process of confirming you are human however is really easy:
- Create an account by clicking the "Login" link at the top of any page, and picking a username and password.
Contact a member of the wiki admin and ask to be added to the ContributorsGroup. All requests needs to include the wiki username you chose, and can request made via...
The #solr IRC channel -- Just ask if any Solr wiki admins are arround to help you
Sending an email to solr-user@lucene.apache.org requesting to be added (Don't forget to explicitly mention your wiki username!) and waiting for a reply email confirming you've been added.
Once you have been authorized, you can edit any page by pressing Edit at the top or the bottom of the page
There are some conventions used on the Solr wiki:
:TODO:
(/!\ :TODO: /!\ ) is used to denote sections that definitely need to be cleaned up.
Solr4.0 (<!> [[Solr4.0]]) is used to draw attention to which version of Solr a feature was (or will be) added to Solr.
Some general info on using this Wiki Software:
Create a link to another page with joined capitalized words (like WikiSandBox) or with ["quoted words in brackets"]
See HelpForBeginners to get you going, HelpContents for all help pages.
HelpOnMoinWikiSyntax: quick access to wiki syntax