Solr was initially developed at CNET Networks and was donated to the Apache Software Foundation in 2006. It is currently used for search applications on several high-traffic public websites. Community reports have been good, with users reporting indices with several million documents performing quite well.
Solr's feature set is broken down into several subsystems:
Schema
- Defines the field types and fields of documents
- Dynamic Fields enables on-the-fly addition of new fields
- Explicit types eliminates the need for guessing types of fields
- External file-based configuration of stopword lists, synonym lists, and protected word lists
- Many additional text analysis components including word splitting, regex and sounds-like filters
- HTTP interface with configurable response formats (XML/XSLT, JSON, Python, Ruby)
- Sort by any number of fields
- Highlighted context snippets
- Constant scoring range and prefix queries - no idf, coord, or lengthNorm factors, and no restriction on the number of terms the query matches.
- Function Query - influence the score by a function of a field's numeric value or ordinal
- Date Math - specify dates relative to "NOW" in queries and updates
- Pluggable query handlers and extensible XML data format
- Document uniqueness enforcement based on unique key field
- Batches updates and deletes for high performance
- User configurable commands triggered on index changes
- Correct handling of numeric types for both sorting and range queries
- Pluggable Cache implementations
- Autowarming of cache in background (The most recently accessed items in the caches of the current searcher are re-populated in the new searcher, enabing high cache hit rates across index/searcher changes.)
- Fast/small filter implementation
- User level caching with autowarming support
- Efficient distribution of index parts that have changed via rsync transport
- Pull strategy allows for easy addition of searchers
- Configurable distribution interval allows tradeoff between timeliness and cache utilization
- Comprehensive statistics on cache utilization, updates, and queries
- Text analysis debugger, showing result of every stage in an analyzer
- Web Query Interface w/ debugging output
This is the first release since Solr graduated from the Incubator, bringing many new features, including CSV/delimited-text data loading, time based autocommit, faster faceting, negative filters, a spell-check handler, sounds-like word filters, regex text filters, and more flexible plugins.A two part series of articles was also recently published on developerWorks that walk through the process of installing, configuring, using, and tuning Solr in more detail.