Lucene PMC give first glimpse into Apache Lucene and Solr 4.0
The Lucene PMC have released details of their upgrade to both Java text search engine Apache Lucene and Apache Solr, the sister open source NoSQL search platform, with an alpha 4.0 release.
For Lucene 4.0, the focus has been placed on performance aspects, with new pluggable codecs for handling the indexes of terms, the addition of support for stored fields and document values and a new DirectSpellChecker. Fuzzy query is claimed to run a "100 to 200 times faster" and in-memory representation is more efficient.
Much of the dual alpha release's additional features are
devoted to Solr and a lot of precedence in this release has been
placed upon Solr Cloud, adding in distributed
attributes claiming to bring easy scalability to
the standalone full-text search server. Distributed indexing is
redesigned from the ground-up to make near real-time searching a
reality. The introduction of further NoSQL features makes it a
viable solution as a data store, with versioning and optimistic
release notes claim that Solr Cloud also
creates 'high availability with no single points of
failure'. Solr is getting a boost by now being
integrated into Apache Zookeeper, the highly reliable distributed
environment coordinator at the epicentre of Hadoop.
Other new features scheduled for Solr 4, include pivot faceting (where the top constraints for one field can be found for each top constraint of another field), pseudo-fields (to add metadata along with returned documents) and Pseudo-Join (for selecting a set of documents based on their relationship to another set of documents). Aesthetically, there's a shiny new web-admin interface, including support for Solr Cloud.
Solr uses the Lucene Java search library at its core for full-text indexing and search, and has extra REST-like HTTP/XML and JSON APIs to welcome in other languages. Solr's external configuration allows it to be altered to almost any type of application without Java coding, and it has an extensive and flexible plugin architecture, requiring more complex action.
This is an alpha release for early adopters, so isn't intended for production environments but it's well worth getting to grips with all the new stuff. Check out the release notes and download Solr here.