New release

Apache Lucene and Solr 5.4 get joint update

Uwe Schindler
Update image via Shutterstock

For all search-engine geeks out there, the Apache Lucene Team has a very special gift for the holidays: Version 5.4.0 of Apache Lucene and Apache Solr has been released! 

As always, the changes made to Apache Lucene, which is a high-performance, full-text search library, are littered with new improvements that optimise performance and memory requirements. The MatchAllDocsQuery (which is often used in Solr for a facet-based drilldown without a full-text query) has been significantly improved. Due to changes in the query structure, this query type had become significantly slower in Lucene 5.

Lucene 5.4

The reason for this was Lucene’s migration from the classic bitset-based filters as a seperate query tool to universal calls Query, which had always forcibly been doing the scoring – until now. Lucene 5.4 has tidied up other bits too, so that the filter class is has been marked as fully deprecated. New code should now always use Query and mark this in boolean queries for filtering.

Furthermore, additional memory improvements for column-based fields (DocValues) will allow for more throughput.

There have been improvements in text analysis and tokenization. Lucene is able to analyse Serbian language and also capable to identify the Arabic origin of Asian Digits. Moreover, the new Tokenizer uses Unicode’s whitespace definition and splits on NBSP.

And finally, Lucene 5.4 is now running on Java 9 Jigsaw. The use of Lucene in combination with a security manager is much easier due to security fixes.

Solr 5.4

Apache Solr is built on Apache Lucene, and that means Lucene features can be used in Solr right away. Among the most important improvements here are the drilldown enhancements and the reduction in memory requirements of DocValues fields. Due to the new faceting api Doc in Solr these fields can achieve a 100% increase in performance.

The standard QueryParser of Solr also received a new syntax. Filters can now be directly declared in query strings due to the combination of filter and query classes. Any filter query can take part in a boolean search without contributing to the score.

Finally, the official SolrJ java client supports basic authentication. The new AngluarJS-featured admin interface is officially ready to use und will replace the old one in Solr 5.5. If you have some spare time during Christmas, feel free to test it and report bugs.

A list of changes in Apache Lucene and Apache Solr can be found on the official website. Have fun searching for Santa Claus.

Uwe Schindler
Uwe Schindler is a member of the Project Management Committee at the Apache Lucene Project. He and his Consulting firm SD DataSolutions GmbH are based in Bremen. He regularly blogs at The Generic Policeman's Blog.

Inline Feedbacks
View all comments