Lucene™ Core News

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

You may also read these news as an ATOM feed.

20 December 2024 - Apache Lucene™ 10.1.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 10.1.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 10.1.0 Release Highlights:

New Features

  • Add IndexInput::isLoaded to determine if the contents of an  input is resident in physical memory
  • FeatureField now supports storing term vectors.

Improvements

  • TieredMergePolicy now allows merging up to maxMergeAtOnce segments for merges below the floor segment size, even if maxMergeAtOnce is greater than segmentsPerTier. This makes it more efficient to configure TieredMergePolicy to merge segments aggressively by configuring a high value of floorSegmentSize (e.g. 64MB), a low value of segmentsPerTier (e.g. 4) and a high value of maxMergeAtOnce (e.g. 32).

Optimizations

  • Many speedups to top-k query evaluation, in particular: top-level disjunctions, filtered disjunctions, conjunctions, DisjunctionMaxQuery.
  • Speedup to exhaustive evaluation of conjunctive queries by vectorizing the intersection of postings lists.
  • Reduced contention for top-k query evaluation when IndexSearcher is configured with an executor.

... plus a multitude of helpful bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/10_1_0/changes/Changes.html

13 December 2024 - Apache Lucene™ 9.12.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.12.1.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.12.1 Release Highlights:

Improvements

  • Allow easier configuration of the Panama vectorization provider with newer Java versions. Set the org.apache.lucene.vectorization.upperJavaFeatureVersion system property to increase the set of Java versions that Panama vectorization will provide optimized implementations for.

Bug fixes

  • Fixed backwards compatibility bug that caused sparse (not all documents have a vector) KNN indices written with 9.0.0 to give silently (no exception) terrible recall results when searched by any 9.x release
  • Improve Tessellatorlogic when two holes share the same vertex with the polygon which was failing in valid polygons.
  • Fix backwards compatibility bug that caused 9.12.0 to incorrectly throw IllegalStateException when trying to open an IndexReader on an index created with quantized (int4, int7, int8) KNN vectors using Lucene99HnswScalarQuantizedVectorsFormat.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/9_12_1/changes/Changes.html

14 October 2024 - Apache Lucene™ 10.0.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 10.0.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 10.0.0 Release Highlights:

System requirements

  • Lucene 10.0 requires JDK 21 or newer

API changes

  • KNN vector values now have a random-access API.
  • Deprecated APIs have been removed and a number of API changes have been made. Please consult the migrate guide for an extensive list and actions to take to migrate to 10.0.

New Features

  • A new IndexInput#prefetch API has been added, allowing query evaluation logic to let the Directory know about regions of data that are about to be read. This helps perform I/O concurrently under the hood. MMapDirectory implements this API using the madvise system call and the MADV_WILLNEED flag on Linux and Mac OS.
  • Lucene now supports sparse indexing on doc values via FieldType#setDocValuesSkipIndexType. The sparse index will record the minimum and maximum values per block of doc IDs. Used in conjunction with index sorting to cluster similar documents together, this allows for very space-efficient and CPU-efficient filtering.
  • Search concurrency is now decoupled from the index geometry, so that an index can be searched using any number of threads, regardless of its number of segments.
  • Kmeans clustering on vectors

Improvements

  • Lucene now opens files with the MADV_RANDOM advice by default on Linux and Mac OS. This results in better efficiency for indexes that exceed the size of the page cache, but can make it slower to load indexes in the page cache. It is possible to revert to the MADV_NORMAL read advice by default by passing -Dorg.apache.lucene.store.defaultReadAdvice=NORMAL as a JVM startup flag.
  • Snowball dictionaries have been upgraded, resulting in improved tokenization. This may require reindexing to ensure consistency of search results with pre-10.0 indexes.
  • The expressions module is now using MethodHandles and Dynamic Class-File Constants (JEP 309) in combination with hidden classes (JEP 371) to implement a strict and type-safe call to external functions. This allows to easier extend expressions with custom functions in secure way because runtime linking of custom functions is no longer the responsibility of the expressions scripting engine. In addition, the hidden classes created by the expressions engine no longer suffer from global classloader locks.

... plus a multitude of helpful bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/10_0_0/changes/Changes.html

28 September 2024 - Apache Lucene™ 9.12.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.12.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.12.0 Release Highlights:

Security Fixes

  • Deserialization of Untrusted Data vulnerability in Apache Lucene Replicator - CVE-2024-45772

New Features

  • Improve intra-merge parallelism for many value types. (Ben Trent)
  • Add support JDK 23 to the Panama Vectorization Provider. (Chris Hegarty)
  • Match-time aggregation engine with improved flexibility and performance. (Egor Potemkin, Shradha Shankar)

Improvements

  • Add Intervals.regexp and Intervals.range methods to produce IntervalsSource for regexp and range queries. (Mayya Sharipova)
  • Remove support for writing 8 bit scalar vector quantization. 4 and 7 bit quantization are still supported (Michael McCandless )

Optimizations

  • Inline postings skip data to improve performance of queries that need skipping such as conjunctions. (Adrien Grand)
  • Optimizations to the decoding logic of blocks of postings. (Adrien Grand, Uwe Schindler, Greg Miller)
  • Avoid performance degradation with closing shared mapped segment data (Chris Hegarty, Michael Gibney, Uwe Schindler)

... plus a multitude of helpful bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_12_0/changes/Changes.html

24 September 2024 - Apache Lucene™ 8.11.4 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.11.4.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.11.4 Release Highlights:

  • There are no changes from Lucene 8.11.3

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_11_4/changes/Changes.html

27 June 2024 - Apache Lucene™ 9.11.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.11.1.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.11.1 Release Highlights:

  • Fix performance regression in NumericComparator.
  • Remove intra-merge parallelism for everything except HNSW graph merges.
  • Fix bug that prevented adding a parent field to an index with no fields.
  • Fix IndexOutOfBoundsException thrown in DefaultPassageFormatter by unordered matches.
  • StringValueFacetCounts stops throwing NPE when faceting over an empty match-set.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/9_11_1/changes/Changes.html

6 June 2024 - Apache Lucene™ 9.11.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.11.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.11.0 Release Highlights:

New features

  • Add support for posix_madvise to MMapDirectory: If running on Linux/macOS and Java 21 or later, MMapDirectory uses IOContext to pass suitable MADV flags to kernel of operating system. This may improve paging logic especially when working with large indexes under memory pressure.
  • Expand support for new scalar bit levels for HNSW vectors. This includes 4-bit vectors and an option to compress them to gain a 50% reduction in memory usage.
  • Recursive graph bisection is now supported on indexes that have blocks

Improvements

  • MergeScheduler can now provide an executor for intra-merge parallelism. The first implementation is the ConcurrentMergeScheduler.
  • Upgrade icu4j to version 74.2.

Optimizations

  • Use RWLock to access LRUQueryCache to reduce contention.
  • Speedup multi-segment HNSW graph search for diversifying child kNN queries.
  • Add a MemorySegment Vector scorer - for scoring without copying on-heap. This can improve search latency by almost 2x for byte vectors.
  • Switch to using optimized, primitive collections where possible to improve performance and heap utilization.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_11_0/changes/Changes.html

20 February 2024 - Apache Lucene™ 9.10.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.10.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.10.0 Release Highlights:

New features

  • Support for similarity-based vector searches, ie. finding all nearest neighbors whose similarity is greater than a configured threshold from a query vector. See [Byte|Float]VectorSimilarityQuery.
  • Index sorting is now compatible with block joins. See IndexWriterConfig#setParentField.
  • MMapDirectory now takes advantage of the now finalized JDK foreign memory API internally when running on Java 22 (or later). This was only supported with Java 19 to 21 until now.
  • SIMD vectorization now takes advantage of JDK vector incubator on Java 22. This was only supported with Java 20 or 21 until now.

Optimizations

  • Tail postings are now encoded using group-varint. This yielded speedups on queries that match lots of terms that have short postings lists in Lucene's nightly benchmarks.
  • Range queries on points now exit earlier when evaluating a segment that has no matches. This will improve performance when intersected with other queries that have a high up-front cost such as multi-term queries.
  • BooleanQueries that mix SHOULD and FILTER clauses now propagate minimum competitive scores to the SHOULD clauses, yielding significant speedups for top-k queries sorted by descending score.
  • IndexSearcher#count has been optimized on pure disjunctions of two term queries.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_10_0/changes/Changes.html

8 February 2024 - Apache Lucene™ 8.11.3 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.11.3.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.11.3 Release Highlights:

  • A number of bugs in polygon tessellating have been fixed.
  • GC Load during indexing has been reduced by estimating FST BysteStore block size.
  • BKD trees will no longer possibly overflow when more than 4 billion points are added.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_11_3/changes/Changes.html

29 January 2024 - Apache Lucene™ 9.9.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.9.2.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.9.2 Release Highlights:

Bug fixes

  • Fix NPE when sampling for quantization in Lucene99HnswScalarQuantizedVectorsFormat (Ben Trent)
  • Rollback the tmp storage of BytesRefHash to -1 after sort (Guo Feng)

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/9_9_2/changes/Changes.html

16 December 2023 - Apache Lucene™ 9.9.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.9.1.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.9.1 Release Highlights:

Bug fixes

  • JVM SIGSEGV crash when compiling computeCommonPrefixLengthAndBuildHistogram (Chris Hegarty)
  • Push and pop OutputAccumulator as IntersectTermsEnumFrames are pushed and popped (Guo Feng, Mike McCandless)

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/9_9_1/changes/Changes.html

4 December 2023 - Apache Lucene™ 9.9.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.9.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.9.0 Release Highlights:

New Features

  • Add int8 scalar quantization to the HNSW vector format. This optionally allows for more compact lossy storage for the vectors, requiring approximately 4x less memory for fast HNSW search.
  • HNSW graph now can be merged with multiple threads, leveraging the same infrastructure that inter-segment concurrency utilizes.

Improvements

  • Speed up Panama vector support, use FMA, and test improvements.
  • FSTCompiler can now approximately limit how much RAM it uses to share suffixes during FST construction using the suffixRAMLimitMB method.

Optimizations

  • Faster top-level conjunctions on term queries when sorting by descending score.
  • Change Postings back to using FOR in Lucene99PostingsFormat. Freqs, positions and offset keep using PFOR.

... plus a multitude of helpful bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_9_0/changes/Changes.html

28 September 2023 - Apache Lucene™ 9.8.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.8.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.8.0 Release Highlights:

Optimizations

  • Faster computation of top-k hits on boolean queries. Lucene's nightly benchmarks report a 20%-30% speedup for disjunctive queries and a 11%-13% speedup for conjunctive queries since Lucene 9.7. Disjunctive queries with many and/or high-frequency terms should see even higher speedups.
  • Faster computation of top-k hits when sorting by field. Lucene's nightly benchmarks report speedups between 7% and 33% since 9.7 depending on the type and cardinality of the field that is used for sorting.
  • Faster indexing of numeric doc values when index sorting is turned on.
  • Expressions now evaluate all arguments in a fully lazy manner, which may provide significant speedups and throughput improvements for heavy expression users.

API Changes

  • Move max vector dims limit to Codec (Mayya Sharipova)

New features

  • Introduced LeafCollector#finish, a hook that runs after collection has finished running on a leaf.
  • Add "KnnCollector" to "LeafReader" and "KnnVectorReader" so that custom collection of vector search results can be provided. The first custom collector provides "ToParentBlockJoin[Float|Byte]KnnVectorQuery" joining child vector documents with their parent documents.
  • Add support for recursive graph bisection, also called bipartite graph partitioning, and often abbreviated BP, an algorithm for reordering doc IDs that results in more compact postings and faster queries, especially conjunctions.

Bug fixes

  • Fix HNSW graph search bug that potentially leaked unapproved docs
  • Fix bug in TermsEnum#seekCeil on doc values terms enums that causes IndexOutOfBoundsException.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_8_0/changes/Changes.html

25 June 2023 - Apache Lucene™ 9.7.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.7.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.7.0 Release Highlights:

New features

  • The new IndexWriter#updateDocuments(Query, Iterable) allows updating multiple documents that match a query at the same time.

  • Function queries can now compute similarity scores between kNN vectors.

Optimizations

  • KNN indexing and querying can now take advantage of vectorization for distance computation between vectors. To enable this, use exactly Java 20 or 21, and pass --add-modules jdk.incubator.vector as a command-line parameter to the Java program.

  • KNN queries now run concurrently if the IndexSearcher has been created with an executor.

  • Queries sorted by field are now able to dynamically prune hits only using the after value. This yields major speedups when paginating deeply.

  • Reduced merge-time overhead of computing the number of soft deletes.

Changes in runtime behavior

  • KNN vectors are now disallowed to have non-finite values such as NaN or ±Infinity.

Bug fixes

  • Backward reading is no longer an adversarial case for BufferedIndexInput, used by NIOFSDirectory and SimpleFSDirectory. This addresses a performance bug when performing terms dictionary lookups with either of these directories.

  • GraphTokenStreamFiniteStrings#articulationPointsRecurse may no longer overflow the stack.

  • ... plus a number of helpful bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_7_0/changes/Changes.html

9 May 2023 - Apache Lucene™ 9.6.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.6.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.6.0 Release Highlights:

  • Introduce a new KeywordField for simple and efficient filtering, sorting and faceting.
  • Add support for Java 20 foreign memory API. If exactly Java 19 or 20 is used, MMapDirectory will mmap Lucene indexes in chunks of 16 GiB (instead of 1 GiB) and indexes closed while queries are running can no longer crash the JVM.
  • Improved performance for TermInSetQuery, PrefixQuery, WildcardQuery and TermRangeQuery
  • Lower memory usage for BloomFilteringPostingsFormat
  • Faster merges for HNSW indexes
  • Improvements to concurrent indexing throughput under heavy load
  • Correct equals implementation in SynonymQuery
  • 'explain' is now implemented on TermAutomatonQuery

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_6_0/changes/Changes.html

30 January 2023 - Apache Lucene™ 9.5.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.5.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.5.0 Release Highlights:

New features

  • Added KnnByteVectorField and ByteVectorQuery that are specialized for indexing and querying byte-sized vectors. Deprecated KnnVectorField, KnnVectorQuery and LeafReader#getVectorValues in favour of the newly introduced KnnFloatVectorField, KnnFloatVectorQuery and LeafReader#getFloatVectorValues that are specialized for float vectors.
  • Added IntField, LongField, FloatField and DoubleField: easy to use numeric fields that perform well both for filtering and sorting.
  • Support for Java 19 foreign memory access ("project Panama") was enabled by default removing the need to provide the "--enable-preview" flag.
  • Added ByteWritesTrackingDirectoryWrapper to expose metrics for bytes merged, flushed, and overall write amplification factor.

Optimizations

  • Improved storage efficiency of connections in the HNSW graph used for vector search
  • Added new stored fields and term vectors interfaces: IndexReader#storedFields and IndexReader#termVectors. These do not rely upon ThreadLocal storage for each index segment, which can greatly reduce RAM requirements when there are many threads and/or segments.
  • Several improvements were made to IndexSortSortedNumericDocValuesRangeQuery including query execution optimization with points for descending sorts and BoundedDocIdSetIterator construction sped up using bkd binary search.

Other

  • Moved DocValuesNumbersQuery from sandbox to NumericDocValuesField#newSlowSetQuery
  • Fix exponential runtime for nested BooleanQuery#rewrite with non scoring clauses

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_5_0/changes/Changes.html

21 November 2022 - Apache Lucene™ 9.4.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.4.2.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.4.2 Release Highlights:

  • Fixed integer overflow when opening segments containing more than ~16M KNN vectors.
  • Fixed cost computation of BitSets created via DocIdSetBuilder, such as for multi-term queries. This may improve performance of multi-term queries.
  • CheckIndex now verifies the consistency of KNN vectors more thoroughly.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/9_4_2/changes/Changes.html

24 October 2022 - Apache Lucene™ 9.4.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.4.1.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.4.1 Release Highlights:

  • When reading large segments, the kNN vectors format could fail with a validation error, preventing further writes or searches on the index. This bug is now fixed. Only version 9.4.0 was affected, so it is recommended to skip 9.4.0 if you are using kNN vectors.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/9_4_1/changes/Changes.html

30 September 2022 - Apache Lucene™ 9.4.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.4.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.4.0 Release Highlights:

New features

  • Added ShapeDocValues/Field, a unified abstraction to represent existing types: XY and lat/long.
  • FacetSets can now be filtered using a Query via MatchingFacetSetCounts.
  • SortField now allows control over whether to apply index-sort optimizations.
  • Support for Java 19 foreign memory access ("project Panama") was added. Applications started with command line parameter "java --enable-preview" will automatically use the new foreign memory API of Java 19 to access indexes on disk with MMapDirectory. This is an opt-in feature and requires explicit Java command line flag passed to your application's Java process (e.g., modify startup parameters of Solr or Elasticsearch/Opensearch)! When enabled, Lucene logs a notice using java.util.logging. Please test thoroughly and report bugs/slowness to Lucene's mailing list. When the new API is used, MMapDirectory will mmap Lucene indexes in chunks of 16 GiB (instead of 1 GiB) and indexes closed while queries are running can no longer crash the JVM.

Optimizations

  • Added support for dynamic pruning to queries sorted by a string field that is indexed with both terms and SORTED or SORTED_SET doc values. This can lead to dramatic speedups when applicable.
  • TermInSetQuery is optimized for the case when one of its terms matches all docs in a segment, and it now provides cost estimation, making it usable with IndexOrDocValuesQuery for better query planning.
  • KnnVector fields can now be stored with reduced (8-bit) precision, saving storage and yielding a small query latency improvement.

Other

  • KnnVector fields' HNSW graphs are now created incrementally when new documents are added, rather than all-at-once when flushing. This yields more consistent predictable behavior at the cost of an overall increase in indexing time.
  • randomizedtesting dependency upgraded to 2.8.1
  • addIndexes(CodecReader) now respects MergePolicy and MergeScheduler, enabling it to do its work concurrently.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_4_0/changes/Changes.html

29 July 2022 - Apache Lucene™ 9.3.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.3.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.3.0 Release Highlights:

  • Merge on full flush is enabled now by default with a timeout of 500ms, giving the merge policy a chance to merge NRT segments together before publishing a new point-in-time view of the IndexReader. This should give queries a small performance boost in the near-realtime case, especially terms-dictionary-intensive queries like fuzzy queries.
  • Add getAllChildren functionality to facets.
  • Added facetsets module for high dimensional (hyper-rectangle) faceting.
  • Top-level two-clause disjunctions sorted by score now use the block-max MAXSCORE algorithm, which introduced a 40%-75% speedup in our benchmarks.
  • BooleanQuery can return quick counts for simple boolean queries.
  • When running KnnVectorQuery with a filter, reuse the cached filter bit set.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_3_0/changes/Changes.html

17 June 2022 - Apache Lucene™ 8.11.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.11.2.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.11.2 Release Highlights:

Bug fixes

  • LUCENE-10564: Make sure SparseFixedBitSet#or updates ramBytesUsed.
  • LUCENE-10477: Highlighter: WeightedSpanTermExtractor.extractWeightedSpanTerms to Query#rewrite multiple times if necessary.

Optimizations

  • LUCENE-10481: FacetsCollector will not request scores if it does not use them.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_11_2/changes/Changes.html

23 May 2022 - Apache Lucene™ 9.2.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.2.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.2.0 Release Highlights:

  • Numerous improvements to indexing and query performance for KNN vectors
  • More efficient implementations for count operations on range queries
  • A new FieldExistsQuery that chooses the best index structures to run over for you
  • A new Persian stemmer

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_2_0/changes/Changes.html

22 March 2022 - Apache Lucene™ 9.1.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.1.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.1.0 Release Highlights:

New features

  • Lucene JARs are now proper Java modules, with module descriptors and dependency information
  • Support for filtering in nearest-neighbor vector search
  • Support for intervals queries in the standard query syntax
  • A new token filter SpanishPluralStemFilter for precise stemming of Spanish plurals

Optimizations

  • Up to 30% improvement in index throughput for high-dimensional vectors
  • Up to 10% faster nearest neighbor searches on high-dimensional vectors
  • Faster execution of "count" searches across different query types
  • Faster counting for taxonomy facets
  • Several other search speed-ups, including improvements to PointRangeQuery, MultiRangeQuery, and CoveringRangeQuery

Other

  • The test framework is now a module, so all classes have been moved from to org.apache.lucene.tests.* to avoid package name conflicts
  • Lucene now faithfully implements the HNSW algorithm for nearest neighbor search by supporting multiple graph layers

… plus a number of helpful bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_1_0/changes/Changes.html

16 December 2021 - Apache Lucene™ 8.11.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.11.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.11.1 Release Highlights:

  • Log4j is upgraded to v2.16.0 to mitigate CVE-2021-44228 (for Luke users)

7 December 2021 - Apache Lucene™ 9.0.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 9.0.0.

Apache Lucene is a high-performance, full-featured search engine library written entirely in Java. It is a technology suitable for nearly any application that requires structured search, full-text search, faceting, nearest-neighbor search across high-dimensionality vectors, spell correction or query suggestions.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 9.0.0 Release Highlights:

System requirements

  • Lucene 9.0 requires JDK 11 or newer

New features

  • Support for indexing high-dimensionality numeric vectors to perform nearest-neighbor search, using the Hierarchical Navigable Small World graph algorithm
  • New Analyzers for Serbian, Nepali, and Tamil languages
  • IME-friendly autosuggest for Japanese
  • Snowball 2, adding Hindi, Indonesian, Nepali, Serbian, Tamil, and Yiddish stemmers
  • New normalization/stemming for Swedish and Norwegian

Optimizations

  • Up to 400% faster taxonomy faceting
  • 10-15% faster indexing of multi-dimensional points
  • Several times faster sorting on fields that are indexed with points. This optimization used to be an opt-in in late 8.x releases and is now opt-out as of 9.0.
  • ConcurrentMergeScheduler now assumes fast I/O, likely improving indexing speed in case where heuristics would incorrectly detect whether the system had modern I/O or not
  • Encoding of postings lists changed from FOR-delta to PFOR-delta to save further disk space

Other

  • File formats have all been changed from big-endian order to little endian order
  • Lucene 9 no longer has split packages. This required renaming some packages outside of the lucene-core JAR, so you will need to adjust some imports accordingly.
  • Using Lucene 9 with the module system should be considered experimental. We expect to make progress on this in future 9.x releases.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/9_0_0/changes/Changes.html

16 November 2021 - Apache Lucene™ 8.11.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.11.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.11.0 Release Highlights:

  • Facets now properly ignore deleted documents when accumulating facet counts for all documents.
  • CheckIndex can run concurrently.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_11_0/changes/Changes.html

18 October 2021 - Apache Lucene™ 8.10.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.10.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains bug fixes. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.10.1 Release Highlights:

Bug fixes

  • MultiCollector now handles single leaf collector that wants to skip low-scoring hits but the combined score mode doesn't allow it.
  • Fix for sort optimization with search_after that was wrongly skipping document whose values are equal to the last value of the previous page.
  • Fix for sort optimization with a chunked bulk scorer that was wrongly skipping documents.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_10_1/changes/Changes.html

27 September 2021 - Apache Lucene™ 8.10.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.10.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.10.0 Release Highlights:

New features

  • Multi-valued fields are now supported in numeric range facet counting
  • Added new analyzer for Telugu
  • Near-real-time readers opened from an IndexCommit can now sort their leaves
  • SimpleText codec now implements skipping for its postings lists

Optimizations

  • Performance improvements for faceting, including a new protected API to control which fields are counted for drill-down during drill sideways, and optimized drill sideways iterating
  • RegexpQuery's detection of adversarial (ReDoS) regular expressions is improved, catching exotic cases that it missed before, and throwing TooComplexToDeterminizeException
  • Speedup for computing the leading prefix and trailing suffix from an Automaton, and for managing powersets during determinize
  • Speedups for stored fields retrieval with the default codec (BEST_SPEED)
  • IndexWriter uses less RAM when buffering documents, especially in the case of many unique fields
  • forceMerge will now merge any number of segments at once, making it much faster in many cases
  • Compression improvements for docvalues storage

... plus a number of exciting bug fixes!

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_10_0/changes/Changes.html

16 June 2021 - Apache Lucene™ 8.9.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.9.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.9.0 Release Highlights:

  • Compression was added to SortedSet DocValues, which allowed to significantly reduce their size on disk.
  • BM25FQuery was extended to handle similarities beyond BM25Similarity. It was renamed to CombinedFieldQuery to reflect its more general scope.
  • A new PatternTypingFilter was added to allow setting a type attribute on tokens based on a configured set of regular expressions.
  • An option was added to supply a custom leaf sorter for IndexWriter and DirectoryReader, which allows to speed up sort queries with a provided sort criteria.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_9_0/changes/Changes.html

12 April 2021 - Apache Lucene™ 8.8.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.8.2.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains three bug fixes. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.8.2 Release Highlights:

  • LUCENE-9870: Fix Circle2D intersectsLine t-value (distance) range clamp
  • LUCENE-9744: NPE on a degenerate query in MinimumShouldMatchIntervalsSource$MinimumMatchesIterator.getSubMatches().
  • LUCENE-9762: DoubleValuesSource.fromQuery (also used by FunctionScoreQuery.boostByQuery) could throw an exception when the query implements TwoPhaseIterator and when the score is requested repeatedly

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_8_2/changes/Changes.html

22 February 2021 - Apache Lucene™ 8.8.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.8.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.8.1 Release Highlights:

No changes from 8.8.0

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_8_1/changes/Changes.html

29 January 2021 - Apache Lucene™ 8.8.0 available

29/01/2021, Apache Lucene™ 8.8 available The Lucene PMC is pleased to announce the release of Apache Lucene 8.8

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 8.8 Release Highlights: LatLonPoint query that accepts an array of LatLonGeometries, support for spatial relationships,

XYPoint query that accepts an array of XYGeometries

Doc values now allow configuring how to trade compression for retrieval speed

Further details of changes are available in the change log available at: http://lucene.apache.org/core/8_8_0/changes/Changes.html

Please report any feedback to the mailing lists:

http://lucene.apache.org/core/discussion.html

Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also applies to Maven access.

3 November 2020 - Apache Lucene™ 8.7.0 available

03/11/2020, Apache Lucene™ 8.7 available The Lucene PMC is pleased to announce the release of Apache Lucene 8.7.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 8.7 Release Highlights: Better compression of stored fields. Stored fields now use dictionaries in order to improve the compression ratio when there is a lot of redundancy across documents. This works for both the BEST_SPEED and the BEST_COMPRESSION modes.

Faster sorting by field. When a doc-value field is also indexed with points, Lucene now takes advantage of this points index in order to skip documents whose sort value is not competitive.

Faster flushing of stored fields when index sorting is enabled. This can significantly speed up indexing when a non-negligible amount of data is stored in the index and index sorting is enabled.

Further details of changes are available in the change log available at: http://lucene.apache.org/core/8_7_0/changes/Changes.html

Please report any feedback to the mailing lists:

http://lucene.apache.org/core/discussion.html

Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also applies to Maven access.

7 October 2020 - Apache Lucene™ 8.6.3 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.6.3.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains no additional bug fixes over the previous version 8.6.2. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

1 September 2020 - Apache Lucene™ 8.6.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.6.2.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.6.2 Bug Fixes:

  • LUCENE-9478: IndexWriter leaked about 500 byte of heap space for each full-flush, getReader or commit. This was a regression in 6.8.0

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_6_2/changes/Changes.html

13 August 2020 - Apache Lucene™ 8.6.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.6.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.6.1 Release Highlights:

  • LUCENE-9443: The UnifiedHighlighter was closing the underlying reader when there were multiple term-vector fields.

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_6_1/changes/Changes.html

15 July 2020 - Apache Lucene™ 8.6.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.6.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.6.0 Release Highlights:

  • API change in: SimpleFSDireectory, IndexWriterConfig, MergeScheduler, SortFields, SimpleBindings, QueryVisitor, DocValues, CodecUtil.
  • New: IndexWriter merge-on-commit feature to selectively merge small segments on commit, subject to a configurable timeout, to improve search performance by reducing the number of small segments for searching.
  • New: Grouping by range based on DoubleValueSource and LongValueSource.
  • Optimizations: BKD trees and index, DoubleValuesSource/QueryValueSource, UsageTrackingQueryingCachingPolicy, FST, Geometry queries, Points, UniformSplit.
  • Others: Ukrainian analyzer, checksums verification, resource leaks fixes.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_6_0/changes/Changes.html

26 May 2020 - Apache Lucene™ 8.5.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.5.2.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fixes. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.5.2 Release Highlights:

  • LUCENE-9350: Don't cache automata on FuzzyQuery

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_5_2/changes/Changes.html

28 April 2020 - Apache Lucene™ 7.7.3 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.7.3.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains 1 bugfix in Lucene. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/7_7_3/changes/Changes.html

16 April 2020 - Apache Lucene™ 8.5.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.5.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.5.1 Bug Fixes:

LUCENE-9300: Index corruption with doc values updates and addIndexes.

Further details of changes are available in the change log available at:

https://lucene.apache.org/core/8_5_1/changes/Changes.html

24 March 2020 - Apache Lucene™ 8.5.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.5.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.5.0 Release Highlights:

  • XYPointField allows you to index points in flat X,Y space and efficiently find documents that fall within a bounding box, distance or arbitrary polygon
  • New query builders on LatLonShape allow you to efficiently find documents with a specific relation to a point or polygon
  • You can now store up to 16 data dimensions in a Point field
  • KoreanTokenizer supports custom dictionaries
  • Binary doc values are now compressed, and term dictionaries have improved compression
  • Index flushes are up to 20% faster if all docvalues updates are updating a single field to the same value
  • The index of stored fields and term vectors is now stored off-heap
  • Query parsers based on QueryBuilder can boost particular terms or synonyms by setting BoostAttribute values on a token stream
  • Intervals queries correctly handle repeated subterms in ordered and unordered sources

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_5_0/changes/Changes.html

13 January 2020 - Apache Lucene™ 8.4.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.4.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.4.1 Release Highlights:

(No Changes since 8.4.0)

29 December 2019 - Apache Lucene™ 8.4.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.4.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.4.0 Release Highlights:

  • LatLonShape now supports the "CONTAINS" relation, which enables to find all indexed shapes that contain the query shape.
  • Concurrent search is getting more efficient by allowing collectors to share information across threads in order to more efficiently skip non-competitive hits.
  • Faster FST lookups on dense nodes.
  • Postings are now decoded using SIMD instructions.
  • LRUQueryCache includes new heuristics that prevent caching from hurting latency too much.
  • LatLonShape builds a more efficient tree that is expected to translate into search speed improvements.
  • BaseDirectoryReader no longer sums up document counts across leaves eagerly, allowing for more efficient reader views that hide a subset of documents.
  • The index on top of BKD trees is now stored off-heap with MMapDirectory.
  • Simple Intervals queries support highlighting.
  • Reading DocValues can be interrupted when timeout is exceeded.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_4_0/changes/Changes.html

3 December 2019 - Apache Lucene™ 8.3.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.3.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.3.1 Release Highlights:

  • Bugfix: MultiTermIntervalsSource.visit() was not calling back to its visitor

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/8_3_1/changes/Changes.html

2 November 2019 - Apache Lucene™ 8.3.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.3.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.3.0 Release Highlights:

  • New SpanishMinimalStemFilter
  • New "export all terms and doc freqs" feature to Luke with delimiters
  • Composite Matches from multiple subqueries now allow access to their submatches, and a new NamedMatches API allows marking of subqueries and a simple way to find which subqueries have matched on a given
  • Range Query For Multiple Connected Ranges
  • LatLonDocValuesPointInPolygonQuery for LatLonDocValuesField
  • New UniformSplitPostingsFormat (name "UniformSplit") primarily benefiting in simplicity and extensibility
  • New STUniformSplitPostingsFormat (name "SharedTermsUniformSplit") that shares a single internal term dictionary across fields
  • DisjunctionMaxQuery more efficiently leverages impacts to skip non-competitive hits
  • BooleanQuery with no scoring clause can now early terminate the query when the total hits is not requested
  • Matches on wildcard queries will defer building their full disjunction until a MatchesIterator is pulled
  • spatial-extras quad and packed quad prefix trees now index points faster
  • Add additional leaf node level optimizations in LatLonShapeBoundingBoxQuery
  • Improve performance of WITHIN and DISJOINT queries for Shape queries by doing just one pass whenever possible
  • Introduce shared count based early termination across multiple slices
  • Blocktree's seekExact now short-circuits false if the term isn't in the min-max range of the segment. Large perf gain for ID/time like data when populated sequentially
  • Show SPI names instead of class names in Luke Analysis tab
  • GraphTokenStreamFiniteStrings preserves all Token attributes through its finite strings TokenStreams
  • Introduced SpanPositionRange into XML Query Parser
  • Use a sort key instead of true distance in NearestNeighbor
  • Tessellator labels the edges of the generated triangles whether they belong to the original polygon
  • Use exact distance between point and bounding rectangle in FloatPointNearestNeighbor
  • The Korean analyzer now splits tokens on boundaries between digits and alphabetic characters
  • MoreLikeThis is biased for uncommon fields

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_3_0/changes/Changes.html

26 July 2019 - Apache Lucene™ 8.2.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.2.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.2.0 Release Highlights:

API Changes:

  • Intervals queries has been moved from the sandbox to the queries module.

New Features

  • New XYShape Field and Queries for indexing and querying general cartesian geometries.
  • Snowball stemmer/analyzer for the Estonian language.
  • Provide a FeatureSortfield to allow sorting search hits by descending value of a feature.
  • Add new KoreanNumberFilter that can change Hangul character to number and process decimal point.
  • Add doc-value support to range fields.
  • Add monitor subproject (previously Luwak monitoring library) that allows a stream of documents to be matched against a set of registered queriesin an efficient manner.
  • Add a numeric range query in sandbox that takes advantage of index sorting.Add a numeric range query in sandbox that takes advantage of index sorting.

Optimizations

  • Use exponential search instead of binary search in IntArrayDocIdSet#advance method.
  • Use incoming thread for execution if IndexSearcher has an executor. Now caller threads execute at least one search on an index even if there is an executor provided to minimize thread context switching.
  • New storing strategy for BKD tree leaves with low cardinality that can lower storage costs and It can be used at search time to speed up queries.
  • Load frequencies lazily only when needed in BlockDocsEnum and BlockImpactsEverythingEnum.
  • Phrase queries now leverage impacts.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_2_0/changes/Changes.html

4 June 2019 - Apache Lucene™ 7.7.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.7.2.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains 9 bugfixes in Lucene. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/7_7_2/changes/Changes.html

28 May 2019 - Apache Lucene™ 8.1.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.1.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains no change over 8.1.0. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

16 May 2019 - Apache Lucene™ 8.1.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.1.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.1.0 Release Highlights:

  • A query introspection API has been introduced.
  • Luke, well-known GUI for inspecting Lucene indexes, now added as a Lucene module
  • Merging dimensional points to use radix partitioning, which has also been optimized
  • Bugfix: LatLonShapePolygonQuery returns incorrect WITHIN results with shared boundaries
  • TieredMergePolicy#findForcedMerges now tries to create the cheapest merges
  • Build point writers in the BKD tree only when they are needed
  • SynonymQuery can now deboost the document frequency of each term when blending synonym scores
  • ConstantScoreQuery can early terminate if minimum score > constant score (total hits are not requested)
  • DateRangePrefixTree can now parse more precise dates

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/8_1_0/changes/Changes.html

5 April 2019 - Apache Lucene™ 6.6.6 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.6.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-pla

This release contains no change over 6.6.4. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.6.6

14 March 2019 - Apache Lucene™ 8.0.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 8.0.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 8.0.0 Release Highlights:

Query execution

Term queries, phrase queries and boolean queries introduced new optimization that enables efficient skipping over non-competitive documents when the total hit count is not needed. Depending on the exact query and data distribution, queries might run between a few percents slower and many times faster, especially term queries and pure disjunctions.

In order to support this enhancement, some API changes have been made: * TopDocs.totalHits is no longer a long but an object that gives a lower bound of the actual hit count. * IndexSearcher's search and searchAfter methods now only compute total hit counts accurately up to 1,000 in order to enable this optimization by default. * Queries are now required to produce non-negative scores.

Codecs

  • Postings now index score impacts alongside skip data. This is how term queries optimize collection of top hits when hit counts are not needed.
  • Doc values introduced jump tables, so that advancing runs in constant time. This is especially helpful on sparse fields.
  • The terms index FST is now loaded off-heap for non-primary-key fields using MMapDirectory, reducing heap usage for such fields.

Custom scoring

The new FeatureField allows efficient integration of static features such as a pagerank into the score. Furthermore, the new LongPoint#newDistanceFeatureQuery and LatLonPoint#newDistanceFeatureQuery methods allow boosting by recency and geo-distance respectively. These new helpers are optimized for the case when total hit counts are not needed. For instance if the pagerank has a significant weight in your scores, then Lucene might be able to skip over documents that have a low pagerank value.

Further details of changes are available in the change log available at:

https://lucene.apache.org/core/8_0_0/changes/Changes.html

1 March 2019 - Apache Lucene™ 7.7.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.7.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains no change over 7.7.0. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

11 February 2019 - Apache Lucene™ 7.7.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.7.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 7.7.0 Release Highlights:

  • Fix LatLonShape WITHIN queries that fail with Multiple search Polygons that share the dateline.
  • LatLonShape's within and disjoint queries can return false positives with indexed multi-shapes.
  • ExitableDirectoryReader may now time out queries that run on points such as range queries or geo queries.
  • StandardTokenizer and UAX29URLEmailTokenizer now support Unicode 9.0, and provide Unicode UTS#51 v11.0 Emoji tokenization with the "" token type.
  • TopFieldCollector can now early-terminates queries when sorting by SortField.DOC.
  • Speed up merging segments of points with data dimensions by only sorting on the indexed dimensions.
  • The KoreanTokenizer no longer splits unknown words on combining diacritics and detects script boundaries more accurately with Character#UnicodeScript#of.
  • Change LatLonShape encoding to use 4 bytes Per Dimension.
  • BufferedUpdates now uses an optimized storage for buffering docvalues updates that can save up to 80% of the heap used compared to the previous implementation and uses non-object based datastructures.
  • Moved to the default accepted overhead ratio for packet ints in DocValuesFieldUpdates yields an up-to 4x performance improvement when applying doc values updates.
  • Doc-value updates get applied faster by sorting with quicksort, rather than an in-place mergesort, which needs to perform fewer swaps.
  • Decrease I/O pressure when merging high dimensional points.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_7_0/changes/Changes.html

14 December 2018 - Apache Lucene™ 7.6.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.6.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 7.6.0 Release Highlights:

  • Index sorting corruption due to numeric overflow has been fixed. Indices affected by this bug can be detected by running the CheckIndex command on a 7.6+ release distribution.
  • Better tessellation processing of Polygons including graceful exceptions for detecting invalid shapes.
  • Points codec now supports selective indexing; the ability to designate dimensions as as "data only" dimensions that do not affect construction of the index.
  • New Simple WKT Shape Parser builds lucene geometries (polygons, lines, rectangles) from WKT format.
  • New LatLonShapeLineQuery queries indexed shapes with arbitrary lines.
  • analyzeGraphPhrase query builder creates one phrase query per finite strings in the graph based on slop parameter.
  • Performance in PerFieldMergeState#FilterFieldInfos has been improved from O(N) to O(1) lookup time.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_6_0/changes/Changes.html

24 September 2018 - Apache Lucene™ 7.5.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.5.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Lucene 7.5.0 Release Highlights:

  • IndexWriter#deleteDocs(Query... query) applies deletes to wrong documents if the index is sorted.
  • TieredMergePolicy now respects maxSegmentSizeMB by default when executing findForcedMerges and findForcedDeletesMerges.
  • A new points based Shape Indexing and Searching that decomposes shapes into a triangular mesh and indexes individual triangles as a 6 dimension point.
  • A new ByteBuffer based Directory implementation that aims to replace the deprecated RAMDirectory.
  • The UnifiedHighlighter can now use the MatchesIterator API to highlight any query more accurately.
  • TopFieldComparator can now stop comparing documents if the index is sorted, even if hits still need to be visited to compute the hit count.
  • TieredMergePolicy can control how aggressively deletes should be reclaimed with the new deletesPctAllowed setting.

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_5_0/changes/Changes.html

3 July 2018 - Apache Lucene™ 6.6.5 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.5.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains no change over 6.6.4. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.6.5

27 June 2018 - Apache Lucene™ 7.4.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.4.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/downloads.html

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/7_4_0/changes/Changes.html

15 May 2018 - Apache Lucene™ 6.6.4 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.4.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains no change over 6.6.3. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.6.4

15 May 2018 - Apache Lucene™ 7.3.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.3.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one build change. The release is available for immediate download at:

https://lucene.apache.org/core/mirrors-core-redir.html

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/7_3_1/changes/Changes.html

4 April 2018 - Apache Lucene™ 7.3.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.3.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/mirrors-core-redir.html

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/7_3_0/changes/Changes.html

7 March 2018 - Apache Lucene™ 6.6.3 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.3.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one build change. The release is available for immediate download at:

https://lucene.apache.org/core/mirrors-core-redir.html

Please read CHANGES.txt for a full list of changes:

https://lucene.apache.org/core/6_6_3/changes/Changes.html

15 January 2018 - Apache Lucene™ 7.2.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.2.1.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate download at:

https://lucene.apache.org/core/mirrors-core-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_2_1/changes/Changes.html

Lucene 7.2.1 Bug Fix:

  • Fix advanceExact on SortedNumericDocValues produced by Lucene54DocValuesProducer.

21 December 2017 - Apache Lucene™ 7.2.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.2.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://lucene.apache.org/core/mirrors-core-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_2_0/changes/Changes.html

Lucene 7.2.0 Release Highlights:

  • Specific query implementations can now opt out of caching.
  • TopFieldDocCollector can now early terminate collection of matches when the index is sorted and the total hit count is not requested.
  • IndexWriter#flushNextBuffer gives more fine-grained control over the memory usage of IndexWriter.
  • Fixed document accounting in IndexWriter.
  • Query scores can be exposed in a ValuesSource using DoubleValuesSource.fromQuery().

24 October 2017 - Apache Lucene™ 5.5.5 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.5.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix. The release is available for immediate download at:

https://lucene.apache.org/core/mirrors-core-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/5_5_5/changes/Changes.html

This release includes a critical security fix. Details:

  • Disallow resolving of external entities in queryparser/xml/CoreParser by default.

18 October 2017 - Apache Lucene™ 6.6.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.2.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.6.2

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/6_6_2/changes/Changes.html

This release includes a critical security fix. Details:

  • Disallow resolving of external entities in queryparser/xml/CoreParser by default.

17 October 2017 - Apache Lucene™ 7.1.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.1.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/7.1.0

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_1_0/changes/Changes.html

Lucene 7.1.0 Release Highlights:

  • New Geo3D shapes for non-spherical planet models

  • Serialization and deserialization support for Geo3D

  • A new CoveringQuery, whose required number of matching clauses can be defined per document

  • New BengaliAnalyzer for Bengali language

  • A point based range field called LatLonBoundingBox

  • FloatPointNearestNeighbor, an N-dimensional FloatPoint K-nearest-neighbor search implementation

  • Faster default taxonomy cache

  • Support for computing facet counts for individual numeric values via LongValueFacetCounts

  • Faster geo-distance queries in case of dense single-valued fields when most documents match

  • Better heuristics in IndexOrDocValuesQuery

  • Optimized builds for OrdinalMap (used by SortedSetDocValuesFacetCounts and others)

6 October 2017 - Apache Lucene™ 7.0.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.0.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains 1 bug fix since the 7.0.0 release:

  • ConjunctionScorer.getChildren was failing to return all child scorers

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/7.0.1

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_0_1/changes/Changes.html

20 September 2017 - Apache Lucene™ 7.0.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 7.0.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/7.0.0

Please read CHANGES.txt for a full list of new features and changes:

https://lucene.apache.org/core/7_0_0/changes/Changes.html

Lucene 7.0.0 Release Highlights:

  • Doc values switched from random access to iterators.

  • The 7.0 codec now sparsely encodes sparse doc values and length normalization factors ("norms"), which also translates to optimization in both indexing, and search on sparse values. With these changes, you finally only pay for what you actually use with doc values, in index size, indexing performance, etc.

  • Index time boost for documents is now removed.

  • Substantial performance gains for delete and update heavy Lucene usage; see http://blog.mikemccandless.com/2017/07/lucene-gets-concurrent-deletes-and.html for details

  • Query scoring is now simpler with removal of coord factor, and query normalization.

  • Classic query parser no longer splits on whitespaces. This enables better multi-word synonym support.

  • The version of Lucene that created the index segment would be recorded, along with the version that last modified the index.

  • IndexWriter, used to add, update and delete documents in your index, will no longer accept broken token offsets sometimes produced by mis-behaving token filters.

  • IndexReader exposes methods that are typically used to manage resources whose lifetime needs to mimic the lifetime of segments/indexes, typically caches. They have been made much less trappy.

  • The dimensional points API now takes a field name up front to offer per-field points access, matching how the doc values APIs work.

  • The PostingsHighlighter was removed. Migrating to the UnifiedHighlighter should be straight-forward.

Apache Lucene was tested to be fully compatible with the release of Java 9 and its module system Jigsaw, coming out tomorrow on September 21st!

7 September 2017 - Apache Lucene™ 6.6.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.6.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

This release contains 2 bug fixes since the 6.6.0 release:

  • Documents with multiple points that should match might not match on a memory index

  • A query which has only one synonym with AND as the default operator would wrongly translate as an AND between the query term and the synonym

6 June 2017 - Apache Lucene™ 6.6.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.6.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.6.0

See the CHANGES.txt file included with the release for a full list of changes and further details.

Highlights of this Lucene release include:

  • A concurrent SortedSet facets implementation

  • spatial-extras HeatmapFacetCounter will now short-circuit it's work when Bits.MatchNoBits is passed

  • OfflineSorter now passes the total number of items it will write to getWriter()

  • Move dictionary for Ukrainian analyzer to external dependency

  • SortedSetDocValuesReaderState now implements Accountable so one can see how much RAM it is using

  • OfflineSorter can now run concurrently if you pass it an optional ExecutorService Sorted set facets now use sparse storage when collecting hits, when appropriate

  • PostingsHighlighter has been deprecated in favour of the UnifiedHighlighter

27 April 2017 - Apache Lucene™ 6.5.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.5.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.5.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

This release contains 3 bug fixes since the 6.5.0 release:

  • Fixed join queries to not reference IndexReaders, as it could cause leaks if they are cached.

  • Made LRUQueryCache delegate the scoreSupplier method.

  • Fixed index sorting to work with sparse numeric and binary docvalues field

27 March 2017 - Apache Lucene™ 6.5.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.5.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.5.0

See the CHANGES.txt file included with the release for a full list of changes and further details.

Highlights of this Lucene release include:

  • It is now possible filter out duplicates in the NRT suggester

  • SimpleQueryString now supports default fuziness

  • IndexWriter can return the list of visible field names

  • DisjunctionScorer now supports returning the matching children clauses

  • A new FunctionScoreQuery that modifies the internal query's score using the per-document values

  • A new FunctionMatchQuery that returns any documents with a value that matches a predicate

  • A new WordDelimiterGraphFilter that outputs a correct graph structure for multi-token expansion at query time

  • A new PatternTokenizer that uses Lucene's RegExp implementation

  • RangeFieldQuery now supports CROSSES relation

  • A new IndexOrDocValuesQuery that uses either an index (points or terms) or doc values in order to run a (range, geo box and distance) query, depending which one is more efficient

  • index-time boosts are deprecated

  • Term filters are no longer cached

  • Compound filters are cached earlier than regular queries

  • BKDReader now calls grow on larger increments

  • LatLonPointInPolygonQuery are faster

  • LatLonPointDistanceQuery now skips distance computations more often

  • To-parent block joins now implements two-phase iteration

  • Point ranges that match most documents are faster

  • PointValues#estimatePointCount is faster with Relation.CELL_INSIDE_QUERY

  • Segments are now also sorted during flush, and merging on a sorted index is substantially faster by using some of the same bulk merge optimizations that non-sorted merging uses

7 March 2017 - Apache Lucene 6.4.2 and Apache Solr 6.4.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.4.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.4.2

See the CHANGES.txt file included with the release for a full list of changes and further details.

Highlights of this Lucene release include:

  • Fixed: CommonGramsQueryFilter was producing a disconnected token graph, messing up phrase queries during query parsing

15 February 2017 - Apache Lucene™ 5.5.4 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.4

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/5.5.4

See the CHANGES.txt file included with the release for a full list of changes and further details.

Highlights of this Lucene release include:

  • Made stored fields reclaim native memory more aggressively

  • Fixed a potential memory leak with LRUQueryCache and (Span)TermQuery

  • MmapDirectory's unmapping code is now compatible with Java 9 (EA build 150 and later)

6 February 2017 - Apache Lucene™ 6.4.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.4.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. TThe release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.4.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

Highlights of this Lucene release include:

  • Javadocs now build successfully with Java 8u121

  • Fixed memory leak in the case that TermQuery or SpanTermQuery objects that wrap a TermContext were cached

  • Fixed native memory leak when the codec is configured with the BEST_COMPRESSION option

  • AnalyzingInfixSuggester now only opens an IndexWriter when changes need to be applied

23 January 2017 - Apache Lucene™ 6.4.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.4.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.4.0

See the CHANGES.txt file included with the release for a full list of changes and further details.

Highlights of this Lucene release include:

  • Lucene's best efforts to un-map memory mapped files with "MMapDirectory" now work with the latest Java9 early access builds

  • A new similarity "BooleanSimilarity" that gives terms a score that is equal to their query boost

  • The axiomatic family of similarities (6 in total) based on https://www.eecis.udel.edu/~hfang/pubs/sigir05-axiom.pdf

  • A new token filter "SynonymGraphFilter" that outputs a correct graph structure for multi-token synonyms at query time

  • Graph token streams, such as those produced by the "SynonymGraphFilter", are now handled accurately by query parsers

  • A new collector "DocValuesStatsCollector" gives the ability to compute statistics on DocValues field

  • It is now possible to filter "SortedDocValues" and "SortedSetDocValues" terms enum with a compiled automaton

  • The "UnifiedHighlighter" can now highlight fields with queries that don't necessarily refer to that field

  • DrillSideways can now run queries concurrently

  • Index sorting now supports sorting on multi-valued fields using MIN, MAX, etc. selectors

  • Points do not store the implicit split dimension in the 1-dimension case. This saves between 6% memory for the largest types such an InetAddressPoint to 33% for the smaller types such as HalfFloatPoint.

  • The BKD in-memory index for dimensional points now uses a compressed format, using substantially less RAM in some cases

  • The BKD writing now buffers each leaf block in heap before writing to disk, giving a small speedup in points-heavy use cases

  • "TermAutomatonQuery" now rewrites to more efficient queries when possible

Please note, this release cannot be built from source with Java 8 update 121, use an earlier version instead! This is caused by a bug introduced into the Javadocs tool shipped with that update. The workaround was too late for this Lucene release. Of course, you can use the binary artifacts.

8 November 2016 - Apache Lucene™ 6.3.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.3.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.3.0

See the CHANGES.txt file included with the release for a full list of changes and further details.

Highlights of this Lucene release include:

  • A brand new "UnifiedHighlighter" derivative of the PostingsHighlighter that can consume offsets from postings, term vectors, or analysis. It can highlight phrases as accurately as the standard Highlighter. Light term vectors can be used with offsets in postings for fast wildcard (MultiTermQuery) highlighting.

  • SimpleQueryParser now parses '*' to MatchAllDocsQuery

  • FuzzyQuery now matches all terms within the specified edit distance, even if they are short terms

  • Points do not store the implicit split dimension in the 1-dimension case. This saves between 6% memory for the largest types such an InetAddressPoint to 33% for the smaller types such as HalfFloatPoint.

  • Many other changes and bug fixes

20 September 2016 - Apache Lucene™ 6.2.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.2.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

Highlights of this Lucene release include:

  • LUCENE-7417: The standard Highlighter could throw an !IllegalArgumentException when trying to highlight a query containing a degenerate case of a !MultiPhraseQuery with one term.

  • LUCENE-7440: Document id skipping (!PostingsEnum.advance) could throw an !ArrayIndexOutOfBoundsException exception on large index segments (>1.8B docs) with large skips.

  • LUCENE-7318: Fix backwards compatibility issues around StandardAnalyzer and its components, introduced with Lucene 6.2.0. The moved classes were restored in their original packages: LowercaseFilter and StopFilter, as well as several utility classes.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.2.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

9 September 2016 - Apache Lucene 5.5.3 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.3

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/5.5.3

See the CHANGES.txt file included with the release for a full list of changes and further details.

25 August 2016 - Apache Lucene 6.2.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.2.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes: https://lucene.apache.org/core/6_2_0/changes/Changes.html

Highlights of this Lucene release include:

  • The CREATE_NEW flag is passed when creating a file to ensure Lucene is really write-once

  • Index numeric ranges (min and max value in a single field) and search by overlapping range

  • IndexWriter methods return a sequence number indicating effective order of operations across threads

  • UkrainianMorfologikAnalyzer is a new dictionary based analyzer for the Ukrainian language

  • The Polygon class can now be created from a GeoJSON string

  • Compound file creation now verifies checksum of its component files

  • Index time sorting is now a core feature, and supports dimensional points

  • StandardAnalyzer is moved to core and is the default analyzer

  • MatchNoDocsQuery now includes the reason it was created

  • QueryParser can now be told to not pre-split on whitespace

  • MMapDirectory tries harder to prevent SIGSEGV if buggy code tries to execute searches after the index was closed, but it's still best effort

  • MMapDirectory no longer allocates weak references to ease garbage collection

  • Conjunction (MUST, FILTER) queries are faster

  • Dimensional points have much faster (~40%) flush time and use less space in the index

25 June 2016 - Apache Lucene 5.5.2 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains 11 bug fixes since the 5.5.1 release.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/5.5.2

See the CHANGES.txt file included with the release for a full list of changes and further details.

17 June 2016 - Apache Lucene 6.1.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.1.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes: https://lucene.apache.org/core/6_1_0/changes/Changes.html

Lucene 6.1.0 Release Highlights:

New features

  • Numerous improvements to LatLonPoint, for indexing a latitude/longitude point and searching by polygon, distance or box, or finding nearest neighbors

  • Geo3D now has simple APIs for creating common shape queries, matching LatLonPoint

Optimizations

  • Faster indexing and searching of points.

  • Faster geo-spatial indexing and searching for LatLonPoint, Geo3D and GeoPoint (see http://home.apache.org/~mikemccand/geobench.html )

  • HardlinkCopyDirectoryWrapper optimizes file copies using hard links

  • In case of contention, the query cache now prefers returning an uncached Scorer rather than waiting on a lock.

Bug fixes

  • BooleanQuery could sometimes assign too low scores to ranges of documents that matched a single clause.

  • Doc values updates could sometimes be applied in the wrong order.

28 May 2016 - Apache Lucene 6.0.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.0.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains 10 bug fixes since the 6.0.0 release, and one new feature:

  • Spatial-extras DateRangePrefixTree's Calendar is now configurable, to e.g. clear the Gregorian Change Date. Also, toString(cal) is now identical to DateTimeFormatter.ISO_INSTANT.

The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/java/6.0.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

5 May 2016 - Apache Lucene 5.5.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix since the 5.5.0 release. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/core/5.5.1

See the CHANGES.txt file included with the release for a full list of changes and further details.

8 April 2016 - Apache Lucene 6.0.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 6.0.0.

The release can be downloaded from https://lucene.apache.org/core/mirrors-core-latest-redir.html

Release Highlights:

  • Java 8 is the minimum Java version required.

  • Dimensional points, replacing legacy numeric fields, provides fast and space-efficient support for both single- and multi-dimension range and shape filtering. This includes numeric (int, float, long, double), InetAddress, BigInteger and binary range filtering, as well as geo-spatial shape search over indexed 2D LatLonPoints. See this blog post for details. Dependent classes and modules (e.g., MemoryIndex, Spatial Strategies, Join module) have been refactored to use new point types.

  • Lucene classification module now works on Lucene Documents using a KNearestNeighborClassifier or SimpleNaiveBayesClassifier.

  • The spatial module no longer depends on third-party libraries. Previous spatial classes have been moved to a new spatial-extras module.

  • Spatial4j has been updated to a new 0.6 version hosted by locationtech.

  • TermsQuery performance boost by a more aggressive default query caching policy.

  • IndexSearcher's default Similarity is now changed to BM25Similarity.

  • Easier method of defining custom CharTokenizer instances.

22 February 2016 - Apache Lucene 5.5.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.5.0

The release can be downloaded from https://lucene.apache.org/core/mirrors-core-latest-redir.html

Release highlights:

  • JoinUtil.createJoinQuery can now join on numeric doc values fields

  • BlendedInfixSuggester now has an exponential reciprocal scoring model, to more strongly favor suggestions with matches closer to the beginning

  • CustomAnalyzer has improved (compile time) type safety

  • DFISimilarity implements the divergence from independence scoring model

  • Fully wrap any other merge policy using MergePolicyWrapper

  • Sandbox geo point queries have graduated into the spatial module, and now use a more efficient binary term encoding for smaller index size, faster indexing, and decreased search-time heap usage

  • BooleanQuery performs some new query optimizations

  • TermsQuery constructors are more GC efficient

23 January 2016 - Apache Lucene 5.3.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.3.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains one bug fix since the 5.3.1 release. The release is available for immediate download at:

https://www.apache.org/dyn/closer.lua/lucene/core/5.3.2

See the CHANGES.txt file included with the release for a full list of changes and further details.

23 January 2016 - Apache Lucene 5.4.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.4.1

The release can be downloaded from https://lucene.apache.org/core/mirrors-core-latest-redir.html

This release contains an important fix for a corruption bug that was introduced in version 5.4.0. If you are on 5.4.0 and using BINARY, SORTED_NUMERIC or SORTED_SET doc values, upgrading to 5.4.1 is strongly recommended.

See the CHANGES.txt file included with the release for a full list of changes and further details.

14 December 2015 - Apache Lucene 5.4.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.4.0

The release can be downloaded from https://lucene.apache.org/core/mirrors-core-latest-redir.html

Highlights of this Lucene release include:

API Changes

  • Query.getBoost and Query.setBoost are deprecated in favour of the new BoostQuery
  • The Filter class is deprecated in favour of FILTER clauses in a BooleanQuery
  • DefaultSimilarity has been renamed to ClassicSimilarity to prepare for the move to BM25 in Lucene 6

New features

  • New Serbian token filter
  • New DecimalDigitFilter, to fold unicode digits to latin digits
  • New UnicodeWhitespaceTokenizer, that uses Unicode's whitespace definition and splits on NBSP
  • New GeoPointDistanceRangeQuery to search for geo-points within a ring
  • Query caching is now enabled by default in IndexSearcher, use IndexSearcher.setQueryCache(null) to disable

Optimizations

  • MatchAllDocsQuery got faster
  • Doc values now use less memory for multi-valued fields and less disk in case of sparse fields
  • Two-phase iterators got a match cost API so that the costly bits can be checked last

Bug fixes

  • PatternTokenizer no longer hangs onto heap sized to the maximum input string it's ever seen.

24 September 2015 - Apache Lucene 5.3.1 and Apache Solr 5.3.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.3.1

The release can be downloaded from https://lucene.apache.org/core/mirrors-core-latest-redir.html

Highlights of this Lucene release include:

Bug Fixes

  • Remove classloader hack in MorfologikFilter
  • UsageTrackingQueryCachingPolicy no longer caches trivial queries like MatchAllDocsQuery
  • Fixed BoostingQuery to rewrite wrapped queries

24 August 2015 - Apache Lucene™ 5.3.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.3.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://www.apache.org/dyn/closer.lua/lucene/java/5.3.0

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 5.3.0 Release Highlights:

API Changes

  • PhraseQuery and BooleanQuery are now immutable

New features

  • Added a new org.apache.lucene.search.join.CheckJoinIndex class that can be used to validate that an index has an appropriate structure to run join queries
  • Added a new BlendedTermQuery to blend statistics across several terms
  • New common suggest API that mirrors Lucene's Query/IndexSearcher APIs for Document based suggester.
  • IndexWriter can now be initialized from an already open near-real-time or non-NRT reader
  • Add experimental range tree doc values format and queries, based on a 1D version of the spatial BKD tree, for a faster and smaller alternative to postings-based numeric and binary term filtering. Range trees can also handle values larger than 64 bits.
  • Added GeoPointField, GeoPointInBBoxQuery, GeoPointInPolygonQuery for simple "indexed lat/lon point in bbox/shape" searching
  • Added experimental BKD geospatial tree doc values format and queries, for fast "bbox/polygon contains lat/lon points"
  • Use doc values to post-filter GeoPointField hits that fall in boundary cells, resulting in smaller index, faster searches and less heap used for each query

Optimizations

  • Reduce RAM usage of FieldInfos, and speed up lookup by number, by using an array instead of TreeMap except in very sparse cases
  • Faster intersection of the terms dictionary with very finite automata, which can be generated eg. by simple regexp queries
  • Various bugfixes and optimizations since the 5.2.0 release.

See the CHANGES.txt file included with the release for a full list of changes and further details.

15 June 2015 - Apache Lucene™ 5.2.1 available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.2.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains various bug fixes and optimizations since the 5.2.0 release.

The release is available for immediate download at: https://www.apache.org/dyn/closer.lua/lucene/java/5.2.1

Lucene 5.2.1 includes 3 bug fixes:

  • Fix class loading deadlock relating to Codec initialization, default codec and SPI discovery.
  • NRT readers now reflect a new commit even if there is no change to the commit user data
  • Queries now get a dummy Similarity when scores are not needed in order to not load unnecessary information like norms

See the CHANGES.txt file included with the release for a full list of changes and further details.

7 June 2015 - Lucene Core 5.2.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.2.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://www.apache.org/dyn/closer.lua/lucene/java/5.2.0

Lucene 5.2.0 release highlights:

  • Span queries now share document conjunction/intersection code with boolean queries, and use two-phased iterators for faster intersection by avoiding loading positions in certain cases.

  • Added two-phase support to SpanNotQuery, and SpanPositionCheckQuery and its subclasses: SpanPositionRangeQuery, SpanPayloadCheckQuery, SpanNearPayloadCheckQuery, SpanFirstQuery.

  • Added a new query time join to the join module that uses global ordinals, which is faster for subsequent joins between reopens.

  • New CompositeSpatialStrategy combines speed of RPT with accuracy of SDV. Includes optimized Intersect predicate to avoid many geometry checks. Uses TwoPhaseIterator.

  • New LimitTokenOffsetFilter that limits tokens to those before a configured maximum start offset.

  • New spatial PackedQuadPrefixTree, a generally more efficient choice than QuadPrefixTree, especially for high precision shapes. When used, you should typically disable RPT's pruneLeafyBranches option.

  • Expressions now support bindings keys that look like zero arg functions

  • Add SpanWithinQuery and SpanContainingQuery that return spans inside of / containing another spans.

  • New Spatial "Geo3d" API with partial Spatial4j integration. It is a set of shapes implemented using 3D planar geometry for calculating spatial relations on the surface of a sphere. Shapes include Point, BBox, Circle, Path (buffered line string), and Polygon.

  • Various bugfixes and optimizations since the 5.1.0 release.

See the CHANGES.txt file included with the release for a full list of changes and further details.

14 April 2015 - Lucene Core 5.1.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.1.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://www.apache.org/dyn/closer.lua/lucene/java/5.1.0

Lucene 5.1.0 includes 9 new features, 10 bug fixes, and 24 optimizations / other changes from 18 unique contributors.

See the CHANGES.txt file included with the release for a full list of changes and further details.

5 March 2015 - Lucene Core 4.10.4 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.4

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://www.apache.org/dyn/closer.lua/lucene/java/4.10.4

Lucene 4.10.4 includes 13 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

20 February 2015 - Lucene™ 5.0.0 core available

The Lucene PMC is pleased to announce the release of Apache Lucene 5.0.

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 5.0 Release Highlights:

Stronger index safety

  • All file access now uses Java’s NIO.2 APIs which give Lucene stronger index safety in terms of better error handling and safer commits.

  • Every Lucene segment now stores a unique id per-segment and per-commit to aid in accurate replication of index files.

  • During merging, IndexWriter now always checks the incoming segments for corruption before merging. This can mean, on upgrading to 5.0.0, that merging may uncover long-standing latent corruption in an older 4.x index.

Reduced heap usage

  • Lucene now supports random-writable and advance-able sparse bitsets (RoaringDocIdSet and SparseFixedBitSet), so the heap required is in proportion to how many bits are set, not how many total documents exist in the index.

  • Heap usage during IndexWriter merging is also much lower with the new Lucene50Codec, since doc values and norms for the segments being merged are no longer fully loaded into heap for all fields; now they are loaded for the one field currently being merged, and then dropped.

  • The default norms format now uses sparse encoding when appropriate, so indices that enable norms for many sparse fields will see a large reduction in required heap at search time.

  • 5.0 has a new API to print a tree structure showing a recursive breakdown of which parts are using how much heap.

Other features

  • FieldCache is gone (moved to a dedicated UninvertingReader in the misc module). This means when you intend to sort on a field, you should index that field using doc values, which is much faster and less heap consuming than FieldCache.

  • Tokenizers and Analyzers no longer require Reader on init.

  • NormsFormat now gets its own dedicated NormsConsumer/Producer

  • SortedSetSortField, used to sort on a multi-valued field, is promoted from sandbox to Lucene's core.

  • PostingsFormat now uses a "pull" API when writing postings, just like doc values. This is powerful because you can do things in your postings format that require making more than one pass through the postings such as iterating over all postings for each term to decide which compression format it should use.

  • New DateRangeField type enables Indexing and searching of date ranges, particularly multi-valued ones.

  • A new ExitableDirectoryReader extends FilterDirectoryReader and enables exiting requests that take too long to enumerate over terms.

  • Suggesters from multi-valued field can now be built as DocumentDictionary now enumerates each value separately in a multi-valued field.

  • ConcurrentMergeScheduler detects whether the index is on SSD or not and does a better job defaulting its settings. This only works on Linux for now; other OS's will continue to use the previous defaults (tuned for spinning disks).

  • Auto-IO-throttling has been added to ConcurrentMergeScheduler, to rate limit IO writes for each merge depending on incoming merge rate.

  • CustomAnalyzer has been added that allows to configure analyzers like you do in Solr's index schema. This class has a builder API to configure Tokenizers, TokenFilters, and CharFilters based on their SPI names and parameters as documented by the corresponding factories.

  • Memory index now supports payloads.

  • Added a filter cache with a usage tracking policy that caches filters based on frequency of use.

  • The default codec has an option to control BEST_SPEED or BEST_COMPRESSION for stored fields.

  • Stored fields are merged more efficiently, especially when upgrading from previous versions or using SortingMergePolicy

NOTE: Lucene 5 no longer supports the Lucene 3.x index format. Opening indexes will result in IndexFormatTooOldException. It is recommended to either reindex all your data, or upgrade the old indexes with the IndexUpgrader tool of latest Lucene 4 version (4.10.x). Those indexes can then be read (see next section) with Lucene 5.

To read more about the changes, also see: http://blog.mikemccandless.com/2014/11/apache-lucene-500-is-coming.html

Please read CHANGES.txt and MIGRATE.txt for a full list of new features and notes on upgrading.

29 December 2014 - Lucene Core 4.10.3 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.3

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.3 includes 12 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details, and Happy Holidays!

31 October 2014 - Lucene Core 4.10.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.2 includes 2 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details, and Happy Halloween!

29 September 2014 - Lucene Core 4.10.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.1 includes 7 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

22 September 2014 - Lucene Core 4.9.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.9.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.9.1 includes 7 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

3 September 2014 - Lucene Core 4.10.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.10.0 Release Highlights:

  • New TermAutomatonQuery using an automaton for proximity queries. http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html

  • New OrdsBlockTree terms dictionary supporting ord lookup.

  • Simplified matchVersion handling for Analyzers with new setVersion method, as well as Analyzer constructors not requiring Version.

  • Fixed possible corruption when opening a 3.x index with NRT reader.

  • Fixed edge case in StandardTokenizer that caused extremely slow parsing times with long text which partially matched grammar rules.

25 June 2014 - Lucene Core 4.9.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.9.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.9.0 Release Highlights:

  • New Terms.getMin/Max methods to retrieve the lowest and highest terms per field.

  • New IDVersionPostingsFormat, optimized for ID lookups that associate a monotonically increasing version per ID.

  • Atomic update of a set of doc values fields.

  • Numerous optimizations for doc values search-time performance.

  • New (default) Lucene49NormsFormat to better compress certain cases such as very short fields.

  • New SORTED_NUMERIC docvalues type for efficient processing of multi-valued numeric fields.

  • Indexer passes previous token stream for easier reuse.

  • MoreLikeThis accepts multiple values per field.

  • All classes that estimate their RAM usage now implement a new Accountable interface.

  • Lucene files are now written by (File)OutputStream on all platforms, completely disallowing seeking with simplified IO APIs.

  • Improve the confusing error message when MMapDirectory cannot create a new map.

20 May 2014 - Lucene Core 4.8.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.8.1 includes 15 bug fixes.

See the CHANGES.txt file included with the release for a full list of changes and further details.

28 April 2014 - Apache Lucene 4.8.0 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.8.0

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.8.0 Release Highlights:

  • Apache Lucene now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Lucene).

  • Apache Lucene is fully compatible with Java 8.

  • All index files now store end-to-end checksums, which are now validated during merging and reading. This ensures that corruptions caused by any bit-flipping hardware problems or bugs in the JVM can be detected earlier. For full detection be sure to enable all checksums during merging (it's disabled by default).

  • Lucene has a new Rescorer/QueryRescorer API to perform second-pass rescoring or reranking of search results using more expensive scoring functions after first-pass hit collection.

  • AnalyzingInfixSuggester now supports near-real-time autosuggest.

  • Simplified impact-sorted postings (using SortingMergePolicy and EarlyTerminatingCollector) to use Lucene's Sort class to express the sort order.

  • Bulk scoring and normal iterator-based scoring were separated, so some queries can do bulk scoring more effectively.

  • Switched to MurmurHash3 to hash terms during indexing.

  • IndexWriter now supports updating of binary doc value fields.

  • HunspellStemFilter now uses 10 to 100x less RAM. It also loads all known OpenOffice dictionaries without error.

  • Lucene now also fsyncs the directory metadata on commits, if the operating system and file system allow it (Linux, MacOSX are known to work).

  • Lucene now uses Java 7 file system functions under the hood, so index files can be deleted on Windows, even when readers are still open.

  • A serious bug in NativeFSLockFactory was fixed, which could allow multiple IndexWriters to acquire the same lock. The lock file is no longer deleted from the index directory even when the lock is not held.

  • Various bugfixes and optimizations since the 4.7.2 release.

15 April 2014 - Lucene Core 4.7.2 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7.2

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.7.2 includes 2 bug fixes, including a possible index corruption with near-realtime search.

See the CHANGES.txt file included with the release for a full list of changes and further details.

2 April 2014 - Lucene Core 4.7.1 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7.1

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.7.1 includes 14 bug fixes; one build improvement; and one change in runtime behavior: AutomatonQuery.equals is no longer implemented as "accepts same language".

See the CHANGES.txt file included with the release for a full list of changes and further details.

12 March 2014 - Apache Lucene 4.8 will require Java 7

The Apache Lucene committers decided with a large majority on the vote to require Java 7 for the next minor release of Apache Lucene (version 4.8)!

The next release will also contain some improvements for Java 7:

  • Better file handling (especially on Windows) in the directory implementations. Files can now be deleted on windows, although the index is still open - like it was always possible on Unix environments (delete on last close semantics).

  • Speed improvements in sorting comparators: Sorting now uses Java 7's own comparators for integer and long sorts, which are highly optimized by the Hotspot VM.

If you want to stay up-to-date with Lucene and Solr, you should upgrade your infrastructure to Java 7. Please be aware that you must use at least use Java 7u1. The recommended version at the moment is Java 7u25. Later versions like 7u40, 7u45,... have a bug causing index corrumption. Ideally use the Java 7u60 prerelease, which has fixed this bug. Once 7u60 is out, this will be the recommended version. In addition, there is no more Oracle/BEA JRockit available for Java 7, use the official Oracle Java 7. JRockit was never working correctly with Lucene/Solr (causing index corrumption), so this should not be an issue. Please also review our list of JVM bugs: http://wiki.apache.org/lucene-java/JavaBugs

EDIT (as of 15 April 2014): The recently released Java 7u55 fixes the above bug causing index corrumption. This version is now the recommended version for running Apache Lucene.

26 February 2014 - Lucene Core 4.7 Available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.7

Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

This release contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: https://lucene.apache.org/core/mirrors-core-latest-redir.html

See the CHANGES.txt file included with the release for a full list of details.

Lucene 4.7 Release Highlights:

  • When sorting by String (SortField.STRING), you can now specify whether missing values should be sorted first (the default), or last.

  • Add two memory resident dictionaries (FST terms dictionary and FSTOrd terms dictionary) to improve primary key lookups. The PostingsBaseFormat API is also changed so that term dictionaries get the ability to block encode term metadata, and all dictionary implementations can now plug in any PostingsBaseFormat.

  • NRT support for file systems that do not have delete on last close or cannot delete while referenced semantics.

  • Add LongBitSet for managing more than 2.1B bits (otherwise use FixedBitSet).

  • Speed up Lucene range faceting from O(N) per hit to O(log(N)) per hit using segment trees.

  • Add SearcherTaxonomyManager over search and taxonomy index directories (i.e. not only NRT).

  • Drilling down or sideways on a Lucene facet range (using Range.getFilter()) is now faster for costly filters (uses random access, not iteration); range facet counts now accept a fast-match filter to avoid computing the value for documents that are out of bounds, e.g. using a bounding box filter with distance range faceting.

  • Add Analyzer for Kurdish.

  • Add Payload support to FileDictionary (Suggest) and make it more configurable.

  • Add a new BlendedInfixSuggester, which is like AnalyzingInfixSuggester but boosts suggestions that matched tokens with lower positions.

  • Add SimpleQueryParser: parser for human-entered queries.

  • Add multitermquery (wildcards,prefix,etc) to PostingsHighlighter.

  • Upgrade to Spatial4j 0.4.1: Parses WKT (including ENVELOPE) with extension BUFFER; buffering a point results in a Circle. JTS isn't needed for WKT any more but remains required for Polygons. New Shapes: ShapeCollection and BufferedLineString.

  • Add spatial SerializedDVStrategy that serializes a binary representation of a shape into BinaryDocValues. It supports exact geometry relationship calculations.

  • Various bugfixes and optimizations since the 4.6.1 release.