Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update lucene to version 8.11.2 #16

Merged
merged 27 commits into from
Nov 14, 2024
Merged
Changes from 26 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8258128
Compiles
JJK96 Oct 30, 2023
41a8b6d
Uncleaned version that supports regex searching
JJK96 Nov 13, 2023
fbeaac7
For regex queries search in full non-canonical text, while for other …
JJK96 Jan 8, 2024
982ce80
Add switch for regex search type
JJK96 Jan 8, 2024
4239e9c
Make Regex search case insensitive
JJK96 Feb 19, 2024
4c92c9c
Fix Thai analyzer
JJK96 Feb 19, 2024
a06ecda
Fix Hebrew analyser
JJK96 Feb 19, 2024
c784ccc
Fix Arabic
JJK96 Mar 18, 2024
7c43cca
Fix Persian
JJK96 Mar 18, 2024
d7616bc
Remove local.properties
JJK96 Mar 18, 2024
02fa61f
Fix analyzer references
JJK96 Mar 18, 2024
54c73b6
Fix tests
JJK96 Mar 18, 2024
a4f26c2
Add local.properties to gitignore
JJK96 Mar 18, 2024
c3933c7
Add smartcn analyzer
JJK96 Mar 18, 2024
d26a312
Fix Chinese and Japanese
JJK96 Jul 8, 2024
f00f512
Fix French stemmer test
JJK96 Jul 8, 2024
f355696
Fix all tests
JJK96 Jul 8, 2024
d830c48
Removed AbstractBookAnalyzer
JJK96 Aug 19, 2024
25ceeb8
All tests compiling, but not completely working yet
JJK96 Aug 19, 2024
c588094
Update test, stemming has been implemented now
JJK96 Aug 20, 2024
069667a
Make stopwording optional but disabled by default
JJK96 Aug 20, 2024
dd6c939
Make code cleaner
JJK96 Oct 14, 2024
4aaf655
Restructured
JJK96 Oct 14, 2024
89d6f45
Remove print
JJK96 Oct 14, 2024
9e6da6d
Apply range query to regex queries as well. Fixes bug where regex que…
JJK96 Oct 14, 2024
c2a7da0
Invalidate old Lucene indices
JJK96 Oct 14, 2024
b2989d4
Code review fixes
tuomas2 Nov 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -15,3 +15,5 @@ rebel.xml
/.gradle/
/build/
atlassian-ide-plugin.xml
.DS_Store
local.properties
12 changes: 6 additions & 6 deletions build.gradle.kts
Original file line number Diff line number Diff line change
@@ -16,7 +16,7 @@ tasks.withType<Test>() {
}

group = "org.crosswire"
version = "2.3"
version = "2.4"

repositories {
mavenCentral()
@@ -25,13 +25,13 @@ repositories {
dependencies {
// implementation("org.jetbrains.kotlin:kotlin-stdlib")
implementation("org.apache.commons:commons-compress:1.12")
implementation("com.chenlb.mmseg4j:mmseg4j-analysis:1.8.6")
implementation("com.chenlb.mmseg4j:mmseg4j-dic:1.8.6")

implementation("org.jdom:jdom2:2.0.6.1")
implementation("org.apache.lucene:lucene-analyzers:3.6.2")
// To upgrade Lucene, change to
// implementation("org.apache.lucene:lucene-analyzers-common:x")
implementation("org.apache.lucene:lucene-analyzers-common:8.11.2")
implementation("org.apache.lucene:lucene-analyzers-smartcn:8.11.2")
implementation("org.apache.lucene:lucene-analyzers-kuromoji:8.11.2")

implementation("org.apache.lucene:lucene-queryparser:8.11.2")

//implementation("org.slf4j:slf4j-api:1.7.6")
implementation("org.slf4j:slf4j-api:1.7.6")
18 changes: 18 additions & 0 deletions notes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Functionality AbstractBookAnalyzer
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be deleted before merging, right?


Add Book to analyzer
Is this necessary
It is used to automatically get the right language from the book
Might be done a level higher, when selecting the right analyser
Set DoStemming
Stemming is done by default in language-specific analyzers
Set DoStopWords
Initializing with empty stopwords set should disable stopwording

Why is createComponents necessary?
I think to support optional stemming and stopwords
Both jSword itself and and-bible nowhere use these options, except when initializing them to the defaults

# TODO

Fix deleted tests
2 changes: 1 addition & 1 deletion src/main/java/org/crosswire/common/util/CWClassLoader.java
Original file line number Diff line number Diff line change
@@ -65,7 +65,7 @@ public final class CWClassLoader extends ClassLoader {
* @return the CrossWire Class Loader
*/
public static CWClassLoader instance(Class<?> resourceOwner) {
return AccessController.doPrivileged(new PrivilegedLoader<CWClassLoader>(resourceOwner));
return new CWClassLoader(resourceOwner);
}

/**
Original file line number Diff line number Diff line change
@@ -36,6 +36,7 @@
import org.crosswire.jsword.book.sword.Backend;
import org.crosswire.jsword.book.sword.processing.NoOpRawTextProcessor;
import org.crosswire.jsword.book.sword.processing.RawTextToXmlProcessor;
import org.crosswire.jsword.index.IndexManagerFactory;
import org.crosswire.jsword.index.IndexStatus;
import org.crosswire.jsword.index.IndexStatusEvent;
import org.crosswire.jsword.index.IndexStatusListener;
@@ -186,6 +187,9 @@ public boolean match(String name) {
* @see org.crosswire.jsword.book.Book#getIndexStatus()
*/
public IndexStatus getIndexStatus() {
if (IndexManagerFactory.getIndexManager().needsReindexing(this)) {
return IndexStatus.INVALID;
}
return bmd.getIndexStatus();
}

1 change: 1 addition & 0 deletions src/main/java/org/crosswire/jsword/index/Index.java
Original file line number Diff line number Diff line change
@@ -44,6 +44,7 @@ public interface Index {
* @throws BookException
*/
Key find(String query) throws BookException;
Key find(String query, boolean full_text) throws BookException;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Key find(String query, boolean full_text) throws BookException;
Key find(String query, boolean fullText) throws BookException;


/**
* An index must be able to create KeyLists for users in a similar way to
37 changes: 10 additions & 27 deletions src/main/java/org/crosswire/jsword/index/lucene/IndexMetadata.java
Original file line number Diff line number Diff line change
@@ -20,6 +20,7 @@
package org.crosswire.jsword.index.lucene;

import java.io.IOException;
import java.util.Objects;

import org.crosswire.common.util.PropertyMap;
import org.crosswire.common.util.ResourceUtil;
@@ -41,17 +42,20 @@
public final class IndexMetadata {

/** latest version on top */
public static final float INDEX_VERSION_1_2 = 1.2f;
public static final float INDEX_VERSION_1_3 = 1.3f;

/**
* A prior version.
*
* @deprecated do not use
*/
@Deprecated
public static final float INDEX_VERSION_1_2 = 1.2f;

@Deprecated
public static final float INDEX_VERSION_1_1 = 1.1f;

public static final String LATEST_INDEX_VERSION = "Latest.Index.Version";
public static final String LUCENE_VERSION = "Lucene.Version";

public static final String PREFIX_LATEST_INDEX_VERSION_BOOK_OVERRIDE = "Latest.Index.Version.Book.";
/**
@@ -69,38 +73,20 @@ public static IndexMetadata instance() {
return myInstance;
}

/**
* default Installed IndexVersion
*
* @return the index version
* @deprecated see InstalledIndex.java
*/
@Deprecated
public float getInstalledIndexVersion() {
String value = props.get(INDEX_VERSION, "1.1"); // todo At some point
// default should be 1.2
return Float.parseFloat(value);
}

// Default Latest IndexVersion : Default version number of Latest indexing
// schema: PerBook index version must be equal or greater than this
public float getLatestIndexVersion() {
String value = props.get(LATEST_INDEX_VERSION, "1.2");
return Float.parseFloat(value);
}
public String getLatestIndexVersionStr() {
String value = props.get(LATEST_INDEX_VERSION, "1.2");
return value;
String value = props.get(LATEST_INDEX_VERSION);
return (value == null) ? InstalledIndex.DEFAULT_INSTALLED_INDEX_VERSION : Float.parseFloat(value);
}

public float getLatestIndexVersion(Book b) {
if (b == null) {
return getLatestIndexVersion();
}

String value = props.get(PREFIX_LATEST_INDEX_VERSION_BOOK_OVERRIDE + IndexMetadata.getBookIdentifierPropSuffix(b.getBookMetaData()),
props.get(LATEST_INDEX_VERSION));
return Float.parseFloat(value);
String value = props.get(PREFIX_LATEST_INDEX_VERSION_BOOK_OVERRIDE + IndexMetadata.getBookIdentifierPropSuffix(b.getBookMetaData()));
return (value == null) ? getLatestIndexVersion() : Float.parseFloat(value);
}

// used in property keys e.g. Installed.Index.Version.Book.ESV[1.0.1]
@@ -112,9 +98,6 @@ public static String getBookIdentifierPropSuffix(BookMetaData meta) {
return meta.getInitials() + moduleVer;
}

public float getLuceneVersion() {
return Float.parseFloat(props.get(LUCENE_VERSION));
}
private IndexMetadata() {
try {
props = ResourceUtil.getProperties(getClass());
Original file line number Diff line number Diff line change
@@ -49,7 +49,7 @@ public final class InstalledIndex {
public static final String PREFIX_INSTALLED_INDEX_VERSION_BOOK_OVERRIDE = "Installed.Index.Version.Book.";
// TODO(Sijo): change this value on lucene upgrade
/** The Index version for new indexes */
public static final float DEFAULT_INSTALLED_INDEX_VERSION = IndexMetadata.INDEX_VERSION_1_2;
public static final float DEFAULT_INSTALLED_INDEX_VERSION = IndexMetadata.INDEX_VERSION_1_3;

/**
* All access through this single instance.
Loading