Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce typo count. #2510

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ Tantivy 0.23 will be backwards compatible with indices created with v0.22 and v0
- modify fastfield range query heuristic [#2375](https://github.com/quickwit-oss/tantivy/pull/2375)(@trinity-1686a)
- add FastFieldRangeQuery for explicit range queries on fast field (for `RangeQuery` it is autodetected) [#2477](https://github.com/quickwit-oss/tantivy/pull/2477)(@PSeitz)

- add format backwards-compatibiliy tests [#2485](https://github.com/quickwit-oss/tantivy/pull/2485)(@PSeitz)
- add columnar format compatibiliy tests [#2433](https://github.com/quickwit-oss/tantivy/pull/2433)(@PSeitz)
- add format backwards-compatibility tests [#2485](https://github.com/quickwit-oss/tantivy/pull/2485)(@PSeitz)
- add columnar format compatibility tests [#2433](https://github.com/quickwit-oss/tantivy/pull/2433)(@PSeitz)
- Improved snippet ranges algorithm [#2474](https://github.com/quickwit-oss/tantivy/pull/2474)(@gezihuzi)
- make find_field_with_default return json fields without path [#2476](https://github.com/quickwit-oss/tantivy/pull/2476)(@trinity-1686a)
- feat(query): Make `BooleanQuery` support `minimum_number_should_match` [#2405](https://github.com/quickwit-oss/tantivy/pull/2405)(@LebranceBW)
Expand Down Expand Up @@ -74,7 +74,7 @@ Tantivy 0.22 will be able to read indices created with Tantivy 0.21.
- Fix bug that can cause `get_docids_for_value_range` to panic. [#2295](https://github.com/quickwit-oss/tantivy/pull/2295)(@fulmicoton)
- Avoid 1 document indices by increase min memory to 15MB for indexing [#2176](https://github.com/quickwit-oss/tantivy/pull/2176)(@PSeitz)
- Fix merge panic for JSON fields [#2284](https://github.com/quickwit-oss/tantivy/pull/2284)(@PSeitz)
- Fix bug occuring when merging JSON object indexed with positions. [#2253](https://github.com/quickwit-oss/tantivy/pull/2253)(@fulmicoton)
- Fix bug occurring when merging JSON object indexed with positions. [#2253](https://github.com/quickwit-oss/tantivy/pull/2253)(@fulmicoton)
- Fix empty DateHistogram gap bug [#2183](https://github.com/quickwit-oss/tantivy/pull/2183)(@PSeitz)
- Fix range query end check (fields with less than 1 value per doc are affected) [#2226](https://github.com/quickwit-oss/tantivy/pull/2226)(@PSeitz)
- Handle exclusive out of bounds ranges on fastfield range queries [#2174](https://github.com/quickwit-oss/tantivy/pull/2174)(@PSeitz)
Expand All @@ -92,7 +92,7 @@ Tantivy 0.22 will be able to read indices created with Tantivy 0.21.
- Support to deserialize f64 from string [#2311](https://github.com/quickwit-oss/tantivy/pull/2311)(@PSeitz)
- Add a top_hits aggregator [#2198](https://github.com/quickwit-oss/tantivy/pull/2198)(@ditsuke)
- Support bool type in term aggregation [#2318](https://github.com/quickwit-oss/tantivy/pull/2318)(@PSeitz)
- Support ip adresses in term aggregation [#2319](https://github.com/quickwit-oss/tantivy/pull/2319)(@PSeitz)
- Support ip addresses in term aggregation [#2319](https://github.com/quickwit-oss/tantivy/pull/2319)(@PSeitz)
- Support date type in term aggregation [#2172](https://github.com/quickwit-oss/tantivy/pull/2172)(@PSeitz)
- Support escaped dot when addressing field [#2250](https://github.com/quickwit-oss/tantivy/pull/2250)(@PSeitz)

Expand Down Expand Up @@ -182,7 +182,7 @@ Tantivy 0.20
- Add PhrasePrefixQuery [#1842](https://github.com/quickwit-oss/tantivy/issues/1842) (@trinity-1686a)
- Add `coerce` option for text and numbers types (convert the value instead of returning an error during indexing) [#1904](https://github.com/quickwit-oss/tantivy/issues/1904) (@PSeitz)
- Add regex tokenizer [#1759](https://github.com/quickwit-oss/tantivy/issues/1759)(@mkleen)
- Move tokenizer API to seperate crate. Having a seperate crate with a stable API will allow us to use tokenizers with different tantivy versions. [#1767](https://github.com/quickwit-oss/tantivy/issues/1767) (@PSeitz)
- Move tokenizer API to separate crate. Having a separate crate with a stable API will allow us to use tokenizers with different tantivy versions. [#1767](https://github.com/quickwit-oss/tantivy/issues/1767) (@PSeitz)
- **Columnar crate**: New fast field handling (@fulmicoton @PSeitz) [#1806](https://github.com/quickwit-oss/tantivy/issues/1806)[#1809](https://github.com/quickwit-oss/tantivy/issues/1809)
- Support for fast fields with optional values. Previously tantivy supported only single-valued and multi-value fast fields. The encoding of optional fast fields is now very compact.
- Fast field Support for JSON (schemaless fast fields). Support multiple types on the same column. [#1876](https://github.com/quickwit-oss/tantivy/issues/1876) (@fulmicoton)
Expand Down Expand Up @@ -229,13 +229,13 @@ Tantivy 0.20
- Auto downgrade index record option, instead of vint error [#1857](https://github.com/quickwit-oss/tantivy/issues/1857) (@PSeitz)
- Enable range query on fast field for u64 compatible types [#1762](https://github.com/quickwit-oss/tantivy/issues/1762) (@PSeitz) [#1876]
- sstable
- Isolating sstable and stacker in independant crates. [#1718](https://github.com/quickwit-oss/tantivy/issues/1718) (@fulmicoton)
- Isolating sstable and stacker in independent crates. [#1718](https://github.com/quickwit-oss/tantivy/issues/1718) (@fulmicoton)
- New sstable format [#1943](https://github.com/quickwit-oss/tantivy/issues/1943)[#1953](https://github.com/quickwit-oss/tantivy/issues/1953) (@trinity-1686a)
- Use DeltaReader directly to implement Dictionnary::ord_to_term [#1928](https://github.com/quickwit-oss/tantivy/issues/1928) (@trinity-1686a)
- Use DeltaReader directly to implement Dictionnary::term_ord [#1925](https://github.com/quickwit-oss/tantivy/issues/1925) (@trinity-1686a)
- Add seperate tokenizer manager for fast fields [#2019](https://github.com/quickwit-oss/tantivy/issues/2019) (@PSeitz)
- Use DeltaReader directly to implement Dictionary::ord_to_term [#1928](https://github.com/quickwit-oss/tantivy/issues/1928) (@trinity-1686a)
- Use DeltaReader directly to implement Dictionary::term_ord [#1925](https://github.com/quickwit-oss/tantivy/issues/1925) (@trinity-1686a)
- Add separate tokenizer manager for fast fields [#2019](https://github.com/quickwit-oss/tantivy/issues/2019) (@PSeitz)
- Make construction of LevenshteinAutomatonBuilder for FuzzyTermQuery instances lazy. [#1756](https://github.com/quickwit-oss/tantivy/issues/1756) (@adamreichold)
- Added support for madvise when opening an mmaped Index [#2036](https://github.com/quickwit-oss/tantivy/issues/2036) (@fulmicoton)
- Added support for madvise when opening an mmapped Index [#2036](https://github.com/quickwit-oss/tantivy/issues/2036) (@fulmicoton)
- Rename `DatePrecision` to `DateTimePrecision` [#2051](https://github.com/quickwit-oss/tantivy/issues/2051) (@guilload)
- Query Parser
- Quotation mark can now be used for phrase queries. [#2050](https://github.com/quickwit-oss/tantivy/issues/2050) (@fulmicoton)
Expand Down Expand Up @@ -274,7 +274,7 @@ Tantivy 0.19
- Add support for phrase slop in query language [#1393](https://github.com/quickwit-oss/tantivy/pull/1393) (@saroh)
- Aggregation
- Add aggregation support for date type [#1693](https://github.com/quickwit-oss/tantivy/pull/1693)(@PSeitz)
- Add support for keyed parameter in range and histgram aggregations [#1424](https://github.com/quickwit-oss/tantivy/pull/1424) (@k-yomo)
- Add support for keyed parameter in range and histogram aggregations [#1424](https://github.com/quickwit-oss/tantivy/pull/1424) (@k-yomo)
- Add aggregation bucket limit [#1363](https://github.com/quickwit-oss/tantivy/pull/1363) (@PSeitz)
- Faster indexing
- [#1610](https://github.com/quickwit-oss/tantivy/pull/1610) (@PSeitz)
Expand Down
2 changes: 1 addition & 1 deletion TODO.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Make schema_builder API fluent.
fix doc serialization and prevent compression problems

u64 , etc. shoudl return Resutl<Option> now that we support optional missing a column is really not an error
u64 , etc. should return Result<Option> now that we support optional missing a column is really not an error
remove fastfield codecs
ditch the first_or_default trick. if it is still useful, improve its implementation.
rename FastFieldReaders::open to load
Expand Down
2 changes: 1 addition & 1 deletion columnar/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ restriction on 50% of the values (e.g. a 64-bit hash). On the other hand, a lot
# Columnar format

This columnar format may have more than one column (with different types) associated to the same `column_name` (see [Coercion rules](#coercion-rules) above).
The `(column_name, columne_type)` couple however uniquely identifies a column.
The `(column_name, column_type)` couple however uniquely identifies a column.
That couple is serialized as a column `column_key`. The format of that key is:
`[column_name][ZERO_BYTE][column_type_header: u8]`

Expand Down
6 changes: 3 additions & 3 deletions columnar/src/TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

# Perf and Size
* remove alloc in `ord_to_term`
+ multivaued range queries restrat frm the beginning all of the time.
+ multivaued range queries restart from the beginning all of the time.
* re-add ZSTD compression for dictionaries
no systematic monotonic mapping
consider removing multilinear
Expand All @@ -30,7 +30,7 @@ investigate if should have better errors? io::Error is overused at the moment.
rename rank/select in unit tests
Review the public API via cargo doc
go through TODOs
remove all doc_id occurences -> row_id
remove all doc_id occurrences -> row_id
use the rank & select naming in unit tests branch.
multi-linear -> blockwise
linear codec -> simply a multiplication for the index column
Expand All @@ -43,5 +43,5 @@ isolate u128_based and uniform naming
# Other
fix enhance column-cli

# Santa claus
# Santa Claus
autodetect datetime ipaddr, plug customizable tokenizer.
4 changes: 2 additions & 2 deletions columnar/src/column_index/merge/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ mod tests {
.into();
let merged_column_index = merge_column_index(&column_indexes[..], &merge_row_order);
let SerializableColumnIndex::Multivalued(start_index_iterable) = merged_column_index else {
panic!("Excpected a multivalued index")
panic!("Expected a multivalued index")
};
let mut output = Vec::new();
serialize_multivalued_index(&start_index_iterable, &mut output).unwrap();
Expand Down Expand Up @@ -211,7 +211,7 @@ mod tests {

let merged_column_index = merge_column_index(&column_indexes[..], &merge_row_order);
let SerializableColumnIndex::Multivalued(start_index_iterable) = merged_column_index else {
panic!("Excpected a multivalued index")
panic!("Expected a multivalued index")
};
let mut output = Vec::new();
serialize_multivalued_index(&start_index_iterable, &mut output).unwrap();
Expand Down
2 changes: 1 addition & 1 deletion columnar/src/column_index/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ pub enum ColumnIndex {
Full,
Optional(OptionalIndex),
/// In addition, at index num_rows, an extra value is added
/// containing the overal number of values.
/// containing the overall number of values.
Multivalued(MultiValueIndex),
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ impl CompactSpaceBuilder {

let mut covered_space = Vec::with_capacity(self.blanks.len());

// begining of the blanks
// beginning of the blanks
if let Some(first_blank_start) = self.blanks.first().map(RangeInclusive::start) {
if *first_blank_start != 0 {
covered_space.push(0..=first_blank_start - 1);
Expand Down
2 changes: 1 addition & 1 deletion columnar/src/column_values/u64_based/line.rs
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ impl Line {
line
}

/// Returns a line that attemps to approximate a function
/// Returns a line that attempts to approximate a function
/// f: i in 0..[ys.num_vals()) -> ys[i].
///
/// - The approximation is always lower than the actual value. Or more rigorously, formally
Expand Down
6 changes: 3 additions & 3 deletions columnar/src/columnar/merge/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ use crate::{
/// After merge, all columns belonging to the same category are coerced to
/// the same column type.
///
/// In practise, today, only Numerical colummns are coerced into one type today.
/// In practise, today, only Numerical columns are coerced into one type today.
///
/// See also [README.md].
///
Expand Down Expand Up @@ -63,8 +63,8 @@ impl From<ColumnType> for ColumnTypeCategory {
/// `require_columns` makes it possible to ensure that some columns will be present in the
/// resulting columnar. When a required column is a numerical column type, one of two things can
/// happen:
/// - If the required column type is compatible with all of the input columnar, the resulsting
/// merged columnar will simply coerce the input column and use the required column type.
/// - If the required column type is compatible with all of the input columnar, the resulting merged
/// columnar will simply coerce the input column and use the required column type.
/// - If the required column type is incompatible with one of the input columnar, the merged will
/// fail with an InvalidData error.
///
Expand Down
2 changes: 1 addition & 1 deletion columnar/src/columnar/writer/column_operation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ impl<V: SymbolValue> ColumnOperation<V> {
minibuf
}

/// Deserialize a colummn operation.
/// Deserialize a column operation.
/// Returns None if the buffer is empty.
///
/// Panics if the payload is invalid:
Expand Down
2 changes: 1 addition & 1 deletion examples/iterating_docs_and_positions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ fn main() -> tantivy::Result<()> {
let mut index_writer: IndexWriter = index.writer_with_num_threads(1, 50_000_000)?;
index_writer.add_document(doc!(title => "The Old Man and the Sea"))?;
index_writer.add_document(doc!(title => "Of Mice and Men"))?;
index_writer.add_document(doc!(title => "The modern Promotheus"))?;
index_writer.add_document(doc!(title => "The modern Prometheus"))?;
index_writer.commit()?;

let reader = index.reader()?;
Expand Down
12 changes: 6 additions & 6 deletions query-grammar/src/query_grammar.rs
Original file line number Diff line number Diff line change
Expand Up @@ -833,7 +833,7 @@ fn aggregate_infallible_expressions(
if early_operand {
err.push(LenientErrorInternal {
pos: 0,
message: "Found unexpeted boolean operator before term".to_string(),
message: "Found unexpected boolean operator before term".to_string(),
});
}

Expand All @@ -856,7 +856,7 @@ fn aggregate_infallible_expressions(
_ => Some(Occur::Should),
};
if occur == &Some(Occur::MustNot) && default_op == Some(Occur::Should) {
// if occur is MustNot *and* operation is OR, we synthetize a ShouldNot
// if occur is MustNot *and* operation is OR, we synthesize a ShouldNot
clauses.push(vec![(
Some(Occur::Should),
ast.clone().unary(Occur::MustNot),
Expand All @@ -872,7 +872,7 @@ fn aggregate_infallible_expressions(
None => None,
};
if occur == &Some(Occur::MustNot) && default_op == Some(Occur::Should) {
// if occur is MustNot *and* operation is OR, we synthetize a ShouldNot
// if occur is MustNot *and* operation is OR, we synthesize a ShouldNot
clauses.push(vec![(
Some(Occur::Should),
ast.clone().unary(Occur::MustNot),
Expand All @@ -897,7 +897,7 @@ fn aggregate_infallible_expressions(
}
Some(BinaryOperand::Or) => {
if last_occur == Some(Occur::MustNot) {
// if occur is MustNot *and* operation is OR, we synthetize a ShouldNot
// if occur is MustNot *and* operation is OR, we synthesize a ShouldNot
clauses.push(vec![(Some(Occur::Should), last_ast.unary(Occur::MustNot))]);
} else {
clauses.push(vec![(last_occur.or(Some(Occur::Should)), last_ast)]);
Expand Down Expand Up @@ -1057,7 +1057,7 @@ mod test {
valid_parse("1", 1.0, "");
valid_parse("0.234234 aaa", 0.234234f64, " aaa");
error_parse(".3332");
// TODO trinity-1686a: I disagree that it should fail, I think it should succeeed,
// TODO trinity-1686a: I disagree that it should fail, I think it should succeed,
// consuming only "1", and leave "." for the next thing (which will likely fail then)
// error_parse("1.");
error_parse("-1.");
Expand Down Expand Up @@ -1467,7 +1467,7 @@ mod test {
}

#[test]
fn test_parse_query_to_triming_spaces() {
fn test_parse_query_to_trimming_spaces() {
test_parse_query_to_ast_helper(" abc", "abc");
test_parse_query_to_ast_helper("abc ", "abc");
test_parse_query_to_ast_helper("( a OR abc)", "(?a ?abc)");
Expand Down
2 changes: 1 addition & 1 deletion query-grammar/src/user_input_ast.rs
Original file line number Diff line number Diff line change
Expand Up @@ -267,7 +267,7 @@ impl fmt::Debug for UserInputAst {
match *self {
UserInputAst::Clause(ref subqueries) => {
if subqueries.is_empty() {
// TODO this will break ast reserialization, is writing "( )" enought?
// TODO this will break ast reserialization, is writing "( )" enough?
write!(formatter, "<emptyclause>")?;
} else {
write!(formatter, "(")?;
Expand Down
2 changes: 1 addition & 1 deletion src/aggregation/agg_tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -870,7 +870,7 @@ fn test_aggregation_on_json_object_mixed_types() {
.add_document(doc!(json => json!({"mixed_type": "blue", "mixed_price": 5.0})))
.unwrap();
index_writer.commit().unwrap();
// => Segment with all boolen
// => Segment with all boolean
index_writer
.add_document(doc!(json => json!({"mixed_type": true, "mixed_price": "no_price"})))
.unwrap();
Expand Down
10 changes: 5 additions & 5 deletions src/aggregation/bucket/term_agg.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ use crate::aggregation::{format_date, Key};
use crate::error::DataCorruption;
use crate::TantivyError;

/// Creates a bucket for every unique term and counts the number of occurences.
/// Creates a bucket for every unique term and counts the number of occurrences.
/// Note that doc_count in the response buckets equals term count here.
///
/// If the text is untokenized and single value, that means one term per document and therefore it
Expand Down Expand Up @@ -158,7 +158,7 @@ pub struct TermsAggregation {
/// when loading the text.
/// Special Case 1:
/// If we have multiple columns on one field, we need to have a union on the indices on both
/// columns, to find docids without a value. That requires a special missing aggreggation.
/// columns, to find docids without a value. That requires a special missing aggregation.
/// Special Case 2: if the key is of type text and the column is numerical, we also need to use
/// the special missing aggregation, since there is no mechanism in the numerical column to
/// add text.
Expand Down Expand Up @@ -364,7 +364,7 @@ impl SegmentTermCollector {
let term_buckets = TermBuckets::default();

if let Some(custom_order) = req.order.as_ref() {
// Validate sub aggregtion exists
// Validate sub aggregation exists
if let OrderTarget::SubAggregation(sub_agg_name) = &custom_order.target {
let (agg_name, _agg_property) = get_agg_name_and_property(sub_agg_name);

Expand Down Expand Up @@ -1685,7 +1685,7 @@ mod tests {
res["my_texts"]["buckets"][2]["key"],
serde_json::Value::Null
);
// text field with numner as missing fallback
// text field with number as missing fallback
assert_eq!(res["my_texts2"]["buckets"][0]["key"], "Hello Hello");
assert_eq!(res["my_texts2"]["buckets"][0]["doc_count"], 5);
assert_eq!(res["my_texts2"]["buckets"][1]["key"], 1337.0);
Expand Down Expand Up @@ -1859,7 +1859,7 @@ mod tests {
res["my_texts"]["buckets"][2]["key"],
serde_json::Value::Null
);
// text field with numner as missing fallback
// text field with number as missing fallback
assert_eq!(res["my_texts2"]["buckets"][0]["key"], "Hello Hello");
assert_eq!(res["my_texts2"]["buckets"][0]["doc_count"], 4);
assert_eq!(res["my_texts2"]["buckets"][1]["key"], 1337.0);
Expand Down
Loading
Loading