Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow calling EqvalenceProperties::project with certain order #12700

Closed
alamb opened this issue Oct 1, 2024 · 16 comments · Fixed by #12759
Closed

Stack overflow calling EqvalenceProperties::project with certain order #12700

alamb opened this issue Oct 1, 2024 · 16 comments · Fixed by #12759
Assignees
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Oct 1, 2024

Describe the bug

Given equivalence sort orders like this:

[
  [a,b,c,d],
  [a,c,b,d], // note b,d are swapped
]

When EqvalenceProperties::project is called to project a, b, c, d the stack overflows

To Reproduce

Reproducer:

    #[test]
    fn project_equivalence_properties_test_multi() ->   Result<()> {
    // test multiple input orderings with equivalence properties
        let input_schema = Arc::new(Schema::new(vec![
            Field::new("a", DataType::Int64, true),
            Field::new("b", DataType::Int64, true),
            Field::new("c", DataType::Int64, true),
            Field::new("d", DataType::Int64, true),
        ]));

        let mut input_properties = EquivalenceProperties::new(Arc::clone(&input_schema));
        // add equivalent ordering [a, b, c, d]
        input_properties.add_new_ordering(vec![
            parse_sort_expr("a", &input_schema),
            parse_sort_expr("b", &input_schema),
            parse_sort_expr("c", &input_schema),
            parse_sort_expr("d", &input_schema),
        ]);

        // add equivalent ordering [a, c, b, d]
        input_properties.add_new_ordering(vec![
            parse_sort_expr("a", &input_schema),
            parse_sort_expr("c", &input_schema),
            parse_sort_expr("b", &input_schema), // NB b and c are swapped
            parse_sort_expr("d", &input_schema),
        ]);

        let proj_exprs = vec![
            (col("a", &input_schema)?, "a".to_string()),
            (col("b", &input_schema)?, "b".to_string()),
            (col("c", &input_schema)?, "c".to_string()),
            (col("d", &input_schema)?, "d".to_string()),
        ];
        let projection_mapping = ProjectionMapping::try_new(&proj_exprs, &input_schema)?;
        let out_properties = input_properties.project(&projection_mapping, input_schema);
        
        assert_eq!(out_properties.to_string(), "order: [[a@0 ASC,b@1 ASC,c@2 ASC]]");
        Ok(())
    }
Full Diff

andrewlamb@Andrews-MacBook-Pro-2:~/Software/datafusion2$ git diff
diff --git a/datafusion/physical-expr/src/equivalence/properties.rs b/datafusion/physical-expr/src/equivalence/properties.rs
index dc59a1eb8..dc3929188 100644
--- a/datafusion/physical-expr/src/equivalence/properties.rs
+++ b/datafusion/physical-expr/src/equivalence/properties.rs
@@ -1755,6 +1761,46 @@ mod tests {
         Ok(())
     }

+    #[test]
+    fn project_equivalence_properties_test_multi() ->   Result<()> {
+    // test multiple input orderings with equivalence properties
+        let input_schema = Arc::new(Schema::new(vec![
+            Field::new("a", DataType::Int64, true),
+            Field::new("b", DataType::Int64, true),
+            Field::new("c", DataType::Int64, true),
+            Field::new("d", DataType::Int64, true),
+        ]));
+
+        let mut input_properties = EquivalenceProperties::new(Arc::clone(&input_schema));
+        // add equivalent ordering [a, b, c, d]
+        input_properties.add_new_ordering(vec![
+            parse_sort_expr("a", &input_schema),
+            parse_sort_expr("b", &input_schema),
+            parse_sort_expr("c", &input_schema),
+            parse_sort_expr("d", &input_schema),
+        ]);
+
+        // add equivalent ordering [a, c, b, d]
+        input_properties.add_new_ordering(vec![
+            parse_sort_expr("a", &input_schema),
+            parse_sort_expr("c", &input_schema),
+            parse_sort_expr("b", &input_schema), // NB b and c are swapped
+            parse_sort_expr("d", &input_schema),
+        ]);
+
+        let proj_exprs = vec![
+            (col("a", &input_schema)?, "a".to_string()),
+            (col("b", &input_schema)?, "b".to_string()),
+            (col("c", &input_schema)?, "c".to_string()),
+            (col("d", &input_schema)?, "d".to_string()),
+        ];
+        let projection_mapping = ProjectionMapping::try_new(&proj_exprs, &input_schema)?;
+        let out_properties = input_properties.project(&projection_mapping, input_schema);
+
+        assert_eq!(out_properties.to_string(), "order: [[a@0 ASC,b@1 ASC,c@2 ASC]]");
+        Ok(())
+    }
+
     #[test]
     fn test_join_equivalence_properties() -> Result<()> {
         let schema = create_test_schema()?;
@@ -3108,4 +3154,36 @@ mod tests {
             assert!(lhs_orderings.contains(rhs_ordering), "{}", err_msg);
         }
     }
+
+    /// Converts a string to a physical sort expression
+    ///
+    /// # Example
+    /// * `"a"` -> (`"a"`, `SortOptions::default()`)
+    /// * `"a ASC"` -> (`"a"`, `SortOptions { descending: false, nulls_first: false }`)
+    fn parse_sort_expr(name: &str, schema: &SchemaRef) -> PhysicalSortExpr {
+        let mut parts = name.split_whitespace();
+        let name = parts.next().expect("empty sort expression");
+        let mut sort_expr = PhysicalSortExpr::new(
+            col(name, schema).expect("invalid column name"),
+            SortOptions::default(),
+        );
+
+        if let Some(options) = parts.next() {
+            sort_expr = match options {
+                "ASC" => sort_expr.asc(),
+                "DESC" => sort_expr.desc(),
+                _ => panic!(
+                    "unknown sort options. Expected 'ASC' or 'DESC', got {}",
+                    options
+                ),
+            }
+        }
+
+        assert!(
+            parts.next().is_none(),
+            "unexpected tokens in column name. Expected 'name' / 'name ASC' / 'name DESC' but got  '{name}'"
+        );
+
+        sort_expr
+    }

Then run

cargo test --package datafusion-physical-expr --lib equivalence::properties::tests::project_equivalence_properties_test_multi -- --exact
   Compiling datafusion-physical-expr v42.0.0 (/Users/andrewlamb/Software/datafusion2/datafusion/physical-expr)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 1.65s
     Running unittests src/lib.rs (target/debug/deps/datafusion_physical_expr-a6fc7a49851ed9b5)

running 1 test

thread 'equivalence::properties::tests::project_equivalence_properties_test_multi' has overflowed its stack
fatal runtime error: stack overflow
error: test failed, to rerun pass `-p datafusion-physical-expr --lib`

Caused by:
  process didn't exit successfully: `/Users/andrewlamb/Software/datafusion2/target/debug/deps/datafusion_physical_expr-a6fc7a49851ed9b5 'equivalence::properties::tests::project_equivalence_properties_test_multi' --exact` (signal: 6, SIGABRT: process abort signal)

Expected behavior

Test should pass (expected output will need to be updated)

Additional context

I found this while working on #12446 / #12562

@alamb alamb added the bug Something isn't working label Oct 1, 2024
@alamb
Copy link
Contributor Author

alamb commented Oct 1, 2024

Here is the stack trace:

<std::hash::random::RandomState as core::hash::BuildHasher>::build_hasher random.rs:85
indexmap::map::IndexMap<K,V,S>::hash map.rs:673
indexmap::map::IndexMap<K,V,S>::get_index_of map.rs:744
indexmap::map::IndexMap<K,V,S>::get map.rs:696
<indexmap::map::IndexMap<K,V,S> as core::ops::index::Index<&Q>>::index map.rs:1328
datafusion_physical_expr::equivalence::properties::construct_orderings properties.rs:1467
datafusion_physical_expr::equivalence::properties::construct_orderings::{{closure}} properties.rs:1477
core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once function.rs:305
[Inlined] core::option::Option<T>::map option.rs:1110
.... <many more calls to construct_orderings>
datafusion_physical_expr::equivalence::properties::construct_orderings properties.rs:1474
datafusion_physical_expr::equivalence::properties::construct_orderings::{{closure}} properties.rs:1477
core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once function.rs:305
[Inlined] core::option::Option<T>::map option.rs:1110
<core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next map.rs:108
<core::iter::adapters::fuse::Fuse<I> as core::iter::adapters::fuse::FuseImpl<I>>::next fuse.rs:359
[Inlined] <core::iter::adapters::fuse::Fuse<I> as core::iter::traits::iterator::Iterator>::next fuse.rs:50
<core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::next flatten.rs:610
<core::iter::adapters::flatten::FlatMap<I,U,F> as core::iter::traits::iterator::Iterator>::next flatten.rs:66
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter spec_from_iter_nested.rs:26
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter spec_from_iter.rs:33
<alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter mod.rs:2977
core::iter::traits::iterator::Iterator::collect iterator.rs:2005
datafusion_physical_expr::equivalence::properties::construct_orderings properties.rs:1474
datafusion_physical_expr::equivalence::properties::construct_prefix_orderings::{{closure}} properties.rs:1304
core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once function.rs:305
[Inlined] core::option::Option<T>::map option.rs:1110
<core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next map.rs:108
<core::iter::adapters::fuse::Fuse<I> as core::iter::adapters::fuse::FuseImpl<I>>::next fuse.rs:359
[Inlined] <core::iter::adapters::fuse::Fuse<I> as core::iter::traits::iterator::Iterator>::next fuse.rs:50
<core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::next flatten.rs:610
<core::iter::adapters::flatten::FlatMap<I,U,F> as core::iter::traits::iterator::Iterator>::next flatten.rs:66
alloc::vec::Vec<T,A>::extend_desugared mod.rs:3081
<alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend spec_extend.rs:17
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter spec_from_iter_nested.rs:43
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter spec_from_iter.rs:33
<alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter mod.rs:2977
core::iter::traits::iterator::Iterator::collect iterator.rs:2005
datafusion_physical_expr::equivalence::properties::construct_prefix_orderings properties.rs:1301
datafusion_physical_expr::equivalence::properties::generate_dependency_orderings::{{closure}} properties.rs:1333
core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once function.rs:305
[Inlined] core::option::Option<T>::map option.rs:1110
<core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next map.rs:108
<core::iter::adapters::fuse::Fuse<I> as core::iter::adapters::fuse::FuseImpl<I>>::next fuse.rs:359
[Inlined] <core::iter::adapters::fuse::Fuse<I> as core::iter::traits::iterator::Iterator>::next fuse.rs:50
<core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::next flatten.rs:939
<core::iter::adapters::flatten::FlatMap<I,U,F> as core::iter::traits::iterator::Iterator>::next flatten.rs:66
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter spec_from_iter_nested.rs:26
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter spec_from_iter.rs:33
<alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter mod.rs:2977
core::iter::traits::iterator::Iterator::collect iterator.rs:2005
datafusion_physical_expr::equivalence::properties::generate_dependency_orderings properties.rs:1330
datafusion_physical_expr::equivalence::properties::EquivalenceProperties::projected_orderings::{{closure}}::{{closure}} properties.rs:816
core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &mut F>::call_once function.rs:305
[Inlined] core::option::Option<T>::map option.rs:1110
<core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next map.rs:108
<core::iter::adapters::fuse::Fuse<I> as core::iter::adapters::fuse::FuseImpl<I>>::next fuse.rs:359
[Inlined] <core::iter::adapters::fuse::Fuse<I> as core::iter::traits::iterator::Iterator>::next fuse.rs:50
<core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::next flatten.rs:610
<core::iter::adapters::flatten::FlatMap<I,U,F> as core::iter::traits::iterator::Iterator>::next flatten.rs:66
core::ops::function::FnOnce::call_once function.rs:250
core::iter::adapters::flatten::and_then_or_clear flatten.rs:840
<core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::next flatten.rs:607
<core::iter::adapters::flatten::FlatMap<I,U,F> as core::iter::traits::iterator::Iterator>::next flatten.rs:66
core::ops::function::FnOnce::call_once function.rs:250
core::iter::adapters::chain::and_then_or_clear chain.rs:333
<core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::next chain.rs:84
<core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next map.rs:108
alloc::vec::Vec<T,A>::extend_desugared mod.rs:3081
<alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend spec_extend.rs:17
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter spec_from_iter_nested.rs:43
<alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter spec_from_iter.rs:33
<alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter mod.rs:2977
core::iter::traits::iterator::Iterator::collect iterator.rs:2005
datafusion_physical_expr::equivalence::properties::EquivalenceProperties::projected_orderings properties.rs:847
datafusion_physical_expr::equivalence::properties::EquivalenceProperties::project properties.rs:900
datafusion_physical_expr::equivalence::properties::tests::project_equivalence_properties_test_multi properties.rs:1798
datafusion_physical_expr::equivalence::properties::tests::project_equivalence_properties_test_multi::{{closure}} properties.rs:1765
core::ops::function::FnOnce::call_once function.rs:250
[Inlined] core::ops::function::FnOnce::call_once function.rs:250
test::__rust_begin_short_backtrace lib.rs:624
[Inlined] test::run_test_in_process::{{closure}} lib.rs:647
[Inlined] <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once unwind_safe.rs:272
[Inlined] std::panicking::try::do_call panicking.rs:557
[Inlined] std::panicking::try panicking.rs:521
[Inlined] std::panic::catch_unwind panic.rs:350
[Inlined] test::run_test_in_process lib.rs:647
test::run_test::{{closure}} lib.rs:568
[Inlined] test::run_test::{{closure}} lib.rs:598
std::sys::backtrace::__rust_begin_short_backtrace backtrace.rs:152
[Inlined] std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} mod.rs:538
[Inlined] <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once unwind_safe.rs:272
[Inlined] std::panicking::try::do_call panicking.rs:557
[Inlined] std::panicking::try panicking.rs:521
[Inlined] std::panic::catch_unwind panic.rs:350
[Inlined] std::thread::Builder::spawn_unchecked_::{{closure}} mod.rs:537
core::ops::function::FnOnce::call_once{{vtable.shim}} function.rs:250
[Inlined] <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once boxed.rs:2070
[Inlined] <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once boxed.rs:2070
std::sys::pal::unix::thread::Thread::new::thread_start thread.rs:108
_pthread_start 0x000000018f1b1f94

@alamb
Copy link
Contributor Author

alamb commented Oct 1, 2024

@mustafasrepo @ozankabak or @berkaysynnada does this code look familiar to you? Any hints on where to begin?

I looked into it briefly this morning and it appears the dependencies computed are cyclic

@ozankabak
Copy link
Contributor

Hmm, will take a look. cc @akurmustafa

@berkaysynnada
Copy link
Contributor

I looked into it briefly this morning and it appears the dependencies computed are cyclic (

You are right - construct_orderings() seems to be the source of the problem. Unfortunately, I am not able to dive into it today.

@ozankabak
Copy link
Contributor

If no one beats us to it, we will get to this after finishing some firefighting

@alamb
Copy link
Contributor Author

alamb commented Oct 2, 2024

Thanks -- I may also find time to work on it, but will post here before doing so

@akurmustafa
Copy link
Contributor

I couldn't look this in detail. However, posting my thoughts, in case that helps. Given orderings are [a,b,c,d]. [a,c,b,d]. One of the assumptions about the oeq_class is that, ordering representations inside it doesn't contain any redundant information. As an example, if ordering given is [a,b,a,c] then it would be converted to the [a,b,c] internally. As another example, if orderings given are [a,b,c,d] and [a,b]. oeq_class will only store [a,b,c,d] internally.

Similary, for the orderings [a,b,c,d] and [a,c,b,d]. Their most condensed representation would be [a,b,d] and [a,c,d]. If we fail to represent ordering with this condensed version, subsequent stages might behave unexpectedly. This might be the root cause of the problem. However, as again I didn't look into this problem in detail. I don't want to misdirect anyone, just posting my immature thoughts.

@alamb
Copy link
Contributor Author

alamb commented Oct 3, 2024

I am starting to check this out

@alamb
Copy link
Contributor Author

alamb commented Oct 4, 2024

Similary, for the orderings [a,b,c,d] and [a,c,b,d]. Their most condensed representation would be [a,b,d] and [a,c,d]. If we fail to represent ordering with this condensed version, subsequent stages might behave unexpectedly.

I am still trying to convince myself that this is the correct transformation for the most condensed ordering 🤔 I will keep thinking about it.

I poked around trying to find a function that could collapse the ordering [a,b,c,d] and [a,c,b,d] --> [a,b,d] and [a,c,d] but could not find one. OrderingEquivalenceClass::remove_redundant_entries does not appear to do this collapsing.

I will also try and make a PR to add some comments about the "no redundant information" invariant

@berkaysynnada
Copy link
Contributor

berkaysynnada commented Oct 4, 2024

Does this simplification make sense?

  1. [a,b,c,d] & [a,c,b,d] => [a,b,d] & [a,c,d]
  2. [b,c,d] & [c,b,d] => [b,d] & [c,d]
  3. [b,c] & [c,b] => [b] & [c]
  4. Add the removed terms => [a,b,c,d] & [a,c,b,d] => [a,b,d] & [a,c,d]

@alamb
Copy link
Contributor Author

alamb commented Oct 4, 2024

Here is a PR that fixes the issue #12759 -- however, given the discussions above there may well be a deeper issue we need to solve too

@berkaysynnada
Copy link
Contributor

I will investigate further into the case where:

        // add equivalent ordering [a, b, c, d]
        input_properties.add_new_ordering(vec![
            parse_sort_expr("a", &input_schema),
            parse_sort_expr("b", &input_schema),
            parse_sort_expr("c", &input_schema),
            parse_sort_expr("d", &input_schema),
        ]);

        // add equivalent ordering [a, c, b, d]
        input_properties.add_new_ordering(vec![
            parse_sort_expr("a", &input_schema),
            parse_sort_expr("c", &input_schema),
            parse_sort_expr("b", &input_schema), // NB b and c are swapped
            parse_sort_expr("d", &input_schema),
        ]);

as @akurmustafa mentioned, I have some questions about what the final state of ordering equivalence is and what it should be, FYI

@alamb
Copy link
Contributor Author

alamb commented Oct 8, 2024

as @akurmustafa mentioned, I have some questions about what the final state of ordering equivalence is and what it should be, FYI

Thank you

@berkaysynnada
Copy link
Contributor

I have worked through this comment in detail to determine whether there is an actual bug in the current implementation. Initially, my thoughts were similar to @akurmustafa's intuition, and I draft a few designs. However, I can now safely say that the current implementation is correct, and the tests are valid for both the add_new_orderings() and satisfy() APIs. Below are my brief findings:

The example that triggered the initial bug was [a, b, c, d] and [a, c, b, d]. We thought it could be simplified to [a, b, d] and [a, c, d], but this is a mistaken assumption.

Why?

Consider this table:

a b c d
1 1 1 1
1 1 2 2
1 2 3 3
1 3 4 1

This table is ordered by both [a, b, c, d] and [a, c, b, d]; also [a, b, d] and [a, c, d] are also valid orderings. However, when we add these orderings to the equivalence properties, can we make them stored as [a, b, d] and [a, c, d]?

The answer is no.

Consider this table:

a b c d
1 1 1 1
1 1 2 2
1 2 3 3
1 3 3 1

This table also satisfies both [a, b, c, d] and [a, c, b, d], but we cannot simplify them to [a, b, d] and [a, c, d]. In this case, [a, c, d] is not valid.

If someone needs to infer an ordering corresponding to the first table, it should be given as [a, b, d] and [a, c, d] because we can derive both [a, b, c, d] and [a, c, b, d] from these. However, the reverse is not true.

In short:

  • [a, c] + [b, c] satisfies both [a, b, c] and [b, a, c]. (Global prefixes can be inserted anywhere in other orderings, and we can safely remove the duplicate element closer to the right-hand side.)
  • [a, b, c] + [b, a, c] does not satisfy [a, c] or [b, c]. (It actually satisfies one of them, but since we cannot know which one, we say it does not satisfy either.) However, if we know that a is unique, we can ensure that [a, c] is valid. Similarly, if b is unique, then [b, c] is valid. This is a task for another day.

@alamb
Copy link
Contributor Author

alamb commented Oct 14, 2024

However, I can now safely say that the current implementation is correct, and the tests are valid for both the add_new_orderings() and satisfy() APIs. Below are my brief findings:

thank you for reviewing and double checking @berkaysynnada -- very much appreciated

@akurmustafa
Copy link
Contributor

akurmustafa commented Oct 14, 2024

I have worked through this comment in detail to determine whether there is an actual bug in the current implementation. Initially, my thoughts were similar to @akurmustafa's intuition, and I draft a few designs. However, I can now safely say that the current implementation is correct, and the tests are valid for both the add_new_orderings() and satisfy() APIs. Below are my brief findings:

Thanks @berkaysynnada for this finding. Nice catch. I hope, I didn't waste your time by misdirecting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants