Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid collecting entire partition(s), when no Comparator is provided with a window #177

Open
lukaseder opened this issue Jan 9, 2016 · 9 comments

Comments

@lukaseder
Copy link
Member

When using only the frame clause (and possibly the partition function), subsequent window function evaluations can be done lazily, as they do not necessarily depend on the whole partition(s) being collected in advance.

For instance, when doing a sliding sum like this:

Seq.of(1, 2, 3, 4, 5, 6, 7...)
   .window(-1, 1)
   .map(w -> w.sum())
   .limit(3)
   .toList();

The result here is:

[Optional[3], Optional[6], Optional[9]]

There is no need to go beyond item no. 4 (limit + frame upper bound) in collecting the stream.

@johnmcclean
Copy link
Contributor

This would be awesome :)

@lukaseder
Copy link
Member Author

Yes, it would indeed, although further research is needed because the semantics of some ranking functions is independent of the window frame, e.g. rowNumber(), rank(), or denseRank().

Also, thus far, I've documented only the convenience window() method. There is also the possibility of creating up to 16 windows at the same time, in case the stream needs to be buffered as soon as at least one window specification uses ordering.

By the way, I've seen your response on Stack Overflow:
http://stackoverflow.com/a/34712153/521799

Great to hear you're building on top of jOOλ! Would you be interested in publishing a guest post about this work on the jOOQ blog?

@johnmcclean
Copy link
Contributor

Yeah sure, that sounds good, cyclops-streams adds reactive-streams type features on top of jOOλ (hotStreams, reactive-streams support, async execution etc) and then simple-react adds concurrency to that. There are a few operators that would probably be at home in Seq though (e.g. single & singleOptional - see http://stackoverflow.com/questions/22694884/filter-java-stream-to-1-and-only-1-element/34715168#34715168)

Your windowing is incredibly feature rich, I'm looking forward to getting a better handle on how to make full use of it.

@lukaseder
Copy link
Member Author

There are a few operators that would probably be at home in Seq though (e.g. single & singleOptional

Pull requests are very welcome! Although, in that case, what's the exact use-case?

Your windowing is incredibly feature rich, I'm looking forward to getting a better handle on how to make full use of it.

Yep, they're very versatile. Usually, there are more generic FP constructs to do the same things, but I have not yet seen anything as concise as SQL window functions

@johnmcclean
Copy link
Contributor

I've put the reason I added it below (manipulating a single value asynchronously), but I think the are more general use cases when you want enforce that there is one element in a dataset that meets a criteria (by throwing an exception if there isn't), or provide a default if there is not.

       List<Footballer> players;
       Goalkeeper goalie = Seq.of(players)
                              .ofType(Goalkeeper.class)
                              .single();

       KeyController critical = Seq.of(suppliedPlugins)
                                   .ofType(KeyController.class)
                                   .singleOptional() //misconfigured if Optional.empty
                                   .orElse(safeModeController);

       Seq.of(host1,host2, host3,host4,host5)
           .filter(host ->memberOfMajorityCluster(host))
           .single(host -> host.isElectedLeader());

The reason we have it : The api in simple-react is in many cases (& at least for the authors) both simpler and more powerful than the direct CompletableFuture api, so it's useful to be able to manipulate a Stream of a single value. If you require that there is absolutely only one result, it is safer to call

        seq.single();

than

     seq.toList().get(0);

Here even unit tests could hide the fact the Stream contains more than one result. Otherwise the equivalent code is something like

   List values = seq.toList();
   if(list.size()==1)
        return values.get();
    else
      throw new Exception();

@lukaseder
Copy link
Member Author

Convinced! Although, I'll implement this subtly differently: #178

@lukaseder
Copy link
Member Author

I'm curious to learn more from you - if you have any other methods that you can see in jOOλ, just open up feature requests and we'll see if we can add something.

@johnmcclean
Copy link
Contributor

Sure will do, I think onEmptySwitch(Supplier<Seq<T>> switchTo) would also fit well in jOOλ for example.

@lukaseder
Copy link
Member Author

(excuse my edit, needed to visualise the generics)

Hmm, yeah, I can see the point. I've created #179 with some criticism.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants