Idea behind xt, xb, xq data split #1769
-
Hello, Quick question: what is the idea behind xt (training set) vs xb (base set or database set)? In ML, there is no 'added' set like xb, or at least it is not obvious, so this is new to me. What is the general philosophy behind splitting a dataset into query, base and training dataset? Is training a subset of base? If so, is it just a subset that sufficiently trains the model before adding the rest? If not, do you mind explaining the idea behind the split? Thanks for any guidance! Cheers! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 4 replies
-
Basically you want to index a set of vectors ( |
Beta Was this translation helpful? Give feedback.
Basically you want to index a set of vectors (
xb
) which you want to query with a set of queries (xq
). Thext
set should be large enough for the index to train in a meaningful way, and have the same distribution asxb
. It is not needed thatxt
andxb
be disjoint.