-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathnotes.txt
954 lines (631 loc) · 30.2 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
audit
convert to expect...
.attr?
fork jasmine and put in pp alternative
also, inspect within arrays, etc
possibly change a different place in the code...
fill out backlog...
tests run in a browser
knit works in a browser
sqlite stuff works with html5 storage
aggregate
straight aggregation, group by
pivot
complete predicates: or, gt, gte, grouping, etc...
sqlite implementation of cost based on explain / cardinality (cost before-the-fact)
using
union
RA expressions as json...
then x-language AT's.
x-engine joins
is it even possible to implement "join" async?
stream vs async...
go stream? assume (synchronous) iterators...
...so far this is an exploration of what a RA library might look like.
it actually works too
but it's inefficient in many places
also incomplete (ex: predicate are and, eq)
assumes arrays where it maybe ought to assume streams
and what about async?
and x-engine joins?
aggregate(relation, avg(attr("person.age")))
aggregate(relation, [attr("person.name"), attr("person.foo")], [avg(attr("person.age")), max(attr("person.age"))])
...assumes a projection down to avg(age)
aggregate(relation, [attr("person.name"), avg(attr("person.age"))])
...assumes a projection down to person.name, avg(age) - group by person.name
readme
Examples
examples
state of the union
algebra
"compilation"
acceptance tests
vision
- tests are json?
future directions...
there's a lot to do, based on demand
- x-engine
- streaming / async
- json wire format
group by, aggregates
group(relation, [name, age])
is group like nest?
nest, then apply aggrgates?
collapse a nesting...
complete predicates.
die: also rows and objects...
delegate signatures
remove rows and objects methods from algebra classes
this is only available upon compilation
...or change them to default compile..?
how should I auto-compile the inner relations to get rows and proper attributes?
relations shouldn't need names. yet names are convenient.
are they so practically convenient that we should require them?
or is that limiting? and/or offensive, from an elegance POV?
convert memory to complier style
quacksLike Executable relation
algebra to algorithm depends into memory relation
make another relation?
or, extract memory relation?
should i reconsider the concept of memory engine?
that's a good name though
is it more about sets?
or algorithms?
...storage vs execution strategy
...ideally engines are about storage.
rename engine => storage?
storage.sqlite
storage.set
...but it's not simply storage.
...engine implies the ability to do interesting things with the storage...
...there's an execution capability.
more predicate match out of engine/memory?
does it belong with algorithm?
...executing an expression
collaboration between storage and strategy
algorithmic execution is a general strategy
sql execution is specific to things that understand sql...
sqlite supports algorithmic, sql strategies
memory/set supports algorithmic
...hybrid strategies? (sql + algorithmic) (e.g. to allow sql for most things but then also nest/unnest algorithmically)
"name" is out of control
async / iterator resultset style?
sqlite left/right outer join
more exotic join + other scenarios...
"compile" metaphor
transform high-level algebra into lower-level "executable" form
(algebra => sql object)
compiler.compile(expression
sqlite db ==> connection / conn
cross-language AT's
since knit expressions are now late-binding, assume a canonical json format of an expression and state the expression directly in json?
select(relation("person"), and(eq(attr("person.houseId"), 101), eq(attr("person.age"), 5)))
["select",
["relation","person"],
["and",
["eq", ["attr","person.houseId"], 101],
["eq", ["attr","person.age"], 5]
]
]
do relations have names, or not?
memory relations ought not have names, right?
you need to be able to refer to them...
they ought not have early-bound names.
toSql => toSqlObject?
getting perform* methods out of relation.js...
on the sql side, we'll be building a relation that holds a sql object (a query?)
later, if we want memory to behave the same way, build a separate memory relation with a function (delayed execution?)
...and think stream...
distinction between these relations and "base" relations (e.g. table)
introduce hashset into main vendor dir and use to back the in-memory rowstore
relation defaults? copy some other prototype?
"it's like this relation but different"
relation interface is inconsistently implemented
make sure there are quackslike tests for all relations
test for better quacksLike failures
extract this?
...plus assertions / jasmine matchers
[style] consider putting the notSame variations on the left side
align the expected and actual vertically? (chop down)
query (vs table)
both are relations
join a query with a table => new query
http://jscc.jmksf.com/
consider a string form
...hrm...do not like...
attribute signature should include type
ok now get the *sqlite* engine working
an attribute has a type
implement merge, a basic relation
forget about types? can I?
...or push types up?
drive toward working integration test
...may not be able to assume row order with sqlite...
use a set just for assertion purposes right now?
a translation converts a relation to another form
an engine "executes" a relation and brings back rows(tuples)
an engine may find it convenient to translate a relation and then execute the translation.
you might think of an engine as a relation implementation
or a relation evaluation/execution environment
...actually engine isn't a bad word
an engine may speak a certain dialect
translate to sql
...sql dialect is the translation classes specially modified?
...so, can translation classes go to a canonical sql?
yes
more complex sql translations
good errors. knit has guardrails
...anything you can put in a dsl expression should have good guardrails / error / type checking.
List places that need guardrails:
relation doesn't exist
attr doesn't exist
foo.bar <= needs to be in this format
stuff is null or not of the appropriate type
unknown column type
give up and convert to standard describe / it / expect
RA concepts:
relational division ("divide ?") http://en.wikipedia.org/wiki/Relational_algebra#Division
http://www.simple-talk.com/sql/t-sql-programming/divided-we-stand-the-sql-of-relational-division/
http://vadimtropashko.wordpress.com/why-relational-division-is-so-uncommon/
equijoin
semijoin
antijoin
full outer join
RF more complex functions using local functions
relation ref equality doesn't go both ways...
make attributes class faster/more efficient by specifying more features in CF creation (oh boy is this cool)
also need to make differ sort-based in CF to get rid of n^2
an attributes class up in the algebra to take care of common attributes issues
inspect
sameness
attr lookup
attributes...simulate star attrs as real attrs...
up ahead. do i need to build / find a js iterator library?
...all streamed?
cache / memoize?
...what about Set, Hash?
outer joins
also nesting eliminates rows that are fully null / missing values...
consider currying to further RF join methods in relation.js
finish app integration
engines export row iterators (not arrays)
chaining style (reuse dsl methods or vice versa)
knit([headers(string array)]) ==> mem relation, no rows
knit([headers(string array)], array) ==> mem relation with rows
knit(array of js objects) ==> mem relation with rows - headers are stripped out, etc
knit(sqlite table) => sqlite relation
knit._util.bindRelation
nest / unnest
cost
prove that cost is low...implement as n+1 strategy, then RF to sorts + weaving
...should it impose its own order.asc on the relation?
project - *, exceptions
convert before varibles from global to this.
http://esa-matti.suuronen.org/projects/underscore.strings/
reconsider apply
interesting that it's in the same place as rows and objects
turn this into a late-evaluating result set like sequel
go look at sequel internals
https://github.com/jeremyevans/sequel/blob/master/lib/sequel/dataset/actions.rb
also go make things stream/firehose-oriented now?
engine implementations provide an iterator (something with forEach?)
make sure underscore can use it
what about jquery? jQuery(objects()).each()... jQuery(rows()).each()...
good inspect capability
rows.inspect returns tab-delimited table
objects.inspect ... hmm, pretty-printed json? not sure. do nothing / just use rows.inspect...
later: rows and objects can take a callback method, jquery-style, for async support (callback with the iterator)
wrap any 2d array, assigning column names, and incorporate it into a binding with a label and go to town...
the knit equivalent of wrapping a dom element with jquery
possibly convert testrelation over to whatever this is...
or it's already there in the form of memory relation (probably this, actually)
quacksLike...anything out there? underscore method?
jasmine-node at-exit merge
style: property or function? name vs name(). cost, or cost() ? etc
general style audit
what's the point of knit.engine.memory?
organization...
implementations of different parts of the algebra
apply / rows / etc
if someone else is implementing, is it obvious where to put things?
code should guide you.
here are the buckets.
what about order push / split / merge?
order adds to cost
what's a sensible way to break apart memory relation?
=========
readme
examples of nesting and unnesting, examples
=========
features supported
some features won't be supported by some engines
use collection_function - style feature capability?
so, check the performXXX of the relation for an implementation? blow up if not supported...
NullRelation / ExplodingRelation
mix engine relation with exploding relation?
feature("foo")...
methods just simply aren't present if you don't have the right features...just like CF
...only pull in the code you need based on the features you want?
use the node tool plus some sort of declaration thing?
=========
acceptance suite
write the test titles the way you'd want the feature docs to read.
should have a way for engines to say "i support this feature" and "i don't support this feature"
supports
doesntSupport
instead of "test"
=========
rows()
rows().map(function(item){return "x" + item})
.collection()
new collection
map.into(collectionAppender)
...just a convenience wrapper for appending?
rows().all() (collection, renamed in knit)
rows().first()
_A.each(rows(), function(item) {
}) ... can handle a stream
rows()
rows()(10) ...makes a collection?
rows().into(appender)
rows().into(relation("foo").append)
...could internally optimize this in engine implementations. batch inserts in mysql.
stream = .rows()
stream() => row, stream() => row, stream() => row
stream.remaining() => collection
^ method name is configurable
...rename collection to all in knit
**?? using a sql engine, generate select into...
var map = rows().map(function(item){return "x" + item})
map() -> next thing
map() -> next thing
map.into(function(item){/*stuff*/})
stream...
.collection() -> read it all in and return a collection
collection is user-configurable...
=========
Knit needs to be stream-based
use jquery / sequel / datamapper results set style (rows, rows.first [applies limit 1 relation at the last minute?], rows.all)
engine implementers should have to implement something like forEach
Cross engine joins
think join (unix cmd)
- needs to assume the two sides are sorted on the join key.
...so, you need to implement forEach, isSortedOn(joinKeyPart1, joinKeyPart2), sortOn(joinKeyPart1, joinKeyPart2)
...you could of course lie about isSorted..., that just means x-engine joins using your engine will be wacky
Should be able to use this in node.js to programmatically do file joins like unix join does, and it should be pretty fast.
should the system "wake up" on forEach push events / callback from relation sources?
imagine doing a x-engine join between a file and remix, that's paged + not resource intensive.
=========
Next:
name and attributes - functions, not properties
tables have columns, not attributes (attributes alias is fine though)
fill out predicates...
gt, gte, lt, lte, disjunction/or, in, etc
project
order
other set ops
...or...go straight to sql engine / AST?
extract acceptance tests
sqlite/node
expand pushing and splitting...
they pass through joins and other stuff
optimize = split + push (everything, all the way down)
constraints
a language about the language
what sorts of things are allowable in a given context
you can have as many selects and projects as you want
but no joins
etc
can I define relation constraints in such a way that they cross language?
turn cost into a function
can you explain plan w/ the html5 local db?
json-format tests that prove out algebra?
.fromJson ... .fromStruct
========
realRelation.doJoin
doSelect
doProject
doRename
...etc
========
f = function(person, house){
return knit(function(){
return join(person, house)
})
}
f(memoryPerson, memoryHouse)
=========
rawRelationFunction = engine.join(engine.mutableRelation("person"))
rawRelationFunction().rows()
or
optimizedRelationFunction = rawRelationFunction.optimize()
optimizedRelationFunction()
so
I can take a selectFunction, where f = selectFunction, do
f.push() ==> true/false
or
f.optimize()
and calling
r = f()
might yield a Join (i.e. noun, object) (known as r)
...or some other relation type I can't predict. the caller should not know or care.
r.name() == "person__house"
r.attributes() == ...
r.rows() ==> tuples
then later...
r.toSql() => "select..."
=========
So then...
is there a clear division between the "core" *Function objects, and the more concrete objects they produce?
...however to do any work we need to know the structure of the seed relations...
...attributes, plus original relation...
personR = engine.mutableRelationFunction("person")
houseR = engine.mutableRelationFunction("house")
select = knit.select(personR, knit.eq(personR.attr("name"), "Jane"))
relationFunction = select.optimize()
r = relationFunction()
r.rows()
=========
functions are first-class. we are dealing in functions.
implications:
prove commutativiity:
join(a,b).equals(join(b,a))
join(a,b,a.x eq b.y).equals(join(b,a,b.y eq a.x))
selection splitting
select(r, conjunction(true, false)).
equals(select(select(r, true), false))
isEquivalent
isSame
=========
join predicate
apply predicate to combined tuple
if true then tuple is added to combinations
" set of all combinations "
=======
relation has a relation has a relation
...i have a join
===========
...you're replacing one of the relations in the join with one that's now wrapped differently
selection pushing:
join.relationOne = select(join.relationOne, predicate)
remove outer select
so you're lopping off parts of the predicate.
select with no predicate just returns the inner relation
pushing on to the join:
join.cross = predicate
remove outer select
6. 1. 3. Selection and cross product
Cross product is the costliest operator to evaluate. If the input relations have N and M rows, the result will contain rows. Therefore it is very important to do our best to decrease the size of both operands before applying the cross product operator.
This can be effectively done, if the cross product is followed by a selection operator, e.g. (R × P). Considering the definition of join, this is the most likely case. If the cross product is not followed by a selection operator, we can try to push down a selection from higher levels of the expression tree using the other selection rules.
In the above case we break up condition A into conditions B, C and D using the split rules about complex selection conditions, so that A = B C D and B only contains attributes from R, C contains attributes only from P and D contains the part of A that contains attributes from both R and P. Note, that B, C or D are possibly empty. Then the following holds:
=========
- do math
- selection, projection pushing
- conjunction splitting, etc
http://en.wikipedia.org/wiki/Relational_algebra#Use_of_algebraic_properties_for_query_optimization
- then make questions work
- isNaturalJoin()
- redo the object model
- class "style"
- mixins
- should I go to a functional style by default?
ra.project(ra.select(r, somePredicate), [name, age])
- unit test - memory equals, selection, projection, cartesian join (simple tests)
===========
Breaking up selections with complex conditions
A selection whose condition is a conjunction of simpler conditions is equivalent to a sequence of selections with those same individual conditions, and selection whose condition is a disjunction is equivalent to a union of selections. These identities can be used to merge selections so that fewer selections need to be evaluated, or to split them so that the component selections may be moved or optimized separately.
This can be effectively done, if the cross product is followed by a selection operator, e.g. σA(R × P). Considering the definition of join, this is the most likely case. If the cross product is not followed by a selection operator, we can try to push down a selection from higher levels of the expression tree using the other selection rules.
============
grand unification
- good_question is really about a couple of things: (-rb first)
- basic http request validation and parsing
- constructing and validating a relation (e.g. relation-rb)
- predicated becomes extensions in relation-rb, around the relation-rb predicate model (-rb first)
- port that stuff to js?
- validation/constraints on a relation (-rb first)
- constraint on join magnitude
- "this relation may not participate in a join in which more than 1000 tuples are used"
- prove out various engines using a common set of acceptance tests defined in json (-js first)
- relation-acceptance project. notes plus json.
- should all relation projects have a possible async style?
- what do you do in ruby? em or something?
- http://www.igvita.com/2008/09/05/asynchronous-database-access-in-ruby/
- http://github.com/tmm1/em-mysql
- http://www.espace.com.eg/neverblock/blog/2008/09/04/neverblock-instant-scaling-for-your-rails-apps/
- http://www.espace.com.eg/neverblock/blog/2008/08/28/neverblock-mysql-support/
========
inputs and outputs are tuples, expressed as arrays.
...so we need to insert
Async by default
what parts should be async? tuples() ? anything else?
insert, update, delete ...come back with what? boolean?
GOAL: you can learn about relational algebra by reading the code.
...so things like Rename being duck-typed to relation might be weird...
GOAL: DSL for naturally expressing a statement
inner join: "The result of the join can be defined as the outcome of first taking the Cartesian product (or cross-join) of all records in the tables (combining every record in table A with every record in table B)—then return all records which satisfy the join predicate."
- an equi join is a theta join that only allows equality compairsons in the join predicate
(sql "using" - though undoubtedly this is the kind of thing where specific
dbms's will want to accomplish this different ways)
- a natural join is an equi join that joins on all identical attributes
later:
allow join strategies (at least on inner join)
- hash merge
- sort merge
...this applies to in-memory cases only
calculate the "expense" of a join (carindality) - expose this,
use this value to acceptance-test the join strategy
Set operations. They do something to a relation, and yield a relation.
General set operations:
sort, set difference, intersection, union
should be a mixin.
Set operations specialized for RA:
projection, selection(restriction), join
=========
r l a js
ra
=========
DSL ideas
OO style / functional style
r2 = person.project("name", "age") / r2 = project(person, "name", "age")
r2 = person
.join(house)
.on("houseId", "id")
.end
/
r2 = join(person, "houseId", house, "id")
r2 = sort(person, ["name",ASC])
engine.relation("person").join(engine.relation("house"))
person
.join(house)
.on("houseId", "id")
.end
=========
noun / verb
relation
attributes, aka heading
attribute, aka domain
rename (relation / attribute)
projection / project
join / join
selection / select , alias restriction / restrict
union / union
intersection / intersect
difference / diff
=========
t * t2 [foo = bar]
t * t2 ()
=========
http://c2.com/cgi/wiki?NotesOnaCeePlusPlusRdbmsApi
void PrintEmployeeProjectManagers?(CDatabase& db)
{
CTupleList tl(db["Employees"]
.Join("Projects")
.Join(db["Employees"]
.Rename("ID", "ManagerID")
.Rename("Name", "ManagerName?"))
.DropAllBut?(attr("Name") | attr("ManagerName?")));
for (CTupleList::iterator i = tl.begin(); i != tl.end(); ++i)
std::cout << (*i)["Name"] << " " << (*i)["ManagerName?"] << std::endl;
}
=========
[topical!]
http://c2.com/cgi/wiki?RelationalLanguage
It is sad to see SQL being the only widespread data language, as it really adds a lot of mess to the relatively clean conceptual model of relational algebra. The strength of the relational model is its enormously simple data model: in addition to data types (ints, strings, etc.), it only has relations, operators (projection, selection, join, union, sorting, and some others) and integrity constraints.
It would be great to be able to use, in production code, a language where one could say 't' instead of 'select * from t' and 't * t2 [foo = bar]' instead of 'select * from t join t2 on foo=bar'.
In practice since very few would use a multiplication without a filter, I believe they should be combined into a single operator (join). (We can still keep multiplication.) Further, a functional style would be nice so that DBA's could more easily define their own operators without having to add new syntax. A new function that uses two existing functions is not new syntax, just a new function.
=========
from: http://c2.com/cgi/wiki?DatabaseType
I just realized recently that a RelVar is actually an instance of a DatabaseType (or, maybe more specifically, a Table Type)... for example, all variables are instances of a type (for those who think in types). So in fact a relvar is really an instance of some sort of grander type.. or RelType?. The specifications of the RelType? or DatabaseType are the schema, the constraints, etc. The types that determine attribute restraints (column restraints) are just like how a struct or record has sub fields with more type specifications per each struct or record field.
No, a relvar is an instance of a variable. A variable may possess a type which constrains the values it may contain, but a variable is not a type.
{Well it is a 'rel'var, not just any old 'var'. That implies it is named at the top level of a relational database, and that it names a mutable relation (if not mutable, not really a 'variable'). I would note that relation does not equal table... except in practice. Tables can't be infinite. Relations - subsets of cartesian products of potentially infinite domains - certainly can be.}
It is, in fact, "any old 'var'" that happens to contain a value which is a relation, and which is often but not necessarily persistent, as a relvar (short for "relation-valued variable" or "relation variable") may be temporary or local. A relvar is not necessarily at the top level of a relational database (i.e., it may be a local relvar), though the top level of a relational database consists of persistent relvars. Variables are named, relations are not. See RdbRelVar. By the way, the cardinality of a relation held by a relvar is immaterial to the definition of "relvar".
{A 'var' that necessarily identifies a relation isn't just "any old 'var'" - the proof being trivial, that I can point at a 'var' at random (e.g. one identifying an 'int') and it isn't a relvar. As far as the 'naming' goes, from my viewpoint variables never 'contain' values (values being immaterial and entirely un-containable). Variables do identify (or 'point at') and thereby give name to values - generally a temporally limited name based upon the lifetime of an object containing a representation of said value and an associated decodec. And I do know that the cardinality of a relation held by a relvar is immaterial to the definition of "relvar". That's why it would be a 'wrong answer' to say that a relvar has some sort of 'table type', because tables necessarily have finite cardinality, they being defined in terms of specifying exactly what they contain. A relvar identifies a relation, which could potentially have infinite cardinality, which is sufficient to say that it is not a 'table type'. (The statement to which you're responding was targeted originally at someone else rather than you, of course.)}
=========
http://www.cs.rochester.edu/~nelson/courses/csc_173/relations/algebra.html
http://infolab.stanford.edu/~ullman/fcdb/oracle/or-intro.html
=========
A table is a relation
it aliases heading to columns
=========
I'm projecting on a relation (that's a join)
join = join(relation_a, relation_b)
projection(rename(relation_a.attribute("age"), "AGE"), relation_b.attribute("height"))
selection(predicate) #aka restriction
relation
heading
attributes
===========
relation = heading(attributes) + tuples - it's the definition, not the content
an attribute OF a relation
you can *extend* a relation to have a new attribute
projection (select cols...)
selection (where) #aka restriction
natural join (inner join)
intersection, difference, cartesian product (natural join w/o matching attrs. special case of nat join), union
rename (table foo as far, column yyy as zzz)
"extend" a relation.
extend(foo * bar)
extend(max(foo))
the relation (name, age, weight)
the relation's projection over the attributes age, weight
produce a function that acts on a set and produces a result...
selection(relation) => function #aka restriction
projection(relation) => function
...should be a delay before you plug in the actual data. this is a specification.
(select) projection (from) relation (where) selection #aka restriction
(select) projection (join) relation1, relation2 (where) selection
as a matter of modeling, js prototypes might match up very well.
===============
The Relational algebra implemented via Prolog rules and queries.
Selection:
select( variables ) :- conditions on the constants.
Constants select rows in the relation.
Intersection:
r_1in_r2( Vars ) :- r_1( Vars ), r2( Vars ).
selects the entities that are in both r_1 and r2 (use the same variables).
Difference:
diff_r_1_r2( Vars ) :- r_1( Vars ), not r2( Vars ).
selects the entities in r_1 that are not in r2.
Projection:
pr( variables ) :- r( variables and don't cares ).
Don't cares represent columns to be deleted.
Cartesian product:
prod( variables ) :- r_1( vars ), r2( vars ).
prod variables is the list of variables both in r_1 and r2.
Union: (two rules are required to perform union.)
union( variables ) :- first_relation( variables ).
union( variables ) :- second_relation( variables ).
Natural Join: In the rule,
nat_join( variables shared variables ) :-
r_1(variables, shared variables),
r2(variables, shared variables).
the shared variables restrict search to common elements, reduced number of variables in the join eliminate multiple columns.
===================
Hmm. Why is there specialization at the engine level, and the same specialization expressed in a different manner in the "algebra" classes?
e.g.
def engine
relation1.engine != relation2.engine ? Memory::Engine.new : relation1.engine
end
def table_sql(formatter = Sql::TableReference.new(self))
relation1.externalize.table_sql(formatter)
end
def joins(environment, formatter = Sql::TableReference.new(environment))
seems suboptimal, sitting under the algebra folder.
also why is Arel::Array owned by/hardwired to memory engine?
maybe the engine should be a factory:
sessions = my_memory_engine.array(:session)
my_sql_engine.table(:user).join(my_memory_engine.table(:session)) => read
read.execute ==> result
...also you can just join in a raw array
my_sql_engine.table(:user).join([[1,'blue'],[2,'red'],[3,'green']]) => read
read.execute
? the joiner has to offer a couple of forms:
- SqlStatement
- Array
the joinee accepts one of them...?
or perhaps it's the joiner that makes the decision:
Create a temporary (in-memory?) table with these rows.
Then join using sql.
Then continue, but clean up the temporary table when we're done.
*** more scenarios ***
===========
set up all the namespace for a module in one place?
require("arel/attributes/namespace")
- unit tests on -x-straight v8-x- >> no, just use node. straight v8 is too bare.
- integration tests on memory engine
apply the same integration suite everywhere? possible?
sql engine
- integration tests that run on node-sqlite (i.e. incorporating node)
- then take those same integration tests and reuse on html5 client side db
(in-browser tests [html pages]?
what about headless safari? "run if darwin" could just be loading an html page and scraping the results, printing to command line (could even poll and puts...use ruby). use a plain-text jasmine runner. extract this?)
generate a minified arel-all-minified.js?
idea: jasmine plain-text html runner, plus rubycocoa safari stuff from schnauzer =
fast command-line browser-based js test run
...can the output be seen incrementally?
target: unit test output looks exactly like command line output
(failures look the same, etc).
do it with macruby?
==========
Arel bugs:
String likely does not convert false to "false"