-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.markdown.in
193 lines (132 loc) · 9.3 KB
/
README.markdown.in
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193

[http://www.flickr.com/photos/jodigreen/108225549/in/photostream/](http://www.flickr.com/photos/jodigreen/108225549/in/photostream/)
Relational algebra for javascript.
## Abstract
Expressions of relational algebra in pure form are rare in the programming world. It's a shame - the relational paradigm is a powerful and efficient way of organizing and manipulating data in general, not just within RDBs.
Knit follows projects like Arel<a href="#arel">*</a> and LINQ as an attempt to bring the power of relational algebra to programmers.
Knit is alpha quality and the api changes regularly.
## Examples
<!-- EXAMPLE:dsl -->
<!-- EXAMPLE:sqlite -->
## Examples, Continued: Acceptance Tests
Please see the suite of acceptance tests under test/acceptance, they are intended to be "executable documentation". They should give you an overview of what's possible with knit.
## DSL
The DSL uses a functional programming style. In fact it ought to read very much like this [explanation of relational algebra](http://www.cs.rochester.edu/~nelson/courses/csc_173/relations/algebra.html).
The pattern is as follows:
operation(relation, other_parameters...)
So starting with a relation representing a bunch of houses:
relation("house")
Select houses on Main street:
select(
relation("house"),
eq(attr("house.address"), "Main")
)
Houses on Main street and the people in them:
naturalJoin(
select(
relation("house"),
eq(attr("house.address"), "Main")
),
relation("person")
)
People found in houses on Main street:
project(
naturalJoin(
select(
relation("house"),
eq(attr("house.address"), "Main")
),
relation("person")
),
attr("person.name", "person.age")
)
You could of course extract the relations into variables:
var housesOnMainStreet = select(relation("house"), eq(attr("house.address"), "Main"))
var peopleAndHousesOnMainStreet = naturalJoin(housesOnMainStreet, relation("person"))
var peopleOnMainStreet = project(peopleAndHousesOnMainStreet, attr("person.name", "person.age"))
## Concepts, Lifecycle
Breaking down the in-memory example above. We started by creating a DSL function, providing it the base relation "bindings":
var $K = knit({
house:{attributes:["houseId", "address", "cityId"],
rows:[
[101, "Market", 1001],
[102, "Parnassus", 1001],
[103, "Canal", 1002]
]},
city:{attributes:["cityId", "name"],
rows:[
[1001, "San Francisco"],
[1002, "New Orleans"]
]}
})
Then we created a relational expression using knit's DSL:
var expression = $K(function(){
return project(
join(relation("house"), relation("city"),
eq(attr("house.cityId"), attr("city.cityId"))),
attr("house.address", "city.name")
)
})
"Expression" is an appropriate term here: the result has no "rows()" - it's an airy, abstract thing.
It needs to be bound with base relations that contain rows. I've chosen the compilation metaphor to express this (you're "compiling the expression down" to an algorithm, or to sql):
var rows = expression.compile().rows()
Here's what's going on when using the in-memory engine:
1. I define an expression (project(join(...)...) etc).
2. As the expression passes out of the DSL function, the [base relations](https://github.com/sconover/knit-js/blob/master/lib/knit/engine/memory/base_relation.js) and [attributes](https://github.com/sconover/knit-js/blob/master/lib/knit/engine/memory/attribute.js) are "linked up" in the appropriate places. relation("house") points to the in-memory relation "house".
3. I call .compile(). The expression is [converted](https://github.com/sconover/knit-js/blob/master/lib/knit/translation/algorithm/algebra_to_algorithm.js) to its equivalent "algorithm" - implementations of relational operations that execute in-memory ([code](https://github.com/sconover/knit-js/blob/master/lib/knit/algorithms.js), [tests](https://github.com/sconover/knit-js/tree/master/test/algorithms)). The result is an [executable relation](https://github.com/sconover/knit-js/blob/master/lib/knit/executable_relation.js).
4. I call .rows() on the executable relation. This pulls the rows from base relations, sends them through the appropriate algorithms, and returns the end result.
And when using the sqlite engine:
1. I define an expression (project(join(...)...) etc).
2. As the expression passes out of the DSL function, the [tables](https://github.com/sconover/knit-js/blob/master/lib/knit/engine/sqlite/table.js) and [columns](https://github.com/sconover/knit-js/blob/master/lib/knit/engine/sqlite/column.js) are "linked up" in the appropriate places. relation("house") points to the table "house".
3. I call .compile(). The expression is [converted](https://github.com/sconover/knit-js/blob/master/lib/knit/translation/sql/algebra_to_sql.js) to its equivalent [sql object](https://github.com/sconover/knit-js/blob/master/lib/knit/translation/sql/base.js). The result is an [executable relation](https://github.com/sconover/knit-js/blob/master/lib/knit/executable_relation.js), as before.
4. I call .rows() on the executable relation. The [sql query execution strategy](https://github.com/sconover/knit-js/blob/master/lib/knit/engine/sqlite/query.js) converts the sql object [into a sql statement](https://github.com/sconover/knit-js/blob/master/lib/knit/translation/sql/to_statement.js), executes this against the database, and returns the result.
## Status, Direction
Knit works. It's an exciting initial attempt at working out a multi-engine RA library.
But it's far from finished, and there are many interesting avenues to explore. Here's just a sampling of todos / possible directions:
- Beef up the sql/sqlite implementation.
- SQL transformation capability is naive
- A second database (e.g. MySQL) would force out dialects
- Knit ought to be usable with browser sqlite implementations
- More RA operations
- Aggregate ("group by"), equi and anti joins, and on and on...
- Exotic operations like "pivot"
- Fill out the predicates (or, grouping, gt, gte, etc)
- RA expressions to/from JSON
- ...which would allow us to express the acceptance tests in a manner that crosses languages
- Now RA expressions can be shipped across the wire, stored, etc. Neat!
- Cross-engine capability
- Efficiency. Many O(N) enhancements are possible.
- Error reporting. Knit could use better guardrails.
- Move from a array-of-rows orientation to a stream / iterator style
- Evaluate async style - would it make sense? (there are challenges - think of how join might be implemented give that the constituent relations are returning rows asynchronously)
- ...etc...
## Running the Tests
Sorry this is a little cumbersome, it's a todo to clean this up (presumably use npm).
### "Main" suite (everything but sqlite, should work with any modern node install)
1. Install [Node.js](https://github.com/joyent/node/wiki/Installation)
2. Clone knit-js
3. In a sibling directory, clone sconover's jasmine-node:
git clone https://github.com/sconover/jasmine-node.git
4. cd to knit-js. run
node test/main_suite.js
### Sqlite suite (everything but sqlite) - requires node 0.2.x for now
Regarding the node v0.2.6 requirement - the combination of a pretty rough world when it comes to node/sqlite support and my own requirements (namely, that the sqlite driver supports sync operation) means I need to use grumdig's node-sqlite, which segfaults on node > 0.2.6. Improving this state of affairs is a priority.
1. Install [Node.js](https://github.com/joyent/node/wiki/Installation) - IMPORTANT - must be v0.2.x!
2. Clone knit-js
3. In a sibling directory, clone sconover's jasmine-node:
git clone https://github.com/sconover/jasmine-node.git
4. In a sibling directory, clone grumdrig's node-sqlite:
git clone https://github.com/grumdrig/node-sqlite.git
5. Build node-sqlite - see the [node-sqlite installation instructions](http://grumdrig.com/node-sqlite/).
6. cd to knit-js. run
node test/suite.js
## Relational Algebra Resources
[Wikipedia: Relational Algebra](http://en.wikipedia.org/wiki/Relational_algebra)
[University of Rochester, CSC173, Relational Algebra Intro](http://www.cs.rochester.edu/~nelson/courses/csc_173/relations/algebra.html)
## Footnotes
<a name="arel">
*that is, Arel as originally envisioned. [Arel 1.0](https://github.com/nkallen/arel) had echoes of relational algebra (the terminology, implementations of the major RA operations). More importantly for Rails, it enabled a powerful composable style, and perhaps because of its success within the Rails project Rails developers reworked it as a focused SQL-oriented tool.
As of [version 2.0](https://github.com/rails/arel) Arel is really a SQL AST library, as [Aaron Patterson indicates](http://engineering.attinteractive.com/2010/12/architecture-of-arel-2-0/):
<blockquote>
Though ARel is billed as a “relational algebra” library, the current implementation is entirely built using patterns found in compiler technology. I think a more accurate description of ARel would be “a SQL compiler".
</blockquote>