-
Notifications
You must be signed in to change notification settings - Fork 22
Database
A Tensorlog DATABASE is holds a bunch of unary and binary relations, which are encoded as scipy sparse matrixes. The human-readable format for this is a set of files with the .cfacts extension. Each line contains a predicate name and then one or two tab-separated string constants. Some examples, from src/test/textcattoy.cfacts:
hasWord dh a hasWord dh pricy hasWord dh doll hasWord dh house hasWord ft a hasWord ft little hasWord ft red hasWord ft fire hasWord ft truck ... label pos label neg
An additional column can be added which is a numeric weight (so don't use any constant that parses to a number in a cfacts file to avoid program confusion.)
A database can be serialized and after serialization should be stored in a directory with extension .db. A serialized database is much smaller and can be loaded much more quickly.
To see what's in a database, serialized or not, you can use the 'list' module, for example:
python -m list --db test/textcattoy.cfacts
or
python -m list --db test/textcattoy.cfacts --mode hasWord/2