Rewritten RelEx2Logic rule engine #37

williampma · 2014-05-16T03:56:56Z

Rewritten rule engine, fixing issue #23, #32, #33, plus a few others I didn't report.

ruiting · 2014-05-16T04:37:08Z

Reviewed.

The fix modified the buggy graph traversing algorithm of the RelEx2Logic
rule engine, using RelationCallback to crawl the Feature Structure..

Ruiting Lian

On Fri, May 16, 2014 at 11:56 AM, william [email protected] wrote:

Rewritten rule engine, fixing issue #23 #23,
#32 #32, #33 #33,

plus a few others I didn't report.

You can merge this Pull Request by running

git pull https://github.com/williampma/relex master

Or view, comment on, or merge it at:

#37
Commit Summary

Rewritten rule engine

Misc cleanup

File Changes

M src/java/relex/output/LogicProcessor.javahttps://github.com/Rewritten RelEx2Logic rule engine #37/files#diff-0(315)

Patch Links:

https://github.com/opencog/relex/pull/37.patch

https://github.com/opencog/relex/pull/37.diff

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/37
.

linas · 2014-05-16T18:58:14Z

Can you please fix the indentation, so that its clear what actually changed, and what didn't?

It seems like the original file used tab indentation, which you changed to spaces. Or something like that. I can't commit when the changeset is completely garbled like this.

You can do this on this same branch, you don;'t need to create a new branch; just push additional commits to this branch.

williampma · 2014-05-17T01:32:32Z

The changeset are in the "Rewritten rule engine" commit. The other one only fixes the spacing and changed all tabs to spaces according to OpenCog development standard on wiki.

linas · 2014-05-17T23:25:12Z

relex uses tabs not spaces.

On 16 May 2014 20:32, William Ma [email protected] wrote:

The changeset are in the "Rewritten rule engine" commit. The other one
only fixes the spacing and changed all tabs to spaces according to OpenCog
development standard on wiki.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/37#issuecomment-43394080
.

linas · 2014-05-17T23:26:06Z

the opencog development standard wiki is stupid. Its a bunch of really bad
rules for development. Read it, and do the opposite.

On 17 May 2014 18:24, Linas Vepstas [email protected] wrote:

relex uses tabs not spaces.

On 16 May 2014 20:32, William Ma [email protected] wrote:

The changeset are in the "Rewritten rule engine" commit. The other one
only fixes the spacing and changed all tabs to spaces according to OpenCog
development standard on wiki.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/37#issuecomment-43394080
.

linas · 2014-05-17T23:26:37Z

anyway, I cannot commit this until you fix the tabs to spaces issue.

bgoertzel · 2014-05-18T02:46:25Z

Ah, sorry William. As Linas feels so strongly on the tabs vs. spaces issue and is the primary maintainer of RelEx, let us make subsequent patches to RelEx using tabs not spaces ;-p

Linas -- as William pointed out, the whitespace changes are different from the "Additional Eclipse whitespace cleanup" commit. Can you just merge his changes prior to the Additional Eclipse Whitespace commit, or does he need to revert the latter so you can review the former?

In any case, we do need William's non-whitespace changes merged QUICKLY, because they fix serious bugs that were rendering RelEx2Logic close to un-usable.

Linas, I know you think "4-space indentation is stupid" (quote from one of your previous emails), but as you know there are famous programmers (and numerous programmers of varying levels of fame) on both sides of the debate. One day a superhuman AI kindergartner, while looking back thru archives of the Internet of the obsolete human species, will observe the amount of emotion some humans attached to the Tabs vs. Spaces debate and be really, really baffled.

William: For entertainment, you may wish to look at such discussions as http://programmers.stackexchange.com/questions/57/tabs-versus-spaceswhat-is-the-proper-indentation-character-for-everything-in-e and http://forums.xkcd.com/viewtopic.php?f=40&t=15328 and many others ... clearly Linas is not the only coder in the world with a strong view on the topic.

Linas: Regarding the OpenCog development guidelines, I'm curious what you protest to there besides the Tabs vs. Spaces issue? But this particular comment thread probably isn't the best place for such a discussion, it would be better on the OpenCog email list.

I note that William is a new employee on the OpenCog HK project, who is doing a good job and taking a professional approach. He just fixed a bunch of bugs in the RelEx2Logic output generator, some of them quite deep logic errors in the code. Regarding indentation, he simply looked at the OpenCog development guidelines on the wiki and figured following them would be the right thing. I doubt he has any strong emotional attachment to Tabs or Spaces ;-p ....

On the other hand, "the opencog development standard wiki is stupid. Its a bunch of really bad
rules for development. Read it, and do the opposite." are not very good new-contributor instructions.... For example, the guidelines advocate commenting code, so what is the opposite? Not commenting any code?

bgoertzel · 2014-05-18T05:28:57Z

?? Now Github says "bgoertzel closed this 3 hours ago" but I didn't mean to -- my browser / bad net connection are playing weird trix today ;o

linas · 2014-05-18T19:25:40Z

I am unable to actually review what the actual changes are, because the whitespace changes obscure everything. Its a form of code obfuscation.

The general guideline for modifying code, used by almost all projects, is to adhere to the coding style currently used in a file, rather than changing that style. I see no need to obfuscate or hide the changes being made!

This reverts commit 61fd762.

williampma · 2014-05-19T02:31:52Z

I have revert the "tabs to space" commit. For the record, there's nothing hidden there. It's just a "search and replace" of tabs and replace with 4 spaces.

githart · 2014-05-19T03:27:47Z

No worries, the 'obfuscate or hide' comments were just Linas being hyperbolic again :-P although his point stands about keeping whitespace changes unmingled with regular PRs and batches of commits whenever possible.

githart · 2014-05-19T14:57:33Z

If anyone needs these commits urgently before they're merged into master, I can help out with the necessary git commands. Clicking on below 'You can also merge branches on the command line' is also a starting point.

Using working branch naming as described at http://wiki.opencog.org/w/Development_standards#Git_HOWTO can make the process a bit less confusing ('master' becomes a less overloaded term).

linas · 2014-05-21T02:51:32Z

src/java/relex/output/LogicProcessor.java

+		}
+
+		/**
+		 * This method is used for "sf-links", which is needed for Stanford-relations such as _det.


this does not seem right, none of the relex2logic code should be using or applying any of the stanford relations. Thos relations don't do anything "interesting", they only generate output that is compatible with the stanford parser, and nothing more.

to be clear, by default, none of the stanfrd processing is applied unless you ask for it with a special flag ... I'm pretty sure you are not using that flag, and you shouldn't -- it will only slow things down.

Some of the RelEx2Logic rules requires this though. For example the det-rule is using relation _det that can only be reached by following sf-links (you will not find _det anywhere by following the normal "links"). sf-links are available on the Feature Structure I am testing on, so it seems be generated. On the other hand I was using the plain-text-server.sh script to start the Relex server, maybe it's different from using the opencog-server.sh script.

Hi William,

The _det is marked in the RelEx rule file as well

On Wednesday, May 21, 2014, William Ma [email protected] wrote:

In src/java/relex/output/LogicProcessor.java:

*/

public Boolean BinaryHeadCB(FeatureNode parentNode)

{

return ApplyToLink(parentNode, "links");

}

/**

\* This method is not needed.

*/

public Boolean BinaryRelationCB(String relName, FeatureNode srcNode, FeatureNode tgtNode)

{

return false;

}

/**

\* This method is used for "sf-links", which is needed for Stanford-relations such as _det.

Some of the RelEx2Logic rules requires this though. For example the
det-rule is using relation _det that can only be reached by following
sf-links (you will not find _det anywhere by following the normal "links").
sf-links are available on the Feature Structure I am testing on, so it
seems be generated. On the other hand I was using the plain-text-server.sh
script to start the Relex server, maybe it's different from using the
opencog-server.sh script.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/37/files#r12879478
.

Ruiting Lian

Hmm. Yes, I'm pretty sure no one sets the --stanford flag on the opencog server.

If you only need the det rule, maybe it should just be moved over to the usual relex ruleset. The problem is that each rule takes some number of milliseconds to run, and the more there are, the slower it all gets. its already at 3-5 seconds per sentence, which is shockingly slow.

Alright, it looks like plain-text-server.sh is slightly different from opencog-server.sh. Starting with opencog-server.sh I no longer gets sf-links. However, _det disappeared from the Feature Structure as well, so something strange is going on there.

well, things are placed in sf-links only if the --stanford flag is given on the command line. I think (not sure) that flag works with any of the servers. That flag causes the "stanford-algs.txt" to be applied.

p.s. you might find it easiest to not use a server during development, but just run relation-extractor.sh one or two sentences at a time.

I figured why I thought _det disappeared from the Feature Structure. In fact, it appears correctly for sentences such as "I ate those apples." for the word "those". However, for sentences such as "I ate an apple.", I used to get _det rules for the word "an" as well using sf-links. I suppose this is intentional.

Hi William,

Yes, it's intentional. RelEx only generates _det for definite article "the"
and "that/this/those/these/...", to specify the definite concepts from the
general concepts for the later stage's purpose..

Ruiting Lian

On Wed, May 21, 2014 at 6:27 PM, William Ma [email protected]:

In src/java/relex/output/LogicProcessor.java:

*/

public Boolean BinaryHeadCB(FeatureNode parentNode)

{

return ApplyToLink(parentNode, "links");

}

/**

\* This method is not needed.

*/

public Boolean BinaryRelationCB(String relName, FeatureNode srcNode, FeatureNode tgtNode)

{

return false;

}

/**

\* This method is used for "sf-links", which is needed for Stanford-relations such as _det.

I figured why I thought _det disappeared from the Feature Structure. In
fact, it appears correctly for sentences such as "I ate those apples." for
the word "those". However, for sentences such as "I ate an apple.", I used
to get _det rules for the word "an" as well using sf-links. I suppose this
is intentional.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/37/files#r12889519
.

Rewritten RelEx2Logic rule engine

linas · 2014-05-21T02:53:59Z

thanks for fixing the spacing issues.

william added 2 commits May 16, 2014 11:38

Rewritten rule engine

c7fe40d

Misc cleanup

f2ad1d3

Additional Eclipse whitespace cleanup

61fd762

bgoertzel closed this May 18, 2014

githart reopened this May 18, 2014

Revert "Additional Eclipse whitespace cleanup"

89076f4

This reverts commit 61fd762.

linas reviewed May 21, 2014
View reviewed changes

linas added a commit that referenced this pull request May 21, 2014

Merge pull request #37 from williampma/master

06c2170

Rewritten RelEx2Logic rule engine

linas merged commit 06c2170 into opencog:master May 21, 2014

linas mentioned this pull request May 26, 2014

RelEx2Logic rule engine not handling multi-conditional rule #33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewritten RelEx2Logic rule engine #37

Rewritten RelEx2Logic rule engine #37

williampma commented May 16, 2014

ruiting commented May 16, 2014

plus a few others I didn't report.

linas commented May 16, 2014

williampma commented May 17, 2014

linas commented May 17, 2014

linas commented May 17, 2014

linas commented May 17, 2014

bgoertzel commented May 18, 2014

bgoertzel commented May 18, 2014

linas commented May 18, 2014

williampma commented May 19, 2014

githart commented May 19, 2014

githart commented May 19, 2014

linas May 21, 2014

linas May 21, 2014

williampma May 21, 2014

ruiting May 21, 2014

linas May 21, 2014

williampma May 21, 2014

linas May 21, 2014

linas May 21, 2014

williampma May 21, 2014

ruiting May 21, 2014

linas commented May 21, 2014

Rewritten RelEx2Logic rule engine #37

Rewritten RelEx2Logic rule engine #37

Conversation

williampma commented May 16, 2014

ruiting commented May 16, 2014

plus a few others I didn't report.

linas commented May 16, 2014

williampma commented May 17, 2014

linas commented May 17, 2014

linas commented May 17, 2014

linas commented May 17, 2014

bgoertzel commented May 18, 2014

bgoertzel commented May 18, 2014

linas commented May 18, 2014

williampma commented May 19, 2014

githart commented May 19, 2014

githart commented May 19, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

linas commented May 21, 2014