-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewritten RelEx2Logic rule engine #37
Conversation
Reviewed. The fix modified the buggy graph traversing algorithm of the RelEx2Logic Ruiting Lian On Fri, May 16, 2014 at 11:56 AM, william [email protected] wrote:
|
Can you please fix the indentation, so that its clear what actually changed, and what didn't? It seems like the original file used tab indentation, which you changed to spaces. Or something like that. I can't commit when the changeset is completely garbled like this. You can do this on this same branch, you don;'t need to create a new branch; just push additional commits to this branch. |
The changeset are in the "Rewritten rule engine" commit. The other one only fixes the spacing and changed all tabs to spaces according to OpenCog development standard on wiki. |
relex uses tabs not spaces. On 16 May 2014 20:32, William Ma [email protected] wrote:
|
the opencog development standard wiki is stupid. Its a bunch of really bad On 17 May 2014 18:24, Linas Vepstas [email protected] wrote:
|
anyway, I cannot commit this until you fix the tabs to spaces issue. |
Ah, sorry William. As Linas feels so strongly on the tabs vs. spaces issue and is the primary maintainer of RelEx, let us make subsequent patches to RelEx using tabs not spaces ;-p Linas -- as William pointed out, the whitespace changes are different from the "Additional Eclipse whitespace cleanup" commit. Can you just merge his changes prior to the Additional Eclipse Whitespace commit, or does he need to revert the latter so you can review the former? In any case, we do need William's non-whitespace changes merged QUICKLY, because they fix serious bugs that were rendering RelEx2Logic close to un-usable. Linas, I know you think "4-space indentation is stupid" (quote from one of your previous emails), but as you know there are famous programmers (and numerous programmers of varying levels of fame) on both sides of the debate. One day a superhuman AI kindergartner, while looking back thru archives of the Internet of the obsolete human species, will observe the amount of emotion some humans attached to the Tabs vs. Spaces debate and be really, really baffled. William: For entertainment, you may wish to look at such discussions as http://programmers.stackexchange.com/questions/57/tabs-versus-spaceswhat-is-the-proper-indentation-character-for-everything-in-e and http://forums.xkcd.com/viewtopic.php?f=40&t=15328 and many others ... clearly Linas is not the only coder in the world with a strong view on the topic. Linas: Regarding the OpenCog development guidelines, I'm curious what you protest to there besides the Tabs vs. Spaces issue? But this particular comment thread probably isn't the best place for such a discussion, it would be better on the OpenCog email list. I note that William is a new employee on the OpenCog HK project, who is doing a good job and taking a professional approach. He just fixed a bunch of bugs in the RelEx2Logic output generator, some of them quite deep logic errors in the code. Regarding indentation, he simply looked at the OpenCog development guidelines on the wiki and figured following them would be the right thing. I doubt he has any strong emotional attachment to Tabs or Spaces ;-p .... On the other hand, "the opencog development standard wiki is stupid. Its a bunch of really bad |
?? Now Github says "bgoertzel closed this 3 hours ago" but I didn't mean to -- my browser / bad net connection are playing weird trix today ;o |
I am unable to actually review what the actual changes are, because the whitespace changes obscure everything. Its a form of code obfuscation. The general guideline for modifying code, used by almost all projects, is to adhere to the coding style currently used in a file, rather than changing that style. I see no need to obfuscate or hide the changes being made! |
This reverts commit 61fd762.
I have revert the "tabs to space" commit. For the record, there's nothing hidden there. It's just a "search and replace" of tabs and replace with 4 spaces. |
No worries, the 'obfuscate or hide' comments were just Linas being hyperbolic again :-P although his point stands about keeping whitespace changes unmingled with regular PRs and batches of commits whenever possible. |
If anyone needs these commits urgently before they're merged into master, I can help out with the necessary git commands. Clicking on below 'You can also merge branches on the command line' is also a starting point. Using working branch naming as described at http://wiki.opencog.org/w/Development_standards#Git_HOWTO can make the process a bit less confusing ('master' becomes a less overloaded term). |
} | ||
|
||
/** | ||
* This method is used for "sf-links", which is needed for Stanford-relations such as _det. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not seem right, none of the relex2logic code should be using or applying any of the stanford relations. Thos relations don't do anything "interesting", they only generate output that is compatible with the stanford parser, and nothing more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be clear, by default, none of the stanfrd processing is applied unless you ask for it with a special flag ... I'm pretty sure you are not using that flag, and you shouldn't -- it will only slow things down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the RelEx2Logic rules requires this though. For example the det-rule is using relation _det that can only be reached by following sf-links (you will not find _det anywhere by following the normal "links"). sf-links are available on the Feature Structure I am testing on, so it seems be generated. On the other hand I was using the plain-text-server.sh script to start the Relex server, maybe it's different from using the opencog-server.sh script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi William,
The _det is marked in the RelEx rule file as well
On Wednesday, May 21, 2014, William Ma [email protected] wrote:
In src/java/relex/output/LogicProcessor.java:
*/
public Boolean BinaryHeadCB(FeatureNode parentNode)
{
return ApplyToLink(parentNode, "links");
}
/**
\* This method is not needed.
*/
public Boolean BinaryRelationCB(String relName, FeatureNode srcNode, FeatureNode tgtNode)
{
return false;
}
/**
\* This method is used for "sf-links", which is needed for Stanford-relations such as _det.
Some of the RelEx2Logic rules requires this though. For example the
det-rule is using relation _det that can only be reached by following
sf-links (you will not find _det anywhere by following the normal "links").
sf-links are available on the Feature Structure I am testing on, so it
seems be generated. On the other hand I was using the plain-text-server.sh
script to start the Relex server, maybe it's different from using the
opencog-server.sh script.—
Reply to this email directly or view it on GitHubhttps://github.com//pull/37/files#r12879478
.
Ruiting Lian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Yes, I'm pretty sure no one sets the --stanford flag on the opencog server.
If you only need the det rule, maybe it should just be moved over to the usual relex ruleset. The problem is that each rule takes some number of milliseconds to run, and the more there are, the slower it all gets. its already at 3-5 seconds per sentence, which is shockingly slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, it looks like plain-text-server.sh is slightly different from opencog-server.sh. Starting with opencog-server.sh I no longer gets sf-links. However, _det disappeared from the Feature Structure as well, so something strange is going on there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, things are placed in sf-links only if the --stanford flag is given on the command line. I think (not sure) that flag works with any of the servers. That flag causes the "stanford-algs.txt" to be applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. you might find it easiest to not use a server during development, but just run relation-extractor.sh one or two sentences at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured why I thought _det disappeared from the Feature Structure. In fact, it appears correctly for sentences such as "I ate those apples." for the word "those". However, for sentences such as "I ate an apple.", I used to get _det rules for the word "an" as well using sf-links. I suppose this is intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi William,
Yes, it's intentional. RelEx only generates _det for definite article "the"
and "that/this/those/these/...", to specify the definite concepts from the
general concepts for the later stage's purpose..
Ruiting Lian
On Wed, May 21, 2014 at 6:27 PM, William Ma [email protected]:
In src/java/relex/output/LogicProcessor.java:
*/
public Boolean BinaryHeadCB(FeatureNode parentNode)
{
return ApplyToLink(parentNode, "links");
}
/**
\* This method is not needed.
*/
public Boolean BinaryRelationCB(String relName, FeatureNode srcNode, FeatureNode tgtNode)
{
return false;
}
/**
\* This method is used for "sf-links", which is needed for Stanford-relations such as _det.
I figured why I thought _det disappeared from the Feature Structure. In
fact, it appears correctly for sentences such as "I ate those apples." for
the word "those". However, for sentences such as "I ate an apple.", I used
to get _det rules for the word "an" as well using sf-links. I suppose this
is intentional.—
Reply to this email directly or view it on GitHubhttps://github.com//pull/37/files#r12889519
.
Rewritten RelEx2Logic rule engine
thanks for fixing the spacing issues. |
Rewritten rule engine, fixing issue #23, #32, #33, plus a few others I didn't report.