Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewritten RelEx2Logic rule engine #37

Merged
merged 4 commits into from
May 21, 2014
Merged

Conversation

williampma
Copy link
Member

Rewritten rule engine, fixing issue #23, #32, #33, plus a few others I didn't report.

@ruiting
Copy link
Contributor

ruiting commented May 16, 2014

Reviewed.

The fix modified the buggy graph traversing algorithm of the RelEx2Logic
rule engine, using RelationCallback to crawl the Feature Structure..

Ruiting Lian

On Fri, May 16, 2014 at 11:56 AM, william [email protected] wrote:

Rewritten rule engine, fixing issue #23#23,
#32 #32, #33#33,

plus a few others I didn't report.

You can merge this Pull Request by running

git pull https://github.com/williampma/relex master

Or view, comment on, or merge it at:

#37
Commit Summary

  • Rewritten rule engine
  • Misc cleanup

File Changes

Patch Links:


Reply to this email directly or view it on GitHubhttps://github.com//pull/37
.

@linas
Copy link
Member

linas commented May 16, 2014

Can you please fix the indentation, so that its clear what actually changed, and what didn't?

It seems like the original file used tab indentation, which you changed to spaces. Or something like that. I can't commit when the changeset is completely garbled like this.

You can do this on this same branch, you don;'t need to create a new branch; just push additional commits to this branch.

@williampma
Copy link
Member Author

The changeset are in the "Rewritten rule engine" commit. The other one only fixes the spacing and changed all tabs to spaces according to OpenCog development standard on wiki.

@linas
Copy link
Member

linas commented May 17, 2014

relex uses tabs not spaces.

On 16 May 2014 20:32, William Ma [email protected] wrote:

The changeset are in the "Rewritten rule engine" commit. The other one
only fixes the spacing and changed all tabs to spaces according to OpenCog
development standard on wiki.


Reply to this email directly or view it on GitHubhttps://github.com//pull/37#issuecomment-43394080
.

@linas
Copy link
Member

linas commented May 17, 2014

the opencog development standard wiki is stupid. Its a bunch of really bad
rules for development. Read it, and do the opposite.

On 17 May 2014 18:24, Linas Vepstas [email protected] wrote:

relex uses tabs not spaces.

On 16 May 2014 20:32, William Ma [email protected] wrote:

The changeset are in the "Rewritten rule engine" commit. The other one
only fixes the spacing and changed all tabs to spaces according to OpenCog
development standard on wiki.


Reply to this email directly or view it on GitHubhttps://github.com//pull/37#issuecomment-43394080
.

@linas
Copy link
Member

linas commented May 17, 2014

anyway, I cannot commit this until you fix the tabs to spaces issue.

@bgoertzel
Copy link

Ah, sorry William. As Linas feels so strongly on the tabs vs. spaces issue and is the primary maintainer of RelEx, let us make subsequent patches to RelEx using tabs not spaces ;-p

Linas -- as William pointed out, the whitespace changes are different from the "Additional Eclipse whitespace cleanup" commit. Can you just merge his changes prior to the Additional Eclipse Whitespace commit, or does he need to revert the latter so you can review the former?

In any case, we do need William's non-whitespace changes merged QUICKLY, because they fix serious bugs that were rendering RelEx2Logic close to un-usable.

Linas, I know you think "4-space indentation is stupid" (quote from one of your previous emails), but as you know there are famous programmers (and numerous programmers of varying levels of fame) on both sides of the debate. One day a superhuman AI kindergartner, while looking back thru archives of the Internet of the obsolete human species, will observe the amount of emotion some humans attached to the Tabs vs. Spaces debate and be really, really baffled.

William: For entertainment, you may wish to look at such discussions as http://programmers.stackexchange.com/questions/57/tabs-versus-spaceswhat-is-the-proper-indentation-character-for-everything-in-e and http://forums.xkcd.com/viewtopic.php?f=40&t=15328 and many others ... clearly Linas is not the only coder in the world with a strong view on the topic.

Linas: Regarding the OpenCog development guidelines, I'm curious what you protest to there besides the Tabs vs. Spaces issue? But this particular comment thread probably isn't the best place for such a discussion, it would be better on the OpenCog email list.

I note that William is a new employee on the OpenCog HK project, who is doing a good job and taking a professional approach. He just fixed a bunch of bugs in the RelEx2Logic output generator, some of them quite deep logic errors in the code. Regarding indentation, he simply looked at the OpenCog development guidelines on the wiki and figured following them would be the right thing. I doubt he has any strong emotional attachment to Tabs or Spaces ;-p ....

On the other hand, "the opencog development standard wiki is stupid. Its a bunch of really bad
rules for development. Read it, and do the opposite." are not very good new-contributor instructions.... For example, the guidelines advocate commenting code, so what is the opposite? Not commenting any code?

@bgoertzel bgoertzel closed this May 18, 2014
@bgoertzel
Copy link

?? Now Github says "bgoertzel closed this 3 hours ago" but I didn't mean to -- my browser / bad net connection are playing weird trix today ;o

@githart githart reopened this May 18, 2014
@linas
Copy link
Member

linas commented May 18, 2014

I am unable to actually review what the actual changes are, because the whitespace changes obscure everything. Its a form of code obfuscation.

The general guideline for modifying code, used by almost all projects, is to adhere to the coding style currently used in a file, rather than changing that style. I see no need to obfuscate or hide the changes being made!

@williampma
Copy link
Member Author

I have revert the "tabs to space" commit. For the record, there's nothing hidden there. It's just a "search and replace" of tabs and replace with 4 spaces.

@githart
Copy link
Member

githart commented May 19, 2014

No worries, the 'obfuscate or hide' comments were just Linas being hyperbolic again :-P although his point stands about keeping whitespace changes unmingled with regular PRs and batches of commits whenever possible.

@githart
Copy link
Member

githart commented May 19, 2014

If anyone needs these commits urgently before they're merged into master, I can help out with the necessary git commands. Clicking on below 'You can also merge branches on the command line' is also a starting point.

Using working branch naming as described at http://wiki.opencog.org/w/Development_standards#Git_HOWTO can make the process a bit less confusing ('master' becomes a less overloaded term).

}

/**
* This method is used for "sf-links", which is needed for Stanford-relations such as _det.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not seem right, none of the relex2logic code should be using or applying any of the stanford relations. Thos relations don't do anything "interesting", they only generate output that is compatible with the stanford parser, and nothing more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, by default, none of the stanfrd processing is applied unless you ask for it with a special flag ... I'm pretty sure you are not using that flag, and you shouldn't -- it will only slow things down.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the RelEx2Logic rules requires this though. For example the det-rule is using relation _det that can only be reached by following sf-links (you will not find _det anywhere by following the normal "links"). sf-links are available on the Feature Structure I am testing on, so it seems be generated. On the other hand I was using the plain-text-server.sh script to start the Relex server, maybe it's different from using the opencog-server.sh script.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi William,

The _det is marked in the RelEx rule file as well

On Wednesday, May 21, 2014, William Ma [email protected] wrote:

In src/java/relex/output/LogicProcessor.java:

  •    */
    
  •   public Boolean BinaryHeadCB(FeatureNode parentNode)
    
  •   {
    
  •       return ApplyToLink(parentNode, "links");
    
  •   }
    
  •   /**
    
  •    \* This method is not needed.
    
  •    */
    
  •   public Boolean BinaryRelationCB(String relName, FeatureNode srcNode, FeatureNode tgtNode)
    
  •   {
    
  •       return false;
    
  •   }
    
  •   /**
    
  •    \* This method is used for "sf-links", which is needed for Stanford-relations such as _det.
    

Some of the RelEx2Logic rules requires this though. For example the
det-rule is using relation _det that can only be reached by following
sf-links (you will not find _det anywhere by following the normal "links").
sf-links are available on the Feature Structure I am testing on, so it
seems be generated. On the other hand I was using the plain-text-server.sh
script to start the Relex server, maybe it's different from using the
opencog-server.sh script.


Reply to this email directly or view it on GitHubhttps://github.com//pull/37/files#r12879478
.

Ruiting Lian

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Yes, I'm pretty sure no one sets the --stanford flag on the opencog server.

If you only need the det rule, maybe it should just be moved over to the usual relex ruleset. The problem is that each rule takes some number of milliseconds to run, and the more there are, the slower it all gets. its already at 3-5 seconds per sentence, which is shockingly slow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, it looks like plain-text-server.sh is slightly different from opencog-server.sh. Starting with opencog-server.sh I no longer gets sf-links. However, _det disappeared from the Feature Structure as well, so something strange is going on there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, things are placed in sf-links only if the --stanford flag is given on the command line. I think (not sure) that flag works with any of the servers. That flag causes the "stanford-algs.txt" to be applied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. you might find it easiest to not use a server during development, but just run relation-extractor.sh one or two sentences at a time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured why I thought _det disappeared from the Feature Structure. In fact, it appears correctly for sentences such as "I ate those apples." for the word "those". However, for sentences such as "I ate an apple.", I used to get _det rules for the word "an" as well using sf-links. I suppose this is intentional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi William,

Yes, it's intentional. RelEx only generates _det for definite article "the"
and "that/this/those/these/...", to specify the definite concepts from the
general concepts for the later stage's purpose..

Ruiting Lian

On Wed, May 21, 2014 at 6:27 PM, William Ma [email protected]:

In src/java/relex/output/LogicProcessor.java:

  •    */
    
  •   public Boolean BinaryHeadCB(FeatureNode parentNode)
    
  •   {
    
  •       return ApplyToLink(parentNode, "links");
    
  •   }
    
  •   /**
    
  •    \* This method is not needed.
    
  •    */
    
  •   public Boolean BinaryRelationCB(String relName, FeatureNode srcNode, FeatureNode tgtNode)
    
  •   {
    
  •       return false;
    
  •   }
    
  •   /**
    
  •    \* This method is used for "sf-links", which is needed for Stanford-relations such as _det.
    

I figured why I thought _det disappeared from the Feature Structure. In
fact, it appears correctly for sentences such as "I ate those apples." for
the word "those". However, for sentences such as "I ate an apple.", I used
to get _det rules for the word "an" as well using sf-links. I suppose this
is intentional.


Reply to this email directly or view it on GitHubhttps://github.com//pull/37/files#r12889519
.

linas added a commit that referenced this pull request May 21, 2014
Rewritten RelEx2Logic rule engine
@linas linas merged commit 06c2170 into opencog:master May 21, 2014
@linas
Copy link
Member

linas commented May 21, 2014

thanks for fixing the spacing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants