-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
two ideas for taming robots #6093
Changes from 2 commits
310b6ed
58a7e20
a244b67
d01e8b2
c3e88f4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,6 +118,8 @@ | |
import java.util.TreeSet; | ||
import java.util.stream.Collectors; | ||
|
||
import static org.labkey.api.data.DataRegion.LAST_FILTER_PARAM; | ||
|
||
public class WikiController extends SpringActionController | ||
{ | ||
private static final Logger LOG = LogManager.getLogger(WikiController.class); | ||
|
@@ -1122,6 +1124,14 @@ public PageAction(ViewContext ctx, Wiki wiki, WikiVersion wikiversion) | |
@Override | ||
public ModelAndView getView(WikiNameForm form, BindException errors) | ||
{ | ||
// Don't index page with non default parameters (e.g. targeting webparts in the page) | ||
for (var e = getViewContext().getRequest().getParameterNames() ; e.hasMoreElements() ; ) | ||
{ | ||
String p = e.nextElement(); | ||
if (p.contains(".") && !LAST_FILTER_PARAM.equals(p)) | ||
getPageConfig().setNoIndex(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe getPageConfig().setNoFollow() as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the use case here? An HTML wiki that uses There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly. This certainly isn't the only place this could be relevant (portal pages are probably more important). But letting the crawler go nuts on the pages with multiple data regions seems to exacerbate the combinatorics. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Open to other ideas, just trying to keep up with the bots. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like it's worth trying. It shouldn't hurt and might help/. |
||
} | ||
|
||
String name = null != form.getName() ? form.getName().trim() : null; | ||
//if there's no name parameter, find default page and reload with parameter. | ||
//default page is not necessarily same page displayed in wiki web part | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this might help the Container Filter menu items? Looks like they're
href
and not JS handlers. They already haverel="nofollow"
but that's not enough?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my reading. nofollow doesn't actually mean "pretend this target link or page doesn't exist". It just means "don't give this link any weight in your magic SEO algorithm, I don't vouch for it". Google is going to crawl every link it can find.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
Why the split between the
href
and thedata-query
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is an important question. The interwebs seems to think google is very good at finding links in attributes and javascript. I am not sure that this change is sufficient, but it seemed worth trying to separate the parts so that the varying part does not look like a URL.