-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace hpricot with nokogiri in wordpressdotcom importer #555
Conversation
948b826
to
c68ab98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ruby changes LGTM except for some style issues in changes to the test file.
However I came across an issue on running the importer with the mock XML at my end. All the imported posts have the following markup which IMHO, would be best if addressed prior to merging:
<p><!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"><br />
<!-- wp:paragraph --><html><body></p>
<!DOCTYPE>
, <html>
, <body>
as children of a <p>
tag is a 👎
A small UI/UX issue that could be addressed in a future pull request prior to release: Current
Ideal
|
Output now looks like this:
|
c8115e0
to
1335c26
Compare
I would tend to agree, but since this is an existing issue, it might be better to fix in a separate PR that's more focused on changing that. content = Nokogiri::HTML(item.text_for("content:encoded"))
...
f.puts Util.wpautop(content.to_html) Perhaps it would be better to use |
1335c26
to
a0424df
Compare
I am not in favor of having wp_global_styless (2s') printed instead of just wp_global_styles.. perhaps in a future pull request.. |
a0424df
to
d4d2bd6
Compare
I believe it's now fixed for that case.
|
d4d2bd6
to
92face8
Compare
hpricot is no longer supported and doesn't build on some modern systems like Mac. Our last use of Hpricot was wordpressdotcom importer, which is now converted to use nokogiri. Rephrase wordpressdotcom importer alternative caveat. The wordpressdotcom importer does import pages, posts, images, etc so previous statement was untrue.
92face8
to
b5a7e9a
Compare
@jekyllbot: merge +minor |
Parker Moore: Replace hpricot with nokogiri in wordpressdotcom (#555) Merge pull request 555
Fixes #554.
hpricot is no longer supported and doesn't build on some modern systems like Mac. Our last use of Hpricot was wordpressdotcom importer, which is now converted to use nokogiri.
I actually created a simple website on wordpress.com and exported the XML file to test this. It now matches the old system. I added the exported XML as a mock here so that we could run the importer in tests and assert on the output.