Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about html(5?) support? #28

Open
trikko opened this issue Aug 1, 2016 · 7 comments
Open

What about html(5?) support? #28

trikko opened this issue Aug 1, 2016 · 7 comments

Comments

@trikko
Copy link

trikko commented Aug 1, 2016

I wonder if it is too difficult to support also html 5. IMO it would be a good idea for web-related applications.

@Hackerpilot
Copy link

HTML is not XML. I don't think this is a reasonable feature request.

For further information about the madness that HTML supports, check out the spec here: https://www.w3.org/TR/html5/syntax.html#tree-construction. Note the gigantic state machine specified for parsing malformed tags.

@Hackerpilot
Copy link

Of course if your HTML input also happens to be XHTML, then there shouldn't be a problem.

@trikko
Copy link
Author

trikko commented Aug 1, 2016

I know it's not the same. But maybe at least XHTML 5 could be interesting.

@trikko
Copy link
Author

trikko commented Aug 1, 2016

(anyway: I don't care too much about parsing malformed html and fixing it. It would be interesting to have dom-related function for tree manipulation and output valid html5)

@rjmcguire
Copy link

On Mon, Aug 1, 2016 at 10:21 AM, Andrea Fontana [email protected]
wrote:

(anyway: I don't care too much about parsing malformed html and fixing it.
It would be interesting to have dom-related function for tree manipulation
and output valid html5)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#28 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABU8CWLeIdIsjOyesHN-ncxPxfkg27WZks5qbayPgaJpZM4JZURI
.

+1, I think it would be irresponsible to allow the definitive standard xml
parser to fix dodgy html / xml. There are tools for that.
How hard it is to do html5 parsing / output with the standard library will
be important to validate during experimental phase of this library though.

@lodo1995
Copy link
Owner

lodo1995 commented Aug 1, 2016

@trikko as @Hackerpilot said, it's not possible to parse all HTML with an XML parser. The idea is to keep the components of the library as independent and generic as possible. So, for example, the parser and cursor do not check for correct element nesting. The parser doesn't even need to parse attributes. So this library already provides some building blocks to parse HTML.
If your HTML happens to be XHTML, then you can even use this library to build a DOM. You can use the provided DOM implementation, which will have full Level 3 support. Or you can create a custom DOM hierarchy with advanced HTML/SVG/whatever-you-need support, basing it on the provided one, and then have the provided DOMBuilder build it.

@wilzbach
Copy link

This issue was moved to dlang-community/experimental.xml#10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants