Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DefaultDOMSource Document#getElementById() doesn't work #70

Open
soundasleep opened this issue Dec 13, 2021 · 1 comment
Open

DefaultDOMSource Document#getElementById() doesn't work #70

soundasleep opened this issue Dec 13, 2021 · 1 comment

Comments

@soundasleep
Copy link
Contributor

soundasleep commented Dec 13, 2021

I found that if you try to load a Document via DefaultDOMSource, #getElementById() always returns null.

As far as I can tell, this is because cssbox is using NekoHTML as its XML parser, and it's not set up to be a validating parser, and Xerces is the underlying parser, that requires it to be a validating parser in order for id="..." to work . I think?

However I did find a fix on sourceforge by adding a custom filter to NekoHTML:

    @Override
    public Document parse() throws SAXException, IOException
    {
        //temporay NekoHTML fix until nekohtml gets fixed
        if (!neko_fixed)
        {
            HTMLElements.Element li = HTMLElements.getElement(HTMLElements.LI);
            HTMLElements.Element[] oldparents = li.parent;
            li.parent = new HTMLElements.Element[oldparents.length + 1];
            for (int i = 0; i < oldparents.length; i++)
                li.parent[i] = oldparents[i];
            li.parent[oldparents.length] = HTMLElements.getElement(HTMLElements.MENU);
            neko_fixed = true;
        }
        
        // start tweak
        HTMLConfiguration config = new HTMLConfiguration();
        XMLDocumentFilter idEnhancer = new DefaultFilter() {
            @Override
            public void startElement(QName element, XMLAttributes attributes, Augmentations augs) throws XNIException {
                int idx = attributes.getIndex("id");
                if (idx > -1) {
                    attributes.setType(idx, "ID");
                    Augmentations attrsAugs = attributes.getAugmentations(idx);
                    attrsAugs.putItem(Constants.ATTRIBUTE_DECLARED, Boolean.TRUE);
                }
                super.startElement(element, attributes, augs);
            }
        };
        XMLDocumentFilter[] filters = { idEnhancer };
        config.setProperty("http://cyberneko.org/html/properties/filters", filters);
        // end tweak
        
        DOMParser parser = new DOMParser(config);
        parser.setProperty("http://cyberneko.org/html/properties/names/elems", "lower");
        if (charset != null)
            parser.setProperty("http://cyberneko.org/html/properties/default-encoding", charset);
        parser.parse(new org.xml.sax.InputSource(getDocumentSource().getInputStream()));
        return parser.getDocument();
    }

I think this could be added to DefaultDOMSource, or HTMLConfiguration, but I'd imagine you'd want to add test cases as well, and I'm not sure what the implications of this might be.

@soundasleep
Copy link
Contributor Author

Update: If you're trying to find IDs for elements that are naturally empty (such as <input>), turns out there's a separate filter for empty elements and normal elements. The XMLDocumentFilter should instead be:

XMLDocumentFilter idEnhancer = new DefaultFilter() {
	/**
	 * Makes #getElementById() work on any set of attributes
	 */
	private void possiblyAddIdAttribute(XMLAttributes attributes) {
		int idx = attributes.getIndex("id");
		if (idx > -1) {
			attributes.setType(idx, "ID");
			Augmentations attrsAugs = attributes.getAugmentations(idx);
			attrsAugs.putItem(Constants.ATTRIBUTE_DECLARED, Boolean.TRUE);
		}
	}
	
	@Override
	public void startElement(QName element, XMLAttributes attributes, Augmentations augs) throws XNIException {
		possiblyAddIdAttribute(attributes);
		super.startElement(element, attributes, augs);
	}

	@Override
	public void emptyElement(QName element, XMLAttributes attributes, Augmentations augs) throws XNIException {
		possiblyAddIdAttribute(attributes);
		super.emptyElement(element, attributes, augs);
	}
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant