Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclusion Question #67

Open
KnightmareVIIVIIXC opened this issue May 14, 2024 · 2 comments
Open

Exclusion Question #67

KnightmareVIIVIIXC opened this issue May 14, 2024 · 2 comments

Comments

@KnightmareVIIVIIXC
Copy link

I'm trying to figure out if there is an easy way or something that I'm overlooking when trying to only save the subdomain entries and eliminate the main domain entries. I'm using doubleclick.net as an example. In this json file, I have an exclusion file and the only entry is ||doubleclick.net^ because I thought that would prevent hostlistcompiler from compressing down to the top domain:

{
	"name": "Test",
	"sources": [{
			"source": "https://someonewhocares.org/hosts/zero/hosts",
			"transformations": ["Validate", "RemoveModifiers"]
		}],
	"transformations": ["Compress", "RemoveComments", "Deduplicate", "RemoveEmptyLines", "TrimLines"],
	"exclusions_sources": ["exclude.txt"]
}

However, this is not the case and it's still compressing to ||doubleclick.net^ and removing the subdomain entries. I'm also very tired so like I said, I could be overlooking something that is very obvious.

@hagezi
Copy link

hagezi commented May 14, 2024

@KnightmareVIIVIIXC

The exclusions seem to be applied to the source format, in your case hosts. Therefore, you must include 0.0.0.0 doubleclick.net in the exclude.txt:

hostlist-compiler -v -c test.json -o output.txt | grep 'doubleclick'
› 0.0.0.0 doubleclick.net excluded by 0.0.0.0 doubleclick.net

@KnightmareVIIVIIXC
Copy link
Author

KnightmareVIIVIIXC commented Jun 30, 2024

I just thought of something that would make my train of thought work: a convert transformation. This transformation would only do the first step of the compression process but not the second. This way, I could convert the individual lists into the adblock format with all of the subdomains still present:


||doubleclick.net^
||sub1.doubleclick.net^
||sub2.doubleclick.net^
||sub3.doubleclick.net^
||sub4.doubleclick.net^

Then during the global transformation at the end of the json file, where I have my global exclude.txt file, hopefully it will read it, see that I don't want ||doubleclick.net^ and only keep the sub entries, so the final result would be:


||sub1.doubleclick.net^
||sub2.doubleclick.net^
||sub3.doubleclick.net^
||sub4.doubleclick.net^

I've been using the exclusion_source function incorrectly, thinking that hostlistcompiler would know that while I don't want the base domain blocked, I do want to block the subdomains.

{
	"name": "Test",
	"sources": [{
			"source": "https://someonewhocares.org/hosts/zero/hosts",
			"transformations": ["Convert", "Validate", "RemoveModifiers"]
		}],
	"transformations": ["Compress", "RemoveComments", "Deduplicate", "RemoveEmptyLines", "TrimLines"],
	"exclusions_sources": ["exclude.txt"]
}

And if it looks like this:


||doubleclick.net^
||sub.doubleclick.net^
||1.sub.doubleclick.net^
||2.sub.doubleclick.net^
||3.sub.doubleclick.net^
||4.sub.doubleclick.net^

The final result would just be ||sub.doubleclick.net^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants