Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Instance generates gigabytes of incoming traffic while in idle #3234

Closed
Nerdmind opened this issue Aug 1, 2022 · 6 comments
Closed
Labels
question Further information is requested

Comments

@Nerdmind
Copy link
Contributor

Nerdmind commented Aug 1, 2022

Hello.

After setting up my own Invidious instance I noticed that Invidious is generating approximately 2-3 GB of incoming traffic per day although no videos were playing and no user connections were made to the instance via web browser in this time span.

The Instance was set up via manual installation on a Debian 11 GNU/Linux VPS. The used commit is: 210c2a8

To reproduce this, simply add some rules to the nftables firewall to count the number of packets and the network traffic in bytes generated by the UNIX user who is running Invidious. The relevant nftables rules for testing are:

table inet default {
	chain INPUT {
		type filter hook input priority 0; policy drop;
		iifname "eth0" skuid "invidious" counter;

		# [...] other non-relevant rules removed for readability
	}

	chain OUTPUT {
		type filter hook output priority 0; policy accept;
		oifname "eth0" skuid "invidious" counter;

		# [...] other non-relevant rules removed for readability
	}
}

After setting up the rules and reloading nftables, you can check the counters with nft list ruleset. In the output of this command, you'll find the number of counted packets and the number of bytes received/transmitted:

iifname "eth0" meta skuid "invidious" counter packets 974091 bytes 1334447333
oifname "eth0" meta skuid "invidious" counter packets 847414 bytes 68815416

The example above is from the last night where the Invidious service was running but the instance definitely was not used since I'm hosting the instance just for myself. There are also no log messages neither in the systemd journal nor in the invidious.log (but I should mention that I forgot to set the log_level to debug, if this matters).

However, while testing this yesterday in an LXC container with the firewall rules mentioned above, I also captured the network packets with tcpdump that Invidious was generating. It reveals that Invidious constantly connects (with some pauses in-between) to the YouTube servers, but since the traffic is encrypted I can't get a clue about what is really going on there and why it is generating 1334 MB (1334447333/1000/1000) of incoming traffic in about 12 hours although the instance is idle.

capture-1
capture-2

I tested it again right now and re-created the PostgreSQL database to ensure that no video information and so on are on it. In the 48 minutes since the counters were reset and Invidious was started it already generated 72 MB of incoming traffic step-by-step.

My question here is: Is this a bug or normal/expected behavior? If so, what is the purpose of it? What is Invidious downloading from YouTube while the instance is idle?

Thank you!

@Nerdmind Nerdmind added the bug Something isn't working label Aug 1, 2022
@unixfox unixfox added question Further information is requested and removed bug Something isn't working labels Aug 1, 2022
@unixfox unixfox changed the title [Bug] Instance generates gigabytes of incoming traffic while in idle [Question] Instance generates gigabytes of incoming traffic while in idle Aug 1, 2022
@unixfox
Copy link
Member

unixfox commented Aug 1, 2022

Every 30 minutes, invidious will refresh the videos from the channels that the users of the instance subscribed to. This may generate a lot of traffic depending on the amount of channels that the users subscribed to. You can tune the frequency by changing the channel_refresh_interval parameter in the config.yml in order to decrease the amount of data fetched per day.

There are other "jobs" that fetch some data from YouTube servers: https://github.com/iv-org/invidious/tree/master/src/invidious/jobs, but these shouldn't generate a lot of data.

In order to be 100% sure that it's the refresh of the videos from the subscribed channels that generate this amount data, you can turn off this "job" by setting 0 for the parameter channel_threads.

TLDR. It's normal behavior.

@unixfox unixfox closed this as completed Aug 1, 2022
@Nerdmind
Copy link
Contributor Author

Nerdmind commented Aug 6, 2022

Well, there are no users in the database because I'm hosting Invidious only for myself and disabled registration and login. I also re-created the database before every test so that it surely was empty when I started monitoring the traffic usage. Even with your proposed settings, the amount of data doesn't change and starts growing every minute after starting the service.

However, thanks for pointing me to the jobs directory. I've now found out that it is the UpdateDecryptFunctionJob that is causing this amount of data. This job only runs if decrypt_polling in the config is true. (The default is true):

## Enable/Disable the polling job that keeps the decryption
## function (for "secured" videos) up to date.
##
## Note: This part of the code is currently broken, so changing
## this setting has no impact.
##
## Accepted values: true, false
## Default: true
##
#decrypt_polling: true

The comments state that „this part of the code is currently broken, so changing this setting has no impact“, but if I change this setting to false, Invidious stops generating the previously mentioned amount of data. The whole night, not a single byte of data transmitted/received while the instance was idle – like I would expect. So is this still not a bug?

@unixfox
Copy link
Member

unixfox commented Aug 6, 2022

Oh yeah there is this job too, the decrypt job is an essential component for loading the encrypted videos like musics, copyrighted videos and more.

Currently, this component is only used for videos that can't be fetched using a special user agent that get all the videos unencrypted (see TeamNewPipe/NewPipeExtractor#562). An example are age-restricted videos (see #2189).

If you don't care about these videos, encrypted age-restricted videos like music age-restricted videos, then feel free to turn off decrypt_polling.

unixfox added a commit that referenced this issue Aug 6, 2022
And add notice about bandwidth usage, related to #3234
@Nerdmind
Copy link
Contributor Author

Nerdmind commented Aug 6, 2022

Thank you. But I can watch the age-restricted example videos even when decrypt_polling is false. Didn't notice a difference when playing those videos. I leave it at false now until I notice that I can't play age-restricted videos anymore.

However, what about the comment that states that this setting currently has no impact because the part of the code is broken? Is this comment outdated and should be removed? I'm a bit confused. It obviously makes a difference in traffic-usage if it is set to true or false, but it doesn't seem to affect playing age-restricted videos as far as I see.

I can just tell you what I observe, but I don't fully understand the code for now, so there still might be a misunderstanding by me. I leave it for now. The problem with traffic-usage in idle is solved for me. Thanks for your help! 😄

EDIT:
Oh, I just saw your recent commit regarding the comment in the config file. 👍

@unixfox
Copy link
Member

unixfox commented Aug 6, 2022

I'm talking about encrypted age-restricted videos, not all the age-restricted videos! Encrypted age-restricted videos are for example music videos that are age-restricted.
Those won't work without enabling decrypt_polling.

@unixfox
Copy link
Member

unixfox commented Aug 6, 2022

After some testing and discussion with the Invidious team, the decrypt function doesn't work anymore and is currently broken. We opened a pull request to disable it by default: #3244

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants