Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

App firewall through netfilter & landlock #35

Closed
rcoscali opened this issue Jun 8, 2024 · 3 comments
Closed

App firewall through netfilter & landlock #35

rcoscali opened this issue Jun 8, 2024 · 3 comments

Comments

@rcoscali
Copy link

rcoscali commented Jun 8, 2024

Hi Mickaël

First thanks for your marvelous job.
This is not an issue, more a feature request.
Second for being able to create an app firewall it would be interesting to be able to have a landlock syscall for installing a process bound netfilter firewall rule. Of course a specific netfilter patch could be necessary (at least a netfilter module). What could be also interesting would be to let an application activate some packet validation modules (through queues that could then be used in userland in a dedicated app as suricata). Providing a packet inspection should be easy at the app level.
And finally a sandboxer with a GUI would be something really valuable (didn't yet check the rust sandboxer, I will). It could generate the locking code.
I'd like to hear your thoughts about these and if it is something you think achievable.
For the netfilter part I'll try to propose a PR soon but as it could imply also a netfilter PR (to manage the link between ruleset & nf table in some way) ...

Best

@rcoscali
Copy link
Author

rcoscali commented Jun 8, 2024

Just thinking a little bit more about the sandboxer GUI.
It should be able to startup from a sandbox free binary package of the app. This will allow to setup all startup file access rules.
It should be able to select a kind of application for which a sandbox is wanted (console based, GUI, server, etc ...). These app templates should allow to setup some startup rules (a console app should have access to a term, a GUI app should have access to an X11 session and its related network comm, same for a server). Of course this template can then be customized.
Perhaps the best way to then customize such rules would be through a VScode extension (for ex).
We could also use elf for adding rules for loadable libraries.
Specific ruleset for each language (C, C++, rust, go, all have some specific needs).
Another interesting feature would be to enumerate all syscalls and add a rule for limiting their use to the lowest possible set (attack surface reduction through a seccomp wrapper ?? or direct seccomp usage -- i saw a pledge port based on seccomp bpf in github). I think to another feature that would be nice, being able to use a call graph for verifying chains of syscalls (this is not an easy stuff, but could it be something supported by landlock/seccomp in the future...).
Once the dev finished his sandbox rules, an activate_sandbox function is generated and could be added through a compiler feature (ex GNU_IFUNC allowing to deploy several levels of sandboxing ??? just another idea) or by rebuilding/relinking app.

Hope some will find these ideas and use it ...

@l0kod
Copy link
Member

l0kod commented Jun 9, 2024

Hi Mickaël

Hi Rémi,

First thanks for your marvelous job.

Thanks, I bootstrapped the project but we're now a few working on Landlock. 😉

This is not an issue, more a feature request. Second for being able to create an app firewall it would be interesting to be able to have a landlock syscall for installing a process bound netfilter firewall rule. Of course a specific netfilter patch could be necessary (at least a netfilter module). What could be also interesting would be to let an application activate some packet validation modules (through queues that could then be used in userland in a dedicated app as suricata). Providing a packet inspection should be easy at the app level.

An application firewall able to filter the content of packets would be interesting, but that poses some challenges. The main one is that Netfilter is not designed to handle rules written by attackers, so I think it would not be trivial to get such guarantee, and of course the Netfilter maintainers need to be convince this is a good idea.

Landlock's design is flexible enough and should not require a new syscall but new access rights and rules, but it's not clear to me how to create safe and simple filtering rules that would still be flexible.

Our thinking about the current Lanlock's TCP port filtering is that it is simple and it addresses a lot of app firewall requirements. For instance, controlling remote peer addresses may not make sense without name resolution. There is also a work in progress with socket creation control #6 and UDP port filtering #10. Help for such work is welcome!

And finally a sandboxer with a GUI would be something really valuable (didn't yet check the rust sandboxer, I will). It could generate the locking code. I'd like to hear your thoughts about these and if it is something you think achievable.

I started working on a new sandboxer, but there is a lot of foundational work required before thinking about a GUI.

For the netfilter part I'll try to propose a PR soon but as it could imply also a netfilter PR (to manage the link between ruleset & nf table in some way) ...

I'm looking forward this RFC! As a reminder, Linux development doesn't happen on GitHub with PRs. Please make sure the Netfilter community is in the loop.

Just thinking a little bit more about the sandboxer GUI. It should be able to startup from a sandbox free binary package of the app. This will allow to setup all startup file access rules. It should be able to select a kind of application for which a sandbox is wanted (console based, GUI, server, etc ...). These app templates should allow to setup some startup rules (a console app should have access to a term, a GUI app should have access to an X11 session and its related network comm, same for a server). Of course this template can then be customized. Perhaps the best way to then customize such rules would be through a VScode extension (for ex). We could also use elf for adding rules for loadable libraries. Specific ruleset for each language (C, C++, rust, go, all have some specific needs). Another interesting feature would be to enumerate all syscalls and add a rule for limiting their use to the lowest possible set (attack surface reduction through a seccomp wrapper ?? or direct seccomp usage -- i saw a pledge port based on seccomp bpf in github). I think to another feature that would be nice, being able to use a call graph for verifying chains of syscalls (this is not an easy stuff, but could it be something supported by landlock/seccomp in the future...). Once the dev finished his sandbox rules, an activate_sandbox function is generated and could be added through a compiler feature (ex GNU_IFUNC allowing to deploy several levels of sandboxing ??? just another idea) or by rebuilding/relinking app.

Hope some will find these ideas and use it ...

All this would be nice but it's out of scope for Landlock which is a kernel feature focused on sandboxing (i.e. access control).

However, feel free to share your progress on implementing such sandboxing app using Landlock, seccomp, and other security features!

@rcoscali
Copy link
Author

rcoscali commented Jun 9, 2024

All this would be nice but it's out of scope for Landlock which is a kernel feature focused on sandboxing (i.e. access control).

Sure, I was just in a silly mood ... but I think at last it could become useful for developers and for helping security.
I'm looking at landlock code in the meantime and i'll try to enter it by looking at the issues you enlighten #6 & #10.
👍

@l0kod l0kod closed this as completed Jun 10, 2024
l0kod pushed a commit that referenced this issue Dec 16, 2024
[ Upstream commit 4b7c3f6 ]

Ignore the userspace provided x2APIC ID when fixing up APIC state for
KVM_SET_LAPIC, i.e. make the x2APIC fully readonly in KVM.  Commit
a92e254 ("KVM: x86: use hardware-compatible format for APIC ID
register"), which added the fixup, didn't intend to allow userspace to
modify the x2APIC ID.  In fact, that commit is when KVM first started
treating the x2APIC ID as readonly, apparently to fix some race:

 static inline u32 kvm_apic_id(struct kvm_lapic *apic)
 {
-       return (kvm_lapic_get_reg(apic, APIC_ID) >> 24) & 0xff;
+       /* To avoid a race between apic_base and following APIC_ID update when
+        * switching to x2apic_mode, the x2apic mode returns initial x2apic id.
+        */
+       if (apic_x2apic_mode(apic))
+               return apic->vcpu->vcpu_id;
+
+       return kvm_lapic_get_reg(apic, APIC_ID) >> 24;
 }

Furthermore, KVM doesn't support delivering interrupts to vCPUs with a
modified x2APIC ID, but KVM *does* return the modified value on a guest
RDMSR and for KVM_GET_LAPIC.  I.e. no remotely sane setup can actually
work with a modified x2APIC ID.

Making the x2APIC ID fully readonly fixes a WARN in KVM's optimized map
calculation, which expects the LDR to align with the x2APIC ID.

  WARNING: CPU: 2 PID: 958 at arch/x86/kvm/lapic.c:331 kvm_recalculate_apic_map+0x609/0xa00 [kvm]
  CPU: 2 PID: 958 Comm: recalc_apic_map Not tainted 6.4.0-rc3-vanilla+ #35
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.2-1-1 04/01/2014
  RIP: 0010:kvm_recalculate_apic_map+0x609/0xa00 [kvm]
  Call Trace:
   <TASK>
   kvm_apic_set_state+0x1cf/0x5b0 [kvm]
   kvm_arch_vcpu_ioctl+0x1806/0x2100 [kvm]
   kvm_vcpu_ioctl+0x663/0x8a0 [kvm]
   __x64_sys_ioctl+0xb8/0xf0
   do_syscall_64+0x56/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
  RIP: 0033:0x7fade8b9dd6f

Unfortunately, the WARN can still trigger for other CPUs than the current
one by racing against KVM_SET_LAPIC, so remove it completely.

Reported-by: Michal Luczaj <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]
Reported-by: Haoyu Wu <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]
Reported-by: [email protected]
Closes: https://lore.kernel.org/all/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-ID: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Stable-dep-of: 73b42dc ("KVM: x86: Re-split x2APIC ICR into ICR+ICR2 for AMD (x2AVIC)")
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants