Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86/defconfig: Add kata containers defconfig #5

Conversation

devimc
Copy link

@devimc devimc commented Nov 23, 2017

kata_containers_common_defconfig is based on Clear Containers config
https://github.com/clearcontainers/packaging/blob/master/kernel/kernel-config-4.9.x
and was adapted to the 4.14.13 kernel by just accepting the defaults configurations

For example, to compile a x86 kernel with KVM enabled:

cat arch/x86/configs/kata_containers_common_defconfig arch/x86/configs/kata_containers_kvm_defconfig > arch/x86/configs/kata_containers_defconfig
make ARCH=x86 kata_containers_defconfig
make -j8

and to compile a x86 kernel with XEN enabled:

cat arch/x86/configs/kata_containers_common_defconfig arch/x86/configs/kata_containers_xen_defconfig > arch/x86/configs/kata_containers_defconfig
make ARCH=x86 kata_containers_defconfig
make -j8

fixes #3

Signed-off-by: Julio Montes [email protected]

@grahamwhaley
Copy link

Hi @devimc - as 2000+ lines of kernel config is quite tricky and takes some time to review ;-), can you give us a little background:

  • is this derived from the Clear Containers clearcontainers@9a69950 for instance?
  • did we adapt to the 4.13.3 kernel by just accepting the defaults for all the new CONFIG entries?

Also, I think you may have submitted this PR against master - should it be against/towards linux-kata-containers-4.13.3 instead?

@gnawux - if you have a kernel config expert, please point /cc them in here - we'd welcome input to share kernel config knowledge (but I also understand it can be quite a task to do a config compare or review, so maybe we will have to schedule that as a task for later).

Generally though, presuming we have tested this kernel in some form, and we get this PR aimed at the correct branch:
lgtm

@devimc devimc changed the base branch from master to linux-kata-containers-4.13.3 November 23, 2017 15:15
@devimc
Copy link
Author

devimc commented Nov 23, 2017

Hi @grahamwhaley
oops! changing to linux-kata-containers-4.13.3

@devimc devimc force-pushed the config/kata_defconfig branch 3 times, most recently from 6d45e5d to 4feec8b Compare November 25, 2017 16:51
Copy link
Member

@bergwolf bergwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list is quite long to review. I might have missed something but I already see following differences between cc and hyper kernels:

  1. initramfs support
  2. loadable module support
  3. XEN PV guset
  4. veth support

We need to get consensus on them before this can be merged.

CONFIG_CGROUP_PERF=y
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
CONFIG_CHECKPOINT_RESTORE=y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need checkpoint/restore?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually don't think we need it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grahamwhaley we have to options to support live migration:

  1. using libcontainer to migrate only the process
  2. using qemu micro checkpoint to migrate all the VM

probably both options won't work because of 9p gaps

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, I'll disable it

CONFIG_NET_NS=y
CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and relay? Do you want debugfs?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect we don't need this - anybody know for sure?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don explicitly need it, but it may come as a dependency from something else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to disable it

CONFIG_SCHED_AUTOGROUP=y
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
# CONFIG_BLK_DEV_INITRD is not set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please enable CONFIG_BLK_DEV_INITRD support? We want to use initramfs.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @devimc please enable INITRD.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

CONFIG_SLABINFO=y
CONFIG_RT_MUTEXES=y
CONFIG_BASE_SMALL=0
# CONFIG_MODULES is not set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please enable module support? We'd like to use loadable modules to keep the kernel small.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, here we can have an interesting discussion :-)
We have deliberately not enabled modules before for a couple of reasons:

  • size, speed, probing etc.
  • where will you store the modules (in the container images or on a volume??), and how will you ensure the modules that live on the host match the kernel that you are running inside the VM?

We have debated this a few times. Quite often a new network or filesystem or feature will be needed by a specific client, and having modules available would be one way to fix that.

If we turn on modules, then we should:

  • assess any size/speed impact
  • I believe turn on kernel API checking so we can try to avoid loading a module/kernel that have a mismatch in API
  • discuss signing of modules as well?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grahamwhaley All fair points, that should be handled by osbuilder or the kata containers user tools for building its image. Provided that kernel modules support does not impact boot time for kernels that do not load any module at all, I think we should enable it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we store modules in sandbox rootfs and bind mount it to container rootfs images. Modules are built together with the guest kernel and compressed in the initramfs.

Optional features are built as modules. A sample list includes crypto, nfs, vsock, ipv6, l2tp etc. They are not essential to run a pod and are not loaded if unneeded.

And providing loadable modules opens up the opportunity to provide guest modules if the default kernel config misses something important to a user. In such case, we only need to provide additional kernel modules rather than rebuilding entire kernel.

CONFIG_PARAVIRT=y
# CONFIG_PARAVIRT_DEBUG is not set
# CONFIG_PARAVIRT_SPINLOCKS is not set
# CONFIG_XEN is not set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd like to have XEN PV guest support.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok :)

# CONFIG_NET_POLL_CONTROLLER is not set
CONFIG_TUN=y
# CONFIG_TUN_VNET_CROSS_LE is not set
# CONFIG_VETH is not set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need CONFIG_VETH.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bergwolf You need veth in the guest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sameo not for the common case. But we got quite a few docker in hyperd and hyperd in hyperd requests, which is why we enabled KVM/cgroups support in guest kernel as well.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bergwolf Fair enough, thanks.

@laijs
Copy link

laijs commented Nov 27, 2017

Since we have https://github.com/kata-containers/osbuilder, IMO, it would be better to maintain the kernel config in it.

@grahamwhaley
Copy link

@laijs that's a good question. My only concern would be separating the config away from any patches we are carrying. Having said that, if we end up supporting multiple configs for different performance characteristic/feature sets for instance, then maybe holding those in osbuilder makes sense.
/cc @jcvenegas

@sameo
Copy link

sameo commented Nov 27, 2017

@laijs I believe we should keep a reference config file in the kernel and maintain variants in osbuilder, as you suggest.
It should not be mandatory to use osbuilder to build a kata containers kernel, imho.

@laijs
Copy link

laijs commented Nov 28, 2017

@grahamwhaley @sameo Thanks, I got it.

@teawater
Copy link
Member

Hi @devimc ,

We need CONFIG_VMAP_STACK.

Thanks,
Hui

@teawater
Copy link
Member

Hi @devimc ,

I think the kata config should not be included in the kernel repo because I think the kernel repo should just keep the patch that we want to upstream it.

Thanks,
Hui

@sameo
Copy link

sameo commented Nov 28, 2017

@teawater

We need CONFIG_VMAP_STACK.

Have you benchmarked this? Having stacked vmalloc'ed can potentially introduce TLB misses from stack accesses and could hurt performance.

cc @mcastelino

@devimc devimc force-pushed the config/kata_defconfig branch from 4feec8b to 41e0e79 Compare November 28, 2017 14:30
@devimc
Copy link
Author

devimc commented Nov 28, 2017

after apply the changes suggested by @bergwolf:

CONFIG_CHECKPOINT_RESTORE is not set
CONFIG_RELAY is not set
CONFIG_KPROBES is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_RD_GZIP=y
CONFIG_RD_BZIP2=y
CONFIG_RD_LZMA=y
CONFIG_RD_XZ=y
CONFIG_RD_LZO=y
CONFIG_RD_LZ4=y
CONFIG_STRICT_MODULE_RWX=y
CONFIG_MODULES=y
CONFIG_XEN=y
CONFIG_XEN_PV=y
CONFIG_XEN_PV_SMP=y
CONFIG_XEN_DOM0=y
CONFIG_XEN_PVHVM=y
CONFIG_XEN_PVHVM_SMP=y
CONFIG_XEN_512GB=y
CONFIG_XEN_SAVE_RESTORE=y
CONFIG_VETH=y
CONFIG_HIBERNATE_CALLBACKS=y
CONFIG_PM_SLEEP=y
CONFIG_PM_SLEEP_SMP=y
CONFIG_PM=y
CONFIG_PM_CLK=y
CONFIG_ACPI_TABLE_UPGRADE=y
CONFIG_PCIE_PME=y
CONFIG_XEN_PCIDEV_FRONTEND=y
CONFIG_SYS_HYPERVISOR=y
CONFIG_XEN_BLKDEV_FRONTEND=y
CONFIG_XEN_NETDEV_FRONTEND=y
CONFIG_INPUT_XEN_KBDDEV_FRONTEND=y
CONFIG_HVC_IRQ=y
CONFIG_HVC_XEN=y
CONFIG_HVC_XEN_FRONTEND=y
CONFIG_VT_CONSOLE_SLEEP=y
CONFIG_ZLIB_INFLATE=y
CONFIG_LZO_DECOMPRESS=y
CONFIG_LZ4_DECOMPRESS=y
CONFIG_DECOMPRESS_GZIP=y
CONFIG_DECOMPRESS_BZIP2=y
CONFIG_DECOMPRESS_LZMA=y
CONFIG_DECOMPRESS_XZ=y
CONFIG_DECOMPRESS_LZO=y
CONFIG_DECOMPRESS_LZ4=y
#
# Xen driver support
#
CONFIG_XEN_BALLOON=y
CONFIG_XEN_SCRUB_PAGES=y
CONFIG_XEN_DEV_EVTCHN=y
CONFIG_XEN_BACKEND=y
CONFIG_XENFS=y
CONFIG_XEN_COMPAT_XENFS=y
CONFIG_XEN_SYS_HYPERVISOR=y
CONFIG_XEN_XENBUS_FRONTEND=y
CONFIG_XEN_GNTDEV=m
CONFIG_XEN_GRANT_DEV_ALLOC=m
CONFIG_SWIOTLB_XEN=y
CONFIG_XEN_PCIDEV_BACKEND=m
CONFIG_XEN_PRIVCMD=y
CONFIG_XEN_ACPI_PROCESSOR=m
CONFIG_XEN_HAVE_PVMMU=y
CONFIG_XEN_AUTO_XLATE=y
CONFIG_XEN_ACPI=y
CONFIG_XEN_SYMS=y
CONFIG_XEN_HAVE_VPMU=y

@teawater
Copy link
Member

teawater commented Nov 29, 2017

@sameo ,
After open vmap stack, the execution time of test with sysbench memory was decreased about 7% in my part.
And we have long history to use it anti long time execution fragmentation in low memory environment.

Thanks,
Hui

@mcastelino
Copy link

@teawater with vmap stack the performance degrades then. Did you check the kvm stats to see if you are seeing much higher ept violations?

@teawater
Copy link
Member

Sorry to make a mistake in language. I want to say performance increased but input the execution time increased.

This the clear test result of sysbench memory inside a qemu:
Without vmap stack:
32.6117
31.169
29.1881
30.4291
31.5273

With vmap stack:
29.4660
29.2134
28.0476
28.9640
28.2300

@teawater
Copy link
Member

teawater commented Dec 5, 2017

Following part is what we need:
CONFIG_VMAP_STACK
CONFIG_SERIAL_EARLYCON=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SG_POOL=y
CONFIG_SCSI_LOWLEVEL=y
CONFIG_SCSI_VIRTIO=y
CONFIG_BLK_DEV_SD=y
CONFIG_CHR_DEV_SG=y
CONFIG_NFS_FS=m
CONFIG_NFS_V2=m
CONFIG_NFS_V3=m
CONFIG_NFS_V4=m

Thanks,
Hui

@devimc
Copy link
Author

devimc commented Dec 6, 2017

@sameo @mcastelino @bergwolf are you agree with @teawater 's changes ?

@devimc devimc force-pushed the config/kata_defconfig branch from 41e0e79 to 6647ef5 Compare December 6, 2017 21:25
@devimc
Copy link
Author

devimc commented Dec 6, 2017

Update

CONFIG_NR_CPUS=255

@sameo
Copy link

sameo commented Dec 10, 2017

@teawater Why do you need all the serial options? Same question for NFS.

@mcastelino
Copy link

The SCSI changes make sense for sure. We plan to move to SCSI for all block devices. The VMAP change also makes sense as the data from @teawater seems to indicate that this does not degrade performance. It will give us some density benifits. However we still need to run I/O benchmarks.

@grahamwhaley @GabyCT we should run PnP with VMAP enabled for storage and network I/O. We really care about I/O performance more than anything else.

@teawater
Copy link
Member

teawater commented Dec 12, 2017

@sameo We use these kernel options with hyperstart.

Thanks,
Hui

@sameo
Copy link

sameo commented Dec 12, 2017

@teawater I don't think this will be needed with the Kata agent. Could you please double check?

@bergwolf
Copy link
Member

@sameo Yes, we are sure we need them. hyperstart supports NFS volumes. We'd expect the kata agent to support NFS volumes as well. And we need serial ports in case vsock is unavailable, right?

@sameo
Copy link

sameo commented Dec 12, 2017

@bergwolf Yes, I figured you'd need NFS volumes, and that is fine with me, especially since it's going to be a module.
But for serial, the only thing we need is virtio console and tty.

@devimc devimc changed the base branch from linux-kata-containers-4.13.3 to linux-kata-containers-4.14.13 February 19, 2018 18:24
@devimc
Copy link
Author

devimc commented Feb 19, 2018

@sameo @bergwolf changes applied

@sameo
Copy link

sameo commented Feb 20, 2018

@egernst Yes, that's the idea.

@sameo
Copy link

sameo commented Feb 20, 2018

@devimc Looks good. Maybe we should have a script to safely generate the final Kata config?
@bergwolf @egernst WDYT?

@jodh-intel
Copy link

@sameo - I think that's a very good idea. I've raised kata-containers/packaging#8 (and kata-containers/packaging#7 for qemu config).

/cc @grahamwhaley.

@devimc
Copy link
Author

devimc commented Feb 26, 2018

@sameo a script in packaging or here?

@jodh-intel
Copy link

Personally, I think such scripts should live in the packaging repo as that's where they will be used. It also allows repos like this (and qemu) to remain as "vanilla" as possible.

@sameo
Copy link

sameo commented Mar 13, 2018

@devimc
Copy link
Author

devimc commented Apr 11, 2018

@kata-containers/linux can we merge this PR ?

@grahamwhaley
Copy link

Looks like this is mergeable @devimc , althought the discussion about having a config generating script in the packaging did not seem to conclude.
This PR is also very old - is this config file upto date - I suspect it could do with a re-generation before a merge maybe??

@jcvenegas
Copy link
Member

Any update on this?

@jcvenegas
Copy link
Member

btw I would be good to have the steps described in this PR documented somewhere.

cat arch/x86/configs/kata_containers_common_defconfig arch/x86/configs/kata_containers_kvm_defconfig > arch/x86/configs/kata_containers_defconfig
make ARCH=x86 kata_containers_defconfig
make -j8

@jodh-intel
Copy link

tbh, I think we could start by simply having the kernel config in the packaging area, as we did for Clear Containers:

@WeiZhang555
Copy link
Member

I am thinking if it's better to put the kernel config in kernel repo, and qemu config in qemu repo. this can allow us compile our own kernel/qemu without packaging repo.
Seperating kernel repo with kernel config can bring confusion to developers.

@jodh-intel
Copy link

I can understand that viewpoint too. But I think as long as we clearly document the process, either would work. It's worth pointing out that we already have precedent for the hypervisor though:

-https://github.com/kata-containers/packaging/blob/master/scripts/configure-hypervisor.sh

@gnawux
Copy link
Member

gnawux commented Apr 19, 2018

@WeiZhang555

I am thinking if it's better to put the kernel config in kernel repo, and qemu config in qemu repo. this can allow us compile our own kernel/qemu without packaging repo.

That's reasonable, however, as linux and qemu are forked from upstream, we'd better keep the stuff not tend to be upstream outside the repo. A packaging script works too as mentioned by @jodh-intel .

@WeiZhang555
Copy link
Member

@gnawux @jodh-intel That's fine. I believe we can eliminate the confusion by providing some good documentations 😄

kata_containers_common_defconfig is based on Clear Containers config
https://github.com/clearcontainers/packaging/blob/master/kernel/kernel-config-4.9.x
and was adapted to the 4.14.13 kernel by just accepting the defaults configurations

For example, to compile a x86 kernel with KVM enabled:

```
cat arch/x86/configs/kata_containers_common_defconfig arch/x86/configs/kata_containers_kvm_defconfig > arch/x86/configs/kata_containers_defconfig
make ARCH=x86 kata_containers_defconfig
make -j8
```

and to compile a x86 kernel with XEN enabled:

```
cat arch/x86/configs/kata_containers_common_defconfig arch/x86/configs/kata_containers_xen_defconfig > arch/x86/configs/kata_containers_defconfig
make ARCH=x86 kata_containers_defconfig
make -j8
```

fixes kata-containers#3

Signed-off-by: Julio Montes <[email protected]>
@devimc devimc force-pushed the config/kata_defconfig branch from f4e94c0 to 1355353 Compare April 19, 2018 20:26
@jcvenegas
Copy link
Member

@grahamwhaley @jodh-intel @gnawux @laijs what about the proposed PR by @devimc in kata-containers/packaging#18

@jodh-intel
Copy link

Thanks for the pointer @jcvenegas.

@devimc - do we need to add the do-not-merge label to this PR then?

@devimc devimc closed this Apr 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add config file