Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle failure on context switch to #1

Open
andr2000 opened this issue Feb 1, 2017 · 1 comment
Open

Handle failure on context switch to #1

andr2000 opened this issue Feb 1, 2017 · 1 comment
Assignees

Comments

@andr2000
Copy link
Owner

andr2000 commented Feb 1, 2017

1.1. General fault vcoproc/coproc handler
1.2. Do not panic on ctx_switch_to failure
1.3. Add VCOPROC_GEN_FAULT state
1.4. Remove failed coproc from the scheduler's list
1.5. Add emulated IO handlers for the state of failure

@andr2000 andr2000 self-assigned this Feb 1, 2017
@aanisov
Copy link

aanisov commented Feb 1, 2017

1.1. +1
1.2. but handle fault in an appropriate way
1.3. +1
1.4. ... and stop its scheduler
1.5. This should be handled by platform code

andr2000 pushed a commit that referenced this issue Jun 12, 2017
When we concurrently try to unload and load crash
images we eventually get:

 Xen call trace:
    [<ffff82d08018b04f>] machine_kexec_add_page+0x3a0/0x3fa
    [<ffff82d08018b184>] machine_kexec_load+0xdb/0x107
    [<ffff82d080116e8d>] kexec.c#kexec_load_slot+0x11/0x42
    [<ffff82d08011724f>] kexec.c#kexec_load+0x119/0x150
    [<ffff82d080117c1e>] kexec.c#do_kexec_op_internal+0xab/0xcf
    [<ffff82d080117c60>] do_kexec_op+0xe/0x1e
    [<ffff82d08025c620>] pv_hypercall+0x20a/0x44a
    [<ffff82d080260116>] cpufreq.c#test_all_events+0/0x30

 Pagetable walk from ffff820040088320:
  L4[0x104] = 00000002979d1063 ffffffffffffffff
  L3[0x001] = 00000002979d0063 ffffffffffffffff
  L2[0x000] = 00000002979c7063 ffffffffffffffff
  L1[0x088] = 80037a91ede97063 ffffffffffffffff

The interesting thing is that the page bits (063) look legit.

The operation on which we blow up is us trying to write
in the L1 and finding that the L2 entry points to some
bizzare MFN. It stinks of a race, and it looks like
the issue is due to no concurrency locks when dealing
with the crash kernel space.

Specifically we concurrently call kimage_alloc_crash_control_page
which iterates over the kexec_crash_area.start -> kexec_crash_area.size
and once found:

  if ( page )
  {
      image->next_crash_page = hole_end;
      clear_domain_page(_mfn(page_to_mfn(page)));
  }

clears. Since the parameters of what MFN to use are provided
by the callers (and the area to search is bounded) the the 'page'
is probably the same. So #1 we concurrently clear the
'control_code_page'.

The next step is us passing this 'control_code_page' to
machine_kexec_add_page. This function requires the MFNs:
page_to_maddr(image->control_code_page).

And this would always return the same virtual address, as
the MFN of the control_code_page is inside of the
kexec_crash_area.start -> kexec_crash_area.size area.

Then machine_kexec_add_page updates the L1 .. which can be done
concurrently and on subsequent calls we mangle it up.

This is all a theory at this time, but testing reveals
that adding the hypercall_create_continuation() at the
kexec hypercall fixes the crash.

NOTE: This patch follows 5c5216 (kexec: clear kexec_image slot
when unloading kexec image) to prevent crashes during
simultaneous load/unloads.

NOTE: Consideration was given to using the existing flag
KEXEC_FLAG_IN_PROGRESS to denote a kexec hypercall in
progress. This, however, overloads the original intent of
the flag which is to denote that we are about-to/have made
the jump to the crash path. The overloading would lead to
failures in existing checks on this flag as the flag would
always be set at the top level in do_kexec_op_internal().
For this reason, the new flag KEXEC_FLAG_HC_IN_PROGRESS
was introduced.

While at it, fixed the #define mismatched spacing

Signed-off-by: Eric DeVolder <[email protected]>
Reviewed-by: Bhavesh Davda <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Reviewed-by: Jan Beulich <[email protected]>
Reviewed-by: Andrew Cooper <[email protected]>
Reviewed-by: Daniel Kiper <[email protected]>
Release-acked-by: Julien Grall <[email protected]>
andr2000 pushed a commit that referenced this issue Oct 18, 2021
ASAN reported one issue when Live Updating Xenstored:

=================================================================
==873==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc194f53e0 at pc 0x555c6b323292 bp 0x7ffc194f5340 sp 0x7ffc194f5338
WRITE of size 1 at 0x7ffc194f53e0 thread T0
    #0 0x555c6b323291 in dump_state_node_perms xen/tools/xenstore/xenstored_core.c:2468
    #1 0x555c6b32746e in dump_state_special_node xen/tools/xenstore/xenstored_domain.c:1257
    #2 0x555c6b32a702 in dump_state_special_nodes xen/tools/xenstore/xenstored_domain.c:1273
    #3 0x555c6b32ddb3 in lu_dump_state xen/tools/xenstore/xenstored_control.c:521
    #4 0x555c6b32e380 in do_lu_start xen/tools/xenstore/xenstored_control.c:660
    #5 0x555c6b31b461 in call_delayed xen/tools/xenstore/xenstored_core.c:278
    #6 0x555c6b32275e in main xen/tools/xenstore/xenstored_core.c:2357
    #7 0x7f95eecf3d09 in __libc_start_main ../csu/libc-start.c:308
    #8 0x555c6b3197e9 in _start (/usr/local/sbin/xenstored+0xc7e9)

Address 0x7ffc194f53e0 is located in stack of thread T0 at offset 80 in frame
    #0 0x555c6b32713e in dump_state_special_node xen/tools/xenstore/xenstored_domain.c:1232

  This frame has 2 object(s):
    [32, 40) 'head' (line 1233)
    [64, 80) 'sn' (line 1234) <== Memory access at offset 80 overflows this variable

This is happening because the callers are passing a pointer to a variable
allocated on the stack. However, the field perms is a dynamic array, so
Xenstored will end up to read outside of the variable.

Rework the code so the permissions are written one by one in the fd.

Fixes: ed6eebf ("tools/xenstore: dump the xenstore state for live update")
Signed-off-by: Julien Grall <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Reviewed-by: Luca Fancellu <[email protected]>
andr2000 pushed a commit that referenced this issue Feb 1, 2022
…ning NULL

If we are in libxl_list_vcpu() and we are returning NULL, let's avoid
touching the output parameter *nr_vcpus_out, which the caller should
have initialized to 0.

The current behavior could be problematic if are creating a domain and,
in the meantime, an existing one is destroyed when we have already done
some steps of the loop. At which point, we'd return a NULL list of vcpus
but with something different than 0 as the number of vcpus in that list.
And this can cause troubles in the callers (e.g., nr_vcpus_on_nodes()),
when they do a libxl_vcpuinfo_list_free().

Crashes due to this are rare and difficult to reproduce, but have been
observed, with stack traces looking like this one:

#0  libxl_bitmap_dispose (map=map@entry=0x50) at libxl_utils.c:626
#1  0x00007fe72c993a32 in libxl_vcpuinfo_dispose (p=p@entry=0x38) at _libxl_types.c:692
#2  0x00007fe72c94e3c4 in libxl_vcpuinfo_list_free (list=0x0, nr=<optimized out>) at libxl_utils.c:1059
#3  0x00007fe72c9528bf in nr_vcpus_on_nodes (vcpus_on_node=0x7fe71000eb60, suitable_cpumap=0x7fe721df0d38, tinfo_elements=48, tinfo=0x7fe7101b3900, gc=0x7fe7101bbfa0) at libxl_numa.c:258
#4  libxl__get_numa_candidate (gc=gc@entry=0x7fe7100033a0, min_free_memkb=4233216, min_cpus=4, min_nodes=min_nodes@entry=0, max_nodes=max_nodes@entry=0, suitable_cpumap=suitable_cpumap@entry=0x7fe721df0d38, numa_cmpf=0x7fe72c940110 <numa_cmpf>, cndt_out=0x7fe721df0cf0, cndt_found=0x7fe721df0cb4) at libxl_numa.c:394
#5  0x00007fe72c94152b in numa_place_domain (d_config=0x7fe721df11b0, domid=975, gc=0x7fe7100033a0) at libxl_dom.c:209
#6  libxl__build_pre (gc=gc@entry=0x7fe7100033a0, domid=domid@entry=975, d_config=d_config@entry=0x7fe721df11b0, state=state@entry=0x7fe710077700) at libxl_dom.c:436
#7  0x00007fe72c92c4a5 in libxl__domain_build (gc=0x7fe7100033a0, d_config=d_config@entry=0x7fe721df11b0, domid=975, state=0x7fe710077700) at libxl_create.c:444
#8  0x00007fe72c92de8b in domcreate_bootloader_done (egc=0x7fe721df0f60, bl=0x7fe7100778c0, rc=<optimized out>) at libxl_create.c:1222
#9  0x00007fe72c980425 in libxl__bootloader_run (egc=egc@entry=0x7fe721df0f60, bl=bl@entry=0x7fe7100778c0) at libxl_bootloader.c:403
#10 0x00007fe72c92f281 in initiate_domain_create (egc=egc@entry=0x7fe721df0f60, dcs=dcs@entry=0x7fe7100771b0) at libxl_create.c:1159
#11 0x00007fe72c92f456 in do_domain_create (ctx=ctx@entry=0x7fe71001c840, d_config=d_config@entry=0x7fe721df11b0, domid=domid@entry=0x7fe721df10a8, restore_fd=restore_fd@entry=-1, send_back_fd=send_back_fd@entry=-1, params=params@entry=0x0, ao_how=0x0, aop_console_how=0x7fe721df10f0) at libxl_create.c:1856
#12 0x00007fe72c92f776 in libxl_domain_create_new (ctx=0x7fe71001c840, d_config=d_config@entry=0x7fe721df11b0, domid=domid@entry=0x7fe721df10a8, ao_how=ao_how@entry=0x0, aop_console_how=aop_console_how@entry=0x7fe721df10f0) at libxl_create.c:2075

Signed-off-by: Dario Faggioli <[email protected]>
Tested-by: James Fehlig <[email protected]>
Reviewed-by: Anthony PERARD <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants