Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x86-64 PCID and PMU Support #446

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

evanstella
Copy link
Contributor

Summary of this Pull Request (PR)

This PR is for an initial review of PCID support. This is not ready to be pulled but will need to be integrated with the namespace implementation that is being developed right now.

This PR adds support for PCID, a kernel API for using PCID for an ASID implementation, as well as PMU programming support for Intel's hardware PMU.

Particularly, I'd like your opinion on:

  • the kernel ASID API and whether it makes sense
  • The updated context switch code (chal_pgtbl_update())
  • (less important) The PMU programming API and whether this is actually an addition we would want to add to the kernel API

Intent for your PR

Choose one (Mandatory):

  • This PR is for a code-review and is intended to get feedback, but not to be pulled yet.
  • This PR is mature, and ready to be integrated into the repo.

Reviewers (Mandatory):

@gparmer

Code Quality

As part of this pull request, I've considered the following:

Style:

  • Comments adhere to the Style Guide (SG)
  • Spacing adhere's to the SG
  • Naming adhere's to the SG
  • All other aspects of the SG are adhered to, or exceptions are justified in this pull request
  • I have run the auto formatter on my code before submitting this PR (see doc/auto_formatter.md for instructions)

Code Craftsmanship:

  • I've made an attempt to remove all redundant code
  • I've considered ways in which my changes might impact existing code, and cleaned it up
  • I've formatted the code in an effort to make it easier to read (proper error handling, function use, etc...)
  • I've commented appropriately where code is tricky
  • I agree that there is no "throw-away" code, and that code in this PR is of high quality

Testing

I've tested the code using the following test programs (provide list here):

  • micro_booter
  • unit_pingpong
  • unit_schedtests
  • All test outlined in the x86-64 patch

Copy link
Collaborator

@gparmer gparmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly great. Some comments that are mainly stream of thought.

@@ -0,0 +1,36 @@
[system]
description = "Simplest system with both capability manager and scheduler to test shared memory implementation"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

focus is now on PMU testing, right?

@@ -514,6 +514,7 @@ cos_init(void)
{
booter_init();
comps_init();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does your editor auto-insert blank lines at the end of functions? Very strange behavior.

@@ -19,6 +19,16 @@ char *pong_test_strings[] = {

shm_bm_t shm;

static unsigned long
rdpmc (unsigned long cntr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no space between fn name and (.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the question is: where should we put this function?

We'll need to consider that it will need to have a generic version (that just returns 0) for architectures that don't support it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, need to figure out what a higher level abstraction would look like

cos_pmu_enable_fixed_counter(BOOT_CAPTBL_SELF_INITHW_BASE, 0);
cos_pmu_enable_fixed_counter(BOOT_CAPTBL_SELF_INITHW_BASE, 1);
/* enable architecture specific counter events (reference https://perfmon-events.intel.com/) */
cos_pmu_program_event_counter(BOOT_CAPTBL_SELF_INITHW_BASE, 0, 0x49, 0x0E);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the power and flexibility of this -- the direct programming of the PMUs. For that reason, we likely want this implementation.

I don't like how this can't really be fit into a capabiliyt-based system were we can delegate in a controlled fashion out the ability to program a subset of the capabilities, including limiting the # of counters set (since the hardware has a limit). A problem for a future us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats actually how KVM does it from my understanding; they delegate a subset of the counters that the guest can use.

char *buf;
int i;

buf = (char *)memmgr_heap_page_allocn(NUM_PAGES);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check return value, at least with assert.

@@ -120,7 +121,7 @@ chal_cpu_init(void)
#if defined(__x86_64__)
u32_t low = 0, high = 0;

chal_cpu_cr4_set(cr4 | CR4_PSE | CR4_PGE);
chal_cpu_cr4_set(cr4 | CR4_PSE | CR4_PGE | CR4_PCE | CR4_PCIDE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check in CPUID if the feature exists on the processor? Bomb out if it doesnt?


#if defined(__i386__)
/* x86-32 does not support PCID :( */
if (asid != 0) return -EINVAL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it is better to just proceed. x86-32 can pass in any asids it wants. The system will still be functionally correct. So it might be better to simply have this code for 32 be the same. In the kernel, we just ignore the value.

We don't want to have to maintain differnet client code for 32 bit vs. 64.

extern u8_t processor_num_pmc;

static inline void
wrmsr(u32_t msr_addr, u32_t in_lo, u32_t in_hi)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these already in chal_cpu?

/* fixed counter enable is in the high dword */
perf_global_ctrl_hi |= 1 << cntr;
/* enable OS and USR mode counting of event, maybe this should be made optional? */
perf_fixed_ctrl_lo |= (IA32_FIXED_CTR_CTRL_USR_EN | IA32_FIXED_CTR_CTRL_OS_EN) << (cntr * 4);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to be able to watch only kernel mode events for microoptimization in the kernel. But I don't like having to be considered in the API. What you have here is probably sufficient for now.

asm volatile("invpcid %0, %1" : : "m"(desc), "r"(type) : "memory");
}

static inline unsigned long
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where this and __write_cr3 are used. __write* I can see being used to switch page-tables. but __readcr3?

@gparmer
Copy link
Collaborator

gparmer commented Mar 10, 2022

You do need to add logic into pgtbl_deactivate to remove the asid from the per-cpu table. We need to discuss this as we might be deactivating on a different core than that on which the component executed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants