Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] LLVM-based Locality Optimization for Chapel programs #9

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

ahayashi
Copy link

Summary

An initial version of LLVM-based Locality Inference Pass (Locality Optimization Pass). This pass tries to convert possibly-remote accesses (addrspace(100)* accesses) to definitely-local accesses as much as possible at compilation time to avoid runtime affinity checking overheads.

What I would expect

I would appreciate if Chapel folks give me some feedback particularly on:

  • Detection of Chapel Runtime API calls

    The current implementation tries to recover Local statement, Chapel Array Construction, and Chapel Array Access since such information is lost at the time of LLVM IR generation. Additionally, the recovery process can easily fail because it completely depends on how the Chapel-LLVM frontends emits LLVM IRs. I'd suggest that the frontend add some annotations and/or attributes so the LLVM-based PGAS optimization can easily recognize the high-level information. Also this will be helpful to make this pass language-agnostic.

  • Coding style

    If there is a preferred coding style for this repository. Please let me know.

Any other feedback is certainly appreciated! For more details of the locality optimization pass, please see below.

How it works

To infer the locality, the locality optimization pass tries to utilize following information :
(Please also see test/local.ll)

Case 1. Scalar access enclosed by Chapel's LOCAL statement

 proc localizeByLocalStmt(ref x) : int {
     var p: int = 1;
     local { p = x; }
     return p + x; // x is definitely local
  }

The locality level of x is inferred by searching an SSA value graph, which is implemented in IGraph.[h|cpp]. When you specify debugThisFn, the pass generates .dot file that can be visualized by the graphviz tool. (http://www.graphviz.org/)

Case 2. Array access enclosed by Chapel's LOCAL statement

proc habanero(A) : int {
    A(1) = 1; // A(1) is definitely local
    local { A(1) = 2; }
    A(2) = 3; // A(2) is possibly remote
}

The locality optimization pass is element-sensitive. For example, the locality of A(1) is definitely-local, but the pass leave A(2) possibly-remote since there is no enough information about the locality of A(2). This is done by using a reduced version of the LLVM's global value numbering pass for assigning a value number to variables and expressions (in ValueTable.[h|cpp]) and an array offset analysis.

Case 3. Locale-locale array declaration

proc localizeByArrayDecl () {
    var A: [1..10] int;
    return A(5);
}

The locality of A(5) is definitely-local since an array A is declared in this scope. Note that this pass is not element-sensitve so far.

Limitations

Chapel's Local statement Detection

Currently, we are assuming that gf.addr function calls correspond to Chapel's local statements, but this is not always true because gf.addr is also used to extract a local pointer from a wide pointer. To avoid this problem, we have an std::vector named "NonLocals" to record a retun value of gf.addr which is also an argument of gf.make and the NonLocals are referred when doing "exemptionTest". This may not be always true. Ideally, a PGAS-LLVM frontend should tell the locality optimization pass which gf.addr call is a local statement.

Example :

call i64* @.gf.addr.1(i64 addrspace(100)* %x)       // %x is definitely local
%y = call i64* @.gf.addr.1(i64 addrspace(100)* %x)  // might not be definitely local
call i64 addrspace(100)* @.gf.make.1(..., %y)

Chapel's Array Declaration Detection

We basically look for chpl__convertRuntimeTypeToValue to detect Chapel's array declaration. Please see analyzeCallInsn for more details.

TODOs and future work

The utilization of high-level information

The locality optimization pass has to recover high-level information such as array accesses and local statements from low-level LLVM IR, but ideally, PGAS-LLVM frontend are supposed to add annotations to keep these information so the locality optimization can easily recognize high-level information and perform language-agnostic PGAS optimization.

Locality Inference considering if statements

The current implementation does not propagate a condition even if a local statement is enclosed by if statement. Hence, we may fail to infer the locality in some cases like this:
if (condition) { local{ p = x } })

Make it inter-procedural pass

This can make more possibly-remote accesses to definitely-local accesses.

More experiments with the latest version of the Chapel compiler (1.12.0)

I have been mainly working with the Chapel compiler 1.9.0. I need to check more if the locality optimization pass works.

Function* f = call->getCalledFunction();
if (f != NULL) {
/* *
* We are assuming that gf.addr function calls correspond to Chapel's local statements, but this is not always true because gf.addr is also used to extract a local pointer from a wide pointer. We work on this later pass (see exemptionTest in llvmLocalityOptimization.cpp).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you split this comment line into multiple lines < 80 characters?

To infer the locality of each possibly-remote data access considering CFG, the inequality graph should be constructed accordingly. Since the origianl LLVM IR does not have any information on Def/Use of locality (e.g. local statement in Chapel), we first need to construct our own CFG having such information. Next, locality-SSA can be built in a similar way that classic compilers does and finally inequality graph can easily be constructed.

Here is the steps:
(Step 1: This commit) Construct our own CFG by recognizing def/use of locality (e.g. calling @.gf.addr and then doing load and store => local statement)
(Step 2: Locality-SSA Construction 1 TODO) Compute Dominator Tree and Dominance Frontier of the CFG
(Step 3: Locality-SSA Construction 2 TODO) Insert Phi-function
@ahayashi
Copy link
Author

ahayashi commented Mar 5, 2016

Hi, Michael,

Thank you very much for the valuable comments, that's very helpful.
Sorry for the delay, but I've just committed an updated version.

My changes includes:

  1. Wrap lines to 80 characters
  2. Add comments as much as possible
  3. Consider control flow for completeness and robustness.
  4. Refactor the inequality graph implementation to make the pass language-agnostic.

I've just sent you an email to give you more details including things that I would like to keep confidential for now.

Please let me know if there is anything I missed.

Thanks,

Akihiro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants