[PM/Unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching,

making it no longer even remotely simple. The pass will now be more of a "full loop unswitching" pass rather than anything substantively simpler than any other approach. I plan to rename it accordingly once the dust settles. The key ideas of the new loop unswitcher are carried over for non-trivial unswitching: 1) Fully unswitch a branch or switch instruction from inside of a loop to outside of it. 2) Update the CFG and IR. This avoids needing to "remember" the unswitched branches as well as avoiding excessively cloning and reliance on complex parts of simplify-cfg to cleanup the cfg. 3) Update the analyses (where we can) rather than just blowing them away or relying on something else updating them. Sadly, chapuni#3 is somewhat compromised here as the dominator tree updates were too complex for me to want to reason about. I will need to make another attempt to do this now that we have a nice dynamic update API for dominators. However, we do adhere to chapuni#3 w.r.t. LoopInfo. This approach also adds an important principls specific to non-trivial unswitching: not *all* of the loop will be duplicated when unswitching. This fact allows us to compute the cost in terms of how much *duplicate* code is inserted rather than just on raw size. Unswitching conditions which essentialy partition loops will work regardless of the total loop size. Some remaining issues that I will be addressing in subsequent commits: - Handling unstructured control flow. - Unswitching 'switch' cases instead of just branches. - Moving to the dynamic update API for dominators. Some high-level, interesting limitationsV that folks might want to push on as follow-ups but that I don't have any immediate plans around: - We could be much more clever about not cloning things that will be deleted. In fact, we should be able to delete *nothing* and do a minimal number of clones. - There are many more interesting selection criteria for which branch to unswitch that we might want to look at. One that I'm interested in particularly are a set of conditions which all exit the loop and which can be merged into a single unswitched test of them. Differential revision: https://reviews.llvm.org/D34200
pcc · Nov 17, 2017 · f2f758d · f2f758d
1 parent fde947a
commit f2f758d
Show file tree

Hide file tree

Showing 9 changed files with 4,372 additions and 100 deletions.
diff --git a/llvm/include/llvm/Analysis/LoopInfo.h b/llvm/include/llvm/Analysis/LoopInfo.h
@@ -146,11 +146,11 @@ template <class BlockT, class LoopT> class LoopBase {
   bool empty() const { return getSubLoops().empty(); }
 
   /// Get a list of the basic blocks which make up this loop.
-  const std::vector<BlockT *> &getBlocks() const {
+  ArrayRef<BlockT *> getBlocks() const {
     assert(!isInvalid() && "Loop not in a valid state!");
     return Blocks;
   }
-  typedef typename std::vector<BlockT *>::const_iterator block_iterator;
+  typedef typename ArrayRef<BlockT *>::const_iterator block_iterator;
   block_iterator block_begin() const { return getBlocks().begin(); }
   block_iterator block_end() const { return getBlocks().end(); }
   inline iterator_range<block_iterator> blocks() const {
@@ -165,6 +165,19 @@ template <class BlockT, class LoopT> class LoopBase {
     return Blocks.size();
   }
 
+  /// Return a direct, mutable handle to the blocks vector so that we can
+  /// mutate it efficiently with techniques like `std::remove`.
+  std::vector<BlockT *> &getBlocksVector() {
+    assert(!isInvalid() && "Loop not in a valid state!");
+    return Blocks;
+  }
+  /// Return a direct, mutable handle to the blocks set so that we can
+  /// mutate it efficiently.
+  SmallPtrSetImpl<const BlockT *> &getBlocksSet() {
+    assert(!isInvalid() && "Loop not in a valid state!");
+    return DenseBlockSet;
+  }
+
   /// Return true if this loop is no longer valid.  The only valid use of this
   /// helper is "assert(L.isInvalid())" or equivalent, since IsInvalid is set to
   /// true by the destructor.  In other words, if this accessor returns true,
@@ -314,6 +327,12 @@ template <class BlockT, class LoopT> class LoopBase {
     return Child;
   }
 
+  /// This removes the specified child from being a subloop of this loop. The
+  /// loop is not deleted, as it will presumably be inserted into another loop.
+  LoopT *removeChildLoop(LoopT *Child) {
+    return removeChildLoop(llvm::find(*this, Child));
+  }
+
   /// This adds a basic block directly to the basic block list.
   /// This should only be used by transformations that create new loops.  Other
   /// transformations should use addBasicBlockToLoop.
@@ -744,9 +763,16 @@ template <class BlockT, class LoopT> class LoopInfoBase {
 
   void verify(const DominatorTreeBase<BlockT, false> &DomTree) const;
 
-protected:
-  // Calls the destructor for \p L but keeps the memory for \p L around so that
-  // the pointer value does not get re-used.
+  /// Destroy a loop that has been removed from the `LoopInfo` nest.
+  ///
+  /// This runs the destructor of the loop object making it invalid to
+  /// reference afterward. The memory is retained so that the *pointer* to the
+  /// loop remains valid.
+  ///
+  /// The caller is responsible for removing this loop from the loop nest and
+  /// otherwise disconnecting it from the broader `LoopInfo` data structures.
+  /// Callers that don't naturally handle this themselves should probably call
+  /// `erase' instead.
   void destroy(LoopT *L) {
     L->~LoopT();
 

diff --git a/llvm/include/llvm/Analysis/LoopInfoImpl.h b/llvm/include/llvm/Analysis/LoopInfoImpl.h
@@ -400,7 +400,7 @@ static void discoverAndMapSubloop(LoopT *L, ArrayRef<BlockT *> Backedges,
       // Discover a subloop of this loop.
       Subloop->setParentLoop(L);
       ++NumSubloops;
-      NumBlocks += Subloop->getBlocks().capacity();
+      NumBlocks += Subloop->getBlocksVector().capacity();
       PredBB = Subloop->getHeader();
       // Continue traversal along predecessors that are not loop-back edges from
       // within this subloop tree itself. Note that a predecessor may directly

diff --git a/llvm/include/llvm/Transforms/Scalar/LoopPassManager.h b/llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
@@ -217,6 +217,19 @@ class LPMUpdater {
     // shouldn't impact anything.
   }
 
+  /// Restart the current loop.
+  ///
+  /// Loop passes should call this method to indicate the current loop has been
+  /// sufficiently changed that it should be re-visited from the begining of
+  /// the loop pass pipeline rather than continuing.
+  void revisitCurrentLoop() {
+    // Tell the currently in-flight pipeline to stop running.
+    SkipCurrentLoop = true;
+
+    // And insert ourselves back into the worklist.
+    Worklist.insert(CurrentL);
+  }
+
 private:
   template <typename LoopPassT> friend class llvm::FunctionToLoopPassAdaptor;
 

diff --git a/llvm/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h b/llvm/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
@@ -36,8 +36,10 @@ namespace llvm {
 /// of the loop, to make the unswitching opportunity obvious.
 ///
 class SimpleLoopUnswitchPass : public PassInfoMixin<SimpleLoopUnswitchPass> {
+  bool NonTrivial;
+
 public:
-  SimpleLoopUnswitchPass() = default;
+  SimpleLoopUnswitchPass(bool NonTrivial = false) : NonTrivial(NonTrivial) {}
 
   PreservedAnalyses run(Loop &L, LoopAnalysisManager &AM,
                         LoopStandardAnalysisResults &AR, LPMUpdater &U);
@@ -46,7 +48,7 @@ class SimpleLoopUnswitchPass : public PassInfoMixin<SimpleLoopUnswitchPass> {
 /// Create the legacy pass object for the simple loop unswitcher.
 ///
 /// See the documentaion for `SimpleLoopUnswitchPass` for details.
-Pass *createSimpleLoopUnswitchLegacyPass();
+Pass *createSimpleLoopUnswitchLegacyPass(bool NonTrivial = false);
 
 } // end namespace llvm
 

diff --git a/llvm/lib/Analysis/LoopPass.cpp b/llvm/lib/Analysis/LoopPass.cpp
@@ -46,8 +46,7 @@ class PrintLoopPassWrapper : public LoopPass {
   }
 
   bool runOnLoop(Loop *L, LPPassManager &) override {
-    auto BBI = find_if(L->blocks().begin(), L->blocks().end(),
-                       [](BasicBlock *BB) { return BB; });
+    auto BBI = llvm::find_if(L->blocks(), [](BasicBlock *BB) { return BB; });
     if (BBI != L->blocks().end() &&
         isFunctionInPrintList((*BBI)->getParent()->getName())) {
       printLoop(*L, OS, Banner);