Skip to content

Commit

Permalink
gpu.rst: animation, graphic-sw-stack.gv and opengl-flow.gv
Browse files Browse the repository at this point in the history
  • Loading branch information
Jonathan2251 committed Aug 14, 2023
1 parent d5b99a5 commit 183e3c8
Show file tree
Hide file tree
Showing 15 changed files with 276 additions and 59 deletions.
Binary file added Fig/gpu/animation.key
Binary file not shown.
Binary file added Fig/gpu/animation.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified Fig/gpu/gpu-driver-role.key
Binary file not shown.
Binary file modified Fig/gpu/gpu-driver-role.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
17 changes: 17 additions & 0 deletions Fig/gpu/graphic-sw-stack.gv
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
digraph G {
rankdir=LR;

compound=true;
node [shape=record];
subgraph cluster_cpu {
label = "CPU (Client)";
CPU_SW [label=" 3D Model | JAVA | JOGL | { OpenGL API | Shaders \n (buitin-functions)} | <f1> Driver"];
}
subgraph cluster_gpu {
label = "GPU HW (Server)"
GPU_SW [label="<f1> 3D Rendering-pipeline \ndescribed in next section"];
}
CPU_SW:f1 -> GPU_SW:f1 [label=" Frame data, \n shader-exectuable-code"];

label = "Graphic SW Stack";
}
35 changes: 35 additions & 0 deletions Fig/gpu/opengl-flow.gv
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
digraph G {
rankdir=LR;

compound=true;
node [shape=record];
subgraph cluster_3d {
label = "3D/2D modeling software";
CodeGen [label="code-gen"];
subgraph cluster_code {
label = "Generated Code";
Api [label="<a> OpenGL API | <s> Shaders"];
}
Hand [label="hand-modifying"];
}
subgraph cluster_driver {
label = "Driver"
Compiler [label="On-line Compiler"];
Obj [label="obj"];
Linker [label="On-line binding (Linker)"];
Exe [label="exe"];
}
CodeGen -> Api [lhead ="cluster_code"];
Api -> Hand [ltail ="cluster_code"];
Hand -> Api [lhead ="cluster_code"];
Api:a -> Obj [lhead ="cluster_driver"];
Api:s -> Compiler;
Compiler -> Obj;
Obj -> Linker;
Linker -> Exe;
Exe -> GPU;
Exe -> CPU [ltail ="cluster_driver"];

label = "OpenGL Flow";
}

Binary file modified Fig/gpu/opengl_flow.odg
Binary file not shown.
Binary file modified Fig/gpu/opengl_flow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion lbdex/clean.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ popd
pushd verilog
make clean
popd
rm -rf chapters preprocess tmp.txt
rm -rf output chapters preprocess tmp.txt

3 changes: 2 additions & 1 deletion lbdex/gen-docs-ref.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

pushd ./lbdex
bash ./gen-chapters.sh
bash ./gen-ref-output.sh
# disable since removing output files of llvm-ir and asm from source/*.rst
#bash ./gen-ref-output.sh
popd

8 changes: 6 additions & 2 deletions source/about.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,11 +63,15 @@ Revision history

Version 12.0.14, not released yet.

Version 12.0.13.2, Released July 24, 2023.
Version 12.0.13.3, Released August 13, 2023.

gpu.rst: animation, graphic-sw-stack.gv and opengl-flow.gv.

Version 12.0.13.2, Released August 7, 2023.

gpu.rst: Subsection of buffers, vao binding.

Version 12.0.13.1, Released August 7, 2023.
Version 12.0.13.1, Released July 24, 2023.

gpu.rst: Section of Basic geometry in computer graphics, a x b = -b x a in 2D,
The role of GPU driver. npu.rst: The role of GPU driver.
Expand Down
133 changes: 121 additions & 12 deletions source/c++.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,24 @@ The Chapter11_2 can be built and run with the C++ polymorphism example code of
ch12_inherit.cpp as follows,

.. rubric:: lbdex/input/ch12_inherit.cpp
.. literalinclude:: ../lbdex/input/ch12_inherit.cpp
:start-after: /// start
.. code-block:: c++

...
class CPolygon { // _ZTVN10__cxxabiv117__class_type_infoE for parent class
...
#ifdef COUT_TEST
// generate IR nvoke, landing, resume and unreachable on iMac
{ cout << this->area() << endl; }
#else
{ printf("%d\n", this->area()); }
#endif
};
...

If using cout instead of printf in ch12_inherit.cpp, it won't generate exception
handler IRs on Linux, whereas it will generate invoke, landing, resume
and unreachable exception handler IRs on iMac.
Example code, ch12_eh.cpp, which supports **try** and **catch** exception handler
Example code, ch12_eh.cpp, which includes **try** and **catch** exception handler
as the following will generate these exception handler IRs both on iMac and Linux.

.. rubric:: lbdex/input/ch12_eh.cpp
Expand All @@ -37,7 +48,26 @@ as the following will generate these exception handler IRs both on iMac and Linu
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-dis ch12_eh.bc -o -
.. literalinclude:: ../lbdex/output/ch12_eh.ll
.. rubric:: ../lbdex/output/ch12_eh.ll
.. code-block:: llvm
...
define dso_local i32 @_Z14test_try_catchv() #0 personality i8* bitcast (i32 (...
)* @__gxx_personality_v0 to i8*) {
entry:
...
invoke void @_Z15throw_exceptionii(i32 signext 2, i32 signext 1)
to label %invoke.cont unwind label %lpad
invoke.cont: ; preds = %entry
br label %try.cont
lpad: ; preds = %entry
%0 = landingpad { i8*, i32 }
catch i8* null
...
}
...
.. code:: console
Expand Down Expand Up @@ -67,7 +97,40 @@ exception C++ keywords. It can compile ch12_eh.bc as follows,
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=static -filetype=asm ch12_eh.bc -o -
.. literalinclude:: ../lbdex/output/ch12_eh.cpu0.s
.. rubric:: ../lbdex/output/ch12_eh.cpu0.s
.. code:: text
.type _Z14test_try_catchv,@function
.ent _Z14test_try_catchv # @_Z14test_try_catchv
_Z14test_try_catchv:
...
$tmp0:
addiu $4, $zero, 2
addiu $5, $zero, 1
jsub _Z15throw_exceptionii
nop
$tmp1:
# %bb.1: # %invoke.cont
jmp $BB1_4
$BB1_2: # %lpad
$tmp2:
st $4, 16($fp)
st $5, 12($fp)
# %bb.3: # %catch
ld $4, 16($fp)
jsub __cxa_begin_catch
nop
addiu $2, $zero, 1
st $2, 20($fp)
jsub __cxa_end_catch
nop
jmp $BB1_5
$BB1_4: # %try.cont
addiu $2, $zero, 0
st $2, 20($fp)
$BB1_5: # %return
ld $2, 20($fp)
...
Thread variable
Expand Down Expand Up @@ -237,33 +300,79 @@ programming.
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llvm-dis ch12_thread_var.bc -o -
.. literalinclude:: ../lbdex/output/ch12_thread_var.ll
.. rubric:: ../lbdex/output/ch12_thread_var.ll
.. code-block:: llvm
...
@a = dso_local thread_local global i32 0, align 4
@b = dso_local thread_local global i32 0, align 4
; Function Attrs: noinline nounwind optnone mustprogress
define dso_local i32 @_Z15test_thread_varv() #0 {
entry:
store i32 2, i32* @a, align 4
%0 = load i32, i32* @a, align 4
ret i32 %0
}
; Function Attrs: noinline nounwind optnone mustprogress
define dso_local i32 @_Z17test_thread_var_2v() #0 {
entry:
store i32 3, i32* @b, align 4
%0 = load i32, i32* @b, align 4
ret i32 %0
}
...
.. code-block:: console
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=pic -filetype=asm ch12_thread_var.bc
-o -
-o ch12_thread_var.cpu0.pic.s
JonathantekiiMac:input Jonathan$ cat ch12_thread_var.cpu0.pic.s
.. literalinclude:: ../lbdex/output/ch12_thread_var.cpu0.pic.s
.. rubric:: ../lbdex/output/ch12_thread_var.cpu0.pic.s
.. code-block:: text
...
.ent _Z15test_thread_varv # @_Z15test_thread_varv
_Z15test_thread_varv:
...
ori $4, $gp, %tlsldm(a)
ld $t9, %call16(__tls_get_addr)($gp)
jalr $t9
nop
ld $gp, 8($fp)
lui $3, %dtp_hi(a)
addu $2, $3, $2
ori $2, $2, %dtp_lo(a)
...
In pic mode, the __thread variable access by call function __tls_get_addr with
the address of thread variable.
The c++11 standard thread_local variable is accessed by calling function _ZTW1b
which also call the function __tls_get_addr to get the thread_local variable
address.
In static mode, the thread variable is accessed by machine instructions as
follows,
In static mode, the thread variable is accessed by getting address of thread
variables "a" and "b" with machine instructions as follows,

.. code-block:: console
JonathantekiiMac:input Jonathan$ /Users/Jonathan/llvm/test/build/
bin/llc -march=cpu0 -relocation-model=static -filetype=asm
ch12_thread_var.bc -o -
ch12_thread_var.bc -o ch12_thread_var.cpu0.static.s
JonathantekiiMac:input Jonathan$ cat ch12_thread_var.cpu0.static.s
.. literalinclude:: ../lbdex/output/ch12_thread_var.cpu0.static.s
.. rubric:: ../lbdex/output/ch12_thread_var.cpu0.static.s
.. code-block:: text
...
lui $2, %tp_hi(a)
ori $2, $2, %tp_lo(a)
...
lui $2, %tp_hi(b)
ori $2, $2, %tp_lo(b)
...
While Mips uses rdhwr instruction to access thread varaible as below,
Cpu0 access thread varaible without inventing any new instruction.
Expand Down
4 changes: 2 additions & 2 deletions source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,9 +65,9 @@
# built documents.
#
# The short X.Y version.
version = u'12.0.13.2'
version = u'12.0.13.3'
# The full version, including alpha/beta/rc tags.
release = u'12.0.13.2'
release = u'12.0.13.3'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
2 changes: 1 addition & 1 deletion source/doc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -458,7 +458,7 @@ set-llvm-lit % `diff -r origin modify &> set-llvm-lit.diff`
.. [#rst] http://docutils.sourceforge.net/rst.html
.. [#llvm-sphinx-quick] http://llvm.org/docs/SphinxQuickstartTemplate.html
.. [#llvm-sphinx-quick] http://llvm.org/docs/SphinxQuickstartTemplate.html If you need to show LLVM IR use the llvm code block. https://llvm.org/docs/SphinxQuickstartTemplate.html#code-blocks
.. [#sphinx-lexers] http://pygments.org/docs/lexers/
Expand Down
Loading

0 comments on commit 183e3c8

Please sign in to comment.