Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/object allocation #3

Open
wants to merge 2 commits into
base: v1.8.2-RAI
Choose a base branch
from

Conversation

udesou
Copy link

@udesou udesou commented Jan 25, 2023

This PR changes the threshold for allocation Julia's large objects to follow immix's threshold. While this probably brings a benefit in terms of allocation (more objects can be allocated via fastpath), until get_so_size is optimised, changing the threshold can actually degrade performance, since getting the objects size of large objects is way less expensive (loading a field from the header) than getting the size of small objects (check the object type, and calculate the size based on that). It should be merged with mmtk/mmtk-julia#20.

qinsoon pushed a commit to qinsoon/julia that referenced this pull request May 2, 2024
Fixes: JuliaLang#33147
Replaces/Closes: JuliaLang#40445

The difference here, compared to past implementations, is that we use
the zero-cost `isiterable` check on every intermediate step, instead of
wrapping the call in a try/catch and then trying to re-approximate the
`isiterable` afterwards. Some samples:

```julia
julia> Dict(i for i in 1:3)                                                        
ERROR: ArgumentError: AbstractDict(kv): kv needs to be an iterator of 2-tuples or pairs                                                                               
Stacktrace:                                                                                                                                                           
 [1] _throw_dict_kv_error()                                                        
   @ Base ./dict.jl:118                                                                                                                                               
 [2] grow_to!                                                                      
   @ ./dict.jl:132 [inlined]                                                       
 [3] dict_with_eltype                                                                                                                                                 
   @ ./abstractdict.jl:592 [inlined]                                                                                                                                  
 [4] Dict(kv::Base.Generator{UnitRange{Int64}, typeof(identity)})                                                                                                     
   @ Base ./dict.jl:120                                                            
 [5] top-level scope                                                                                                                                                  
   @ REPL[1]:1                                                                     
                                                                                                                                                                      
julia> Dict(i => error("$i") for i in 1:3)                                         
ERROR: 1                                                                                                                                                              
Stacktrace:                                                                                                                                                           
 [1] error(s::String)                                                                                                                                                 
   @ Base ./error.jl:35                                                                                                                                               
 [2] (::var"mmtk#3#4")(i::Int64)                                                       
   @ Main ./none:0                                                                 
 [3] iterate                                                                                                                                                          
   @ ./generator.jl:48 [inlined]                                                   
 [4] grow_to!                                                                      
   @ ./dict.jl:124 [inlined]                                                       
 [5] dict_with_eltype                                                              
   @ ./abstractdict.jl:592 [inlined]                                               
 [6] Dict(kv::Base.Generator{UnitRange{Int64}, var"mmtk#3#4"})                                                                                                            
   @ Base ./dict.jl:120                                                                                                                                               
 [7] top-level scope                                                               
   @ REPL[2]:1                                                                                                                                                        
```

The other unrelated change here is that `dest = empty(dest, typeof(k),
typeof(v))` is made conditional, so we do not unconditionally construct
an empty Dict in order to discard it and allocate an exact duplicate of
it, but only do so if inference wasn't precise originally.

Co-authored-by: Curtis Vogt <[email protected]>
qinsoon pushed a commit to qinsoon/julia that referenced this pull request May 2, 2024
For example, we seek to eliminate the gc frame from this function, as
observed here:

```julia
julia> code_llvm((BitSet,), raw=true) do x; r = x.bits; GC.safepoint(); @inbounds r[1]; end
; Function Signature: var"mmtk#3"(Base.BitSet)
;  @ REPL[1]:1 within `mmtk#3`
define swiftcc i64 @"julia_#3_494"(ptr nonnull swiftself %pgcstack, ptr noundef nonnull align 8 dereferenceable(16) %"x::BitSet") #0 !dbg !5 {
top:
  call void @llvm.dbg.declare(metadata ptr %"x::BitSet", metadata !21, metadata !DIExpression()), !dbg !22
  %ptls_field = getelementptr inbounds ptr, ptr %pgcstack, i64 2
  %ptls_load = load ptr, ptr %ptls_field, align 8, !tbaa !23
  %0 = getelementptr inbounds ptr, ptr %ptls_load, i64 2
  %safepoint = load ptr, ptr %0, align 8, !tbaa !27
  fence syncscope("singlethread") seq_cst
  %1 = load volatile i64, ptr %safepoint, align 8, !dbg !22
  fence syncscope("singlethread") seq_cst
; ┌ @ Base.jl:49 within `getproperty`
   %"x::BitSet.bits" = load atomic ptr, ptr %"x::BitSet" unordered, align 8, !dbg !29, !tbaa !27, !alias.scope !33, !noalias !36, !nonnull !11, !dereferenceable !41, !align !42
; └
; ┌ @ gcutils.jl:253 within `safepoint`
   %ptls_load4 = load ptr, ptr %ptls_field, align 8, !dbg !43, !tbaa !23
   %2 = getelementptr inbounds ptr, ptr %ptls_load4, i64 2, !dbg !43
   %safepoint5 = load ptr, ptr %2, align 8, !dbg !43, !tbaa !27
   fence syncscope("singlethread") seq_cst, !dbg !43
   %3 = load volatile i64, ptr %safepoint5, align 8, !dbg !43
   fence syncscope("singlethread") seq_cst, !dbg !43
; └
; ┌ @ essentials.jl:892 within `getindex`
   %4 = load ptr, ptr %"x::BitSet.bits", align 8, !dbg !46, !tbaa !49, !alias.scope !52, !noalias !53
   %5 = load i64, ptr %4, align 8, !dbg !46, !tbaa !54, !alias.scope !57, !noalias !58
   ret i64 %5, !dbg !46
; └
}
```
udesou pushed a commit to udesou/julia that referenced this pull request Aug 19, 2024
The functions `toms`, `tons`, and `days` uses `sum` over a vector of
`Period`s to obtain the conversion of a `CompoundPeriod`. However, the
compiler cannot infer the return type because those functions can return
either `Int` or `Float` depending on the type of the `Period`. This PR
forces the result of those functions to be `Float64`, fixing the type
stability.

Before this PR we had:

```julia
julia> using Dates

julia> p = Dates.Second(1) + Dates.Minute(1) + Dates.Year(1)
1 year, 1 minute, 1 second

julia> @code_warntype Dates.tons(p)
MethodInstance for Dates.tons(::Dates.CompoundPeriod)
  from tons(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:458
Arguments
  #self#::Core.Const(Dates.tons)
  c::Dates.CompoundPeriod
Body::Any
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto mmtk#3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.tons::Core.Const(Dates.tons)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
└──       return %11


julia> @code_warntype Dates.toms(p)
MethodInstance for Dates.toms(::Dates.CompoundPeriod)
  from toms(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:454
Arguments
  #self#::Core.Const(Dates.toms)
  c::Dates.CompoundPeriod
Body::Any
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto mmtk#3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.toms::Core.Const(Dates.toms)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
└──       return %11


julia> @code_warntype Dates.days(p)
MethodInstance for Dates.days(::Dates.CompoundPeriod)
  from days(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:468
Arguments
  #self#::Core.Const(Dates.days)
  c::Dates.CompoundPeriod
Body::Any
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto mmtk#3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.days::Core.Const(Dates.days)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
└──       return %11
```

After this PR we have:

```julia
julia> using Dates

julia> p = Dates.Second(1) + Dates.Minute(1) + Dates.Year(1)
1 year, 1 minute, 1 second

julia> @code_warntype Dates.tons(p)
MethodInstance for Dates.tons(::Dates.CompoundPeriod)
  from tons(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:458
Arguments
  #self#::Core.Const(Dates.tons)
  c::Dates.CompoundPeriod
Body::Float64
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto mmtk#3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.tons::Core.Const(Dates.tons)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
│   %12 = Dates.Float64::Core.Const(Float64)
│   %13 = Core.typeassert(%11, %12)::Float64
└──       return %13


julia> @code_warntype Dates.toms(p)
MethodInstance for Dates.toms(::Dates.CompoundPeriod)
  from toms(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:454
Arguments
  #self#::Core.Const(Dates.toms)
  c::Dates.CompoundPeriod
Body::Float64
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto mmtk#3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.toms::Core.Const(Dates.toms)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
│   %12 = Dates.Float64::Core.Const(Float64)
│   %13 = Core.typeassert(%11, %12)::Float64
└──       return %13


julia> @code_warntype Dates.days(p)
MethodInstance for Dates.days(::Dates.CompoundPeriod)
  from days(c::Dates.CompoundPeriod) @ Dates ~/.julia/juliaup/julia-nightly/share/julia/stdlib/v1.12/Dates/src/periods.jl:468
Arguments
  #self#::Core.Const(Dates.days)
  c::Dates.CompoundPeriod
Body::Float64
1 ─ %1  = Dates.isempty::Core.Const(isempty)
│   %2  = Base.getproperty(c, :periods)::Vector{Period}
│   %3  = (%1)(%2)::Bool
└──       goto mmtk#3 if not %3
2 ─       return 0.0
3 ─ %6  = Dates.Float64::Core.Const(Float64)
│   %7  = Dates.sum::Core.Const(sum)
│   %8  = Dates.days::Core.Const(Dates.days)
│   %9  = Base.getproperty(c, :periods)::Vector{Period}
│   %10 = (%7)(%8, %9)::Any
│   %11 = (%6)(%10)::Any
│   %12 = Dates.Float64::Core.Const(Float64)
│   %13 = Core.typeassert(%11, %12)::Float64
└──       return %13
```
udesou pushed a commit to udesou/julia that referenced this pull request Oct 16, 2024
Rebase and extension of @alexfanqi's initial work on porting Julia to
RISC-V. Requires LLVM 19.

Tested on a VisionFive2, built with:

```make
MARCH := rv64gc_zba_zbb
MCPU := sifive-u74

USE_BINARYBUILDER:=0

DEPS_GIT = llvm
override LLVM_VER=19.1.1
override LLVM_BRANCH=julia-release/19.x
override LLVM_SHA1=julia-release/19.x
```

```julia-repl
❯ ./julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.12.0-DEV.1374 (2024-10-14)
 _/ |\__'_|_|_|\__'_|  |  riscv/25092a3982* (fork: 1 commits, 0 days)
|__/                   |

julia> versioninfo(; verbose=true)
Julia Version 1.12.0-DEV.1374
Commit 25092a3* (2024-10-14 09:57 UTC)
Platform Info:
  OS: Linux (riscv64-unknown-linux-gnu)
  uname: Linux 6.11.3-1-riscv64 #1 SMP Debian 6.11.3-1 (2024-10-10) riscv64 unknown
  CPU: unknown:
              speed         user         nice          sys         idle          irq
       #1  1500 MHz        922 s          0 s        265 s     160953 s          0 s
       mmtk#2  1500 MHz        457 s          0 s        280 s     161521 s          0 s
       mmtk#3  1500 MHz        452 s          0 s        270 s     160911 s          0 s
       mmtk#4  1500 MHz        638 s         15 s        301 s     161340 s          0 s
  Memory: 7.760246276855469 GB (7474.08203125 MB free)
  Uptime: 16260.13 sec
  Load Avg:  0.25  0.23  0.1
  WORD_SIZE: 64
  LLVM: libLLVM-19.1.1 (ORCJIT, sifive-u74)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)
Environment:
  HOME = /home/tim
  PATH = /home/tim/.local/bin:/usr/local/bin:/usr/bin:/bin:/usr/games
  TERM = xterm-256color


julia> ccall(:jl_dump_host_cpu, Nothing, ())
CPU: sifive-u74
Features: +zbb,+d,+i,+f,+c,+a,+zba,+m,-zvbc,-zksed,-zvfhmin,-zbkc,-zkne,-zksh,-zfh,-zfhmin,-zknh,-v,-zihintpause,-zicboz,-zbs,-zvknha,-zvksed,-zfa,-ztso,-zbc,-zvknhb,-zihintntl,-zknd,-zvbb,-zbkx,-zkt,-zvkt,-zicond,-zvksh,-zvfh,-zvkg,-zvkb,-zbkb,-zvkned


julia> @code_native debuginfo=:none 1+2.
	.text
	.attribute	4, 16
	.attribute	5, "rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_zicsr2p0_zifencei2p0_zmmul1p0_zba1p0_zbb1p0"
	.file	"+"
	.globl	"julia_+_3003"
	.p2align	1
	.type	"julia_+_3003",@function
"julia_+_3003":
	addi	sp, sp, -16
	sd	ra, 8(sp)
	sd	s0, 0(sp)
	addi	s0, sp, 16
	fcvt.d.l	fa5, a0
	ld	ra, 8(sp)
	ld	s0, 0(sp)
	fadd.d	fa0, fa5, fa0
	addi	sp, sp, 16
	ret
.Lfunc_end0:
	.size	"julia_+_3003", .Lfunc_end0-"julia_+_3003"

	.type	".L+Core.Float64#3005",@object
	.section	.data.rel.ro,"aw",@progbits
	.p2align	3, 0x0
".L+Core.Float64#3005":
	.quad	".L+Core.Float64#3005.jit"
	.size	".L+Core.Float64#3005", 8

.set ".L+Core.Float64#3005.jit", 272467692544
	.size	".L+Core.Float64#3005.jit", 8
	.section	".note.GNU-stack","",@progbits
```

Lots of bugs guaranteed, but with this we at least have a functional
build and REPL for further development by whoever is interested.

Also requires Linux 6.4+, since the fallback processor detection
used here relies on LLVM's `sys::getHostCPUFeatures`, which for
RISC-V is implemented using hwprobe introduced in 6.4. We could
probably add a fallback that parses `/proc/cpuinfo`, either by building
a CPU database much like how we've done for AArch64, or by parsing the
actual ISA string contained there. That would probably also be a good
place to add support for profiles, which are supposedly the way forward
to package RISC-V binaries. That can happen in follow-up PRs though.
For now, on older kernels, use the `-C` arg to Julia to specify an ISA.

Co-authored-by: Alex Fan <[email protected]>
udesou pushed a commit to udesou/julia that referenced this pull request Oct 16, 2024
E.g. this allows `finalizer` inlining in the following case:
```julia
mutable struct ForeignBuffer{T}
    const ptr::Ptr{T}
end
const foreign_buffer_finalized = Ref(false)
function foreign_alloc(::Type{T}, length) where T
    ptr = Libc.malloc(sizeof(T) * length)
    ptr = Base.unsafe_convert(Ptr{T}, ptr)
    obj = ForeignBuffer{T}(ptr)
    return finalizer(obj) do obj
        Base.@assume_effects :notaskstate :nothrow
        foreign_buffer_finalized[] = true
        Libc.free(obj.ptr)
    end
end
function f_EA_finalizer(N::Int)
    workspace = foreign_alloc(Float64, N)
    GC.@preserve workspace begin
        (;ptr) = workspace
        Base.@assume_effects :nothrow @noinline println(devnull, "ptr = ", ptr)
    end
end
```
```julia
julia> @code_typed f_EA_finalizer(42)
CodeInfo(
1 ── %1  = Base.mul_int(8, N)::Int64
│    %2  = Core.lshr_int(%1, 63)::Int64
│    %3  = Core.trunc_int(Core.UInt8, %2)::UInt8
│    %4  = Core.eq_int(%3, 0x01)::Bool
└───       goto mmtk#3 if not %4
2 ──       invoke Core.throw_inexacterror(:convert::Symbol, UInt64::Type, %1::Int64)::Union{}
└───       unreachable
3 ──       goto mmtk#4
4 ── %9  = Core.bitcast(Core.UInt64, %1)::UInt64
└───       goto mmtk#5
5 ──       goto mmtk#6
6 ──       goto mmtk#7
7 ──       goto mmtk#8
8 ── %14 = $(Expr(:foreigncall, :(:malloc), Ptr{Nothing}, svec(UInt64), 0, :(:ccall), :(%9), :(%9)))::Ptr{Nothing}
└───       goto mmtk#9
9 ── %16 = Base.bitcast(Ptr{Float64}, %14)::Ptr{Float64}
│    %17 = %new(ForeignBuffer{Float64}, %16)::ForeignBuffer{Float64}
└───       goto mmtk#10
10 ─ %19 = $(Expr(:gc_preserve_begin, :(%17)))
│    %20 = Base.getfield(%17, :ptr)::Ptr{Float64}
│          invoke Main.println(Main.devnull::Base.DevNull, "ptr = "::String, %20::Ptr{Float64})::Nothing
│          $(Expr(:gc_preserve_end, :(%19)))
│    %23 = Main.foreign_buffer_finalized::Base.RefValue{Bool}
│          Base.setfield!(%23, :x, true)::Bool
│    %25 = Base.getfield(%17, :ptr)::Ptr{Float64}
│    %26 = Base.bitcast(Ptr{Nothing}, %25)::Ptr{Nothing}
│          $(Expr(:foreigncall, :(:free), Nothing, svec(Ptr{Nothing}), 0, :(:ccall), :(%26), :(%25)))::Nothing
└───       return nothing
) => Nothing
```

However, this is still a WIP. Before merging, I want to improve EA's
precision a bit and at least fix the test case that is currently marked
as `broken`. I also need to check its impact on compiler performance.

Additionally, I believe this feature is not yet practical. In
particular, there is still significant room for improvement in the
following areas:
- EA's interprocedural capabilities: currently EA is performed ad-hoc
for limited frames because of latency reasons, which significantly
reduces its precision in the presence of interprocedural calls.
- Relaxing the `:nothrow` check for finalizer inlining: the current
algorithm requires `:nothrow`-ness on all paths from the allocation of
the mutable struct to its last use, which is not practical for
real-world cases. Even when `:nothrow` cannot be guaranteed, auxiliary
optimizations such as inserting a `finalize` call after the last use
might still be possible (JuliaLang#55990).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant