Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adapter part2 #647

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from
Open

Adapter part2 #647

wants to merge 13 commits into from

Conversation

gaozhangfei
Copy link
Collaborator

adpater part 2, replace pr 642, target to devel branch

如何使用

  1. 默认,没有配置文件,
    选最高优先级两个驱动,primary 模式,工作时最高优先级驱动,worker[0], 出错时才切换次优先级驱动worker[1], 运行10次(looptime)再尝试切回worker[0]
    命令和之前一样
    example:
    numactl --cpubind=0 --membind=0 uadk_tool benchmark --alg sm3 --mode instr --opt 0 --sync --pktlen 1024 --seconds 20 --multi 1 --thread 8 --ctxnum 8 --init2

  2. 提供配置文件, 根据配置文件依次worker[0], worke[1], 可以配置round robin模式,轮流跑10次(looptime 可配置)
    example:
    vi uadk.conf
    mode=1
    driver_name=isa_ce_sm3
    driver_name=hisi_sec2
    export UADK_CONF=xxx/uadk.conf
    运行命令

  3. config 每次运行生效,可以更换驱动顺序,
    参数有
    mode=1 (默认是primary模式,出错才选择worker[1], mode=1时round robin, 两个驱动依次执行looptime次)
    looptime=5 (默认10,)

  4. 工作机制
    setting -> adapter -> workers[]
    sess->worker: alloc_sess 会选择worker[0], 某些时候sess->worker 会切换,如果出错或者round robin
    工作时,worker=sess->worker

细节参考
【腾讯文档】adapter2024.7.31
https://docs.qq.com/doc/DRWVVTWFQWU1QeVFz

Use the common func wd_find_msg_in_pool to replace the different
wd_xxx_get_msg via passing the pool to the drv.priv->config.poll

Signed-off-by: Zhangfei Gao <[email protected]>
The wd_util.c and wd_shced.c should be in libwd.so
No need to export in every algo.so

Signed-off-by: Zhangfei Gao <[email protected]>
Now wd_init_sched will copy sched memory,
later the &sched will be directly used without copy.
So use static var instead of temp var, which will be freed.

Signed-off-by: Zhangfei Gao <[email protected]>
poll_ctx adds para wd_sched to transfer para,
since different driver may use different pool.

Signed-off-by: Zhangfei Gao <[email protected]>
wd_find_drv is used to find drv with name and alg,
also takes idx to search the idx'th drv

Signed-off-by: Zhangfei Gao <[email protected]>
Using uadk_adapter_worker to record driver and resources.
So adding multi driver is adding workers.
The worker priority and mode can be set by env UADK_CONF.

For export UADK_CONF=TO_PATH/uadk.conf

    Example: uadk.conf
    mode=1
    driver_name=isa_ce_sm3
    driver_name=hisi_sec2

If env UADK_CONF exist, only parse UADK_CONF and add to
adapter->workers[] one by one. So the first driver
has higher priority

Otherwise, parse all drivers and choose the top two
priorities drivers to adapter->workers[]

Signed-off-by: Zhangfei Gao <[email protected]>
Using uadk_adapter_worker to record driver and resources.
so adding multi driver is adding workers.
To make sure different worker can work.

Signed-off-by: Zhangfei Gao <[email protected]>
case 1: sync or async error, switch worker
case 2: roundrobin mode, switch worker after worker->looptime (10)

Signed-off-by: Zhangfei Gao <[email protected]>
By default each worker loop 10 times in roundrobin mode,
except looptime is set in UADK_CONF

For example:
uadk.conf
looptime=5
mode=1

Signed-off-by: Zhangfei Gao <[email protected]>
There may driver does not support async mode.
The operation is finished when send is over.
So we can simulate async mode just return the correct number

Signed-off-by: Zhangfei Gao <[email protected]>
@gaozhangfei gaozhangfei force-pushed the part2 branch 2 times, most recently from 97736f1 to 4300a5d Compare January 6, 2025 07:05
@gaozhangfei
Copy link
Collaborator Author

gaozhangfei commented Jan 7, 2025

Performance , Jan 3, Thanks Taoqi

sm4 sync
numactl --cpubind=0 --membind=0 ./uadk_tool benchmark --alg sm4-128-ecb --mode sva --opt 0 --sync --pktlen 1024 --seconds 10 --multi 1 --thread 8 --ctxnum 8 --init2

sm4 async
numactl --cpubind=0 --membind=0 ./uadk_tool benchmark --alg sm4-128-ecb --mode sva --opt 0 --async --pktlen 1024 --seconds 10 --multi 1 --thread 8 --ctxnum 8 --prefetch --init2

sm3 sync
numactl --cpubind=0 --membind=0 uadk_tool benchmark --alg sm3 --mode sva --opt 0 --sync --pktlen 1024 --seconds 20 --multi 1 --thread 8 --ctxnum 8 --init2

sm3 async
numactl --cpubind=0 --membind=0 uadk_tool benchmark --alg sm3 --mode sva --opt 0 --async --pktlen 1024 --seconds 20 --multi 1 --thread 8 --ctxnum 8 --init2

sm4 sync:
hw: 3934165.80KiB/s
ce: 14277999.40KiB/s  
hw+ce: 6050629.80KiB/s

sm4 async:
hw: 5928097.90KiB/s

ce 模拟:1209859 kib s
ce+hw: 3815037kib s

sm3 sync:
hw: 3395791.70KiB/s
ce: 3708896.95KiB/s
hw+ce: 3476048.45KiB
sve: 1699783.35KiB/s
hw+sve: 2238242.30KiB/s

sm3 async:
hw: 4825519.40KiB/s
sve: 185712.55KiB/s
hw+sve: 766422.70KiB/s

ce: 1072081KiB/s 模拟的ce 异步
ce+hw: 4606990KiB/s

sm3:
同步:ce > ce+ hw > hw
同步:hw > sve + hw >  sve
异步:hw > sve + hw >  sve
hw async > ce sync > hw sync> sve sync > sve async

Multi-threads:
SM3 1024B Performance
numactl --cpubind=0 --membind=0 ./uadk_tool benchmark --alg sm3 --mode instr --opt 0 --sync --pktlen 1024 --seconds 10 --multi 1 --thread tds --ctxnum tds --prefetch --init2
 
SM3 1024B Performance(MB/s)

tds------init1(HW)-----init2(HW + CE)----increase

1-----------167-------- 426 -------------155%

2----------842--------- 856 ------------1.6%

4----------1672------- 1683 ------------0%

8----------3322------ 3387 -----------1.5%

16---------4739------- 6376 -----------43%

32--------4740------- 9476 ------------100%

wd_xx_set_adapter_mode can change the cipher apdater mode
while system is running

Signed-off-by: Zhangfei Gao <[email protected]>
Add UADK_ADAPT_MODE_THRESHOLD
if (pkg_len > threshold)
	choose hw
else
	choose sw

Signed-off-by: Zhangfei Gao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants