-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsearch.json
1 lines (1 loc) · 767 KB
/
search.json
1
[{"categories":["Computer"],"content":"汇编语言","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"汇编语言 ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:1:0","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"配置安装 apt install nasm gdb # 安装 # vscode 插件 The Netwide Assembler (NASM) 后缀改成nas或者asm nasm -f elf -o hello.o hello.s # 编译 ld -m elf_i386 -o hello.out hello.o # 链接 ASM = nasm LD = ld ASM_FLAGS = -f elf LD_FLAGS = -m elf_i386 # Source and output files TARGETS = code01 SRCS = $(wildcard *.s) OBJS = $(patsubst %.s, %.o, $(SRCS)) # Targets all: $(TARGETS) $(TARGETS): %: %.o $(LD) -o $@ $^ $(LD_FLAGS) %.o: %.s $(ASM) $(ASM_FLAGS) $\u003c -o $@ .PHONY: clean clean: rm -f *.o $(TARGETS) ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:2:0","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"gdb # 启动配置文件 .gdbinit,需要设置安全设置生效 cat ~/.config/gdb/gdbinit add-auto-load-safe-path /home/username/compiler_test/lab01/ # .gdbinit break _start run set disassembly-flavor intel # 默认att, 设置intel风格 # set disassemble-next-line on layout asm layout reg ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:2:1","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"常用gdb调试 # nm 查看符号表 # x /选项 地址 查看各个变量内存信息 const char* str = \"test\"; x str # 默认16进制显示,内存存储内容和“test\"相反(小端存储) 0x74736574 x /s str # 直接显示内容 ”test\" x /d str # 十进制显示 x /4d str # 十进制显示,显示宽度为4 # 变量非指针类型,如int, 先p \u0026value_name, 使用x查看 ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:2:2","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"X86体系结构 ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:3:0","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"寄存器 分类 英文全称 16 位 32 位 64 位 通用寄存器 Accumulator ax eax rax 通用寄存器 Base bx ebx rbx 通用寄存器 Counter cx ecx rcx 通用寄存器 Data dx edx rdx 指针寄存器:栈指针,指向当前栈顶 Stack Pointer sp esp rsp 指针寄存器:基址指针,通常用于访问栈帧中的局部变量 Base Pointer bp ebp rbp 变地址寄存器:源变址寄存器 Source Index si esi rsi 变地址寄存器:目的变址寄存器 Destination Index di edi rdi 控制寄存器:指令指针,指向下一条要执行的指令 Instruction Pointer ip eip rip 控制寄存器:标志寄存器,存储 CPU 的状态标志 Flag flag eflag rflag 段寄存器 Code Segment cs cs cs 段寄存器 Data Segment ds ds ds 段寄存器 Stack Segment ss ss ss 段寄存器 Extra Segment es es es ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:3:1","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"基本汇编语法 ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:4:0","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"Basic Instruction mov dest, src ; move src to desc mov eax, 4 ; 将立即数 (immidiate) 存入到 eax 寄存器中 mov bx, ax ; bx = ax add eax, 4 ; eax = eax + 4 sub ebx, edi ; ebx = ebx - edi inc ecx ; ecx++ mov eax, 10 mov ebx, 20 mul ebx ; edx:eax = eax * ebx mul无符号,imul有符号 edx:eax 表示 64 位的值 mov eax, 100 mov ebx, 20 div ebx ; eax = 5 (商), edx = 0 (余数) div无符号,idiv有符号 ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:4:1","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"Directive Instruction 定义常量 symbol equ 128 aaa equ 8 %define SIZE 128 定义内存 L1 db 0 ; 定义一个字节, 并初始化为 0 L2 dw0 ; 定义一个字 (word), 并初始化为 0 L3 resb 4 ; 预留 4 个字节 L4 times 100 db 1 ; 100 次定义字节, 初始化成 1 ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:4:2","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"寻址模式 立即寻址 mov eax, 10 ; 将立即数 10 加载到 eax 寄存器寻址 mov eax, ebx ; 将 ebx 的值加载到 eax 直接寻址 mov eax, [0x1000] ; 将地址 0x1000 处的值加载到 eax 寄存器间接寻址 mov eax, [ebx] ; 将 ebx 指向的内存地址的值加载到 eax 基址加变址寻址 mov eax, [ebx+ecx*4] ; 将 ebx + ecx * 4 处的值加载到 eax ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:4:3","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"控制流 比较指令 Bit Label Description 0 CF Carry Flag(进位标志):运算结果的最高有效位有进位(加法)或借位(减法)时,进位标志置1 2 PF Parity Flag(奇偶标志):运算结果的所有位中1的个数是偶数置1 4 AF Auxiliary Carry flag(辅助进位标志位):第3位向第4位发生了进位,那么AF标志位置1 6 ZF Zero Flag:结果为0,置1 7 SF Sign Flag:结果为负数(最高位为1),置1 8 TF Trap Flag:陷阱标志位 ,用于调试,置 1 时单步执行。 9 IF Interrupt enable Flag:是否响应中断 10 DF Direction Flag(方向标志位)控制字符串操作的方向(0:递增,1:递减) 11 OF Overflow Flag(溢出标志位) 12-13 IOPL I/O privilege level:控制 I/O 指令的执行权限 14 NT Nested task 16 RF Resume Flag 用于调试,控制是否忽略断点 17 VM Virtual-8086 mode:置 1 时进入虚拟 8086 模式 18 AC Alignment check / Access Control:置 1 时启用对齐检查 19 VIF Virtual Interrupt Flag:虚拟模式下的中断标志 20 VIP Virtual Interrupt Pending:虚拟模式下的中断挂起状态。 21 ID ID Flag :支持 CPUID 指令的标志 cmp a, b ;计算 a-b 的值,并设置标志寄存器 对于无符号数字计算,存在以下场景: ZF(Zero Flag), CF(Carry Flag) a=b =\u003e ZF=1, CF=0 a\u003eb =\u003e ZF=0, CF=0 a\u003cb =\u003e ZF=0, CF=1 对于有符号数字计算,存在以下场景: ZF(Zero Flag), OF(Overflow Flag), SF(Sign Flag) a=b =\u003e ZF=1 a\u003eb =\u003e ZF=0, OF = SF a\u003cb =\u003e ZF=0, OF != SF 跳转指令 指令 条件 JZ branch only if ZF=1 JNZ branch only if ZF=0 JO branch only if OF=1 JNO branch only if OF=0 JS branch only if SF=1 JNS branch only if SF=0 JC branch only if CF=1 JNC branch only if CF=0 JP branch only if PF=1 JNP branch only if PF=0 jmp label 循环指令 loop label ; ecx--,如果 ecx != 0,跳转到 label loope label ; ecx--,如果 ecx != 0 且 ZF=1,跳转到 label loopne label ; ecx--,如果 ecx != 0 且 ZF=0,跳转到 label ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:4:4","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"函数调用 call func01 ;调用函数 ret ;函数返回 ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:4:5","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":["Computer"],"content":"参考阅读 汇编语言-B站 Arch Linux - v86 (copy.sh) ","date":"2024-12-24","objectID":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/:5:0","tags":["Computer","汇编语言"],"title":"汇编语言","uri":"/posts/computer/%E6%B1%87%E7%BC%96%E8%AF%AD%E8%A8%80/"},{"categories":null,"content":"fengchen 的个人介绍","date":"2024-12-01","objectID":"/about/","tags":null,"title":"About","uri":"/about/"},{"categories":null,"content":"fengchen 👍 ","date":"2024-12-01","objectID":"/about/:1:0","tags":null,"title":"About","uri":"/about/"},{"categories":["Other"],"content":"Record","date":"2024-12-01","objectID":"/posts/other/record/","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"常用工具 ","date":"2024-12-01","objectID":"/posts/other/record/:1:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"WSL wsl --install -d Ubuntu-22.04 # 安装后重启 wsl --shutdown # 使其stop wsl --export Ubuntu-22.04 D:\\wsl_ubuntu\\Ubuntu.tar # 导出备份 wsl --unregister Ubuntu-22.04 #删除当前安装的系统 wsl --import Ubuntu-22.04 D:\\wsl_ubuntu D:\\wsl_ubuntu\\Ubuntu.tar Ubuntu2204 config --default-user fengchen ","date":"2024-12-01","objectID":"/posts/other/record/:2:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"VSCode远程连接AC平台 vscode1.86无法远程连接waiting the server log-CSDN博客 在windows系统上安装vscode,并在扩展中心搜索并安装Remote Development插件。 在windows系统上生成一对公钥和私钥,可以使用命令ssh-keygen -t rsa,然后一路回车。注意保存好您的私钥文件(id_rsa)和公钥文件(id_rsa.pub)的路径。 在linux服务器上安装ssh服务,如果已经安装了,请确保22端口是开放的。(AC平台忽略这步) 在linux服务器上在.ssh目录下创建一个名为authorized_keys的文件,然后将您的公钥文件(id_rsa.pub)的内容复制到该文件中。 在您的vscode中按F1键(或者左下角小图标-设置下面),输入ssh,然后选择打开SSH配置文件,编辑config文件,按照以下格式填写服务器信息: Host 主机名 HostName IP地址 Port 端口号 User 用户名 ForwardAgent yes 保存config文件后,点击左侧的小图标选择``连接到主机Remote-ssh`,选择配置好的服务器即可。 终端主题选择 复制到vscode中的settings.json中 \"workbench.colorCustomizations\": { \"terminal.background\": \"#1D2021\", \"terminal.foreground\": \"#D5C4A1\", \"terminalCursor.background\": \"#D5C4A1\", \"terminalCursor.foreground\": \"#D5C4A1\", \"terminal.ansiBlack\": \"#1D2021\", \"terminal.ansiBlue\": \"#83A598\", \"terminal.ansiBrightBlack\": \"#665C54\", \"terminal.ansiBrightBlue\": \"#83A598\", \"terminal.ansiBrightCyan\": \"#8EC07C\", \"terminal.ansiBrightGreen\": \"#B8BB26\", \"terminal.ansiBrightMagenta\": \"#D3869B\", \"terminal.ansiBrightRed\": \"#FB4934\", \"terminal.ansiBrightWhite\": \"#FBF1C7\", \"terminal.ansiBrightYellow\": \"#FABD2F\", \"terminal.ansiCyan\": \"#8EC07C\", \"terminal.ansiGreen\": \"#B8BB26\", \"terminal.ansiMagenta\": \"#D3869B\", \"terminal.ansiRed\": \"#FB4934\", \"terminal.ansiWhite\": \"#D5C4A1\", \"terminal.ansiYellow\": \"#FABD2F\" } ","date":"2024-12-01","objectID":"/posts/other/record/:3:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"vscode 插件 vscode 集成 Neovim - 简书 (jianshu.com) ","date":"2024-12-01","objectID":"/posts/other/record/:3:1","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"安装rocm环境 sudo apt update wget https://repo.radeon.com/amdgpu-install/6.2.3/ubuntu/jammy/amdgpu-install_6.2.60203-1_all.deb sudo apt install ./amdgpu-install_6.2.60203-1_all.deb sudo amdgpu-install --list-usecase # 显示可用用例的列表 amdgpu-install -y --usecase=wsl,rocm --no-dkms ","date":"2024-12-01","objectID":"/posts/other/record/:4:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"MobaXterm MobaXterm的基本使用与快捷键介绍 - 木卯生十木 - 博客园 (cnblogs.com) ","date":"2024-12-01","objectID":"/posts/other/record/:5:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"Source Insight New project -\u003e 新建工程名字;保存路径; project source directory:输入程序源代码的路径 add all 为工程添加文件 ,全部勾选;Show only known file types这一选项来选择显示其它类型的文件 ","date":"2024-12-01","objectID":"/posts/other/record/:6:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"软件 键盘/🎮手柄按键 检测及历史记录显示工具 Windows11、Win10完美去除快捷方式小箭头的方法 - 知乎 (zhihu.com) # .bat 管理员运行 去除箭头 win11 reg add \"HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Explorer\\Shell Icons\" /v 29 /d \"%systemroot%\\system32\\imageres.dll,197\" /t reg_sz /f taskkill /f /im explorer.exe attrib -s -r -h \"%userprofile%\\AppData\\Local\\iconcache.db\" del \"%userprofile%\\AppData\\Local\\iconcache.db\" /f /q start explorer pause # win10 reg add \"HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Explorer\\Shell Icons\" /v 29 /d \"%systemroot%\\system32\\imageres.dll,197\" /t reg_sz /f taskkill /f /im explorer.exe start explorer pause # 恢复箭头 reg delete \"HKEY_LOCAL_MACHINE\\SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Explorer\\Shell Icons\" /v 29 /f taskkill /f /im explorer.exe start explorer pause ","date":"2024-12-01","objectID":"/posts/other/record/:7:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"终端美化 ohmyzsh sudo apt install zsh # git也要安装 sh -c \"$(wget -O- https://install.ohmyz.sh/)\" ","date":"2024-12-01","objectID":"/posts/other/record/:7:1","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"专利检索平台 访问网址 ","date":"2024-12-01","objectID":"/posts/other/record/:8:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"pycharm激活插件 在File-Settings -\u003e Plugins 内手动添加第三方插件仓库地址:https://plugins.zhile.io IDE Eval Reset ","date":"2024-12-01","objectID":"/posts/other/record/:9:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"Scientific Toolworks Understand安装 Setup x64.exe安装 安装目录的bin文件夹下替换crack的understand.exe Understand 6.4.1141破解_understand离线激活-CSDN博客 ","date":"2024-12-01","objectID":"/posts/other/record/:10:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"typora 使用 Typora 画图(类图、流程图、时序图) - 知乎 (zhihu.com) 一不小心没保存可以在临时目录里找:C:\\Users\\用户名\\AppData\\Roaming\\Typora\\draftsRecover ","date":"2024-12-01","objectID":"/posts/other/record/:11:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"Latex KaTeX 默认不支持 numcases 环境,还是使用cases吧。 katex在线 ","date":"2024-12-01","objectID":"/posts/other/record/:12:0","tags":["Other"],"title":"Record","uri":"/posts/other/record/"},{"categories":["Other"],"content":"Blog配置","date":"2024-11-24","objectID":"/posts/other/blog/","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["Other"],"content":"安装Hugo winget install Hugo.Hugo.Extended 安装完成后查看hugo版本验证安装是否成功 hugo version ","date":"2024-11-24","objectID":"/posts/other/blog/:1:0","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["Other"],"content":"配置博客源 ","date":"2024-11-24","objectID":"/posts/other/blog/:2:0","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["Other"],"content":"使用hugo创建网站 hugo new site blog # 可以替换成任意你想要的名字 # github里创建同名仓库,到时候git push进去就行 ","date":"2024-11-24","objectID":"/posts/other/blog/:3:0","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["Other"],"content":"安装主题 hugo 主题 根据主题文档安装,用的fixit主题 cd blog git init git submodule add https://github.com/hugo-fixit/FixIt.git themes/FixIt git submodule update --remote --merge themes/FixIt 测试只需要把主blog/themes/FixIt/demo放在blog下相对于的文件夹里就行,比如 cp themes/FixIt/demo/hugo.toml hugo.toml # blog/themes/FixIt/demo/content/posts 替换blog/content 其中修改hugo.toml里的baseurl修改成你的网站.github.io ","date":"2024-11-24","objectID":"/posts/other/blog/:4:0","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["Other"],"content":"本地调试和预览 创建文件 hugo new posts/test/a.md 站点调试 hugo server --buildDrafts hugo server -D hugo server -D --disableFastRender ","date":"2024-11-24","objectID":"/posts/other/blog/:5:0","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["Other"],"content":"配置Action settings -\u003e Developer Settings -\u003e Personal access tokens(Token classic) -\u003e generate new token里创建一个tokens,注意勾选repo和workflow权限 在博客源仓库的Settings -\u003e Secrets and variables -\u003e Actions -\u003e Repository secrets中添加一个NAME为ACTION_TOKEN(随便什么名字,后面要使用)内容为刚刚创建的tokens 创建 blog/.github/workflows/gh-pages.yml name: GitHub Pages on: push: branches: - main pull_request: jobs: deploy: runs-on: ubuntu-22.04 concurrency: group: ${{ github.workflow }}-${{ github.ref }} steps: - uses: actions/checkout@v4 with: submodules: true # Fetch Hugo themes (true OR recursive) fetch-depth: 0 # Fetch all history for .GitInfo and .Lastmod - name: Setup Hugo uses: peaceiris/actions-hugo@v3 with: hugo-version: '0.139.0' extended: true # 是否启用hugo extended - name: Build run: hugo --minify - name: Deploy uses: peaceiris/actions-gh-pages@v3 with: EXTERNAL_REPOSITORY: fengchen321/fengchen321.github.io # 你的Github Pages远程仓库名 PERSONAL_TOKEN: ${{ secrets.ACTION_TOKEN }} # setting 存放的名字而不是原始key PUBLISH_DIR: ./public PUBLISH_BRANCH: main push该博客源即可 ","date":"2024-11-24","objectID":"/posts/other/blog/:6:0","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["Other"],"content":"参考阅读 使用 Hugo + Github Pages 部署个人博客 matrix-a ","date":"2024-11-24","objectID":"/posts/other/blog/:7:0","tags":["Other"],"title":"Blog配置","uri":"/posts/other/blog/"},{"categories":["C++"],"content":"C++并发","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"C++ Concurrency 进程和线程的区别 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:1:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"线程基础 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"初始化线程对象 启动线程后要明确是等待线程结束join(),还是让其自主运行detach()。否则程序会终止(std::thread的析构函数会调用std::terminate())。 等待线程结束,来保证可访问的数据是有效的。 只能对一个线程使用一次join(),当对其使用joinable()时,将返回false。 void hello() { std::cout \u003c\u003c \"Hello world !\" \u003c\u003c std::endl; } std::thread t1(hello); t1.join(); class background_task { public: void operator()() { // 重载()运算符 hello(); } }; // my_thread被当作函数对象的定义,其返回类型为std::thread, 参数为函数指针background_task() // std::thread my_thread(background_task()); // 相当与声明了一个名为my_thread的函数 // 使用一组额外的括号,或使用新统一的初始化语法,可以避免其解释函数声明 (定义一个线程my_thread) std::thread my_thread_1((background_task())); std::thread my_thread_2{background_task()}; // lambda表达式 std::thread my_thread_3([](){ hello(); }); my_thread_3.join(); ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:1","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"detach 线程允许采用分离的方式在后台独自运行。 当oops调用后,局部变量some_local_state可能被释放。 通过智能指针传递参数。 (引用计数会随着赋值增加,可保证局部变量在使用期间不被释放) 将局部变量的值作为参数传递。(需要局部变量有拷贝复制的功能,而且拷贝耗费空间和效率) 将线程运行的方式修改为join。(可能会影响运行逻辑) struct func{ int\u0026 _i; func(int \u0026 i): _i(i){} void operator()(){ for (int i = 0; i \u003c 3; i++){ _i = i; std::cout \u003c\u003c \"_i is \" \u003c\u003c _i \u003c\u003c std::endl; std::this_thread::sleep_for(std::chrono::seconds(1)); } } }; void oops() { int some_locate_state = 0; func myfunc(some_locate_state); std::thread functhread(myfunc); // 访问局部变量。局部变量可能会随着}结束而回收或随着主线程退出而回收 functhread.detach(); } void use_join() { int some_locate_state = 0; func myfunc(some_locate_state); std::thread functhread(myfunc); functhread.join(); } oops(); // 防止主线程退出过快 std::this_thread::sleep_for(std::chrono::seconds(1)); // 使用join use_join(); ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:2","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"捕获异常 捕获异常,并且在异常情况下保证子线程稳定运行结束后,主线程抛出异常结束运行。 void catch_exception() { int some_locate_state = 0; func myfunc(some_locate_state); std::thread functhread(myfunc); try { std::this_thread::sleep_for(std::chrono::seconds(1)); } catch (std::exception\u0026 e) { functhread.join(); throw; } functhread.join(); } 线程守卫:采用RAII技术,保证线程对象析构的时候等待线程运行结束,回收资源 // RAII 资源获取初始化 class thread_guard { private: std::thread\u0026 _t; public: explicit thread_guard(std::thread\u0026 t): _t(t){} ~thread_guard() { // join只能调用一次 if (_t.joinable()){ _t.join(); } } thread_guard(thread_guard const\u0026) = delete; thread_guard\u0026 operator=(thread_guard const\u0026) = delete; }; void auto_guard() { int some_locate_state = 0; func myfunc(some_locate_state); std::thread functhread(myfunc); thread_guard g(functhread); std::cout \u003c\u003c \"auto guard finished\" \u003c\u003c std::endl; } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:3","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"参数传递 void print_str(int i, std::string const\u0026 s) { std::cout \u003c\u003c \"i is \" \u003c\u003c i \u003c\u003c \", str is \" \u003c\u003c s \u003c\u003c std::endl; } void danger_oops(int som_param) { char buffer[1024]; sprintf(buffer, \"%i\", som_param); std::thread t(print_str, 3, buffer); // 局部变量buffer可能回收 t.detach(); std::cout \u003c\u003c \"danger oops finished\" \u003c\u003c std::endl; } void safe_oops(int som_param) { char buffer[1024]; sprintf(buffer, \"%i\", som_param); std::thread t(print_str, 3, std::string(buffer)); // 显示创建一个std::string对象 t.detach(); std::cout \u003c\u003c \"safe oops finished\" \u003c\u003c std::endl; } 当线程要调用的回调函数参数为引用类型时,需要将参数显示转化为引用对象传递给线程的构造函数。 void chage_param(int\u0026 param){ param++; } void ref_oops(int som_param) { std::cout \u003c\u003c \"before change, param is \" \u003c\u003c som_param \u003c\u003c std::endl; std::thread t(chage_param,std::ref(som_param));// 不加stds:ref会盲目复制,传递的是副本的引用即data副本(copy)的引用 t.join(); std::cout \u003c\u003c \"after change, param is \" \u003c\u003c som_param \u003c\u003c std::endl; } 绑定类的成员函数,必须添加\u0026 class X { public: void do_lengthy_work(){ std::cout \u003c\u003c \"do_lengthy_work \" \u003c\u003c std::endl; } }; void bind_class_oops() { X my_x; std::thread t(\u0026X::do_lengthy_work, \u0026my_x); t.join(); } 有时候传递给线程的参数是独占的(不支持拷贝赋值和构造),可以通过std::move的方式将参数的所有权转移给线程 void deal_unique(std::unique_ptr\u003cint\u003e p) { std::cout \u003c\u003c \"unique ptr data is \" \u003c\u003c *p \u003c\u003c std::endl; (*p)++; std::cout \u003c\u003c \"after unique ptr data is \" \u003c\u003c *p \u003c\u003c std::endl; } void move_oops() { auto p = std::make_unique\u003cint\u003e(100); std::thread t(deal_unique, std::move(p)); t.join(); } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:4","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"线程归属 使用std::move移动归属; 不能将一个线程的管理权交给一个已经绑定线程的变量,会触发线程的terminate函数引发崩溃 void some_function(){ while (true) { std::this_thread::sleep_for(std::chrono::seconds(1)); } } void some_other_function(){ while (true) { std::this_thread::sleep_for(std::chrono::seconds(1)); } } std::thread t1(some_function); std::thread t2 = std::move(t1); t1 = std::thread(some_other_function); std::thread t3; t3 = std::move(t2); // t1 = std::move(t3); // 将一个线程的管理权交给一个已经绑定线程的变量,会触发线程的terminate函数引发崩溃 std::this_thread::sleep_for(std::chrono::seconds(10)); 自动join的线程类 joining_thread class joining_thread { std::thread _t; public: joining_thread() noexcept = default; template\u003ctypename Callable, typename ... Args\u003e explicit joining_thread(Callable\u0026\u0026 func, Args\u0026\u0026 ...args): _t(std::forward\u003cCallable\u003e(func), std::forward\u003cArgs\u003e(args)...){} explicit joining_thread(std::thread t) noexcept: _t(std::move(t)){} joining_thread(joining_thread\u0026\u0026 other) noexcept: _t(std::move(other._t)){} joining_thread\u0026 operator=(joining_thread\u0026\u0026 other) noexcept { if (joinable()) { join(); } _t = std::move(other._t); return *this; } joining_thread\u0026 operator=(std::thread other) noexcept { if (joinable()) { join(); } _t = std::move(other); return *this; } ~joining_thread() noexcept { if (joinable()) { join(); } } void swap(joining_thread\u0026 other) noexcept { _t.swap(other._t); } std::thread::id get_id() const noexcept { return _t.get_id(); } bool joinable() const noexcept { return _t.joinable(); } void join() { _t.join(); } void detach() { _t.detach(); } std::thread\u0026 as_thread() noexcept { return _t; } const std::thread\u0026 as_thread() const noexcept { return _t; } }; ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:5","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"容器存储 生成一批线程并等待它们完成。初始化多个线程存储在vector中, 采用的时emplace方式,可以直接根据线程构造函数需要的参数构造,这样就避免了调用thread的拷贝构造函数。 void param_function(int a) { std::cout \u003c\u003c \"param is \" \u003c\u003c a \u003c\u003c std::endl; std::this_thread::sleep_for(std::chrono::seconds(1)); } void use_vector() { unsigned int N = std::thread::hardware_concurrency(); std::vector\u003cstd::thread\u003e threads; for (unsigned int i = 0; i \u003c N; ++i) { threads.emplace_back(param_function, i); } for (auto\u0026 entry : threads) { if (entry.joinable()) { entry.join(); } } threads.clear(); } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:6","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"选择运行数量 std::thread::hardware_concurrency()函数,它的返回值是一个指标,表示程序在各次运行中可真正并发的线程数量。 template\u003ctypename Iterator, typename T\u003e struct accumulate_block{ void operator()(Iterator first, Iterator last, T\u0026 result){ result = std::accumulate(first, last, result); } }; template\u003ctypename Iterator, typename T\u003e T parallel_accumulate(Iterator first, Iterator last, T init){ unsigned long const length = std::distance(first, last); if (!length) { // 1.输入为空,返回初始值init return init; } unsigned long const min_per_thread = 25; unsigned long const max_threads = (length + min_per_thread - 1) / min_per_thread; // 2.需要的线程最大数(向上取整) unsigned long const hardware_threads = std::thread::hardware_concurrency(); unsigned long const num_threads = std::min(hardware_threads!=0 ? hardware_threads : 2, max_threads); // 3.实际的线程选择数量 unsigned long const block_size = length / num_threads; // 4.每个线程待处理的条目数量,步长 std::vector\u003cT\u003e results(num_threads); std::vector\u003cstd::thread\u003e threads(num_threads - 1); // 5.初始化了(num_threads - 1)个大小的vector,因为主线程也参与计算 Iterator block_start = first; for (unsigned long i = 0; i \u003c num_threads - 1; ++i){ Iterator block_end = block_start; std::advance(block_end, block_size); // 6. 递进block_size迭代器到当前块的结尾 threads[i] = std::thread(accumulate_block\u003cIterator, T\u003e(), block_start, block_end, std::ref(results[i])); // 7.启动新的线程计算结果 block_start = block_end; // 8.更新起始位置 } accumulate_block\u003cIterator, T\u003e()( block_start, last, results[num_threads - 1]); // 9. 主线程计算,处理最后的块 for (auto\u0026 entry : threads){ if (entry.joinable()){ entry.join(); } } return std::accumulate(results.begin(), results.end(), init); // 10. 累加 } void use_parallel_acc(int N) { auto start = std::chrono::high_resolution_clock::now(); std::vector \u003cint\u003e vec(N); for (int i = 0; i \u003c N; i++) { vec.push_back(i); } int sum = 0; sum = parallel_accumulate\u003cstd::vector\u003cint\u003e::iterator, int\u003e(vec.begin(), vec.end(), sum); auto end = std::chrono::high_resolution_clock::now(); std::chrono::duration\u003cdouble\u003e timeDuration = end - start; double duration = timeDuration.count(); std::cout \u003c\u003c \"use_parallel_acc sum is \" \u003c\u003c sum \u003c\u003c \" duration: \" \u003c\u003c duration \u003c\u003c std::endl; } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:7","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"识别线程 获取线程ID,根据线程id是否相同判断是否同一个线程 通过get_id()成员函数来获取 std::this_thread::get_id()获取 void do_subthread(){ std::cout \u003c\u003c \"do sub thread work \" \u003c\u003c std::this_thread::get_id() \u003c\u003c std::endl; } void thread_id(){ std::thread::id master_thread = std::this_thread::get_id(); std::thread t(do_subthread); std::cout \u003c\u003c \"do_subthread id: \" \u003c\u003c t.get_id() \u003c\u003c std::endl; // 线程可能没运行,可能会返回一个空的 std::thread::id t.join(); if (std::this_thread::get_id() == master_thread){ std::cout \u003c\u003c \"do master thread work: \"\u003c\u003c std::this_thread::get_id() \u003c\u003c std::endl; } std::cout \u003c\u003c \"do common thread work: \" \u003c\u003c std::this_thread::get_id() \u003c\u003c std::endl; } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:2:8","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"锁 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:3:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"避免竞争 lock_guard 保护机制封装数据结构 互斥量(mutex) —-lock加锁 和unlock解锁 std::lock_guard\u003cstd::mutex\u003e lock(mtx1):互斥量RAII惯用法,自动加锁和解锁 不要将对受保护数据的指针和引用传递到锁的范围之外 死锁 deadlock:每个线程都在等待另一个释放其mutex 使用相同顺序锁定两个mutex 将加锁和解锁的功能封装为独立的函数 使用两个互斥量,同时加锁 减少锁的使用范围 层级锁:同一个函数内部加多个锁的情况,要尽可能避免循环加锁,自定义一个层级锁来保证项目中对多个互斥量加锁时是有序的。 修改数据结构的设计及不变量 (无锁编程) ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:3:1","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"同时加锁 方法1: std::lock(objm1._mtx, objm2._mtx); std::lock_guard\u003cstd::mutex\u003e guard1(objm1._mtx, std::adopt_lock); //领养锁,只负责解锁,不负责加锁 std::lock_guard\u003cstd::mutex\u003e guard2(objm2._mtx, std::adopt_lock); 方法2: std::scoped_lock guard(objm1._mtx, objm2._mtx); // c++17 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:3:2","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"层级锁 class hierarchical_mutex { public: explicit hierarchical_mutex(unsigned long value):_hierarchy_value(value), _previous_hierarchy_value(0){} hierarchical_mutex(const hierarchical_mutex\u0026) = delete; hierarchical_mutex\u0026 operator=(const hierarchical_mutex\u0026) = delete; void lock(){ check_for_hierarchy_violation(); _internal_mutex.lock(); // 实际锁定 update_hierarchy_value(); //更新层级值 } void unlock(){ if (_this_thread_hierarchy_value != _hierarchy_value) { throw std::logic_error(\"mutex hierarchy violated\"); } _this_thread_hierarchy_value = _previous_hierarchy_value; // 保存当前线程之前的层级值 _internal_mutex.unlock(); } bool try_lock(){ check_for_hierarchy_violation(); if (!_internal_mutex.try_lock()){ return false; } update_hierarchy_value(); return true; } private: std::mutex _internal_mutex; unsigned long const _hierarchy_value; // 当前层级值 unsigned long _previous_hierarchy_value; // 上一次层级值 static thread_local unsigned long _this_thread_hierarchy_value; // 当前线程记录的层级值 void check_for_hierarchy_violation(){ if (_this_thread_hierarchy_value \u003c= _hierarchy_value){ throw std::logic_error(\"mutex hierarchy violated\"); } } void update_hierarchy_value(){ _previous_hierarchy_value = _this_thread_hierarchy_value; _this_thread_hierarchy_value = _hierarchy_value; } }; thread_local unsigned long hierarchical_mutex::_this_thread_hierarchy_value(ULONG_MAX); //初始化为最大值 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:3:3","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"unique_lock unique_lock:可以手动解锁。,通过unique_lock的owns_lock判断是否持有锁 std::mutex mtx; int shared_data = 0; void use_unique_owns() { std::unique_lock\u003cstd::mutex\u003e guard(mtx); if (guard.owns_lock()){ std::cout \u003c\u003c \"owns lock\" \u003c\u003c std::endl; } else { std::cout \u003c\u003c \"doesn't own lock\" \u003c\u003c std::endl; } shared_data++; guard.unlock(); if (guard.owns_lock()){ std::cout \u003c\u003c \"owns lock\" \u003c\u003c std::endl; } else { std::cout \u003c\u003c \"doesn't own lock\" \u003c\u003c std::endl; } } 支持领养和延迟加锁 将std::adopt_lock作为第二参数传入构造函数,对互斥量进行管理 将std::defer_lock作为第二参数传入构造函数,表明互斥量应保持解锁状态。 int a = 10, b = 100; std::mutex mtx1; std::mutex mtx2; void safe_swap_adopt(){ std::lock(mtx1, mtx2); std::unique_lock\u003cstd::mutex\u003e guard1(mtx1, std::adopt_lock); std::unique_lock\u003cstd::mutex\u003e guard2(mtx2, std::adopt_lock); std::swap(a,b); guard1.unlock(); // 可自动释放, 已经领养不能mtx1.unlock() guard2.unlock(); std::cout \u003c\u003c \"a = \" \u003c\u003c a \u003c\u003c \", b = \" \u003c\u003c b \u003c\u003c std::endl; } void safe_swap_defer(){ std::unique_lock\u003cstd::mutex\u003e guard1(mtx1, std::defer_lock); std::unique_lock\u003cstd::mutex\u003e guard2(mtx2, std::defer_lock); std::lock(guard1, guard2); std::swap(a,b); std::cout \u003c\u003c \"a = \" \u003c\u003c a \u003c\u003c \", b = \" \u003c\u003c b \u003c\u003c std::endl; } mutex是不支持移动和拷贝的,unique_lock可移动,不可赋值 std::unique_lock\u003cstd::mutex\u003e get_lock() { std::unique_lock\u003cstd::mutex\u003e lk(mtx); shared_data++; return lk; } void test_return() { std::unique_lock\u003cstd::mutex\u003e lk(get_lock()); shared_data++; } 锁的粒度:表示加锁的精细程度。 一个锁的粒度要足够大,以保证可以锁住要访问的共享数据。 一个锁的粒度要足够小,以保证非共享的数据不被锁住影响性能。 void precision_lock() { std::unique_lock\u003cstd::mutex\u003e lk(mtx); shared_data++; lk.unlock(); // 不涉及共享数据的耗时操作不在锁内执行; std::this_thread::sleep_for(std::chrono::seconds(1)); lk.lock(); shared_data++; lk.unlock(); } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:3:4","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"shared_lock C++11标准没有共享互斥量,可以使用boost提供的boost::shared_mutex std::shared_mutex(c++17) 提供lock()、try_lock_for()和try_lock_until()用于获取互斥锁的函数 提供try_lock_shared()和lock_shared()用于获取共享锁的函数 当 std::shared_mutex 被锁定后,其他尝试获取该锁的线程将会被阻塞,直到该锁被解锁 std::shared_timed_mutex (c++14、17) 提供lock()、try_lock_for()和try_lock_until()用于获取互斥锁的函数 提供try_lock_shared()和lock_shared()用于获取共享锁的函数 (超时机制) 尝试获取共享锁时,如果不能立即获得锁,std::shared_timed_mutex 会设置一个超时,超时过后如果仍然没有获取到锁,则操作将返回失败。 写操作需要独占锁。而读操作需要共享锁。 class dns_cache { public: std::string find_entry(std::string const \u0026 domain) const { std::shared_lock\u003cstd::shared_mutex\u003e lk(entry_mutex); // 保护共享和只读权限 std::map\u003cstd::string, std::string\u003e::const_iterator const it = entries.find(domain); return (it == entries.end()) ? \"\" : it-\u003esecond; } void update_or_add_entry(std::string const \u0026 domain, std::string const\u0026 dns_details) { std::lock_guard\u003cstd::shared_mutex\u003e lk(entry_mutex); entries[domain] = dns_details; } private: std::map\u003cstd::string, std::string\u003e entries; mutable std::shared_mutex entry_mutex; }; ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:3:5","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"recursive_lock 出现一个接口调用另一个接口的情况,如果用普通的std::mutex就会出现卡死 class RecursiveDemo { public: RecursiveDemo() {} bool QueryStudent(std::string name) { // std::lock_guard\u003cstd::mutex\u003e mutex_lock(_mtx); std::lock_guard\u003cstd::recursive_mutex\u003e recursive_lock(_recursive_mtx); auto iter_find = _students_info.find(name); if (iter_find == _students_info.end()) { return false; } return true; } void AddScore(std::string name, int score) { // std::lock_guard\u003cstd::mutex\u003e mutex_lock(_mtx); std::lock_guard\u003cstd::recursive_mutex\u003e recursive_lock(_recursive_mtx); if (!QueryStudent(name)) { _students_info.insert(std::make_pair(name, score)); return; } _students_info[name] = _students_info[name] + score; } void AddScoreAtomic(std::string name, int score) { std::lock_guard\u003cstd::mutex\u003e mutex_lock(_mtx); // std::lock_guard\u003cstd::recursive_mutex\u003e recursive_lock(_recursive_mtx); auto iter_find = _students_info.find(name); if (iter_find == _students_info.end()){ _students_info.insert(std::make_pair(name, score)); return; } _students_info[name] = _students_info[name] + score; return; } private: std::map\u003cstd::string, int\u003e _students_info; std::mutex _mtx; std::recursive_mutex _recursive_mtx; }; ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:3:6","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"同步并发操作 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:4:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"条件变量 std::condition_variable , 与std::mutex一起 std::condition_variable_any,与满足最低标准的互斥量一起 条件不满足时(num 不等于1 时)cvA.wait就会挂起,等待线程B通知通知线程A唤醒,线程B采用的是cvA.notifyone void ResonableImplemention() { std::thread t1([](){ while(true){ std::unique_lock\u003cstd::mutex\u003e lk(mtx_num); // 方法一 // while (num != 1) { // cvA.wait(lk); // } // 方法二 cvA.wait(lk, []() { return num == 1; }); std::cout \u003c\u003c \"thread A print 1.....\" \u003c\u003c std::endl; num++; cvB.notify_one(); } }); std::thread t2([](){ while(true){ std::unique_lock\u003cstd::mutex\u003e lk(mtx_num); cvB.wait(lk, []() { return num == 2; }); std::cout \u003c\u003c \"thread B print 2.....\" \u003c\u003c std::endl; num--; cvA.notify_one(); } }); t1.join(); t2.join(); } // 队列实现,和之前栈实现类似 template\u003ctypename T\u003e class threadsafe_queue { public: threadsafe_queue(){} threadsafe_queue(const threadsafe_queue\u0026 other) { std::lock_guard\u003cstd::mutex\u003e lk(other.mut); data_queue = other.data_queue; } threadsafe_queue\u0026 operator=(const threadsafe_queue\u0026) = delete; void push(T new_value) { std::lock_guard\u003cstd::mutex\u003e lk(mut); data_queue.push(new_value); data_cond.notify_one(); } void wait_and_pop(T\u0026 value) { std::unique_lock\u003cstd::mutex\u003e lk(mut); data_cond.wait(lk, [this]{return !data_queue.empty();}); value = data_queue.front(); data_queue.pop(); } std::shared_ptr\u003cT\u003e wait_and_pop() { std::unique_lock\u003cstd::mutex\u003e lk(mut); data_cond.wait(lk, [this]{return !data_queue.empty();}); std::shared_ptr\u003cT\u003e res(std::make_shared\u003cT\u003e(data_queue.front())); data_queue.pop(); return res; } bool try_pop(T\u0026 value) { std::lock_guard\u003cstd::mutex\u003e lk(mut); if (data_queue.empty()) { return false; } value = data_queue.front(); data_queue.pop(); return true; } std::shared_ptr\u003cT\u003e try_pop() { std::lock_guard\u003cstd::mutex\u003e lk(mut); if (data_queue.empty()) { return std::shared_ptr\u003cT\u003e(); } std::shared_ptr\u003cT\u003e res(std::make_shared\u003cT\u003e(data_queue.front())); data_queue.pop(); return res; } bool empty() const { std::lock_guard\u003cstd::mutex\u003e lk(mut); return data_queue.empty(); } private: mutable std::mutex mut; std::queue\u003cT\u003e data_queue; std::condition_variable data_cond; }; ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:4:1","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"async 用于异步执行函数的模板函数,它返回一个 std::future 对象,该对象用于获取函数的返回值。 类似std::thread,通过添加额外的调用参数,向函数传递额外的参数。 std::string fetchDataFromDB(std::string query) { std::this_thread::sleep_for(std::chrono::seconds(2)); return \"Data: \" + query; } void use_asyc() { // 使用 std::async 异步调用 fetchDataFromDB std::future\u003cstd::string\u003e resultFromDB = std::async(std::launch::async, fetchDataFromDB, \"Data\"); // 在主线程中做其他事情 std::cout \u003c\u003c \"Doing something else...\" \u003c\u003c std::endl; std::this_thread::sleep_for(std::chrono::seconds(4)); std::cout \u003c\u003c \"past 4s\" \u003c\u003c std::endl; // 从 future 对象中获取数据 std::string dbData = resultFromDB.get(); std::cout \u003c\u003c dbData \u003c\u003c std::endl; } std::async 创建了一个新的线程(或从内部线程池中挑选一个线程)并自动与一个 std::promise 对象相关联。std::promise 对象被传递给 fetchDataFromDB 函数,函数的返回值被存储在 std::future 对象中。在主线程中,使用 std::future::get 方法从 std::future 对象中获取数据。注意,在使用 std::async 的情况下,必须使用 std::launch::async 标志来明确表明希望函数异步执行。 启动策略:在std::launch枚举中定义。 enum class launch { async = 1, deferred = 2 }; std::launch::async:表明函数必须在其所在的独立线程上执行 std::launch::deferred:表明函数调用被延迟到std::future::get()或std::future::wait()时才执行。(要结果的时候才执行) std::launch::async | std::launch::deferred:(默认使用)任务可以在一个单独的线程上异步执行,也可以延迟执行,具体取决于实现。 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:4:2","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"future 期望值 唯一期望值:std::futurte\u003c\u003e;只能与一个指定事件相关联 共享期望值:std::shared_future\u003c\u003e:可关联多个事件,所有实例同时变为就绪状态。 std::future::get():阻塞调用,用于获取并返回任务的结果;只能调用一次 std::future::wait(): 阻塞调用,只是等待任务完成;可以被多次调用 std::future::wait_for()和std::future::wait_until检查异步操作是否已完成,返回一个表示操作状态的std::future_status值 任务与future关联 std::packaged_task:是一个可调用对象,它包装了一个任务,该任务可以在另一个线程上运行。它可以捕获任务的返回值或异常,并将其存储在std::future对象中,以便以后使用。 创建一个std::packaged_task对象,该对象包装了要执行的任务。 调用std::packaged_task对象的get_future()方法,该方法返回一个与任务关联的std::future对象。 在另一个线程上调用std::packaged_task对象的operator(),以执行任务。 在需要任务结果的地方,调用与任务关联的std::future对象的get()方法,以获取任务的返回值或异常。 int my_task() { std::this_thread::sleep_for(std::chrono::seconds(5)); std::cout \u003c\u003c \"my task run 5 s\" \u003c\u003c std::endl; return 0; } void use_package() { std::packaged_task\u003cint()\u003e task(my_task); //创建一个`std::packaged_task`对象,该对象包装了要执行的任务。 std::future\u003cint\u003e result = task.get_future(); // // 获取与任务关联的 std::future 对象 std::thread t(std::move(task)); // 在另一个线程上执行任务 t.detach(); int value = result.get(); // 等待任务完成并获取结果 std::cout \u003c\u003c \"The result is: \" \u003c\u003c value \u003c\u003c std::endl; } 共享类型的future 多个线程等待同一个异步操作的结果 void myFunction(std::promise\u003cint\u003e\u0026\u0026 promise) { std::this_thread::sleep_for(std::chrono::seconds(1)); promise.set_value(42); // 设置 promise 的值 } void threadFunction(std::shared_future\u003cint\u003e future) { try { int result = future.get(); std::cout \u003c\u003c \"Result: \" \u003c\u003c result \u003c\u003c std::endl; } catch (const std::future_error\u0026 e) { std::cout \u003c\u003c \"Future error: \" \u003c\u003c e.what() \u003c\u003c std::endl; } } void use_shared_future() { std::promise\u003cint\u003e promise; std::shared_future\u003cint\u003e future = promise.get_future(); std::thread myThread1(myFunction, std::move(promise)); // 将 promise 移动到线程中 // 使用 share() 方法获取新的 shared_future 对象 std::thread myThread2(threadFunction, future); std::thread myThread3(threadFunction, future); myThread1.join(); myThread2.join(); myThread3.join(); } 异常处理 void may_throw() { throw std::runtime_error(\"Oops, something went wrong!\"); // 抛出一个异常 } void use_future_exception() { std::future\u003cvoid\u003e result(std::async(std::launch::async, may_throw)); // 创建一个异步任务 try { result.get(); // 获取结果(如果在获取结果时发生了异常,那么会重新抛出这个异常) } catch (const std::exception\u0026 e) { std::cerr \u003c\u003c \"Caught exception: \" \u003c\u003c e.what() \u003c\u003c std::endl; // 捕获并打印异常 } } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:4:3","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"promise 承诺值 std::promise用于在某一线程中设置某个值或异常, std::promise::set_value():设置异步操作的结果值 std::promise::set_exception:设置异常情况 接受一个std::exception_ptr参数,该参数可以通过调用std::current_exception()方法获取 std::future则用于在另一线程中获取这个值或异常。 void set_value(std::promise\u003cint\u003e prom) { std::this_thread::sleep_for(std::chrono::seconds(5)); prom.set_value(10); std::cout \u003c\u003c \"promise set value success\" \u003c\u003c std::endl; } void use_promise_setvalue() { std::promise\u003cint\u003e prom; // 创建一个 promise 对象 std::future\u003cint\u003e fut = prom.get_future(); // 获取与 promise 相关联的 future 对象 std::thread t(set_value, std::move(prom)); // 在新线程中设置 promise 的值 std::cout \u003c\u003c \"Waiting for the thread to set the value...\\n\"; std::cout \u003c\u003c \"Value set by the thread: \" \u003c\u003c fut.get() \u003c\u003c '\\n'; // 在主线程中获取 future 的值 t.join(); } // 随着局部作用域}的结束,prom可能被释放也可能会被延迟释放,如果立即释放则fut.get()获取的值会报error_value的错误 void bad_promise_setvalue() { std::thread t; std::future\u003cint\u003e fut; { std::promise\u003cint\u003e prom; // 创建一个 promise 对象 fut = prom.get_future(); // 获取与 promise 相关联的 future 对象 t = std::thread(set_value, std::move(prom)); // 在新线程中设置 promise 的值 } std::cout \u003c\u003c \"Waiting for the thread to set the value...\\n\"; std::cout \u003c\u003c \"Value set by the thread: \" \u003c\u003c fut.get() \u003c\u003c '\\n'; // 在主线程中获取 future 的值 t.join(); } void set_exception(std::promise\u003cvoid\u003e prom) { try { throw std::runtime_error(\"An error occurred!\"); } catch (...) { prom.set_exception(std::current_exception()); } } // 注:子线程调用了set_exception,主线程一定要捕获这个异常,否则崩溃 void use_promise_setexception() { std::promise\u003cvoid\u003e prom; // 创建一个 promise 对象 std::future\u003cvoid\u003e fut = prom.get_future(); // 获取与 promise 相关联的 future 对象 std::thread t(set_exception, std::move(prom)); // 在新线程中设置 promise 的异常 try { std::cout \u003c\u003c \"Waiting for the thread to set the exception...\\n\"; fut.get(); } catch (const std::exception\u0026 e) { std::cout \u003c\u003c \"Exception set by the thread: \" \u003c\u003c e.what() \u003c\u003c '\\n'; } t.join(); } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:4:4","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"快速排序实例 开辟一个一次性的线程执行并行任务,主线程可以通过future在合适的时机执行等待汇总结果。 template\u003ctypename T\u003e void quick_sort_recursive(T q[], int l, int r){ if (l \u003e= r) return; T x = q[(l + r + 1) \u003e\u003e 1]; int i = l - 1, j = r + 1; while(i \u003c j){ do i++; while(q[i] \u003c x); do j--; while(q[j] \u003e x); if (i \u003c j) std::swap (q[i], q[j]); } quick_sort_recursive(q, l, i - 1); quick_sort_recursive(q, i, r); } template\u003ctypename T\u003e void quick_sort(T q[], int len) { quick_sort_recursive(q, 0, len - 1); } 串行版本 template\u003ctypename T\u003e std::list\u003cT\u003e sequential_quick_sort(std::list\u003cT\u003e input) { if (input.empty()) { return input; } std::list\u003cT\u003e result; result.splice(result.begin(), input, input.begin()); // 将 input 列表中的第一个元素移动到 result 列表的起始位置,并且在 input 列表中删除该元素 T const\u0026 pivot = *result.begin(); // 取首元素作为 x // partition 分区函数,使得满足条件的元素排在不满足条件元素之前。divide_point指向的是input中第一个大于等于pivot的地址 auto divide_point = std::partition(input.begin(), input.end(), [\u0026](T const\u0026 t){return t \u003c pivot;}); std::list\u003cT\u003e lower_part; lower_part.splice(lower_part.end(), input, input.begin(), divide_point); // 小于pivot的元素放在lower_part里 auto new_lower(sequential_quick_sort(std::move(lower_part))); auto new_higher(sequential_quick_sort(std::move(input))); result.splice(result.end(), new_higher); result.splice(result.begin(), new_lower); return result; } 并行版本 template\u003ctypename T\u003e std::list\u003cT\u003e parallel_quick_sort(std::list\u003cT\u003e input) { if (input.empty()) { return input; } std::list\u003cT\u003e result; result.splice(result.begin(), input, input.begin()); T const\u0026 pivot = *result.begin(); auto divide_point = std::partition(input.begin(), input.end(), [\u0026](T const\u0026 t){return t \u003c pivot;}); std::list\u003cT\u003e lower_part; lower_part.splice(lower_part.end(), input, input.begin(), divide_point); std::future\u003cstd::list\u003cT\u003e\u003e new_lower(std::async(parallel_quick_sort\u003cT\u003e, std::move(lower_part))); auto new_higher(parallel_quick_sort(std::move(input))); result.splice(result.end(), new_higher); result.splice(result.begin(), new_lower.get()); return result; } ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:4:5","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"并发设计模式 Actor 参与者模式 系统由多个独立的并发执行的actor组成。每个actor都有自己的状态、行为和邮箱(用于接收消息)。Actor之间通过消息传递进行通信,而不是共享状态。 CSP(Communicating Sequential Processes)通信顺序进程 各个进程之间彼此独立,通过发送和接收消息进行通信,通道用于确保进程之间的同步。 生产者消费者模型 template\u003ctypename T\u003e class Channel { public: Channel(size_t capacity = 0):capacity_(capacity){} bool send(T value) { std::unique_lock\u003cstd::mutex\u003e lock(mtx_); cv_producer_.wait(lock, [this]() {return (capacity_ == 0 \u0026\u0026 queue_.empty()) || queue_.size() \u003c capacity_ || closed_;}); if (closed_) { return false; } queue_.push(value); cv_consumer_.notify_one(); return true; } bool receive(T\u0026 value) { std::unique_lock\u003cstd::mutex\u003e lock(mtx_); cv_consumer_.wait(lock, [this]() {return !queue_.empty() || closed_;}); if (closed_ \u0026\u0026 queue_.empty()) { return false; } value = queue_.front(); queue_.pop(); cv_producer_.notify_one(); return true; } void close() { std::unique_lock\u003cstd::mutex\u003e lock(mtx_); closed_ = true; cv_producer_.notify_all(); cv_consumer_.notify_all(); } private: std::queue\u003cT\u003e queue_; std::mutex mtx_; std::condition_variable cv_producer_; std::condition_variable cv_consumer_; size_t capacity_; bool closed_ = false; }; ATM实例 handle成员函数:当函数返回一个类类型的局部变量时会先调用移动构造,如果没有移动构造再调用拷贝构造。 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:4:6","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"内存模型和原子类型 原子操作 无法拷贝构造,拷贝赋值 操作方式 可选顺序 store操作 存储操作 memory_order_relaxed,memory_order_release,memory_order_seq_cst Load操作 载入操作 memory_order_relaxed,memory_order_consume,memory_order_acquire,memory_order_seq_cst read-modify-write(读-改-写)操作 memory_order_relaxed,memory_order_consume,memory_order_acquire, memory_order_release,memory_order_acq_rel, memory_order_seq_cst 成员函数 说明 void store(T desired, std::memory_order order = std::memory_order_seq_cst) 写入(释放操作) T load(std::memory_order order = std::memory_order_seq_cst) 读取(获取操作) bool compare_exchange_weak(T\u0026 expected, T desired, std::memory_order order =std::memory_order_seq_cst) 当前值与期望值(expect)相等时,修改当前值为设定值(desired),返回true; 当前值与期望值(expect)不等时,将期望值(expect)修改为当前值,返回false; 读改写:比较-交换操作;可能保存失败,往往配合循环使用 bool compare_exchange_strong(T\u0026 expected, T desired, std::memory_order order =std::memory_order_seq_cst) 读改写:内部含循环,保存的值需要耗时计算(或体积较大的原子类型)选择其更合理 T exchange(T desired, std::memory_order order = std::memory_order_seq_cst) 读改写 内存顺序 获取-释放次序:存储操作采用memory_order_release次序,而载入操作采用memory_order_acquire次序,两者同步 内存序 说明 memory_order_relaxed 松散内存序,只用来保证对原子对象的操作是原子的,对顺序不做保证(允许指令重排) memory_order_consume 适用读操作,阻止对这个原子量有依赖的操作重排到前面去(限制读操作之后的部分操作,不允许指令重排) memory_order_acquire 适用读操作,在读取某原子对象时,当前线程的任何后面的读写操作都不允许重排到这个操作的前面去(读操作之后的部分,不允许指令重排) memory_order_release 适用写操作,在写入某原子对象时,当前线程的任何前面的读写操作都不允许重排到这个操作的后面去(写操作之前的部分,不允许指令重排) memory_order_acq_rel 适用读写操作,一个读-修改-写操作同时具有获得语义和释放语义,即它前后的任何读写操作都不允许重排(读写操作不允许指令重排) memory_order_seq_cst 顺序一致性语义,对于读操作相当于获取,对于写操作相当于释放,对于读-修改-写操作相当于获得释放,是所有原子操作的默认内存序(不允许指令重排) 自旋锁:当一个线程尝试获取锁时,如果锁已经被其他线程持有,那么该线程就会不断地循环检查锁的状态,直到成功获取到锁为止。 class Spinlock { public: Spinlock():flag(ATOMIC_FLAG_INIT){} void lock() { while (flag.test_and_set(std::memory_order_acquire));// 获取旧值并设置标志 } void unlock() { flag.clear(std::memory_order_release); // clear为存储操作,显示采用释放语义将标志清零 } private: std::atomic_flag flag; }; ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:5:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"环形队列 template\u003ctypename T, size_t Cap\u003e class CircularQueLk:private std::allocator\u003cT\u003e { public: CircularQueLk() :_max_size(Cap + 1), _data(std::allocator\u003cT\u003e::allocate(_max_size)), _head(0), _tail(0) {} CircularQueLk(const CircularQueLk\u0026) = delete; CircularQueLk\u0026 operator=(const CircularQueLk\u0026) volatile = delete; // 为什么拷贝复制有两个 CircularQueLk\u0026 operator=(const CircularQueLk\u0026) = delete; ~CircularQueLk() { std::lock_guard\u003cstd::mutex\u003e lock(_mtx); while(_head != _tail) { std::allocator\u003cT\u003e::destroy(_data + _head); _head = (_head + 1) % _max_size; } std::allocator\u003cT\u003e::deallocate(_data, _max_size); } template\u003ctypename ...Args\u003e bool emplace(Args\u0026\u0026 ...args) { std::lock_guard\u003cstd::mutex\u003e lock(_mtx); if ((_tail + 1) % _max_size == _head) { std::cout \u003c\u003c \"circular que full !\\n\"; return false; } // 尾部位置构造一个对象 std::allocator\u003cT\u003e::construct(_data + _tail, std::forward\u003cArgs\u003e(args)...); _tail = (_tail + 1) % _max_size; return true; } // 接受左值引用版本(加const:让其接受const类型也可以接受非const类型) bool push(const T\u0026 val) { std::cout \u003c\u003c \"called push const T\u0026 version\\n\"; return emplace(val); } bool push(T\u0026\u0026 val) { std::cout \u003c\u003c \"called push const T\u0026\u0026 version\\n\"; return emplace(std::move(val)); } bool pop(T\u0026 val) { std::lock_guard\u003cstd::mutex\u003e lock(_mtx); if (_head == _tail) { std::cout \u003c\u003c \"circular que empty !\\n\"; return false; } val = std::move(_data[_head]); _head = (_head + 1) % _max_size; return true; } private: size_t _max_size; T* _data; std::mutex _mtx; size_t _head = 0; size_t _tail = 0; }; ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:6:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"进程 fork前是多线程,fork后是不会继续运行多线程 ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:7:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"参考阅读 C++并发编程(中文版)(C++ Concurrency In Action) 恋恋风辰官方博客 -并发编程 对应B站视频 – 对应gitee ","date":"2023-10-15","objectID":"/posts/c++/c++%E5%B9%B6%E5%8F%91/:8:0","tags":["C++"],"title":"C++并发","uri":"/posts/c++/c++%E5%B9%B6%E5%8F%91/"},{"categories":["C++"],"content":"设计模式","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"设计模式 创建型模式:单例模式,工厂方法模式,抽象工厂模式,建造者模式,原型模式 结构型模式:适配器模式,代理模式,桥接模式,装饰器模式,外观模式,组合模式,享元模式 行为型模式:观察者模式,策略模式,命令模式,中介者模式,备忘录模式,模板方法模式,迭代器模式,状态模式,责任链模式,解释器模式,访问者模式 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:1:0","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"UML 类图分三层 第一层显示类的名称,如果是抽象类,则就用斜体显示。 第二层是类的特性,通常就是字段和属性。 第三层是类的操作,通常是方法或行为。 注意前面的符号,‘+’表示public,‘-’表示private,‘#’表示protected 继承(Inheritance)关系用空心三角形+实线来表示。 实现(Realization)接口用空心三角形+虚线来表示。 关联(Association)关系用实线箭头来表示。 依赖(Dependency)关系用虚线箭头来表示。 聚合(Aggregation)表示一种弱的‘拥有’关系,体现的是A对象可以包含B对象,但B对象不是A对象的一部分。聚合关系用空心的菱形+实线箭头来表示 。(汽车发动机是汽车的组成部分) classDiagram Car o--\u003e Engine : contains classDiagram Car o--\u003e Engine : contains组合(Composition)是一种强的‘拥有’关系,体现了严格的部分和整体的关系,部分和整体的生命周期一样。合成关系用实心的菱形+实线箭头来表示。(嘴巴是头的组成部分) classDiagram Head *--\u003e Mouth : has classDiagram Head *--\u003e Mouth : has","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:2:0","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"设计原则 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:3:0","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"单一职责原则 一个类只承担一个职责 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:3:1","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"开放封闭原则 对扩展开放,对修改封闭。 指的是软件在适应需求变化时,应尽量通过扩展代码来实现变化,而不是通过修改已有代码来实现变化。 使用枚举和 if-else 实现多态,难以扩展,还要一直去修改原函数的底层实现,就违背了开闭原则。 应尽量通过扩展代码来实现变化,而不是通过*修改已有代码来实现变化。 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:3:2","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"依赖倒置原则 一个封装好的函数或类,应该尽量依赖于抽象接口,而不是依赖于具体实现 面向过程编程范式中,函数指针就是那个抽象接口。 面向对象编程范式中,虚函数就是那个抽象接口。 函数式编程范式中,函数对象就是那个抽象接口。 模板元编程范式中,模板参数就是那个抽象接口。 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:3:3","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"接口隔离原则 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:3:4","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"里氏替换原则 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:3:5","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"迪米特法则(最少知识原则) ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:3:6","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"创建型模式 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:4:0","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"单例模式 ⼀个类只有⼀个实例,并提供⼀个全局访问点来访问这个实例。(对象创建型) 单例设计模式优点 全局控制:保证只有⼀个实例,这样就可以严格的控制客户怎样访问它以及何时访问它,简单的说就是对唯⼀实例的受控访问 节省资源:避免多次创建了相同的对象,节省系统资源,多个模块可以通过单例实例共享数据。 懒加载:单例模式可以实现懒加载,只有在需要时才实例化。 基本要求 classDiagram class Singleton{ -instance:Singleton -Singleton() +GetInstance() } classDiagram class Singleton{ -instance:Singleton -Singleton() +GetInstance() } 私有的构造函数:防止外部代码直接创建类的实例 私有的静态实例变量:保存该类的唯⼀实例 公有的静态方法:通过公有的静态方法来获取类的实例 单例设计模式实现 // c++11后是线程安全的 ( class Singleton { private: Singleton(){} Singleton(const Singleton\u0026) = delete; Singleton\u0026 operator=(const Singleton\u0026) = delete; public: static Singleton\u0026 GetInstance() { static Singleton singleton; return singleton; } }; 饿汉式 类加载时完成实例创建 // 饿汉式 class Singlehungry { private: Singlehungry(){} Singlehungry(const Singlehungry\u0026) = delete; Singlehungry\u0026 operator=(const Singlehungry\u0026) = delete; public: static Singlehungry* GetInstance() { if (singleton == nullptr) { singleton = new Singlehungry(); } return singleton; } private: static Singlehungry* singleton; }; // 饿汉式初始化 Singlehungry* Singlehungry::singleton = Singlehungry::GetInstance(); 懒汉式 请求实例时才创建:线程不安全,需要加锁,防止资源被重复初始化 // 懒汉式 class SingleLazy { private: SingleLazy(){} SingleLazy(const SingleLazy\u0026) = delete; SingleLazy\u0026 operator=(const Singlehungry\u0026) = delete; public: static SingleLazy* GetInstance() { if(singleton != nullptr) { return singleton; } _mutex.lock(); if (singleton != nullptr) { _mutex.unlock(); return singleton; } singleton = new SingleLazy(); _mutex.unlock(); return singleton; } private: static SingleLazy* singleton; static std::mutex _mutex; }; SingleLazy* SingleLazy::singleton = nullptr; std::mutex SingleLazy::_mutex; // 懒汉式 智能指针 自动回收 class SingleAuto { private: SingleAuto(){} SingleAuto(const SingleAuto\u0026) = delete; SingleAuto\u0026 operator=(const SingleAuto\u0026) = delete; public: ~SingleAuto() {std::cout \u003c\u003c \"single auto delete success \" \u003c\u003c std::endl; } static std::shared_ptr\u003cSingleAuto\u003e GetInstance() { if(singleton != nullptr) { return singleton; } _mutex.lock(); if (singleton != nullptr) { _mutex.unlock(); return singleton; } singleton = std::shared_ptr\u003cSingleAuto\u003e(new SingleAuto); _mutex.unlock(); return singleton; } private: static std::shared_ptr\u003cSingleAuto\u003e singleton; static std::mutex _mutex; }; std::shared_ptr\u003cSingleAuto\u003e SingleAuto::singleton = nullptr; std::mutex SingleAuto::_mutex; // 定义友元类帮忙回收 class SingleAutoSafe; class SafeDeletor { public: void operator()(SingleAutoSafe* sf) { std::cout \u003c\u003c \"this is safe deleter operator()\" \u003c\u003c std::endl; delete sf; } }; class SingleAutoSafe { private: SingleAutoSafe(){} SingleAutoSafe(const SingleAutoSafe\u0026) = delete; SingleAutoSafe\u0026 operator=(const SingleAutoSafe\u0026) = delete; ~SingleAutoSafe() {std::cout \u003c\u003c \"single auto delete success \" \u003c\u003c std::endl; } friend class SafeDeletor; public: static std::shared_ptr\u003cSingleAutoSafe\u003e GetInstance() { if(singleton != nullptr) { return singleton; } _mutex.lock(); if (singleton != nullptr) { _mutex.unlock(); return singleton; } singleton = std::shared_ptr\u003cSingleAutoSafe\u003e(new SingleAutoSafe, SafeDeletor()); _mutex.unlock(); return singleton; } private: static std::shared_ptr\u003cSingleAutoSafe\u003e singleton; static std::mutex _mutex; }; std::shared_ptr\u003cSingleAutoSafe\u003e SingleAutoSafe::singleton = nullptr; std::mutex SingleAutoSafe::_mutex; call_once call_once函数,配合一个局部的静态变量once_flag实现线程安全的初始化 多线程调用call_once函数时,会判断once_flag是否被初始化,如没被初始化则进入初始化流程,但是同一时刻只有一个线程能进入这个初始化函数 class SingleOnce { private: SingleOnce() = default; SingleOnce(const SingleOnce\u0026) = delete; SingleOnce\u0026 operator=(const SingleOnce\u0026) = delete; public: static std::shared_ptr\u003cSingleOnce\u003e GetInstance() { static std::once_flag _flag; std::call_once(_flag, [\u0026](){ singleton = std::shared_ptr\u003cSingleOnce\u003e(new SingleOnce); }); return singleton; } ~SingleOnce() {std::cout \u003c\u003c \"this is singleton destruct\" \u003c\u003c std::endl;} private: static std::shared_ptr\u003cSingleOnce\u003e singleton; }; std::shared_ptr\u003cSingleOnce\u003e SingleOnce::singleton = nullptr; // 通用模板类 template \u003ctypename T\u003e class Singleton { protected: Singleton() = default; Si","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:4:1","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"工厂方法模式 定义一个用于创建对象的接口,让子类决定将哪一个类实例化。工厂方法模式让一个类的实例化延迟到其子类。(类创建型) 简单工厂模式 定义一个工厂类,可以根据参数不同返回不同类的实例,被创建的实例通常都具有共同的父类。(静态工厂方法,类创建型)) 优点:实现了对象创建和使用的分离; 缺点:工厂类集中所有产品创建逻辑,职责过重;增加系统中类的个数;拓展困难;静态工厂方法无法形成基于继承的等级结构。 classDiagram class Product class ConcreteProductA class ConcreteProductB class Factory{ +factoryMethod(String arg) Product } Product\u003c|--ConcreteProductA: Inheritance Product\u003c|--ConcreteProductB: Inheritance ConcreteProductA\u003c..Factory : Dependency ConcreteProductB\u003c..Factory : Dependency classDiagram class Product class ConcreteProductA class ConcreteProductB class Factory{ +factoryMethod(String arg) Product } Product\u003c|--ConcreteProductA: Inheritance Product\u003c|--ConcreteProductB: Inheritance ConcreteProductA\u003c..Factory : Dependency ConcreteProductB\u003c..Factory : Dependency 工厂:负责实现创建所有产品实例看i的内部逻辑,提供静态工厂方法,返回抽象产品类型 抽象产品:工厂类创建的所有对象的父类,封装各种产品对象的公有方法。 具体产品:具体产品类,继承抽象产品。 工厂方法模式优点 优点:工厂方法用来创建客户所需要的产品,同时还向客户隐藏了哪种具体产品类将被实例化这一细节;基于工厂角色和产品角色的多态性设计;在系统中加入新产品,只要添加一个具体工厂和具体产品(开闭原则)。 缺点:添加新产品时需要编写新的具体产品类和对应的具体工厂类,系统中类的个数将成对增加。 优化:使用反射机制:根据类名字符串生成对象;引入配置文件 基本要求 classDiagram class Factory{ +factoryMethod() Product } class ConcreteFactory{ +factoryMethod() Product } class ConcreteProduct class Product Factory \u003c|..ConcreteFactory : Realization ConcreteFactory..\u003eConcreteProduct : Dependency Product\u003c|..ConcreteProduct : Realization classDiagram class Factory{ +factoryMethod() Product } class ConcreteFactory{ +factoryMethod() Product } class ConcreteProduct class Product Factory \u003c|..ConcreteFactory : Realization ConcreteFactory..\u003eConcreteProduct : Dependency Product\u003c|..ConcreteProduct : Realization 抽象工厂:声明工厂方法,用于返回一个产品 具体工厂:抽象工厂子类,实现抽象工厂定义的工厂方法,创建具体产品实例。 抽象产品:定义产品的接口。 具体产品:实现抽象产品接口,是具体工厂创建的对象。 工厂方法模式实现 class ImageReader { public: virtual void readImage(const std::string\u0026 filepath) = 0; virtual ~ImageReader(){} }; class GIFReader: public ImageReader { public: void readImage(const std::string\u0026 filepath) override { std::cout \u003c\u003c \"Reading GIF image from file: \" \u003c\u003c filepath \u003c\u003c std::endl; } }; class JPGReader: public ImageReader { public: void readImage(const std::string\u0026 filepath) override { std::cout \u003c\u003c \"Reading JPG image from file: \" \u003c\u003c filepath \u003c\u003c std::endl; } }; class ImageReaderFactory { public: virtual ImageReader* createReader() = 0; virtual ~ImageReaderFactory(){} }; class GIFReaderFactory: public ImageReaderFactory { public: ImageReader* createReader() override { return new GIFReader(); } }; class JPGReaderFactory: public ImageReaderFactory { public: ImageReader* createReader() override { return new JPGReader(); } }; 工厂方法模式使用场景 客户端不知道其所需要的对象的类,只需要知道所对应的工厂即可,具体的产品对象由具体工厂类创建,可将具体工厂类的类名存储在配置文件或数据库中。 抽象工厂类通过其子类来指定创建哪个对象。 具体工厂类在实现工厂方法时除了创建具体产品对象之外,还可以负责产品对象的初始化工作以及一些资源和环境配置工作,例如连接数据库、创建文件等 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:4:2","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"抽象工厂模式 提供一个创建一系列相关或相互依赖对象的接口,而无须指定它们具体的类。(对象创建型) 工厂方法模式通过引入工厂等级结构,解决了简单工厂模式中工厂类职责太重的问题。 抽象工厂模式通过将一些相关的产品组成一个“产品族”,由同一个工厂来统一生产,解决工厂方法模式中每个具体工厂只负责创建单一产品,存在大量工厂类的问题。 抽象工厂模式优点 优点:方便增加产品族;隔离了具体类的生成 缺点:增加新的产品等级结构麻烦 基本要求 classDiagram class Client class AbstractFactory { +createPeoductA() AbstractProductA +createPeoductB() AbstractProductB } class AbstractProductA class ConcreteProductA1 class ConcreteProductA2 class AbstractProductB class ConcreteProductB1 class ConcreteProductB2 class ConcreateFactory1 { +createPeoductA() AbstractProductA +createPeoductB() AbstractProductB } class ConcreateFactory2 { +createPeoductA() AbstractProductA +createPeoductB() AbstractProductB } Client..\u003eAbstractFactory : Dependency Client..\u003eAbstractProductA : Dependency Client..\u003eAbstractProductB : Dependency AbstractFactory\u003c|..ConcreateFactory1 : Realization AbstractFactory\u003c|..ConcreateFactory2 : Realization AbstractProductA\u003c|..ConcreteProductA1 : Realization AbstractProductA\u003c|..ConcreteProductA2 : Realization ConcreteProductA1\u003c..ConcreateFactory1 : Dependency ConcreateFactory1..\u003eConcreteProductB1: Dependency ConcreteProductA2\u003c..ConcreateFactory2 : Dependency ConcreateFactory2..\u003eConcreteProductB2 : Dependency AbstractProductB\u003c|..ConcreteProductB1 : Realization AbstractProductB\u003c|..ConcreteProductB2 : Realization classDiagram class Client class AbstractFactory { +createPeoductA() AbstractProductA +createPeoductB() AbstractProductB } class AbstractProductA class ConcreteProductA1 class ConcreteProductA2 class AbstractProductB class ConcreteProductB1 class ConcreteProductB2 class ConcreateFactory1 { +createPeoductA() AbstractProductA +createPeoductB() AbstractProductB } class ConcreateFactory2 { +createPeoductA() AbstractProductA +createPeoductB() AbstractProductB } Client..\u003eAbstractFactory : Dependency Client..\u003eAbstractProductA : Dependency Client..\u003eAbstractProductB : Dependency AbstractFactory\u003c|..ConcreateFactory1 : Realization AbstractFactory\u003c|..ConcreateFactory2 : Realization AbstractProductA\u003c|..ConcreteProductA1 : Realization AbstractProductA\u003c|..ConcreteProductA2 : Realization ConcreteProductA1\u003c..ConcreateFactory1 : Dependency ConcreateFactory1..\u003eConcreteProductB1: Dependency ConcreteProductA2\u003c..ConcreateFactory2 : Dependency ConcreateFactory2..\u003eConcreteProductB2 : Dependency AbstractProductB\u003c|..ConcreteProductB1 : Realization AbstractProductB\u003c|..ConcreteProductB2 : Realization 抽象工厂:声明一组用于创建一族产品的方法,每个方法对应一种产品。 具体工厂:实现抽象工厂声明的方法。 抽象产品:定义产品的接口。 具体产品:实现抽象产品接口,是具体工厂创建的对象。 抽象工厂模式实现 // 操作控制类:抽象产品类A class OperationController { public: virtual void controlGame() = 0; virtual ~OperationController(){} }; // 界面控制类:抽象产品类B class InterfaceController { public: virtual void displayInterface() = 0; }; // 具体产品类A1 class iOSOperationController : public OperationController { public: void controlGame() override { std::cout \u003c\u003c \"ios control game\" \u003c\u003c std::endl; } }; // 具体产品类A2 class AndroidOperationController : public OperationController { public: void controlGame() override { std::cout \u003c\u003c \"android control game\" \u003c\u003c std::endl; } }; // 具体产品类B1 class iOSInterfaceController : public InterfaceController { public: void displayInterface() override { std::cout \u003c\u003c \"ios display\" \u003c\u003c std::endl; } }; // 具体产品类B2 class AndroidInterfaceController : public InterfaceController { public: void displayInterface() override { std::cout \u003c\u003c \"android display\" \u003c\u003c std::endl; } }; // 抽象工厂类 class AbstractFactory { public: virtual OperationController* createOperationController() = 0; virtual InterfaceController* createInterfaceController() = 0; virtual ~AbstractFactory() {} }; // iOS工厂 :具体工厂1 class iOSFactory : public AbstractFactory { public: OperationController* createOperationController() override { return new iOSOperationController(); } InterfaceController* createInterfaceController() override { return new iOSInterfaceController(); } }; // Android工厂 :具体工厂2 class AndroidFactory : public AbstractFactory { public: OperationController* createOperationController() override { return new AndroidOperationController","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:4:3","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"建造者模式 将一个复杂对象的构建与它的表示分离,使得同样的构建过程可以创建不同的表示。(对象创建型) 建造者模式优点 优点:客户端与包含多个部件的复杂对象的创建过程分离 缺点:建造者模式所创建的产品一般具有较多的共同点,其组成部分相似 基本要求 classDiagram class Director { -builder: Builder +construct() } class Builder { +buildPartA() +buildPartB() +buildPartC() +getResult() } class ConcreteBuilder { +buildPartA() +buildPartB() +buildPartC() +getResult() } class Product Director o--\u003e Builder : Aggregation Builder \u003c|-- ConcreteBuilder : Inheritance ConcreteBuilder..\u003eProduct : Dependency classDiagram class Director { -builder: Builder +construct() } class Builder { +buildPartA() +buildPartB() +buildPartC() +getResult() } class ConcreteBuilder { +buildPartA() +buildPartB() +buildPartC() +getResult() } class Product Director o--\u003e Builder : Aggregation Builder \u003c|-- ConcreteBuilder : Inheritance ConcreteBuilder..\u003eProduct : Dependency 抽象建造者:创建一个产品Product对象的各个部件指定抽象接口。一般声明两类方法:一类方法是buildPartX(),用于创建复杂对象的各个部件;另一类方法是getResult(),用于返回复杂对象 具体建造者:实现了Builder接口,实现各个部件的具体构造和装配方法,定义并明确其所创建的复杂对象 产品角色:被构建的复杂对象,包含多个组成部件 指挥者:负责安排复杂对象的建造次序 建造者模式实现 class ComputerComponent { public: virtual ~ComputerComponent() {} virtual void display() = 0; }; class CPU : public ComputerComponent { public: void display() override { std::cout \u003c\u003c \"\\tCPU \\n\"; } }; class GPU : public ComputerComponent { public: void display() override { std::cout \u003c\u003c \"\\tGPU \\n\"; } }; class ARM : public ComputerComponent { public: void display() override { std::cout \u003c\u003c \"\\tARM \\n\"; } }; // 抽象建造者类 class Builder { public: virtual void buildCPU() = 0; virtual void buildGPU() = 0; virtual void buildRAM() = 0; virtual std::vector\u003cComputerComponent*\u003e getResult() = 0; virtual ~Builder() {} }; // 具体建造者类 class ComputerBuilder : public Builder { public: void buildCPU() override { components.push_back(new CPU()); } void buildGPU() override { components.push_back(new GPU()); } void buildRAM() override { components.push_back(new ARM()); } std::vector\u003cComputerComponent*\u003e getResult() override { return components; } private: std::vector\u003cComputerComponent*\u003e components; }; // 指挥者类 class Director { public: void construct(Builder* builder) { builder-\u003ebuildCPU(); builder-\u003ebuildGPU(); builder-\u003ebuildRAM(); } }; 建造者模式场景 需要生成的产品对象的属性相互依赖,需要指定其生成顺序 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:4:4","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"原型模式 使用原型实例指定创建对象的种类,并且通过克隆这些原型创建新的对象。(对象创建型) 原型模式优点 优点:提高创建效率,简化创建结构,可以使用深克隆方式保存对象状态(辅助实现撤销操作) 缺点:每个类配备一个克隆方法,且该方法位于类内。 基本要求 classDiagram class Client { +operation() } class Prototype { +clone() Prototype } class ConcreatPrototypeA { +clone() Prototype } class ConcreatPrototypeB { +clone() Prototype } Client--\u003ePrototype : Association Prototype\u003c|--ConcreatPrototypeA : Inheritance Prototype\u003c|--ConcreatPrototypeB : Inheritance classDiagram class Client { +operation() } class Prototype { +clone() Prototype } class ConcreatPrototypeA { +clone() Prototype } class ConcreatPrototypeB { +clone() Prototype } Client--\u003ePrototype : Association Prototype\u003c|--ConcreatPrototypeA : Inheritance Prototype\u003c|--ConcreatPrototypeB : Inheritance 抽象原型类:声明克隆方法的接口,是所有具体原型类的公共父类 具体原型类:实现在抽象原型类中声明的克隆方法,在克隆方法中返回自己的一个克隆对象。 客户类:让一个原型对象克隆自身从而创建一个新的对象 原型模式实现 class Address { public: Address(const std::string\u0026 s, const std::string\u0026 c, const std::string\u0026 co) : street(s), city(c), country(co) {} Address(const Address\u0026 other) : street(other.street), city(other.city), country(other.country) {}\\ ~Address() {} public: std::string street; std::string city; std::string country; }; // 抽象原型类 class Customer { public: Customer(const std::string\u0026 n, Address* addr) : name(n), address(addr) {} // 原型模式中的克隆接口 virtual Customer* clone() const = 0; virtual ~Customer(){} public: std::string name; Address* address; }; // 浅克隆 class ShallowCopyCustomer : public Customer { public: ShallowCopyCustomer(const std::string\u0026 n, Address* addr) : Customer(n, addr) {} ShallowCopyCustomer(const ShallowCopyCustomer\u0026 other) : Customer(other.name, other.address) {} Customer* clone() const override { return new ShallowCopyCustomer(*this); } }; // 深克隆 class DeepCopyCustomer : public Customer { public: DeepCopyCustomer(const std::string\u0026 n, Address* addr) : Customer(n, addr) {} DeepCopyCustomer(const DeepCopyCustomer\u0026 other) : Customer(other.name, new Address(*other.address)) {} Customer* clone() const override { return new DeepCopyCustomer(*this); } }; 原型模式场景 复制粘贴操作 配合备忘录模式 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:4:5","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"结构型模式 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:0","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"适配器模式 将一个接口转换成客户希望的另一个接口,使接口不兼容的那些类可以一起工作。(对象结构型,类结构型) 对象适配器模式中,适配器与适配者之间是关联关系; 在类适配器模式中,适配器与适配者之间是继承(或实现)关系。 适配器模式优点 优点:将目标类和适配者类解耦(无需修改原有结构),增加了类的透明性和复用性 基本要求 对象适配器 classDiagram class Client class Target { +request() } class Adaptee { +specificRequest() } class Adapter { +request() } Client ..\u003e Target : Dependency Target \u003c|-- Adapter : Inheritance Adaptee\u003c--Adapter : Association classDiagram class Client class Target { +request() } class Adaptee { +specificRequest() } class Adapter { +request() } Client ..\u003e Target : Dependency Target \u003c|-- Adapter : Inheritance Adaptee\u003c--Adapter : Association Target目标抽象类:定义客户所需接口 Adapter适配器类:调用另一个接口,作为转换器,对Adaptee和Target进行适配 Adaptee适配者类:被适配的角色 类适配器:类适配器模式与对象适配器模式最大的区别在于其适配器和适配者之间的关系是继承关系(多重继承) classDiagram class Client class Target { +request() } class Adaptee { +specificRequest() } class Adapter { +request() } Client ..\u003e Target : Dependency Target \u003c|.. Adapter : Realization Adaptee\u003c|--Adapter : Inheritance classDiagram class Client class Target { +request() } class Adaptee { +specificRequest() } class Adapter { +request() } Client ..\u003e Target : Dependency Target \u003c|.. Adapter : Realization Adaptee\u003c|--Adapter : Inheritance适配器模式实现 // Target:request 目标接口:加密解密接口 class EncryptorInterface { public: virtual ~EncryptorInterface() {} virtual void encrypt() = 0; virtual void decrypt() = 0; }; // Adaptee:specificRequest 第三方加密类 class ThirdPartyEncryptor { public: void encrypt() { std::cout \u003c\u003c \"ThirdParty Encrypted !\\n\"; } void decrypt() { std::cout \u003c\u003c \"ThirdParty Decrypted !\\n\"; } }; // Adapter:request class EncryptorAdapter : public EncryptorInterface { public: EncryptorAdapter() {} void encrypt() override { _thirdPartyEncryptor.encrypt(); } void decrypt() override { _thirdPartyEncryptor.decrypt(); } private: ThirdPartyEncryptor _thirdPartyEncryptor; }; 适配器模式场景 系统需要使用一些现有的类,而这些类的接口(例如方法名)不符合系统的需要,甚至没有这些类的源代码 想创建一个可以重复使用的类,用于与一些彼此之间没有太大关联的类,包括一些可能在将来引进的类一起工作 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:1","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"桥接模式 将抽象部分与其实现部分分离,使它们都可以独立地变化。(对象结构性) 桥接模式优点 优点: 缺点: 基本要求 classDiagram class Client class Abstraction { +operation() } class Implementor { +operationImpl() } class RefinedAbstraction { +operation() } class ConcreteImplementorA { +operationImpl() } class ConcreteImplementorB { +operationImpl() } Client ..\u003e Abstraction : Dependency Client ..\u003e Implementor : Dependency Abstraction o--\u003e Implementor : Aggregation Abstraction \u003c|-- RefinedAbstraction : Inheritance Implementor \u003c|.. ConcreteImplementorA : Realization Implementor \u003c|.. ConcreteImplementorB : Realization classDiagram class Client class Abstraction { +operation() } class Implementor { +operationImpl() } class RefinedAbstraction { +operation() } class ConcreteImplementorA { +operationImpl() } class ConcreteImplementorB { +operationImpl() } Client ..\u003e Abstraction : Dependency Client ..\u003e Implementor : Dependency Abstraction o--\u003e Implementor : Aggregation Abstraction \u003c|-- RefinedAbstraction : Inheritance Implementor \u003c|.. ConcreteImplementorA : Realization Implementor \u003c|.. ConcreteImplementorB : Realization RefinedAbstraction(扩充抽象类):调用Implementor定义的业务方法 桥接模式实现 将具有两个独立变化维度的类的一些普通业务方法和与之关系最密切的维度设计为抽象类层次结构(抽象部分),而将另一个维度设计为实现类层次结构(实现部分) class DatabaseAccess; class DataConverter { public: DataConverter(DatabaseAccess* db) : database(db) {} virtual ~DataConverter() {} virtual void convertToFile(const std::string\u0026 filePath) = 0; protected: DatabaseAccess* database; // 引用数据库访问层,实现桥接 }; // 数据库访问抽象接口 class DatabaseAccess { public: virtual ~DatabaseAccess() {} virtual void fetchRecords() = 0; }; class MySQLDatabase : public DatabaseAccess { public: void fetchRecords() override { std::cout \u003c\u003c \"MySQLDatabase data records \"; } }; class PostgreSQLDatabase : public DatabaseAccess { public: void fetchRecords() override { std::cout \u003c\u003c \"PostgreSQLDatabase data records \"; } }; class TXTConverter : public DataConverter { public: TXTConverter(DatabaseAccess* db) : DataConverter(db) {} void convertToFile(const std::string\u0026 filePath) override { database-\u003efetchRecords(); std::cout \u003c\u003c \"convert to txt\\n\"; } }; class XMLConverter : public DataConverter { public: XMLConverter(DatabaseAccess* db) : DataConverter(db) {} void convertToFile(const std::string\u0026 filePath) override { database-\u003efetchRecords(); std::cout \u003c\u003c \"convert to xml\\n\"; } }; 桥接模式场景 一个类存在两个(或多个)独立变化的维度,且这两个(或多个)维度都需要独立进行扩展 系统类的个数急剧增加的系统 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:2","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"组合模式 组合多个对象形成树形结构以表示具有“部分—整体”关系的层次结构。(对象结构型) 组合模式优点 优点: 缺点:在增加新构件时很难对容器中的构件类型进行限制。 基本要求 classDiagram class Client class Component { +operation() +add(Component c) +remove(Component c) +getChild(int i) } class Leaf { +operation() } class Composite { +operation() +add(Component c) +remove(Component c) +getChild(int i) } Client ..\u003e Component : Dependency Component \u003c|-- Leaf : Inheritance Component \u003c|-- Composite : Inheritance Component \u003c--o Composite : Aggregation classDiagram class Client class Component { +operation() +add(Component c) +remove(Component c) +getChild(int i) } class Leaf { +operation() } class Composite { +operation() +add(Component c) +remove(Component c) +getChild(int i) } Client ..\u003e Component : Dependency Component \u003c|-- Leaf : Inheritance Component \u003c|-- Composite : Inheritance Component \u003c--o Composite : Aggregation Compoent(抽象构件):为叶子构件和容器构件对象声明接口,在该角色中可以包含所有子类共有行为的声明和实现 Leaf(叶子构件):叶子节点对象,实现了在抽象构件中定义的行为 Composite(容器构件):提供一个集合用于存储子节点,实现了在抽象构件中定义的行为 组合模式实现 class Component { public: virtual ~Component() {} virtual void add(std::shared_ptr\u003cComponent\u003e) = 0; virtual void remove(std::shared_ptr\u003cComponent\u003e) = 0; virtual void display(int depth = 0) const = 0; }; class Button : public Component { public: void add(std::shared_ptr\u003cComponent\u003e) override { /* 不允许添加子控件 */ } void remove(std::shared_ptr\u003cComponent\u003e) override { /* 不允许删除子控件 */ } void display(int depth) const override { std::cout \u003c\u003c std::string(depth * 2, '-') \u003c\u003c \"Button\\n\"; } }; class TextBox : public Component { public: void add(std::shared_ptr\u003cComponent\u003e) override { /* 不允许添加子控件 */ } void remove(std::shared_ptr\u003cComponent\u003e) override { /* 不允许删除子控件 */ } void display(int depth) const override { std::cout \u003c\u003c std::string(depth * 2, '-') \u003c\u003c \"TextBox\\n\"; } }; class Panel : public Component { public: void add(std::shared_ptr\u003cComponent\u003e child) override { children.push_back(child); } void remove(std::shared_ptr\u003cComponent\u003e child) override { children.erase( std::remove_if(children.begin(), children.end(), [\u0026](const std::shared_ptr\u003cComponent\u003e\u0026 c) { return c == child; }), children.end()); } void display(int depth) const override { std::cout \u003c\u003c std::string(depth * 2, '-') \u003c\u003c \"Panel\\n\"; for (const auto\u0026 child : children) { child-\u003edisplay(depth + 1); } } private: std::vector\u003cstd::shared_ptr\u003cComponent\u003e\u003e children; }; 组合模式场景 需要构建具有部分-整体层次结构的场景,比如组织架构管理、文件系统的文件和文件夹组织等 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:3","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"装饰模式 动态地给一个对象增加一些额外的职责。(对象结构型) 装饰模式优点 基本要求 classDiagram class Component { +operation() } class ConcreteComponent { +operation() } class Decorator { +operation() } class ConcreteDecoratorA { -addedState: +operation() } class ConcreteDecoratorB { +operation() +addedBehavior() } Component \u003c|-- ConcreteComponent : Inheritance Component \u003c|-- Decorator : Inheritance Component \u003c--o Decorator : Aggregation Decorator \u003c|-- ConcreteDecoratorA : Inheritance Decorator \u003c|-- ConcreteDecoratorB : Inheritance classDiagram class Component { +operation() } class ConcreteComponent { +operation() } class Decorator { +operation() } class ConcreteDecoratorA { -addedState: +operation() } class ConcreteDecoratorB { +operation() +addedBehavior() } Component \u003c|-- ConcreteComponent : Inheritance Component \u003c|-- Decorator : Inheritance Component \u003c--o Decorator : Aggregation Decorator \u003c|-- ConcreteDecoratorA : Inheritance Decorator \u003c|-- ConcreteDecoratorB : Inheritance Component(抽象构件): 声明了在具体构件中实现的业务方法 ConcreteComponent(具体构件):实现了在抽象构件中声明的方法,装饰器可以给它增加额外的职责 Decorator(抽象装饰类):维护一个指向抽象构件对象的引用,通过该引用可以调用装饰之前构件对象的方法,并通过其子类扩展该方法,以达到装饰的目的 ConcreteDecoratorA(具体装饰类):用在抽象装饰类中定义的方法,并可以增加新的方法用以扩充对象的行为 装饰模式实现 // Component 抽象组件:定义了加密接口 class Encryptor { public: virtual ~Encryptor() {} virtual std::string encrypt(const std::string \u0026input) const = 0; }; // ConcreteComponent 具体组件:原始字符串,没有加密 class PlainText : public Encryptor { public: std::string encrypt(const std::string \u0026input) const override { return input; } }; // Decorator 抽象装饰器:继承自 Encryptor,为装饰器定义接口 class EncryptDecorator : public Encryptor { public: explicit EncryptDecorator(std::unique_ptr\u003cEncryptor\u003e encryptor) : m_encryptor(std::move(encryptor)) {} std::string encrypt(const std::string \u0026input) const override { return m_encryptor-\u003eencrypt(input); } private: std::unique_ptr\u003cEncryptor\u003e m_encryptor; }; // ConcreteDecoratorA 具体装饰器:移位加密 class ShiftEncryptor : public EncryptDecorator { public: explicit ShiftEncryptor(std::unique_ptr\u003cEncryptor\u003e encryptor) : EncryptDecorator(std::move(encryptor)) {} std::string encrypt(const std::string \u0026input) const override { std::string encrypted = EncryptDecorator::encrypt(input); for (char \u0026c : encrypted) { if (isalpha(c)) { c = (c \u003e= 'a' \u0026\u0026 c \u003c= 'z') ? ((c - 'a' + 3) % 26 + 'a') : ((c - 'A' + 3) % 26 + 'A'); } } return encrypted; } }; 装饰模式场景 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:4","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"外观模式 外部与一个子系统的通信通过一个统一的外观角色进行,为子系统中的一组接口提供一个一致的入口。(对象结构型) 基本要求 classDiagram Client ..\u003e Facde : Dependency Facde --\u003e SubSystemA : Association Facde --\u003e SubSystemB : Association Facde --\u003e SubSystemC : Association classDiagram Client ..\u003e Facde : Dependency Facde --\u003e SubSystemA : Association Facde --\u003e SubSystemB : Association Facde --\u003e SubSystemC : Association外观模式实现 // 通讯录备份实现 class ContactsBackup { public: bool backup() { std::cout \u003c\u003c \"Backing up contacts...\\n\"; return true; } }; // 短信备份实现 class SMSBackup { public: bool backup() { std::cout \u003c\u003c \"Backing up SMS...\\n\"; return true; } }; class BackupFacade { public: BackupFacade(ContactsBackup *contacts_backup = nullptr, SMSBackup *sms_backup = nullptr) { this-\u003econtacts_backup_ = contacts_backup ? : new ContactsBackup; this-\u003esms_backup_ = sms_backup ? : new SMSBackup; } ~BackupFacade() { delete contacts_backup_; delete sms_backup_; } bool oneClickBackup() { std::cout \u003c\u003c \"Starting one-click backup...\\n\"; bool backupSuccess = this-\u003econtacts_backup_-\u003ebackup() \u0026\u0026 this-\u003esms_backup_-\u003ebackup(); if (backupSuccess) { std::cout \u003c\u003c \"One-click backup completed successfully.\\n\"; } else { std::cout \u003c\u003c \"One-click backup failed.\\n\"; } return backupSuccess; } private: ContactsBackup *contacts_backup_; SMSBackup *sms_backup_; }; 外观模式场景 不修改客户端代码:引入一个抽象外观类,客户端针对抽象外观类编程,而在运行时再确定具体外观类 结合单例类,降低系统资源消耗 多业务对象交互场景;系统首页导航页面,菜单和工具栏 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:5","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"享元模式 运用共享技术有效地支持大量细粒度对象的复用。(对象结构型) 基本要求 classDiagram class Flyweight { +operation(extrinsicState) } class FlyweightFactory{ - flyweights:HashMap +getFlyweight(String key) Flyweight } class ConcreteFlyweight { -intrinsicState: +operation(extrinsicState) } class UnsharedConcreteFlyweight { -allState: +operation(extrinsicState) } FlyweightFactory o--\u003e Flyweight: Aggregation Flyweight \u003c|.. ConcreteFlyweight: Realization Flyweight \u003c|.. UnsharedConcreteFlyweight: Realization classDiagram class Flyweight { +operation(extrinsicState) } class FlyweightFactory{ - flyweights:HashMap +getFlyweight(String key) Flyweight } class ConcreteFlyweight { -intrinsicState: +operation(extrinsicState) } class UnsharedConcreteFlyweight { -allState: +operation(extrinsicState) } FlyweightFactory o--\u003e Flyweight: Aggregation Flyweight \u003c|.. ConcreteFlyweight: Realization Flyweight \u003c|.. UnsharedConcreteFlyweight: Realization FlyWeight(抽象享元类) FlyweightFactory(享元工厂类):提供一个用于存储享元对象的享元池 ConcreteFlyweight(具体享元类) UnsharedConcreteFlyweight(非共享具体享元类) 享元模式实现 // FlyWeight 抽象享元类 class Media { public: virtual void display(int x, int y, int width, int height) const = 0; virtual ~Media() = default; }; // 具体享元类 - 图片 class Image : public Media { private: std::string filename; public: Image(const std::string\u0026 filename) : filename(filename) {} void display(int x, int y, int width, int height) const override { std::cout \u003c\u003c \"Displaying image \" \u003c\u003c filename \u003c\u003c \" at (\" \u003c\u003c x \u003c\u003c \", \" \u003c\u003c y \u003c\u003c \"), size: (\" \u003c\u003c width \u003c\u003c \"x\" \u003c\u003c height \u003c\u003c \").\\n\"; } }; // 具体享元类 - 视频 class Video : public Media { private: std::string filename; public: Video(const std::string\u0026 filename) : filename(filename) {} void display(int x, int y, int width, int height) const override { std::cout \u003c\u003c \"Playing video \" \u003c\u003c filename \u003c\u003c \" at (\" \u003c\u003c x \u003c\u003c \", \" \u003c\u003c y \u003c\u003c \"), size: (\" \u003c\u003c width \u003c\u003c \"x\" \u003c\u003c height \u003c\u003c \").\\n\"; } }; // FlyweightFactory class MediaFactory { public: std::shared_ptr\u003cMedia\u003e getMedia(const std::string\u0026 type, const std::string\u0026 filename) { std::string key = type + \":\" + filename; if (mediaPool.find(key) == mediaPool.end()) { if (type == \"image\") { mediaPool[key] = std::make_shared\u003cImage\u003e(filename); } else if (type == \"video\") { mediaPool[key] = std::make_shared\u003cVideo\u003e(filename); } } return mediaPool[key]; } private: std::unordered_map\u003cstd::string, std::shared_ptr\u003cMedia\u003e\u003e mediaPool; }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:6","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"代理模式 给某一个对象提供一个代理,并由代理对象控制对原对象的引用。(对象结构型) 基本要求 classDiagram class Client class Subject { + request() } class RealSubject { + request() } class Proxy { - realSubject : RealSubject + preRequest() + request(); + postRequest() } Client ..\u003e Subject : Dependency Subject \u003c|-- Proxy : Inheritance Subject \u003c|-- RealSubject : Inheritance Proxy --\u003e RealSubject : Association classDiagram class Client class Subject { + request() } class RealSubject { + request() } class Proxy { - realSubject : RealSubject + preRequest() + request(); + postRequest() } Client ..\u003e Subject : Dependency Subject \u003c|-- Proxy : Inheritance Subject \u003c|-- RealSubject : Inheritance Proxy --\u003e RealSubject : Association代理模式实现 // Subject 抽象主题 class Image { public: virtual void display() = 0; virtual ~Image() = default; }; // RealSubject 真实主题 网络图片 class RealImage : public Image { public: RealImage(const std::string\u0026 filename) : file_name{filename} {} void display() override { std::cout \u003c\u003c \"display image: \" \u003c\u003c file_name \u003c\u003c \".\\n\"; } private: std::string file_name; }; // Proxy 图标代理 class IconProxy : public Image { public: IconProxy(const std::string\u0026 filename) : file_name(filename) {} void display() override { if (real_subject_ == nullptr) { real_subject_ = std::make_unique\u003cRealImage\u003e(file_name); } std::cout \u003c\u003c \"Displaying icon for image: \" \u003c\u003c file_name \u003c\u003c std::endl; real_subject_-\u003edisplay(); } private: std::string file_name; std::unique_ptr\u003cRealImage\u003e real_subject_; }; // ImageManager 管理所有URL及其图片集合 class ImageManager { public: void addImage(const std::string\u0026 url, const std::string\u0026 filename) { view_url[url].emplace(std::make_unique\u003cIconProxy\u003e(filename)); } const std::set\u003cstd::unique_ptr\u003cImage\u003e\u003e\u0026 getImagesForUrl(const std::string\u0026 url) const { return view_url.at(url); } private: std::unordered_map\u003cstd::string, std::set\u003cstd::unique_ptr\u003cImage\u003e\u003e\u003e view_url; }; // ImageViewer 显示特定URL的图片 class ImageViewer { public: ImageViewer(const std::string\u0026 url, const ImageManager\u0026 manager) : url{url}, manager{manager} {} void displayImages() { const auto\u0026 images = manager.getImagesForUrl(url); for (const auto\u0026 image : images) { image-\u003edisplay(); std::cout \u003c\u003c \"\\n\"; } } private: std::string url; const ImageManager\u0026 manager; }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:5:7","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"行为型模式 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:0","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"责任链模式 避免将请求发送者与接收者耦合在一起,让多个对象都有机会接收请求,将这些对象连接成一条链,并且沿着这条链传递请求,直到有对象处理它为止。(对象行为型) 基本要求 classDiagram class Client class Handler { # successor:Handler + handleRequest() } class ConcreteHanderA { + handleRequest() } class ConcreteHanderB { + handleRequest() } Client ..\u003e Handler : Dependency Handler o--\u003e Handler : Aggregation Handler \u003c|-- ConcreteHanderA : Inheritance Handler \u003c|-- ConcreteHanderB : Inheritance classDiagram class Client class Handler { # successor:Handler + handleRequest() } class ConcreteHanderA { + handleRequest() } class ConcreteHanderB { + handleRequest() } Client ..\u003e Handler : Dependency Handler o--\u003e Handler : Aggregation Handler \u003c|-- ConcreteHanderA : Inheritance Handler \u003c|-- ConcreteHanderB : Inheritance Handler(抽象处理者):定义了对下家的引用对象,以便将请求转发给下家 ConcreteHander(具体处理者):处理请求,转发请求 责任链模式实现 // 抽象处理者 class Approver { public: virtual void processLeaveApplication(int days) = 0; virtual ~Approver() = default; }; // 具体处理者:主任 class Director : public Approver { public: void processLeaveApplication(int days) override { if (days \u003c 3) { std::cout \u003c\u003c \"Director approved the leave application.\" \u003c\u003c std::endl; } else { std::cout \u003c\u003c \"Director cannot approve, passing to next approver.\" \u003c\u003c std::endl; if (nextApprover) { nextApprover-\u003eprocessLeaveApplication(days); } } } void setNextApprover(Approver* next) { nextApprover = next; } private: Approver* nextApprover = nullptr; }; // 具体处理者:经理 class Manager : public Approver { public: void processLeaveApplication(int days) override { if (days \u003e= 3 \u0026\u0026 days \u003c 10) { std::cout \u003c\u003c \"Manager approved the leave application.\" \u003c\u003c std::endl; } else { std::cout \u003c\u003c \"Manager cannot approve, passing to next approver.\" \u003c\u003c std::endl; if (nextApprover) { nextApprover-\u003eprocessLeaveApplication(days); } } } void setNextApprover(Approver* next) { nextApprover = next; } private: Approver* nextApprover = nullptr; }; // 具体处理者:总经理 class GeneralManager : public Approver { public: void processLeaveApplication(int days) override { if (days \u003e= 10 \u0026\u0026 days \u003c 30) { std::cout \u003c\u003c \"General Manager approved the leave application.\" \u003c\u003c std::endl; } else if (days \u003e= 30) { std::cout \u003c\u003c \"Leave application rejected: too many days requested.\" \u003c\u003c std::endl; } else { std::cout \u003c\u003c \"General Manager cannot approve, passing to next approver.\" \u003c\u003c std::endl; if (nextApprover) { nextApprover-\u003eprocessLeaveApplication(days); } } } void setNextApprover(Approver* next) { nextApprover = next; } private: Approver* nextApprover = nullptr; }; 责任链模式场景 分级审批,多对象可以处理同一请求,具体哪个对象处理该请求待运行时刻再确定。 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:1","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"命令模式 将一个请求封装为一个对象,从而可用不同的请求对客户进行参数化;对请求排队或者记录请求日志,以及支持可撤销的操作。(对象行为型) 命令队列。支持撤销和恢复操作。宏命令。 基本要求 classDiagram class Client class Invoker class Command { + execute() } class ConcreteCommand { - state: + execute() } class Receiver { + action() } Client ..\u003e Invoker : Dependenvy Client ..\u003e Command : Dependenvy Invoker o--\u003e Command : Aggregation Command \u003c|-- ConcreteCommand : Inheritance ConcreteCommand --\u003e Receiver : Association classDiagram class Client class Invoker class Command { + execute() } class ConcreteCommand { - state: + execute() } class Receiver { + action() } Client ..\u003e Invoker : Dependenvy Client ..\u003e Command : Dependenvy Invoker o--\u003e Command : Aggregation Command \u003c|-- ConcreteCommand : Inheritance ConcreteCommand --\u003e Receiver : Association Command(抽象命令类) ConcreteCommand(具体命令类) Invoker(调用者):通过命令对象来执行请求 Receiver(接收者) 命令模式实现 class Command { public: virtual ~Command() {} virtual void execute() = 0; }; // Receiver 接收者 class BoardScreen { public: void open() { std::cout \u003c\u003c \"Opening board screen\" \u003c\u003c std::endl; } void create() { std::cout \u003c\u003c \"Creating board content\" \u003c\u003c std::endl; } void edit() { std::cout \u003c\u003c \"Editing board content\" \u003c\u003c std::endl; } }; // ConcreteCommand 具体命令类 class OpenCommand : public Command { public: OpenCommand(BoardScreen* screen) : boardScreen(screen) {} void execute() override { boardScreen-\u003eopen(); } private: BoardScreen* boardScreen; }; class CreateCommand : public Command { public: CreateCommand(BoardScreen* screen) : boardScreen(screen) {} void execute() override { boardScreen-\u003ecreate(); } private: BoardScreen* boardScreen; }; class EditCommand : public Command { public: EditCommand(BoardScreen* screen) : boardScreen(screen) {} void execute() override { boardScreen-\u003eedit(); } private: BoardScreen* boardScreen; }; // Invoker 调用者 class MenuItem { public: void setCommand(Command* command) { this-\u003ecommand = command; } void click() { if (command) { command-\u003eexecute(); } } private: Command* command = nullptr; }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:2","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"解释器模式 定义一个语言的文法,并且建立一个解释器来解释该语言中的句子。(类行为型) 基本要求 classDiagram class Client class Context class AbstractExpression { + interpret (Contest ctx) } class TerminalExpression { + interpret (Contest ctx) } class NonterminaltExpression { + interpret (Contest ctx) } Client ..\u003e Context : Dependency Client ..\u003e AbstractExpression : Dependency Context \u003c.. AbstractExpression : Dependency AbstractExpression \u003c|-- TerminalExpression :Inheritance AbstractExpression \u003c|-- NonterminaltExpression :Inheritance AbstractExpression \u003c--o NonterminaltExpression :Aggregation classDiagram class Client class Context class AbstractExpression { + interpret (Contest ctx) } class TerminalExpression { + interpret (Contest ctx) } class NonterminaltExpression { + interpret (Contest ctx) } Client ..\u003e Context : Dependency Client ..\u003e AbstractExpression : Dependency Context \u003c.. AbstractExpression : Dependency AbstractExpression \u003c|-- TerminalExpression :Inheritance AbstractExpression \u003c|-- NonterminaltExpression :Inheritance AbstractExpression \u003c--o NonterminaltExpression :Aggregation Context(环境类):用于存储解释器之外的一些全局信息,通常用来临时存储需要解释的语句 AbstractExpression(抽象表达式) TerminalExpression(终结符表达式):实现了与文法中的终结符相关联的解释操作,在句子中的每一个终结符都是该类的一个实例 NonterminaltExpression(非终结符表达式):其解释操作一般通过递归的方式来完成 解释器模式实现 解释器模式场景 正则表达式、XML文档解释等领域 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:3","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"迭代器模式 提供一种方法来访问聚合对象,而不用暴露这个对象的内部表示。(对象行为型) 基本要求 classDiagram class Aggergate { +createIterator() } class ConcreteAggerate { +createIterator() } class Iterator { +first() +next() +hasNext() +currentItem() } class ConcreteIterator { +first() +next() +hasNext() +currentItem() } Aggergate \u003c|.. ConcreteAggerate : Realization Iterator \u003c|.. ConcreteIterator : Realization ConcreteAggerate ..\u003e ConcreteIterator : Dependency ConcreteAggerate \u003c-- ConcreteIterator : Association classDiagram class Aggergate { +createIterator() } class ConcreteAggerate { +createIterator() } class Iterator { +first() +next() +hasNext() +currentItem() } class ConcreteIterator { +first() +next() +hasNext() +currentItem() } Aggergate \u003c|.. ConcreteAggerate : Realization Iterator \u003c|.. ConcreteIterator : Realization ConcreteAggerate ..\u003e ConcreteIterator : Dependency ConcreteAggerate \u003c-- ConcreteIterator : Association Iterator(抽象迭代器):声明了用于遍历数据元素的方法 ConcreteIterator(具体迭代器):完成对聚合对象的遍历,同时在具体迭代器中通过游标来记录在聚合对象中所处的当前位置 Aggergate(抽象聚合类):用于存储和管理元素对象;将迭代器类设计为聚合类的内部类(迭代器访问聚合对象数据) ConcreteAggerate(具体聚合类):返回一个与该具体聚合类对应的具体迭代器 迭代器模式实现 template \u003ctypename T, typename U\u003e class PageIterator { public: typedef typename std::vector\u003cT\u003e::iterator iter_type; PageIterator(U *p_data, size_t page_size, size_t page_index) : m_p_data_(p_data), page_size_(page_size), current_index_(page_index * page_size) { m_it_ = m_p_data_-\u003em_data_.begin() + current_index_; } void NextPage() { current_index_ += page_size_; m_it_ = m_p_data_-\u003em_data_.begin() + current_index_; } bool IsDone() { return current_index_ \u003e= m_p_data_-\u003em_data_.size(); } std::vector\u003cT\u003e CurrentPage() { std::vector\u003cT\u003e page_elements; size_t remaining = std::min(page_size_, m_p_data_-\u003em_data_.size() - current_index_); for (size_t i = 0; i \u003c remaining; ++i) { page_elements.push_back(*m_it_); ++m_it_; } return page_elements; } private: U *m_p_data_; // 指向分页容器的指针 iter_type m_it_; // 当前迭代器位置 size_t page_size_; size_t current_index_; }; template \u003cclass T\u003e class PaginatedContainer { friend class PageIterator\u003cT, PaginatedContainer\u003cT\u003e\u003e; public: void Add(T a) { m_data_.push_back(a); } size_t Size() const { return m_data_.size(); } PageIterator\u003cT, PaginatedContainer\u003cT\u003e\u003e CreatePageIterator(size_t page_size, size_t page_index = 0) { return PageIterator\u003cT, PaginatedContainer\u003cT\u003e\u003e(this, page_size, page_index); } private: std::vector\u003cT\u003e m_data_; }; 迭代器模式场景 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:4","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"中介者模式 用一个中介对象来封装一系列的对象交互。(对象行为型) 中介者使各对象不需要显式地相互引用,从而使其耦合松散,而且可以独立地改变它们之间的交互。 基本要求 classDiagram class Mediator class ConcreteMediator class Colleague class ConcreteColleageA class ConcreteColleageB Mediator \u003c|-- ConcreteMediator : Inheritance Mediator \u003c-- Colleague : Association Colleague \u003c|-- ConcreteColleageA : Inheritance Colleague \u003c|-- ConcreteColleageB : Inheritance ConcreteMediator --\u003e ConcreteColleageA : Association ConcreteMediator --\u003e ConcreteColleageB : Association classDiagram class Mediator class ConcreteMediator class Colleague class ConcreteColleageA class ConcreteColleageB Mediator \u003c|-- ConcreteMediator : Inheritance Mediator \u003c-- Colleague : Association Colleague \u003c|-- ConcreteColleageA : Inheritance Colleague \u003c|-- ConcreteColleageB : Inheritance ConcreteMediator --\u003e ConcreteColleageA : Association ConcreteMediator --\u003e ConcreteColleageB : Association Mediator(抽象中介者):用于与各同事对象之间进行通信。作用:中转(结构性),协调(行为性) ConcreteMediator(具体中介者):通过协调各个同事对象来实现协作行为,维持了对各个同事对象的引用 Colleague(抽象同事类):定义各个同事类公有的方法,同事维持了一个对抽象中介者类的引用 ConcreteColleageA(具体同事类):通过中介者来间接完成与其他同事类的通信 中介者模式实现 class Pane; // 中介者接口 class Mediator { public: virtual ~Mediator() {} virtual void PaneChanged(Pane* pane) = 0; }; // 窗格基类或接口 class Pane { public: virtual ~Pane() {} virtual void SetMediator(Mediator* mediator) = 0; virtual void Update() = 0; protected: bool updating = false; }; // 具体窗格类 TextPane class TextPane : public Pane { public: void SetMediator(Mediator* mediator) override { this-\u003emediator = mediator; } void Update() override { if (!updating) { updating = true; mediator-\u003ePaneChanged(this); updating = false; } } private: Mediator* mediator; }; // 具体窗格类 ListPane class ListPane : public Pane { public: void SetMediator(Mediator* mediator) override { this-\u003emediator = mediator; } void Update() override { if (!updating) { updating = true; mediator-\u003ePaneChanged(this); updating = false; } } private: Mediator* mediator; }; // 具体中介者类 WindowMediator class WindowMediator : public Mediator { public: void AddPane(Pane* pane) { panes.push_back(pane); pane-\u003eSetMediator(this); } void PaneChanged(Pane* pane) override { if (dynamic_cast\u003cTextPane*\u003e(pane)) { std::cout \u003c\u003c \"Text Pane updated. Updating other panes as needed.\" \u003c\u003c std::endl; } else if (dynamic_cast\u003cListPane*\u003e(pane)) { std::cout \u003c\u003c \"List Pane updated. Updating other panes as needed.\" \u003c\u003c std::endl; } NotifyOtherPanes(); } private: void NotifyOtherPanes() { for (Pane* p : panes) { if (p != currentPane) { p-\u003eUpdate(); } } } std::vector\u003cPane*\u003e panes; Pane* currentPane = nullptr; }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:5","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"备忘录模式 在不破坏封装的前提下,捕获一个对象的内部状态,并在该对象之外保存这个状态,用来将对象恢复到原先保存的状态。(对象行为型) 基本要求 classDiagram class Originator { -state: +restoreMemento(Memento m) +createMemento() } class Memento { -state: +getState() +setState() } class Caretaker Originator ..\u003e Memento : Dependency Memento \u003c--* Caretaker : Composition classDiagram class Originator { -state: +restoreMemento(Memento m) +createMemento() } class Memento { -state: +getState() +setState() } class Caretaker Originator ..\u003e Memento : Dependency Memento \u003c--* Caretaker : Composition Originator(原发器):创建一个备忘录,并存储其当前内部状态,也可以使用备忘录来恢复其内部状态 Memento(备忘录):存储原发器的内部状态 Caretaker(负责人):只负责存储对象,而不能修改对象,也无须知道对象的实现细节。 备忘录模式实现 // Memento class class Memento { public: Memento(const std::string\u0026 state) : state_(state) {} std::string getState() const { return state_; } void setState(const std::string\u0026 state) { state_ = state; } private: std::string state_; }; // Originator class class Originator { public: Originator() : state_(\"\") {} void setState(const std::string\u0026 state) { std::cout \u003c\u003c \"Originator: Setting state to '\" \u003c\u003c state \u003c\u003c \"'.\\n\"; state_ = state; } std::string getState() const { return state_; } Memento createMemento() { std::cout \u003c\u003c \"Originator: Creating memento.\\n\"; return Memento(state_); } void restoreMemento(const Memento\u0026 memento) { state_ = memento.getState(); std::cout \u003c\u003c \"Originator: State restored to '\" \u003c\u003c state_ \u003c\u003c \"'.\\n\"; } private: std::string state_; }; // Caretaker class class Caretaker { public: Caretaker(Originator* originator) : originator_(originator), current_(-1) {} void createMemento() { Memento memento = originator_-\u003ecreateMemento(); mementos_.push_back(memento); current_ = mementos_.size() - 1; std::cout \u003c\u003c \"Caretaker: Memento created.\\n\"; } void restore() { if (current_ \u003e= 0 \u0026\u0026 current_ \u003c mementos_.size()) { originator_-\u003erestoreMemento(mementos_[current_]); std::cout \u003c\u003c \"Caretaker: State restored.\\n\"; current_--; } else { std::cout \u003c\u003c \"Caretaker: No memento available to restore.\\n\"; } } void redo() { if (current_ + 1 \u003c mementos_.size()) { current_++; originator_-\u003erestoreMemento(mementos_[current_]); std::cout \u003c\u003c \"Caretaker: Redo to next state.\\n\"; } else { std::cout \u003c\u003c \"Caretaker: No redo available.\\n\"; } } private: Originator* originator_; std::vector\u003cMemento\u003e mementos_; size_t current_; }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:6","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"观察者模式 定义对象之间的一种一对多依赖关系,使得每当一个对象状态发生改变时,其相关依赖对象皆得到通知并被自动更新。(对象行为型) 基本要求 classDiagram class Subject { +attach(Observer obs) +detach(Observer obs) +notify() } class Observer { + update() } class ConcreteSubject { -subjectState: +getState() +setState() } class ConcreteObserver { -observerState: +upodate() } Subject --\u003e Observer : Association Subject \u003c|-- ConcreteSubject : Inheritance ConcreteSubject \u003c-- ConcreteObserver : Association Observer \u003c|.. ConcreteObserver : Realization classDiagram class Subject { +attach(Observer obs) +detach(Observer obs) +notify() } class Observer { + update() } class ConcreteSubject { -subjectState: +getState() +setState() } class ConcreteObserver { -observerState: +upodate() } Subject --\u003e Observer : Association Subject \u003c|-- ConcreteSubject : Inheritance ConcreteSubject \u003c-- ConcreteObserver : Association Observer \u003c|.. ConcreteObserver : Realization 观察者模式实现 // 抽象观察者类 class Investor { public: virtual void update(double price) = 0; }; // 具体观察者类,股民 class StockInvestor : public Investor { public: StockInvestor(std::string name) : name(name) {} void update(double price) override { std::cout \u003c\u003c name \u003c\u003c \" received price update: \" \u003c\u003c price \u003c\u003c std::endl; } private: std::string name; }; // 抽象主题类,股票 class Stock { public: virtual void attach(Investor* investor) = 0; virtual void detach(Investor* investor) = 0; virtual void notify(double price) = 0; virtual void setPrice(double price) = 0; }; // 具体主题类,具体股票 class ConcreteStock : public Stock { public: void attach(Investor* investor) override { investors.push_back(investor); } void detach(Investor* investor) override { investors.erase(std::remove(investors.begin(), investors.end(), investor), investors.end()); } void notify(double price) override { for (auto investor : investors) { investor-\u003eupdate(price); } } void setPrice(double price) override { if (priceChanged(price)) { notify(price); } } private: bool priceChanged(double price) { if (lastPrice == 0) { lastPrice = price; return false; } double change = (price - lastPrice) / lastPrice; if (std::abs(change) \u003e= 0.05) { lastPrice = price; return true; } return false; } double lastPrice = 0; std::vector\u003cInvestor*\u003e investors; }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:7","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"状态模式 允许一个对象在其内部状态改变时改变它的行为,对象看起来似乎修改了它的类。(对象行为型) 基本要求 classDiagram class Context { -state: State +request() +setState(State state) } class State { + handle() } class ConcreteStateA { +handle } class ConcreteStateB { +handle } Context o--\u003e State : Aggregation State \u003c|-- ConcreteStateA : Inheritance State \u003c|-- ConcreteStateB : Inheritance classDiagram class Context { -state: State +request() +setState(State state) } class State { + handle() } class ConcreteStateA { +handle } class ConcreteStateB { +handle } Context o--\u003e State : Aggregation State \u003c|-- ConcreteStateA : Inheritance State \u003c|-- ConcreteStateB : Inheritance Context(环境类):拥有多种状态的对象 State(抽象状态类):定义一个接口以封装与环境类的一个特定状态相关的行为 使用环境类实现状态转换 由具体状态类来负责状态之间的转换 状态模式实现 状态模式场景 公文状态的转换、游戏中角色的升级 ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:8","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"策略模式 定义一系列算法类,将每一个算法封装起来,并让它们可以相互替换(变化)。该模式使得算法可独立于使用他的客户程序(稳定)而变化(拓展,子类化)。(对象行为型) 基本要求 classDiagram class Context { -strategy:Strategy +algorithm() } class Strategy { +algorithm() } class ConcreteStartegyA { +algorithm() } class ConcreteStartegyB { +algorithm() } Context o--\u003e Strategy : Aggregation Strategy \u003c|-- ConcreteStartegyA : Inheritance Strategy \u003c|-- ConcreteStartegyB : Inheritance classDiagram class Context { -strategy:Strategy +algorithm() } class Strategy { +algorithm() } class ConcreteStartegyA { +algorithm() } class ConcreteStartegyB { +algorithm() } Context o--\u003e Strategy : Aggregation Strategy \u003c|-- ConcreteStartegyA : Inheritance Strategy \u003c|-- ConcreteStartegyB : Inheritance Context(环境类) :维持一个对抽象策略类的引用实例,用于定义所采用的策略。 策略模式实现 // 策略接口:飞机行为 class AircraftBehavior { public: virtual ~AircraftBehavior() = default; virtual std::string flightCharacteristics() const = 0; virtual std::string takeoffCharacteristics() const = 0; }; // 直升机的具体策略类 class HelicopterBehavior : public AircraftBehavior { public: std::string flightCharacteristics() const override { return \"亚音速飞行\"; } std::string takeoffCharacteristics() const override { return \"垂直起飞\"; } }; // 歼击机的具体策略类 class FighterJetBehavior : public AircraftBehavior { public: std::string flightCharacteristics() const override { return \"超音速飞行\"; } std::string takeoffCharacteristics() const override { return \"长距离起飞\"; } }; // 上下文类,用于操作飞机模拟系统 class AircraftSimulator { public: explicit AircraftSimulator(std::unique_ptr\u003cAircraftBehavior\u003e\u0026\u0026 behavior = nullptr) : behavior_(std::move(behavior)) { } void setBehavior(std::unique_ptr\u003cAircraftBehavior\u003e\u0026\u0026 behavior) { behavior_ = std::move(behavior); } void simulateFlight() const { if (behavior_) { std::cout \u003c\u003c \"Flight Characteristics: \" \u003c\u003c behavior_-\u003eflightCharacteristics() \u003c\u003c \"\\n\"; std::cout \u003c\u003c \"Takeoff Characteristics: \" \u003c\u003c behavior_-\u003etakeoffCharacteristics() \u003c\u003c \"\\n\"; } else { std::cout \u003c\u003c \"No aircraft behavior set.\\n\"; } } private: std::unique_ptr\u003cAircraftBehavior\u003e behavior_; }; 策略模式场景 JAVA SE; PetShop ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:9","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"模板方法模式 定义一个操作中算法的框架(稳定),而将一些步骤延迟(变化)到子类中。 模板方法模式使得子类可以不改变(复用)一个算法的结构即可重定义(重写)该算法的某些特定步骤。(类行为型) 基本要求 classDiagram class AbstractClass { + templateMethod() + primitiveOperation1() + primitiveOperation2() + primitiveOperation3() } class ConcreteClass { + primitiveOperation1() + primitiveOperation2() } AbstractClass \u003c|-- ConcreteClass : Inheritance classDiagram class AbstractClass { + templateMethod() + primitiveOperation1() + primitiveOperation2() + primitiveOperation3() } class ConcreteClass { + primitiveOperation1() + primitiveOperation2() } AbstractClass \u003c|-- ConcreteClass : Inheritance基本方法是实现算法各个步骤的方法,是模板方法的组成部分。 基本方法可以分为3种:抽象方法(Abstract Method)、具体方法(Concrete Method)和钩子方法(Hook Method) 模板方法模式实现 class DatabaseOperation { public: void runSQL(const std::string \u0026connectionString, const std::string\u0026 sql, bool isQuery) { connectDatabase(connectionString); openDatabase(); if (isQuery) { queryDatabase(sql); } else { updateDatabase(sql); } closeDatabase(); } protected: virtual void connectDatabase(const std::string\u0026 connectionString) = 0; virtual void openDatabase() = 0; virtual void queryDatabase(const std::string\u0026 sql) = 0; virtual void updateDatabase(const std::string\u0026 sql) = 0; virtual void closeDatabase() = 0; }; class JDBCDatabaseOperation : public DatabaseOperation { public: void connectDatabase(const std::string\u0026 connectionString) override { connectionStr = connectionString; std::cout \u003c\u003c \"Connecting to database: \" \u003c\u003c connectionStr \u003c\u003c std::endl; } void openDatabase() override { std::cout \u003c\u003c \"Opening database\" \u003c\u003c std::endl; } void queryDatabase(const std::string\u0026 sql) override { std::cout \u003c\u003c \"Executing query: \" \u003c\u003c sql \u003c\u003c std::endl; } void updateDatabase(const std::string\u0026 sql) override { std::cout \u003c\u003c \"Executing update: \" \u003c\u003c sql \u003c\u003c std::endl; } void closeDatabase() override { std::cout \u003c\u003c \"Closing database\" \u003c\u003c std::endl; } protected: std::string connectionStr; }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:10","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"访问者模式 提供一个作用于某对象结构中的各元素的操作表示,它使得可以在不改变各元素的类的前提下定义作用于这些元素的新操作。(对象行为型) 基本要求 classDiagram class Client class Visitor { +visitConcreteElementA(ConcreteElementA elementA) +visitConcreteElementB(ConcreteElementB elementB) } class ConcreteVisitorA { +visitConcreteElementA(ConcreteElementA elementA) +visitConcreteElementB(ConcreteElementB elementB) } class ConcreteVisitorB { +visitConcreteElementA(ConcreteElementA elementA) +visitConcreteElementB(ConcreteElementB elementB) } class ObjectStrucure class Element { + accept(Visitor visitor) } class ConcreteElementA { + accept(Visitor visitor) + +operationA() } class ConcreteElementB { + accept(Visitor visitor) + +operationB() } Client --\u003e Visitor : Association Visitor \u003c|.. ConcreteVisitorA : Realization Visitor \u003c|.. ConcreteVisitorB : Realization Client --\u003e ObjectStrucure : Association ObjectStrucure --\u003e Element : Association Element \u003c|.. ConcreteElementA : Realization Element \u003c|.. ConcreteElementB : Realization classDiagram class Client class Visitor { +visitConcreteElementA(ConcreteElementA elementA) +visitConcreteElementB(ConcreteElementB elementB) } class ConcreteVisitorA { +visitConcreteElementA(ConcreteElementA elementA) +visitConcreteElementB(ConcreteElementB elementB) } class ConcreteVisitorB { +visitConcreteElementA(ConcreteElementA elementA) +visitConcreteElementB(ConcreteElementB elementB) } class ObjectStrucure class Element { + accept(Visitor visitor) } class ConcreteElementA { + accept(Visitor visitor) + +operationA() } class ConcreteElementB { + accept(Visitor visitor) + +operationB() } Client --\u003e Visitor : Association Visitor \u003c|.. ConcreteVisitorA : Realization Visitor \u003c|.. ConcreteVisitorB : Realization Client --\u003e ObjectStrucure : Association ObjectStrucure --\u003e Element : Association Element \u003c|.. ConcreteElementA : Realization Element \u003c|.. ConcreteElementB : Realization Visitor(抽象访问者):为对象结构中每个具体元素类声明一个访问操作 ConcreteVisitor(具体访问者) Element(抽象元素):accept()通常以一个抽象访问者作为参数 ConcreteElement(具体元素):调用访问者的访问方法以便完成对一个元素的操作 ObjectStructure(对象结构):元素的集合,它用于存放元素对象,并且提供了遍历其内部元素的方法 访问者模式实现 class Teacher; class Student; // 访问者抽象类 class Visitor { public: virtual ~Visitor() {} virtual void visit(Teacher \u0026t) = 0; virtual void visit(Student \u0026s) = 0; }; // 候选人抽象类 class Candidate { public: virtual ~Candidate() {} virtual void accept(class Visitor \u0026v) = 0; }; // 教师类 class Teacher : public Candidate { public: Teacher(int papers, int feedback) : paperCount(papers), feedbackScore(feedback) {} int getPapers() const { return paperCount; } int getFeedbackScore() const { return feedbackScore; } void accept(Visitor \u0026v) override { v.visit(*this); } private: int paperCount; int feedbackScore; }; // 学生类 class Student : public Candidate { public: Student(int papers, double score) : paperCount(papers), averageScore(score) {} int getPapers() const { return paperCount; } double getAverageScore() const { return averageScore; } void accept(Visitor \u0026v) override { v.visit(*this); } private: int paperCount; double averageScore; }; // 奖励审批系统实现为具体访问者 class AwardChecker : public Visitor { public: void visit(Teacher \u0026t) override { if (t.getPapers() \u003e 10 || t.getFeedbackScore() \u003e= 90) { std::cout \u003c\u003c \"Teacher is eligible for Research Award.\" \u003c\u003c std::endl; } if (t.getFeedbackScore() \u003e= 90) { std::cout \u003c\u003c \"Teacher is eligible for Excellent Teaching Award.\" \u003c\u003c std::endl; } } void visit(Student \u0026s) override { if (s.getPapers() \u003e 2 || s.getAverageScore() \u003e= 90) { std::cout \u003c\u003c \"Student is eligible for Research Award.\" \u003c\u003c std::endl; } if (s.getAverageScore() \u003e= 90) { std::cout \u003c\u003c \"Student is eligible for Excellent Performance Award.\" \u003c\u003c std::endl; } } }; ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:6:11","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["C++"],"content":"拓展阅读 设计模式:可复用面向对象软件的基础 秒懂设计模式-bilibili 深入设计模式 (refactoringguru.cn) 设计模式的艺术 卡码网设计模式专题 卡码网-23种设计模式精讲 单例模式演变 类图 | Mermaid 中文网 (nodejs.cn) ","date":"2023-09-15","objectID":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/:7:0","tags":["C++"],"title":"设计模式","uri":"/posts/c++/%E8%AE%BE%E8%AE%A1%E6%A8%A1%E5%BC%8F/"},{"categories":["Computer"],"content":"Linx系统编程","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"Linx系统编程 在 Linux 中,手册节号通常被分为以下 8 个部分: 1:用户命令和可执行文件的手册页。 2:系统调用和内核函数的手册页。 3:C 库函数的手册页。 4:特殊文件的手册页,例如设备文件和驱动程序。 5:文件格式和约定的手册页。 6:游戏的手册页。 7:杂项手册页,例如惯例、宏包和协议等。(signal) 8:系统管理命令和守护进程的手册页。 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:1:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"文件与IO #include \u003cunistd.h\u003e #include \u003cstring.h\u003e #include \u003cstdio.h\u003e #include \u003cerrno.h\u003e int main(){ int ret; ret = close(10); if(ret == -1){ perror(\"close error\"); } if(ret == -1){ fprintf(stderr, \"close error: %s\\n\", strerror(errno)); } printf(\"EINTR desc = %s\\n\", strerror(EINTR)); // 系统调用被中断 // man 2 close // E2BIG 参数列表太长 EACCESS 权限不足 EAGAIN 重试 EBADF 错误的文件描述符 EBUSY 设备或资源忙 ECHILD 无子进程 // EDOM 数学参数不在函数域内 EEXIST 文件已存在 EFAULT 地址错误 EFBIG 文件太大 EINTR 系统调用被中断 return 0; } # 输出 close error: Bad file descriptor close error: Bad file descriptor EINTR desc = Interrupted system call ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:2:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"文件描述符 Linux (int)非负整数 (文件描述符) C (FILE* fp)(文件指针) 0 (STDIN_FILENO) 标准输入 stdin 1 (STDOUT_FILENO) 标准输出 stdout 2 (STDERR_FILENO) 标准错误 stderr 相互转换函数: fileno: 将文件指针转换为文件描述符 fdopen: 将文件描述符转换为文件指针 #include \u003cstdlib.h\u003e #include \u003cstdio.h\u003e int main(){ printf(\"fileno(stdin) = %d\\n\", fileno(stdin)); return 0; } # 输出 fileno(stdin) = 0 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:2:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"文件系统调用 open man 2 open 打开文档 ulimit -n 一个进程能够打开的文件个数 cat /proc/sys/fs/file-max 查看当前系统中文件描述符的最大数量限制 功能:打开可能创建一个文件,得到了一个文件描述符 函数原型: int open(const char *path, int flags); int open(const char *path, int flags, mode_t mode); 函数参数 path:文件的名称,可以包含(绝对和相对)路径 flags:文件打开模式 必选项:以下三个常数中必须指定一个,且仅允许指定一个。通过#include \u003cfcntl.h\u003e访问 O_RDONLY 只读打开 O_WRONLY 只写打开 O_RDWR 可读可写打开 以下可选项可同时指定一个或多个,和必选项按位或起来作为flag参数,以下是几个常用选项: O_APPEND 表示追加,所写数据附加到文件末尾。 O_CREAT 若文件不存在则创建它,使用此选项需要提供第三个参数mode,表示该文件的访问权限。 注:文件最终权限:newmode = mode\u0026~umask O_EXCL 如果同时指定了O_CREAT,并且文件已存在,则出错返回。 O_TRUNC 如果文件已存在,清空文件内容,长度置为0。 O_NONBLOCK 对于设备文件,以O_NONBLOCK方式打开可以做非阻塞I/O(NonblockI/O)。 mode:用来规定对该文件的所有者,文件的用户组及其他用户的访问权限(除了使用数字,也可以用相关宏) 比如 0600 和 S_IRUSR | S_IWUSR 此时的0600需要使用 newmode = mode\u0026~umask` –\u003e0600 \u0026~0022=0600 返回值:0:成功;-1:失败,并设置errno值 #include \u003cstdio.h\u003e #include \u003cstdlib.h\u003e #include \u003cunistd.h\u003e #include \u003csys/types.h\u003e #include \u003csys/stat.h\u003e #include \u003cstring.h\u003e #include \u003cerrno.h\u003e #include \u003cfcntl.h\u003e // #define ERR_EXIT(m) (perror(m), exit(EXIT_FAILURE)) #define ERR_EXIT(m) \\ do \\ { \\ perror(m); \\ exit(EXIT_FAILURE); \\ } while(0) int main(){ int fd; fd = open(\"test.txt\", O_RDONLY); /* if(fd == -1){ fprintf(stderr, \"open error with errno=%d %s\\n\", errno, strerror(errno)); exit(EXIT_FAILURE); } */ /* if(fd == -1){ perror(\"open error\"); exit(EXIT_FAILURE); } */ if(fd == -1){ ERR_EXIT(\"open error\"); } close(fd); return 0; } # 输出 open error: No such file or directory CXX = gcc CXXFLAGS = -c -Wall -g TARGETS = open SRCS = $(wildcard *.c) OBJS = $(patsubst %.c, %.o, $(SRCS)) all: $(TARGETS) $(TARGETS): %: %.o $(CXX) -o $@ $^ %.o: %.c $(CXX) $(CXXFLAGS) $\u003c -o $@ .PHONY: clean clean: rm -f *.o $(TARGETS) read 功能:从该文件中读取文件 函数原型: ssize_t read(int fd, void *buffer, size_t count); 函数参数:fd:想要读的文件的文件描述符;buf:指向内存块的指针,从文件中读取来的字节放到这个内存块中;count:从该文件复制到buf中的字节数 注:读取的文件指针偏移,内核数据结构会维护 返回值:0:文件结束;-1:出现错误;复制到缓存区的字节数 write 功能:将数据写到一个文件中 函数原型: ssize_t write(int fd, void *buffer, size_t count); 函数参数:fd:想要写入的文件的文件描述符;buf:指向内存块的指针,从这个内存块中读取数据写入到文件中;count:要写入文件的字节数 返回值:写入的字节数:写入成功;-1:出现错误 close 功能:关闭文件 函数原型: int close(int fd); 函数参数:fd:文件描述符 返回值:0:成功;-1,失败,并设置errno值 简单版cp命令 #define BUFF_SIZE 1024 #define ERR_EXIT(m) \\ do { \\ perror(m); \\ exit(EXIT_FAILURE); \\ } while(0) int main(int argc, char *argv[]){ int infd, outfd; if (argc != 3){ fprintf(stderr, \"Usage %s src dest\\n\", argv[0]); exit(EXIT_FAILURE); } infd = open(argv[1], O_RDONLY); if ((infd = open(argv[1], O_RDONLY)) == -1){ ERR_EXIT(\"open src file error\"); } if ((outfd = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0644)) == -1){ // 等价creat(),creat不多见了 ERR_EXIT(\"open std file error\"); } char buff[BUFF_SIZE]; int numRead; while ((numRead = read(infd, buff, BUFF_SIZE)) \u003e 0) { write(outfd, buff, numRead); } if (close(infd) == -1){ ERR_EXIT(\"close src file error\"); } if (close(outfd) == -1){ ERR_EXIT(\"close dst file error\"); } exit(EXIT_SUCCESS); return 0; } lseek 功能:通过指定相对于开始位置、当前位置或末尾位置的字节数来重定位curp,这取决于Iseek()函数中指定的位置 函数原型: off_t lseek(int fd, off_t offset, int whence); 函数参数:fd:设置的文件描述符; offset:偏移量; whence:搜索的起始位置 SEEK_SET:从文件开始处计算偏移,offset必须为负数 SEEK_CUR:从当前文件的偏移值计算偏移 SEEK_END:从文件的结束处计算偏移 返回值:新的文件偏移值:成功;-1:错误 int main(){ int fd; fd = open(\"test.txt\", O_RDONLY); if (fd == -1){ ERR_EXIT(\"open error\"); } char buf[1024] = {0}; int ret = read(fd, buf, 5); if (ret == -1){ ERR_EXIT(\"read error\"); } printf(\"buf = %s \\n\", buf); ret = lseek(fd, 0, SEEK_CUR); if(ret == -1){ ERR_EXIT(\"lseek\"); } printf(\"current offset = %d \\n\", ret); close(fd); return 0; } int main(){ int fd; fd = open(\"test.txt\", O_WRONLY | O_CREAT | O_TRUNC, 0644); if (fd == -1){ ERR_EXIT(\"open error\"); } write(fd, \"hello\", 5); int ret = lseek(fd, 1024*1024*1024, SEEK_CUR); if(ret == -1){ ERR_EXIT(\"lseek\"); } write(fd, \"world\", 5); close(fd); return 0; } // od -c file 查看文件空格 readdir 功能:访问指定目录下一个连接的细节 函数原型: struct dirent* readdir(DIR *dirptr); 函数参数:dirptr:目录指针 返回值:一个指向dirent结构得指针,包含指定目录中下一个连接得细节:没有更多连接时返回0 简单版ls指令 int main(){ DIR *dir = opendir(\".\"); struct dirent *de; while((de = readdir(dir)) != NULL){ if(st","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:2:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"文件共享 文件描述表(1024个文件描述符) 文件状态转移 读、写、追加、同步、非阻塞等 当前文件偏移量 refcnt = 1(引用计数) v节点指针 v节点表 int main(int argc, char *argv[]){ int fd1, fd2; char buf1[BUFF_SIZE] = {0}; char buf2[BUFF_SIZE] = {0}; fd1 = open(\"test.txt\", O_RDONLY); if (fd1 == -1){ ERR_EXIT(\"open error\"); } read(fd1, buf1, 5); printf(\"buf1 = %s\\n\", buf1); fd2 = open(\"test.txt\", O_RDWR); if (fd2 == -1){ ERR_EXIT(\"open error\"); } read(fd2, buf2, 5); printf(\"buf2 = %s\\n\", buf2); write(fd2, \"world\", 5); memset(buf1, 0, sizeof(buf1)); read(fd1, buf1, 5); printf(\"buf1 = %s\\n\", buf1); close(fd1); close(fd2); return 0; } 重定向 dup 2\u003e\u00261:把标准错误(2)重定向到标准输出(1) dup:int dup(int oldfd); dup2:int dup2(int oldfd, int newfd); fcntl: int fcntl(int fd, int cmd, ...); int main(int argc, char *argv[]){ int fd; fd = open(\"test.txt\", O_WRONLY); if (fd == -1){ ERR_EXIT(\"open error\"); } /* close(1); dup(fd); // 0,1,2(输入,输出,错误)占用,默认返回3 */ /* dup2(fd, 1); // 如果由 newfd 参数所指定编号的文件描述符之前已经打开,那么 dup2()会首先将其关闭 */ close(1); if (fcntl(fd, F_DUPFD, 0) \u003c 0){ ERR_EXIT(\"dup fd error\"); } printf(\"hello\\n\"); return 0; } fcntl 功能:操纵文件描述符,改变已打开的文件的属性 函数原型: int fcntl(int fd, int cmd, ...); 函数参数: fd:文件描述符 cmd操作 复制文件描述符 F_DUPFD(Iong) 文件描述符标志 F_GETFD(void) F_SETFD(long) 文件状态标志 F_GETFL(void) F_SETFL(Iong) 文件锁 F GETLK F_SETLK, F_SETLKW 返回值:0:成功;-1:失败 void set_flag(int fd, int flags); void clr_flag(int fd, int flags); int main(int argc, char *argv[]){ char buf[BUFF_SIZE] = {0}; int ret; /* set_flag function int flags; flags = fcntl(0, F_GETFL, 0); if (flags == -1) { ERR_EXIT(\"fcntl get flag error\"); } ret = fcntl(0, F_SETFL, flags | O_NONBLOCK); if (ret == -1) { ERR_EXIT(\"fcntl set flag error\"); } */ set_flag(0, O_NONBLOCK); clr_flag(0, O_NONBLOCK); ret = read(0, buf, BUFF_SIZE); if (ret == -1) { ERR_EXIT(\"read error\"); } printf(\"buf = %s \\n\", buf); return 0; } void set_flag(int fd, int flags){ int val; val = fcntl(fd, F_GETFL, 0); if(val == -1){ ERR_EXIT(\"fcntl get flag error\"); } val |= flags; if(fcntl(fd, F_SETFL, val) \u003c 0){ ERR_EXIT(\"fcntl set flag error\"); } } void clr_flag(int fd, int flags){ int val; val = fcntl(fd, F_GETFL, 0); if(val == -1){ ERR_EXIT(\"fcntl get flag error\"); } val \u0026= ~flags; if(fcntl(fd, F_SETFL, val) \u003c 0){ ERR_EXIT(\"fcntl set flag error\"); } } 文件锁结构体查看 man 2 fcntl int main(int argc, char *argv[]){ int fd; fd = open(\"test.txt\", O_CREAT | O_RDWR | O_TRUNC, 0644); if (fd == -1){ ERR_EXIT(\"open error\"); } struct flock lock; memset(\u0026lock, 0, sizeof(lock)); lock.l_type = F_WRLCK; lock.l_whence = SEEK_SET; lock.l_start = 0; lock.l_len = 0; if(fcntl(fd, F_SETLK, \u0026lock) == 0){ printf(\"lock success\\n\"); printf(\"press any key to unlock\\n\"); lock.l_type = F_UNLCK; if (fcntl(fd, F_SETLK, \u0026lock) == 0){ printf(\"unlock success\\n\"); } else { ERR_EXIT(\"unlock fail\"); } } else { ERR_EXIT(\"lock fail\"); } return 0; } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:2:3","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"进程 代码段 + 数据段 + 堆栈段 + PCB(进程控制块process control block) ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:3:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"进程状态变迁 进程状态变迁 进程创建 给新创建的进程分配一个内部标识,在内核中建立进程结构 复制父进程的环境 为进程分配资源,包括进程映像所需要的所有元素(程序、数据、用户栈等), 复制父进程地址空间的内容到该进程地址空间中。 置该进程的状态为就绪,插入就绪队列。 进程撤销 关闭软中断:因为进程即将终止而不再处理任何软中断信号 回收资源:释放进程分配的所有资源,如关闭所有已打开文件,释放进程相应的数据结构等 写记帐信息:将进程在运行过程中所产生的记帐数据(其中包括进程运行时的各种统计信息)记录到一个全局记帐文件中 置该进程为僵死状态:向父进程发送子进程死的软中断信号,将终止信息status送到指定的存储单元中: 转进程调度:因为此时CPU已经被释放,需要由进程调度进行CPU再分配。 终止进程 从main函数返回 调用exit 调用exit 调用abort 由信号终止 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:3:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"fork系统调用(写时复制) 功能:创建一个子进程。一次调用两次返回,创建一个进程副本,在各自的进程地址空间返回 函数原型: pid_t fork(void); 函数参数:无参数 返回值: 如果成功创建一个子进程,对于父进程来说返回子进程ID 如果成功创建一个子进程,对于子进程来说返回值为0 如果为-1表示创建失败 子进程和父进程的区别 父进程设置的锁,子进程不继承 各自的进程ID和父进程ID不同 子进程的未决告警被清除: 子进程的未决信号集设置为空集。 注意 fork系统调用之后,父子进程将交替执行 (孤儿进程,托孤给一号进程)如果父进程先退出,子进程还没退出,那么子进程的父进程将变为init进程。(注:任何一个进程都必须有父进程) (僵死进程,子进程先退出,父进程尚未查询子进程的退出状态)如果子进程先退出,父进程还没退出,那么子进程要等到父进程捕获到子进程的退出状态才真正结束,否则这个时候子进程就成为僵进程。 使用信号 signal(SIGCHLD, SIG_IGN),避免僵死进程 调用进程的进程号:pid_t getpid(void); 检索父进程的进程号:pid_t getppid(void); int main(int argc, char *argv[]){ signal(SIGCHLD, SIG_IGN); printf(\"before fork pid = %d\\n\", getpid()); pid_t pid; pid = fork(); if (oid == -1){ ERR_EXIT(\"fork fail\"); } if (pid \u003e 0){ printf(\"this is parent pid = %d, child pid = %d\\n\", getpid(), pid); sleep(1); } else if (pid == 0){ printf(\"this is child pid = %d, parent pid = %d\\n\", getpid(), getppid()); } return 0; } 查看进程的树状关系:pstree /proc/PID/status提供的PPid字段,查看每个进程的父进程。 系统支持的最大线程数:cat /proc/sys/kernel/threads-max 系统全局的 PID 号数值的限制:cat /proc/sys/kernel/pid_max vfork fork + exec(替换函数) –\u003e创建一个进程 + 替换(新的程序) 使用vfork,子进程必须执行_exit或者exec函数。_ int gval = 100; int main(int argc, char *argv[]){ signal(SIGCHLD, SIG_IGN); printf(\"before fork pid = %d\\n\", getpid()); pid_t pid; pid = vfork(); if (pid == -1){ ERR_EXIT(\"fork fail\"); } if (pid \u003e 0){ printf(\"this is parent pid = %d, child pid = %d, gval = %d\\n\", getpid(), pid, gval); sleep(1); } else if (pid == 0){ gval++; // copy on write printf(\"this is child pid = %d, parent pid = %d, gval = %d\\n\", getpid(), getppid(), gval); _exit(0); } return 0; } exit和_exit _exit:系统调用 exit:C库函数 会做缓存区清除操作:fflush(stdout); exit(); 调用终止处理程序(最多注册32个) 终止处理程序需注册:int atexit(void (*function)(void));调用和注册次序相反;即调用顺序和输出顺序相反。 int execve(const char *filename, char *const argv[], char *const rnvp[]); exec替换进程映像 替换后,不会运行之后的代码 man execlp int execl(const char *path, const char *arg, ...); int execlp(const char *file, const char *arg, ...); int execle(const char *path, const char *arg, ..., char * const envp[]); int execv(const char *path, char *const argv[]); int execvp(const char *file, char *const argv[]); int execvpe(const char *file, char *const argv[], char *const envp[]); printf(\"pid = %d \\n\", getpid()); int ret = execlp(\"./fork_pid\", \"fork_pid\", NULL); int ret = execl(\"/bin/ls\", \"ls\", \"-l\", NULL); // 指定全路径 等价于 int ret = execlp(\"ls\", \"ls\", \"-l\", NULL); 等价于 char *const args[] = {\"ls\", \"-l\", NULL}; int ret = execvp(\"ls\", args); char *const envp[] = {\"AA=11\", \"BB=22\", NULL}; // 配置环境变量, 但是没输出AA和BB很奇怪 int ret = execle(\"/bin/ls\", \"ls\", NULL, envp); if (ret == -1){ perror(\"ececlp error\"); } fcntl和exec int main(int argc, char *argv[]){ printf(\"Entering main ... \\n\"); int fret = fcntl(1, F_SETFD, FD_CLOEXEC); if (fret == -1){ perror(\"fcntl error\"); } int ret = execlp(\"./fork_pid\", \"fork_pid\", NULL); if (ret == -1){ perror(\"ececlp error\"); } printf(\"Entering main ... \\n\"); return 0; } # 输出 Entering main ... wait和waitpid 信号:异步通知事件 当子进程退出的时候,内核会向父进程发送SIGCHLD信号,子进程的退出是异步事件(子进程可以在父进程运行的任何时刻终止) 子进程退出时,内核将子进程置为僵尸状态,这个进程称为僵尸进程,它只保留最小的一些内核数据结构,以便父进程查询子进程的退出状态。 wait 功能:父进程查询子进程的退出状态 函数原型: pid_t wait(int *status); 函数参数:status:该参数可以获得等待子进程的信息 返回值:如果成功,返回等待子进程的ID int main(int argc, char *argv[]){ pid_t pid; printf(\"before fork pid = %d\\n\", getpid()); pid = fork(); if (pid == -1){ ERR_EXIT(\"fork fail\"); } if (pid \u003e 0){ printf(\"this is parent pid = %d, child pid = %d\\n\", getpid(), pid); } if (pid == 0){ sleep(3); printf(\"this is child pid = %d, parent pid = %d\\n\", getpid(), getppid()); // exit(1); abort(); // 异常终止 } printf(\"this is parent\\n\"); int ret, status; ret = wait(\u0026status); // 等待子进程退出 printf(\"ret = %d, pid = %d\\n\", ret, pid); // wait返回值子进程PIDS return 0; } // 状态信息 if (WIFEXITED(status)){ printf(\"child exited normal, exit status = %d\\n\", WEXITSTATUS(status)); } else if (WIFSIGNALED(status)){ printf(\"child exited abnormal, signal number = %d\\n\", WTERMSIG(status));// 通过kill -l查看信号 man 7 signal } else if (WIFSTOPPED(status)){ printf(\"child stoped , signal number = %d\\n\", WTERMSIG(status));","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:3:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"信号 man 7 signal ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:4:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"信号和中断 中断过程 中断信号 中断源 保护现场 中断处理程序 恢复现场 中断源–\u003e中断屏蔽–\u003e 保护现场–\u003e中断处理程序–\u003e恢复现场 中断向量表:保存固定个数的中断处理程序入口地址 中断分类 硬件中断(外部中断) 外部中断是指由外部设各通过硬件请求的方式产生的中断,也称为硬件中断 软件中断(内部中断) 内部中断是由CPU运行程序错误或执行内部程序调用引起的一种中断,也称为软件中断。 信号是系统响应某些状况而产生的事件,进程在接收到信号时会采取相应的行动。 信号是在软件层次上对中断的一种模拟,所以通常把它称为是软中断 kill -l查看信号 信号分类 可靠信号(实时信号,支持排队,SIGRT开头); 非可靠信号(非实时信号,不支持排队) 信号与中断的相似点 采用了相同的异步通信方式 当检测出有信号或中断请求时,都暂停正在执行的程序而转去执行相应的处理程序 都在处理完毕后返回到原来的断点 对信号或中断都可进行屏蔽。 信号与中断的区别 中断有优先级,而信号没有优先级,所有的信号都是平等的 信号处理程序是在用户态下运行的,而中断处理程序是在核心态下运行 中断响应是及时的,而信号响应通常都有较大的时间延迟 进程对信号的三种响应 忽略信号:不采取任何操作,有两个信号不能忽略,也不能捕获:SIGKILL和SIGSTOP即-9和19 捕获并处理信号:内核中断正在执行的代码,转去执行先前注册过的处理程序。 执行默认操作:默认操作通常是终止进程,这取决于被发送的信号 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:4:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"signal SIGINT (crtl + c); SIGQUIT (crtl + \\) 功能:安装信号 函数原型: __sighandler_t signal(int signum, __sighandler_t handler); 函数参数:signum:中断号,handler:中断处理程序 准备捕捉或屏蔽的信号由参数signum给出,接收到指定信号时将要调用的函数由handler给出 handler:这个函数必须有一个int类型的参数(即接收到的信号代码),它本身的类型是void handler也可以是两个特殊值:SIG_IGN:屏蔽该信号;SIG_DFL:恢复默认行为 返回值:上一次所处理的程序 void handler(int sig); void func(int numLoops, char ch, char *pass); int main(int argc, char *argv[]){ if (signal(SIGINT, handler) == SIG_ERR){ ERR_EXIT(\"signal error\"); } int numLoops = 0; char ch = '\\0'; func(numLoops, ch, \"pass\"); return 0; } void handler(int sig){ printf(\"\\nrecieve a signal = %d\\n\", sig); } void func(int numLoops, char ch, char *pass){ while (1) { printf(\"Press ENTER to test (loop %d )...\", numLoops); numLoops++; ch = getchar(); if (ch == '\\n'){ printf(\"%s\\n\", pass); } else { break; } } } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:4:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"信号发送 kill 功能:发送信号 函数原型: int kill(pid_t pid, int sig); 函数参数:pid:进程号,sig:信号 pid \u003e 0: 信号sig发送给进程号等于pid的进程 pid = 0: 信号sig被发送给调用者所在组的每一个进程 pid = -1: 信号sig被发送给调用者进程有权限发送的每一个进程,除了1号进程和自己之外 pid \u003c -1: 信号sig被发送给进程组等于-pid的每一个进程 返回值:0: 成功。 -1 :设置 errno 表示错误 void handler(int sig); int main(int argc, char *argv[]){ if (signal(SIGUSR1, handler) == SIG_ERR){ ERR_EXIT(\"signal error\"); } pid_t pid = fork(); if (pid == -1){ ERR_EXIT(\"fork error\"); } if (pid == 0){ // kill(getppid(), SIGUSR1); // exit(EXIT_SUCCESS); pid = getpgrp(); kill(-pid, SIGUSR1); exit(EXIT_SUCCESS); } int n = 5; do { n = sleep(n); } while (n \u003e 0); return 0; } void handler(int sig){ printf(\"recv a sig = %d\\n\", sig); } pause 功能:使调用者进程挂起,直到一个信号被捕获 函数原型: int pause(void); 返回值:0: 成功。 -1 :设置 errno 表示错误 void handler(int sig); int main(int argc, char *argv[]){ if (signal(SIGINT, handler) == SIG_ERR){ ERR_EXIT(\"signal error\"); } alarm(1); for (;;){ pause(); printf(\"pause return\\n\"); } return 0; } void handler(int sig){ printf(\"recv a sig = %d\\n\", sig); alarm(1); } kill -ALRM `ps aux | grep demo | grep -v vi | grep -v grep | awk '{print $2}'` raise 功能:给自己发信号。raise(sig)等价于kill(getpid(), sig) killpg 功能:给进程组发信号。killpg(pgrp, sig)等价于kill(-pgrp, sig) sigqueue 功能:给进程发送信号,支持排队,可以附带信号 函数原型: int sigqueue(pid_t pid, int sig, const union sigval value); 参数:pid:进程号, sig:信号;value:信号传递的参数 返回值:-1:失败;0:成功 // 接收 void handler(int sig, siginfo_t *info, void *ctx); int main(int argc, char *argv[]){ struct sigaction act; act.sa_sigaction = handler; sigemptyset(\u0026act.sa_mask); act.sa_flags = SA_SIGINFO; if (sigaction(SIGINT, \u0026act, NULL) \u003c 0){ ERR_EXIT(\"sigaction error\"); } for (;;){ pause(); } return 0; } void handler(int sig, siginfo_t *info, void *ctx){ printf(\"recv a sig = %d data = %d\\n\", sig, info-\u003esi_value.sival_int); } // 发送 int main(int argc, char *argv[]){ if (argc != 2){ fprintf(stderr, \"Usage %s pid\\n\", argv[0]); exit(EXIT_FAILURE); } pid_t pid = atoi(argv[1]); union sigval v; v.sival_int = 100; sigqueue(pid, SIGINT, v); return 0; } // 运行 ./sigqueue_send `ps aux | grep sigqueue_recv | grep -v vi | grep -v grep | awk '{print $2}'` ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:4:3","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"可重入函数 使用不可重入函数,进程可能会修改原来进程中不应该被修改的数据,是不安全的 多数是不可重入函数的,一般条件如下: 使用静态数据结构 函数实现时调用了malloc或free函数 实现了使用标准IO函数 typedef struct { int a; int b; } TEST; TEST g_data; int main(int argc, char *argv[]){ TEST zeros ={0, 0}; TEST ones = {1, 1}; if (signal(SIGALRM, handler) == SIG_ERR){ ERR_EXIT(\"signal error\"); } g_data = zeros; alarm(1); for (;;){ g_data = zeros; g_data = ones; } return 0; } void unsafe_fun(){ printf(\"%d %d\\n\", g_data.a, g_data.b); } void handler(int sig){ unsafe_fun(); alarm(1); } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:4:4","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"信号未决(pending) 执行信号的处理动作称为信号递达,信号从产生到递达之间的状态,称为信号未决。 信号集操作函数 int sigemptyset(sigset_t *set); int sigfillset(sigset_t *set); int sigaddset(sigset_t *set, int signum); int sigdelset(sigset_t *set, int signum); int sigismember(const sigset_t *set, int signum); sigprocmask 功能:读取或更改进程的信号屏蔽字 函数原型: int sigprocmask(int how, const sigset_t *set, sigset_t *oldset); 函数参数:以屏蔽子mask为例 SIG_BLOCK 被阻塞的信号集是当前信号集和信号集参数的集合。mask = mask | set SIG_UNBLOCK 从当前阻塞信号集中删除 set 中的信号。 允许尝试解锁未被屏蔽的信号。 mask = mask \u0026 ~set SIG_SETMASK 阻塞信号集被设置为参数 set。mask = set 如果 oldset 是非空指针,则读取进程的当前信号屏蔽字通过oldset参数传出。 返回值:0: 成功。 -1: 出错 void handler(int sig); void printsigset(sigset_t *set); int main(int argc, char *argv[]){ sigset_t pset, bset; sigemptyset(\u0026bset); sigaddset(\u0026bset, SIGINT); if (signal(SIGINT, handler) == SIG_ERR){ ERR_EXIT(\"signal error\"); } if (signal(SIGQUIT, handler) == SIG_ERR){ ERR_EXIT(\"signal error\"); } sigprocmask(SIG_BLOCK, \u0026bset, NULL); for(;;){ sigpending(\u0026pset); printsigset(\u0026pset); sleep(1); } return 0; } void handler(int sig){ if (sig == SIGINT){ printf(\"recv a sig = %d\\n\", sig); } else { sigset_t uset; sigemptyset(\u0026uset); sigaddset(\u0026uset, SIGINT); sigprocmask(SIG_UNBLOCK, \u0026uset, NULL); } } void printsigset(sigset_t *set){ int i; for (i = 1; i \u003c NSIG; ++i){ if (sigismember(set, i)){ putchar('1'); } else { putchar('0'); } } } sigaction 功能:改变进程接收到特定信号后的行为 原型: int sigaction(int signum, const struct sigaction *act, struct sigaction *oldact); 参数:signum:信号值;act:指向结构sigaction的实例指针loldact:oldact指向的对象用来保存原来相对应信号的处理。 struct sigaction { void (*sa_handler)(int); void (*sa_sigaction)(int, siginfo_t *, void *); sigset_t sa_mask; int sa_flags; void (*sa_restorer)(void); }; 返回值:-1:失败;0:成功 oid handler(int sig); __sighandler_t my_signal(int sig, __sighandler_t handler); int main(int argc, char *argv[]){ // struct sigaction act; // act.sa_handler = handler; // sigemptyset(\u0026act.sa_mask); // act.sa_flags = 0; // if (sigaction(SIGINT, \u0026act, NULL) \u003c 0){ // ERR_EXIT(\"sigaction error\"); // } my_signal(SIGINT, handler); for (;;){ pause(); } return 0; } __sighandler_t my_signal(int sig, __sighandler_t handler){ struct sigaction act; struct sigaction oldact; act.sa_handler = handler; sigemptyset(\u0026act.sa_mask); act.sa_flags = 0; if (sigaction(SIGINT, \u0026act, \u0026oldact) \u003c 0){ return SIG_ERR; } return oldact.sa_handler; } void handler(int sig){ printf(\"recv a sig = %d\\n\", sig); } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:4:5","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"时间 不同精度下的休眠 秒:unsigned int sleep(unsigned int seconds); time_t 微秒:int usleep(useconds_t usec); struct timeval{ long tv_sec; long tv_usec; } 纳秒:int nanosleep(const struct timespec *req, struct timespec *rem); struct timespec { time_t tv_sec; /* seconds */ long tv_nsec; /* nanoseconds */ }; setitimer 功能:将 which 指定的定时器当前值存储到 value 指向的结构体中 函数原型: int setitimer(int which, const struct itimerval *restrict value, struct itimerval *restrict ovalue); 参数:which:指定定时器类型 IYIMER_REAL: 经过指定时间后,内核发送SIGALARM信号给本进程 ITIMER_VIRTUAL: 在用户空间执行指定的时间后,内核发送 SIGVTALRM 信号给本进程。 ITIMER_PROF: 在用户空间与内核空间执行指定的时间后,内核发送 SIGPROF 信号给本进程。 返回值:-1失败,0成功 void handler(int sig); int main(int argc, char *argv[]){ if (signal(SIGINT, handler) == SIG_ERR){ ERR_EXIT(\"signal error\"); } struct timeval tv_interval = {1, 0}; struct timeval tv_value = {1, 0}; struct itimerval it; it.it_interval = tv_interval; it.it_value = tv_value; setitimer(ITIMER_REAL, \u0026it, NULL); // for(;;); for (int i = 0; i \u003c 10000; i++); struct itimerval oit; setitimer(ITIMER_REAL, \u0026it, \u0026oit); printf(\"%d %d %d %d\\n\", (int)oit.it_interval.tv_sec, (int)oit.it_interval.tv_usec, (int)oit.it_value.tv_sec, (int)oit.it_value.tv_usec); return 0; } void handler(int sig){ printf(\"recv a sig = %d\\n\", sig); } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:4:6","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"管道 管道 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:5:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"匿名管道 pipe 功能:创建一无名管道 (在具有共同祖先的进程间通信) 原型:int pipe(int fd[2]); 参数:fd:文件描述符数组,0:读端,1写端 返回:0:成功,错误代码:失败 int main(int argc, char *argv[]){ int pipefd[2]; if (pipe(pipefd) == -1){ ERR_EXIT(\"pipe error\"); } pid_t pid; pid = fork(); if (pid == -1){ ERR_EXIT(\"fork error\"); } if (pid == 0){ close(pipefd[0]); write(pipefd[1], \"hello\", 5); close(pipefd[1]); exit(EXIT_SUCCESS); } close(pipefd[1]); char buf[10] = {0}; read(pipefd[0], buf, 10); printf(\"buf = %s\\n\", buf); return 0; } ls | wc -w int main(int argc, char *argv[]){ int pipefd[2]; if (pipe(pipefd) == -1){ ERR_EXIT(\"pipe error\"); } pid_t pid; pid = fork(); if (pid == -1){ ERR_EXIT(\"fork error\"); } if (pid == 0){ dup2(pipefd[1], STDOUT_FILENO); close(pipefd[1]); close(pipefd[0]); execlp(\"ls\", \"ls\", NULL); fprintf(stderr, \"error execute ls\\n\"); exit(EXIT_FAILURE); } dup2(pipefd[0], STDIN_FILENO); close(pipefd[0]); close(pipefd[1]); execlp(\"wc\", \"wc\", \"-w\", NULL); fprintf(stderr, \"error execute ls\\n\"); exit(EXIT_FAILURE); return 0; } cp int main(int argc, char *argv[]){ close(0); open(\"makefile\", O_RDONLY); close(1); open(\"test\", O_WRONLY | O_CREAT | O_TRUNC, 0644); execlp(\"cat\", \"cat\", NULL); return 0; } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:5:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"管道读写规则 当没有数据可读时 O_NONBLOCK disable:read调用阻塞,即进程暂停执行,一直等到有数据来到为止。 O_NONBLOCK enable:read调用返回-1,errno值为EAGAIN。 如果所有管道写端对应的文件描述符被关闭,则read返回0 如果所有管道读端对应的文件描述符被关闭,则write操作会产生信号SIGPIPE 当要写入的数据量不大于PIPE_BUF时,linux将保证写入的原子性。 当要写入的数据量大于PIPE_BUF时,linux将不再保证写入的原子性。 int main(int argc, char *argv[]){ int pipefd[2]; if (pipe(pipefd) == -1){ ERR_EXIT(\"pipe error\"); } pid_t pid; pid = fork(); if (pid == -1){ ERR_EXIT(\"fork error\"); } if (pid == 0){ sleep(3); close(pipefd[0]); write(pipefd[1], \"hello\", 5); close(pipefd[1]); exit(EXIT_SUCCESS); } close(pipefd[1]); char buf[10] = {0}; int flags = fcntl(pipefd[0], F_GETFL); fcntl(pipefd[0], F_SETFL, flags | O_NONBLOCK); int ret = read(pipefd[0], buf, 10); if (ret == -1){ ERR_EXIT(\"read error\"); } printf(\"buf = %s\\n\", buf); return 0; } // 输出:read error: Resource temporarily unavailable int main(int argc, char *argv[]){ int pipefd[2]; if (pipe(pipefd) == -1){ ERR_EXIT(\"pipe error\"); } pid_t pid; pid = fork(); if (pid == -1){ ERR_EXIT(\"fork error\"); } if (pid == 0){ close(pipefd[1]); exit(EXIT_SUCCESS); } close(pipefd[1]); char buf[10] = {0}; int ret = read(pipefd[0], buf, 10); if (ret == -1){ ERR_EXIT(\"read error\"); } printf(\"ret = %d\\n\", ret); return 0; } // 输出: ret = 0 void handler(int sig){ printf(\"recv a sig = %d\\n\", sig); } int main(int argc, char *argv[]){ signal(SIGPIPE, handler); int pipefd[2]; if (pipe(pipefd) == -1){ ERR_EXIT(\"pipe error\"); } pid_t pid; pid = fork(); if (pid == -1){ ERR_EXIT(\"fork error\"); } if (pid == 0){ close(pipefd[0]); exit(EXIT_SUCCESS); } close(pipefd[0]); sleep(1); int ret = write(pipefd[1], \"hello\", 5); if (ret == -1){ ERR_EXIT(\"write error\"); } return 0; } // 输出: recv a sig = 13 write error: Broken pipe // 管道大小 65536 int main(int argc, char *argv[]){ int pipefd[2]; if (pipe(pipefd) == -1){ ERR_EXIT(\"pipe error\"); } int ret; int count = 0; int flags = fcntl(pipefd[1], F_GETFL); fcntl(pipefd[1], F_SETFL, flags | O_NONBLOCK); while(1){ ret = write(pipefd[1], \"A\", 1); if (ret == -1){ printf(\"err = %s\\n\", strerror(errno)); break; } count++; } printf(\"pipe size = %d\\n\", count); return 0; } int main(int argc, char *argv[]){ char a[TEST_SIZE]; char b[TEST_SIZE]; memset(a, 'A', sizeof(a)); memset(b, 'B', sizeof(b)); int pipefd[2]; int ret = pipe(pipefd); if(ret == -1){ ERR_EXIT(\"pipe error\"); } pid_t pid = fork(); if (pid == 0){ close(pipefd[0]); ret = write(pipefd[1], a, sizeof(a)); printf(\"apid = %d write %d bytes to pipe\\n\", getpid(), ret); exit(0); } pid = fork(); if (pid == 0){ close(pipefd[0]); ret = write(pipefd[1], b, sizeof(b)); printf(\"bpid = %d write %d bytes to pipe\\n\", getpid(), ret); exit(0); } close(pipefd[1]); sleep(1); int fd = open(\"test.txt\", O_WRONLY | O_CREAT | O_TRUNC, 0644); char buf[1024*4] = {0}; int n = 1; while(1){ ret = read(pipefd[0], buf, sizeof(buf)); if (ret == 0) break; printf(\"n = %02d pid = %d read %d bytes from pipe buf[4095] = %c\\n\", n++, getpid(), ret, buf[4095]); write(fd, buf, ret); } return 0; } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:5:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"命名管道FIFO mkfifo 功能:创建一命名管道 ,可在不相关的进程间进行通信,命令行创建mkfifo filename 原型:int mkfifo(const char *pathname, mode_t mode); 参数:pathname:文件名;mode:文件状态模式 返回:0:成功,-1:失败 命名管道打开规则 如果当前打开操作是为读而打开FIFO时 O_NONBLOCK disable:阻塞直到有相应进程为写而打开FIFO O_NONBLOCK enable:立刻返回成功 如果当前打开操作是为写而打开FIFO时 O_NONBLOCK disable:阻塞直到有相应进程为读而打开FIFO O_NONBLOCK enable:立刻返回失败,错误码为ENXIO int main(int argc, char *argv[]){ int fifo; fifo = mkfifo(\"p1\", 0644); if (fifo == -1){ ERR_EXIT(\"FIFO create fail\"); } int fd; fd = open(\"p1\", O_RDONLY | O_NONBLOCK); if (fd == -1){ ERR_EXIT(\"open error\"); } printf(\"open success\\n\"); return 0; } // 输出:open success int main(int argc, char *argv[]){ int fd; fd = open(\"p1\", O_WRONLY | O_NONBLOCK); if (fd == -1){ ERR_EXIT(\"open error\"); } printf(\"open success\\n\"); return 0; } // 输出open error: No such device or address cp int main(int argc, char *argv[]){ mkfifo(\"tp\", 0644); int infd; infd = open(\"makefile\", O_RDONLY); if (infd == -1){ ERR_EXIT(\"open error\"); } int outfd; outfd = open(\"tp\", O_WRONLY); if (outfd == -1){ ERR_EXIT(\"open error\"); } char buf[1024]; int n; while ((n = read(infd, buf, 1024)) \u003e 0){ write(outfd, buf, n); } close(infd); close(outfd); return 0; } int main(int argc, char *argv[]){ int infd; infd = open(\"tp\", O_RDONLY); if (infd == -1){ ERR_EXIT(\"open error\"); } int outfd; outfd = open(\"test\", O_WRONLY | O_CREAT | O_TRUNC, 0644); if (outfd == -1){ ERR_EXIT(\"open error\"); } char buf[1024]; int n; while ((n = read(infd, buf, 1024)) \u003e 0){ write(outfd, buf, n); } close(infd); close(outfd); unlink(\"tp\"); return 0; } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:5:3","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"minishell实践 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:6:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"parse_command // 单条命令解析 ls -l -a char *cp = cmdline; char *avp = avline; int i = 0; while(*cp != '\\0'){ while (*cp == ' ' || *cp == '\\t'){ cp++; // 过滤空格 } if (*cp == '\\0' || *cp == '\\n'){ break; // 行尾跳出 } cmd.args[i] = avp; while (*cp != '\\0' \u0026\u0026 *cp !=' ' \u0026\u0026 *cp != '\\t' \u0026\u0026 *cp !='\\n'){ *avp++ = *cp++; } *avp++ = '\\0'; // printf(\"[%s]\\n\", cmd.args[i]); i++; } // 单条命令执行 pid_t pid = fork(); // 让子进程执行命令execvp(execvp是替换程序) if (pid == -1){ ERR_EXIT(\"fork\"); } int ret; if (pid == 0){ ret = execvp(cmd.args[0], cmd.args); if (ret == -1){ ERR_EXIT(\"execvp\"); } } wait(NULL); cmd [\u003c filename][| cmd] ... [or filename][\u0026] 方括号可选 省略号(…)表示前面可重复0次或者多次 其中or可以是\u003e 或者\u003e\u003e /* cat \u003c test.txt | grep -n public \u003e test2.txt \u0026 */ /* 1. 解析第一条简单命令 2. 判定是否有输入重定向符 3. 判定是否有管道 4. 判定是否有输出重定向符 5. 判定是否后台作业 6. 判断命令结束 '\\n' */ ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:6:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"Linux网络编程 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:7:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"TCP/IP 直接看 图解网络介绍 | 小林coding (xiaolincoding.com) TCP_IP TCP/IP四层模型 TCP/IP四层模型 封装过程 封装 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:8:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"链路层 最大传输单元(MTU):链路层数据帧的最大长度, 两台通信主机路径中的最小MTU叫路径MTU。 ICMP协议:用于传递差错信息、时间、回显、网络信息等控制数据,在IP报文中 ARP地址解析协议:广播机制传播,回复ARP请求,ARP缓存区映射 RARP反向地址解析协议 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:8:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"传输控制层 TCP 基于字节流,面向连接,可靠传输,缓冲传输,全双工,流量控制 最长报文大小MSS 保证可靠性 差错:校验和 丢包:超时重传+确认 失序:seq 重复:seq TCP 头格式 三次握手 TCP 三次握手 四次挥手 客户端主动关闭连接 —— TCP 四次挥手 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:8:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"滑动窗口协议 流量控制:窗口维护 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:8:3","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"Socket 看成是用户进程与内核网络协议栈的编程接口。 不仅可以用于本机的进程间通信,还可以用于网络上不同主机的进程间通信。 套接口的地址结构,包括PV4和通用地址结构,以及如何在不同主机和协议之间进行通信。 man 7 ip查看 套接口必须有地址属性来标识一个端点,TCP/IP协议用IP地址、端口号和地址家族来表达。 struct sockaddr_in { sa_family_t sin_family; /* address family: AF_INET */ in_port_t sin_port; /* port in network byte order */ struct in_addr sin_addr; /* internet address */ }; ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"基础概念 网络字节序 字节序 大端字节序 (Big Endian) 最高有效位(MSB:Most Significant Bit)存储于最低内存地址处,最低有效位(LSB:Lowest Significant Bit)存储于最高内存地址处。 小端字节序(Little Endian) 最高有效位存储于最高内存地址处,最低有效位存储于最低内存地址处。 主机字节序 不同的主机有不同的字节序,如x86为小端字节序,Motorola6800为大端字节序,ARM字节序是可配置的。 网络字节序 网络字节序规定为大端字节序 字节序转换函数 h:host;n:network;s:short; l:long; #include \u003carpa/inet.h\u003e uint32_t htonl(uint32_t hostlong);// 4字节的主机字节序转为网络字节序 uint16_t htons(uint16_t hostshort); uint32_t ntohl(uint32_t netlong); uint16_t ntohs(uint16_t netshort); #include \u003cstdio.h\u003e #include \u003cstdlib.h\u003e #include \u003carpa/inet.h\u003e int main() { unsigned int x = 0x12345678; unsigned char *p = (unsigned char *)(\u0026x); printf(\"%0x %0x %0x %0x\\n\", p[0], p[1], p[2], p[3]); unsigned int y = htonl(x); p = (unsigned char *)(\u0026y); printf(\"%0x %0x %0x %0x\\n\", p[0], p[1], p[2], p[3]); return 0; } // 输出 // 78 56 34 12 // 12 34 56 78 地址转换函数 int inet_aton(const char *cp, struct in_addr *inp);// 点分十进制的IPv4地址字符串转struct in_addr结构体类型的二进制表示 in_addr_t inet_addr(const char *cp); in_addr_t inet_network(const char *cp); // 点分十进制的IPv4地址字符串转对应的网络地址的二进制表示 char *inet_ntoa(struct in_addr in); // 网络字节序表示的struct in_addr类型的IPv4地址转换为点分十进制的字符串表示 int main() { struct in_addr ipaddr; inet_aton(\"192.168.0.123\", \u0026ipaddr); printf(\"%u\\n\", ntohl(ipaddr.s_addr)); printf(\"IPv4地址: %s\\n\", inet_ntoa(ipaddr)); unsigned long addr2 = inet_addr(\"192.168.0.123\"); printf(\"%u\\n\", ntohl(addr2)); struct in_addr ipaddr_1; ipaddr_1.s_addr = addr2; printf(\"IPv4地址: %s\\n\", inet_ntoa(ipaddr_1)); in_addr_t ip; ip = inet_network(\"192.168.0.123\"); printf(\"%u\\n\", ip); ip = ntohl(ip); struct in_addr ipaddr_2; ipaddr_2.s_addr = ip; printf(\"IPv4地址: %s\\n\", inet_ntoa(ipaddr_2)); return 0; } 套接字类型 流式套接字(SOCK_STREAM) 提供面向连接的、可靠的数据传输服务,数据无差错,无重复的发送,且按发送顺序接收。 数据报式套接字(SOCK_DGRAM) 提供无连接服务。不提供无错保证,数据可能丢失或重复,并且接收顺序混乱 原始套接字(SOCK RAW) ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"socket函数 socket 功能:创建一个套接字用于通信 函数原型:int socket(int domain, int type, int protocol); 参数:domain:通信协议族(protocol family);type:socket类型;protocol:协议类型 返回值:成功:非负整数;失败:-1. bind 功能:绑定一个本地地址到套接字 函数原型:int bind(int socket, const struct sockaddr *address, socklen_t address_len); 参数:socket:函数返回的套接字;address:要绑定的地址;address_len:地址长度 返回值:成功:0;失败:-1. listen 功能:将套接字用于监听进入的连接;将socket从主动套接字变为被动套接字 函数原型:int listen(int socket, int backlog); 参数:socket:函数返回的套接字;backlog:规定内核为此套接字排队的最大连接个数 返回值:成功:0;失败:-1. accept 功能:从已完成连接队列返回第一个连接,如果已完成连接队列为空,则阻塞 函数原型:int accept(int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); 参数:socket:函数返回的套接字;address:将返回对等方的套接字地址;address_len:地址长度 返回值:成功:非负整数;失败:-1. connect 功能:建立一个连接至addr所指定的套接字 函数原型:int connect(int socket, const struct sockaddr *address, socklen_t address_len); 参数:socket:未连接的套接字;address:要连接的套接字地址;address_len:地址长度 返回值:成功:0;失败:-1. 属性 getsockname 功能:获取本地地址 函数原型:int getsockname(int sockfd, struct sockaddr *addr, socklen_t *addrlen); 参数:socket:套接字;addr:本地地址;addrlen:地址长度 返回值:成功:0;失败:-1. struct sockaddr_in localaddr; socklen_t addrlen = sizeof(localaddr); if ((getsockname(sock, (struct sockaddr*)\u0026localaddr, \u0026addrlen) \u003c 0)){ ERR_EXIT(\"getsockname fail\"); } printf(\"ip = %s port = %d\\n\", inet_ntoa(localaddr.sin_addr), ntohs(localaddr.sin_port)); getpeername :获取对等方地址 int getpeername(int sockfd, struct sockaddr *addr, socklen_t *addrlen); gethostname:获取主机名 int gethostname(char *name, size_t len); gethostbyname:通过主机名获取IP地址 struct hostent *gethostbyname(const char *name); gethostbyaddr:通过IP地址获取主机的完整信息 struct hostent *gethostbyaddr(const void *addr, socklen_t len, int type); ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"TCP客户/服务器模型 netstat -an | grep TIME_WAIT查看等待状态的网络 TCP客户/服务器模型 回射客户/服务器 回射客户/服务器 int main() { int listenfd; // listenfd = socket(PF_INET, SOCK_STREAM, 0); listenfd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); // 指定TCP if (listenfd \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_in servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(5188); servaddr.sin_addr.s_addr = htonl(INADDR_ANY); // servaddr.sin_addr.s_addr = inet_addr(\"127.0.0.1\"); // 指定地址 // inet_aton(\"127.0.0.1\", \u0026servaddr.sin_addr); if (bind(listenfd, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)) \u003c 0){ ERR_EXIT(\"bind fail\"); } if (listen(listenfd, SOMAXCONN) \u003c 0){ ERR_EXIT(\"listen fail\"); } struct sockaddr_in peeraddr; socklen_t peerlen = sizeof(peeraddr); int conn; conn = accept(listenfd, (struct sockaddr*)\u0026peeraddr, \u0026peerlen); if (conn \u003c 0){ ERR_EXIT(\"accept fail\"); } char recvbuf[1024]; while(1){ memset(recvbuf, 0, sizeof(recvbuf)); int ret = read(conn, recvbuf, sizeof(recvbuf)); fputs(recvbuf, stdout); write(conn, recvbuf, ret); memset(recvbuf, 0, sizeof(recvbuf)); } close(conn); close(listenfd); return 0; } int main() { int sock; // listenfd = socket(PF_INET, SOCK_STREAM, 0); sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); // 指定TCP if (sock \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_in servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(5188); servaddr.sin_addr.s_addr = inet_addr(\"127.0.0.1\"); // 指定地址 // inet_aton(\"127.0.0.1\", \u0026servaddr.sin_addr); int ret; ret = connect(sock, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)); if (ret \u003c 0){ ERR_EXIT(\"connect fail\"); } char sendbuf[1024] = {0}; char recvbuf[1024] = {0}; while (fgets(sendbuf, sizeof(sendbuf), stdin) != NULL) { write(sock, sendbuf, strlen(sendbuf)); read(sock, recvbuf, sizeof(recvbuf)); fputs(recvbuf, stdout); memset(sendbuf, 0, sizeof(sendbuf)); memset(recvbuf, 0, sizeof(recvbuf)); } close(sock); return 0; } 处理多客户连接 (process-per-connection) 一个连接一个进程来处理并发。 父进程接受客户端连接,子进程用来处理和客户端的通信细节。 void do_service(int conn, struct sockaddr_in peeraddr){ char recvbuf[1024]; while(1){ memset(recvbuf, 0, sizeof(recvbuf)); int ret = read(conn, recvbuf, sizeof(recvbuf)); if (ret == 0){ printf(\"client ip = %s port = %d close\\n\", inet_ntoa(peeraddr.sin_addr), ntohs(peeraddr.sin_port)); break; } else if (ret == -1){ ERR_EXIT(\"read fail\"); } fputs(recvbuf, stdout); write(conn, recvbuf, ret); memset(recvbuf, 0, sizeof(recvbuf)); } } int main() { int listenfd; // listenfd = socket(PF_INET, SOCK_STREAM, 0); listenfd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); // 指定TCP if (listenfd \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_in servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(5188); servaddr.sin_addr.s_addr = htonl(INADDR_ANY); int on = 1; // 在TIME_WAIT还没消失的情况,允许服务器重启 if (setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, \u0026on, sizeof(on)) \u003c 0){ ERR_EXIT(\"setsocketopt\"); } if (bind(listenfd, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)) \u003c 0){ ERR_EXIT(\"bind fail\"); } if (listen(listenfd, SOMAXCONN) \u003c 0){ ERR_EXIT(\"listen fail\"); } struct sockaddr_in peeraddr; socklen_t peerlen = sizeof(peeraddr); int conn; pid_t pid; while(1){ conn = accept(listenfd, (struct sockaddr*)\u0026peeraddr, \u0026peerlen); if (conn \u003c 0){ ERR_EXIT(\"accept fail\"); } printf(\"ip = %s port = %d\\n\", inet_ntoa(peeraddr.sin_addr), ntohs(peeraddr.sin_port)); pid = fork(); if (pid == -1){ ERR_EXIT(\"fork fail\"); } if (pid == 0){ close(listenfd); do_service(conn, peeraddr); exit(EXIT_SUCCESS); } else{ close(conn); } } return 0; } 点对点聊天 双方维护一个套接字 void handler(int sig) { printf(\"recv a sig = %d\\n\", sig); exit(EXIT_SUCCESS); } int main() { int listenfd; // listenfd = socket(PF_INET, SOCK_STREAM, 0); listenfd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); // 指定TCP if (listenfd \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_in servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:3","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"Select 五种I/O模型 阻塞I/O 非阻塞I/O I/O复用(select 和 poll) select管理多个文件描述符 信号驱动I/O 异步I/O select函数 功能:检测多个文件描述符中是否有可读、可写或异常事件 函数原型:int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); 参数 nfds:表示所有文件描述符的范围,即最大的文件描述符加1。(后面集合最大值加1) readfds、writefds、exceptfds:分别是指向可读、可写和异常等事件的文件描述符集合的指针。 timeout:表示超时时间。 返回值:失败:1;成功:超时没检测到为0,检测到的事件个数 void FD_CLR(int fd, fd_set *set); // 移除 int FD_ISSET(int fd, fd_set *set); // fd是否在集合中 void FD_SET(int fd, fd_set *set); // 添加集合 void FD_ZERO(fd_set *set); // 清空集合 可读事件发生条件 套接口缓冲区有数据可读 连接的读一半关闭,即接收到FIN段,读操作将返回O 如果是监听套接口,已完成连接队列不为空时 套接口上发生了一个错误待处理,错误可以通过getsockopt指定SO_ERROR选项来获取。 可写事件发生条件 套接口发送缓冲区有空间容纳数据。 连接的写一半关闭。即收到RST段之后,再次调用write操作。 套接口上发生了一个错误待处理,错误可以通过getsockopt指定SO_ERROR选项来获取。 异常事件发生条件 套接口存在带外数据 select改进回射客户/服务器 void echo_client(int sock){ fd_set rset; FD_ZERO(\u0026rset); int nready; int maxfd; int fd_stdin = fileno(stdin); maxfd = (fd_stdin \u003e sock) ? fd_stdin: sock; char sendbuf[1024] = {0}; char recvbuf[1024] = {0}; while(1){ FD_SET(fd_stdin, \u0026rset); FD_SET(sock, \u0026rset); nready = select(maxfd + 1, \u0026rset, NULL, NULL, NULL); if (nready == -1){ ERR_EXIT(\"select fail\"); } if (nready == 0){ continue; } if (FD_ISSET(sock, \u0026rset)){ int ret = readline(sock, recvbuf, sizeof(recvbuf)); if (ret == -1){ ERR_EXIT(\"readline fail\"); } else if (ret == 0){ printf(\"server close\\n\"); break; } fputs(recvbuf, stdout); memset(recvbuf, 0, sizeof(recvbuf)); } if (FD_ISSET(fd_stdin, \u0026rset)){ if (fgets(sendbuf, sizeof(sendbuf), stdin) == NULL){ break; } writen(sock, sendbuf, strlen(sendbuf)); memset(sendbuf, 0, sizeof(sendbuf)); } } close(sock); } // 服务器端 int client[FD_SETSIZE]; int maxi = -1; // 遍历整个FD_SETSIZE太费时间,记录最大得fd位置,遍历到那个位置即可 int i; for (i = 0; i \u003c FD_SETSIZE; i++){ client[i] = -1; } int nready; int maxfd = listenfd; fd_set rset; fd_set allset; FD_ZERO(\u0026rset); FD_ZERO(\u0026allset); FD_SET(listenfd, \u0026allset); while(1){ rset = allset; // select会修改fd_set,所以每次需要重新赋值一份 nready = select(maxfd + 1, \u0026rset, NULL, NULL, NULL); if (nready == -1){ if (errno == EINTR){ // select被信号中断需要重新执行 continue; } ERR_EXIT(\"select fail\"); } if (nready == 0){ continue; } if (FD_ISSET(listenfd, \u0026rset)){ peerlen = sizeof(peeraddr); conn = accept(listenfd, (struct sockaddr*)\u0026peeraddr, \u0026peerlen); if (conn \u003c 0){ ERR_EXIT(\"accept fail\"); } for (i = 0; i \u003c FD_SETSIZE; i++){ if (client[i] \u003c 0){ client[i] = conn; if (i \u003e maxi){ maxi = i; } break; } } if (i == FD_SETSIZE){ fprintf(stderr, \"too many clients\\n\"); exit(EXIT_FAILURE); } printf(\"ip = %s port = %d\\n\", inet_ntoa(peeraddr.sin_addr), ntohs(peeraddr.sin_port)); FD_SET(conn, \u0026allset); if (conn \u003e maxfd){ maxfd = conn; } if (--nready \u003c= 0){ continue; } } for (i = 0; i \u003c= maxi; i++){ conn = client[i]; if (conn == -1){ continue; } if (FD_ISSET(conn, \u0026rset)){ char recvbuf[1024]; int ret = readline(conn, recvbuf, sizeof(recvbuf)); if (ret == -1){ ERR_EXIT(\"readline fail\"); } if (ret == 0){ struct sockaddr_in peer_addr; socklen_t peer_len = sizeof(peer_addr); getpeername(conn, (struct sockaddr*)\u0026peer_addr, \u0026peer_len); printf(\"client ip = %s port = %d close\\n\", inet_ntoa(peer_addr.sin_addr), ntohs(peer_addr.sin_port)); FD_CLR(conn, \u0026allset); client[i] = -1; if (i == maxi){// 可能删除得i是当前得maxi,要优化到第二大的位置 for(int j = maxi - 1; i \u003e= 0; j--){ if (client[j] != -1){ maxi = j; break; } } } } fputs(recvbuf, stdout); writen(conn, recvbuf, strlen(recvbuf)); memset(\u0026recvbuf, 0, sizeof(recvbuf)); if (--nready \u003c= 0){ break; } } } } close和shutdown close终止了数据传送的两个方向 shutdown可以有选择的终止某个方向的数据传送或者终止数据传送的两个方向 int shutdown(int sockfd, int how); shutdown how=1可以保证对等方接收到一个E0F字符,而不管其他进程是否已经打开了套接字。而close不能保证,直到套接字引用计数减为0时才发送。即直到所有的进程都关闭了套接字。 I/O超时 alarm 套接字选项 SO_SNDTIMEO SO_RCVTIMEO select // 闹钟冲突,一般不用 void handler(int sig){ return; } signal(SIGALARM, handler); alarm(5); int ret = read(fd, buf, sizeof(buf)); if (ret == -1 \u0026\u0026 errno == EINTR){ errno = ETIMEDOUT; } else if (ret \u003e= 0){ aralm(0); } setsockopt(sock, SOL_SOCKET, SO_RCVTIMEO,5); int ret = read(sock, buf, sizeof(buf)); if (ret == -1 \u0026\u0026 errno","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:4","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"poll select的限制 用select实现的并发服务器,能达到的并发数,受两方面限制 一个进程能打开的最大文件描述符限制。这可以通过调整内核参数。 ulimit -n 1024调整 只能修改当前进程以及子进程 select中的fd_set集合容量的限制(FD SETSIZE),这需要重新编译内核。 int getrlimit(int resource, struct rlimit *rlim); int setrlimit(int resource, const struct rlimit *rlim); resource 设置 RLIMIT_NOFILE select和poll共同点:内核要遍历所有文件描述符,直到找到发生事件的文件描述符 poll 一个进程能打开的最大文件描述符限制。系统所有打开的最大文件描述个数也是有限的,跟内存大小有关 poll函数 功能:检测多个文件描述符中是否有可读、可写或异常事件 函数原型:int poll(struct pollfd *fds, nfds_t nfds, int timeout); 参数 fds:指向一个struct pollfd结构体数组的指针,每个结构体描述一个待检测的文件描述符及其关注的事件。 nfds:表示fds数组中结构体的数量。 timeout:表示超时时间。 返回值:成功:发生事件的文件描述符数,如果超时返回0,如果出错返回-1,并将errno设置为相应的错误码 poll函数支持的文件描述符数目更大(nfds参数没有上限),并且不需要像select那样使用位图处理多个文件描述符的状态。 不用维护maxfd 不用使用FD、ZERO、FD_SET、FD_CLR、FD_ISSET函数 // serve struct pollfd client[CLIENT_SIZE]; int maxi = 0; // 遍历整个FD_SETSIZE太费时间,记录最大得fd位置,遍历到那个位置即可 int i; for (i = 0; i \u003c CLIENT_SIZE; i++){ client[i].fd = -1; } int nready; client[0].fd = listenfd; client[0].events = POLLIN;// 对监听套接口的可读事件感兴趣 while(1){ nready = poll(client, maxi + 1, -1); if (nready == -1){ if (errno == EINTR){ continue; } ERR_EXIT(\"poll fail\"); } if (nready == 0){ continue; } if (client[0].revents \u0026 POLLIN){ // 如果产生了可读事件 peerlen = sizeof(peeraddr); conn = accept(listenfd, (struct sockaddr*)\u0026peeraddr, \u0026peerlen); if (conn \u003c 0){ ERR_EXIT(\"accept fail\"); } for (i = 0; i \u003c CLIENT_SIZE; i++){ if (client[i].fd \u003c 0){ client[i].fd = conn; if (i \u003e maxi){ maxi = i; } break; } } if (i == CLIENT_SIZE){ fprintf(stderr, \"too many clients\\n\"); exit(EXIT_FAILURE); } printf(\"ip = %s port = %d\\n\", inet_ntoa(peeraddr.sin_addr), ntohs(peeraddr.sin_port)); client[i].events = POLLIN; if (--nready \u003c= 0){ continue; } } for (i = 1; i \u003c= maxi; i++){ conn = client[i].fd; if (conn == -1){ continue; } if(client[i].events \u0026 POLLIN){ char recvbuf[1024]; int ret = readline(conn, recvbuf, sizeof(recvbuf)); if (ret == -1){ ERR_EXIT(\"readline fail\"); } if (ret == 0){ struct sockaddr_in peer_addr; socklen_t peer_len = sizeof(peer_addr); getpeername(conn, (struct sockaddr*)\u0026peer_addr, \u0026peer_len); printf(\"client ip = %s port = %d close\\n\", inet_ntoa(peer_addr.sin_addr), ntohs(peer_addr.sin_port)); client[i].fd = -1; if (i == maxi){// 可能删除得i是当前得maxi,要优化到第二大的位置 for(int j = maxi - 1; i \u003e= 0; j--){ if (client[j].fd != -1){ maxi = j; break; } } } } fputs(recvbuf, stdout); writen(conn, recvbuf, strlen(recvbuf)); memset(\u0026recvbuf, 0, sizeof(recvbuf)); if (--nready \u003c= 0){ break; } } } } // client void echo_client(int sock){ struct pollfd client_fd[2]; int nready; int fd_stdin = fileno(stdin); char sendbuf[1024] = {0}; char recvbuf[1024] = {0}; while(1){ client_fd[0].fd = fd_stdin; client_fd[0].events = POLLIN; client_fd[1].fd = sock; client_fd[1].events = POLLIN; nready = poll(client_fd, 2, -1); if (nready == -1){ ERR_EXIT(\"poll fail\"); } if (nready == 0){ continue; } if (client_fd[1].revents \u0026 POLLIN){ int ret = readline(sock, recvbuf, sizeof(recvbuf)); if (ret == -1){ ERR_EXIT(\"readline fail\"); } else if (ret == 0){ printf(\"server close\\n\"); break; } fputs(recvbuf, stdout); memset(recvbuf, 0, sizeof(recvbuf)); } if (client_fd[0].revents \u0026 POLLIN){ if (fgets(sendbuf, sizeof(sendbuf), stdin) == NULL){ break; } writen(sock, sendbuf, strlen(sendbuf)); memset(sendbuf, 0, sizeof(sendbuf)); } } close(sock); } epoll函数 epoll的优点 相比于select与poll,epoll最大的好处在于它不会随着监听fd数目的增长而降低效率。 内核中的select与poll的实现是采用轮询来处理的,轮询的fd数目越多,耗时越多。 epoll的实现是基于回调的,如果fd有期望的事件发生就通过回调函数将其加入epoll就绪队列中。(只关心“活跃”的fd,与fd数目无关) 内核把fd消息通知给用户空间呢?``select/poll采取内存拷贝方法。而epoll采用共享内存`的方式。 epoll能直接定位事件,而不必遍历整个fd集合。因为epoll不仅会告诉应用程序有I/O事件到来,还会告诉应用程序相关的信息,这些信息是应用程序填充的。 int epoll_create(int size); // 创建epoll实例 哈希表 int epoll_create1(int flags); // 红黑树 int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); // 将I/O 添加到epoll管理 int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout); // 等待事件 epoll模式 EPOLLLT:电平 完全靠kernel epoll驱动,应用程序只需要处理从epoll_wait返回的fds。(这些fds认为处于就绪状态) EPOLLET:边沿 仅仅通知应用程序哪些fds变成了就绪状态,一旦fd变成就绪状态,epoll将不再关注这个fd的在何状态信息,(从epo队列移除)直到应用程序通过读写操作触发EAG","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:5","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"UDP UDP特点 无连接 基于消息的数据传输服务 不可靠 一般情况下UDP更加高效 UDP客户/服务器模型 UDP客户/服务器模型 回射客户/服务器模型 回射客户/服务器模型 void echo_service(int sock){ char recvbuf[1024] = {0}; struct sockaddr_in peeraddr; socklen_t peerlen; while(1){ peerlen = sizeof(peeraddr); memset(recvbuf, 0, sizeof(recvbuf)); int ret = recvfrom(sock, recvbuf, sizeof(recvbuf), 0, (struct sockaddr *)\u0026peeraddr, \u0026peerlen); if (ret == -1){ if (errno == EINTR){ continue; } ERR_EXIT(\"recvfrom fail\"); } else if (ret \u003e 0){ fputs(recvbuf, stdout); sendto(sock, recvbuf, ret, 0, (struct sockaddr *)\u0026peeraddr, peerlen); memset(recvbuf, 0, sizeof(recvbuf)); } } close(sock); } int main() { int sock; sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP); // 指定UDP if (sock \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_in servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(5188); servaddr.sin_addr.s_addr = htonl(INADDR_ANY); if (bind(sock, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)) \u003c 0){ ERR_EXIT(\"bind fail\"); } echo_service(sock); return 0; } void echo_client(int sock, struct sockaddr_in servaddr){ char sendbuf[1024] = {0}; char recvbuf[1024] = {0}; while (fgets(sendbuf, sizeof(sendbuf), stdin) != NULL) { sendto(sock, sendbuf, strlen(sendbuf), 0, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)); recvfrom(sock, recvbuf, sizeof(recvbuf), 0, NULL, NULL); fputs(recvbuf, stdout); memset(sendbuf, 0, sizeof(sendbuf)); memset(recvbuf, 0, sizeof(recvbuf)); } close(sock); } int main() { int sock; sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP); // UDP if (sock \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_in servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(5188); servaddr.sin_addr.s_addr = inet_addr(\"127.0.0.1\"); // 指定地址 echo_client(sock, servaddr); return 0; } UDP注意点 UDP报文可能会丢失、重复 UDP报文可能会乱序 UDP缺乏流量控制 UDP协议数据报文截断 recvfrom返回0,不代表连接关闭,因为udp是无连接的。 ICMP异步错误 UDP connect UDP外出接口的确定 UDP聊天室 UDP聊天室 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:6","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"UNIX域 UNIX域特点 在同一台主机的传输速度是TCP的两倍 可以在同一台主机上各进程之间传递描述符。 UNX域套接字与传统套接字的区别是用路径名来表示协议族的描述。 UNIX域地址结构 man 7 UNIX #define UNIX_PATH_MAX 108 struct sockaddr_un { sa_family_t sun_family; /* AF_UNIX */ char sun_path[UNIX_PATH_MAX]; /* pathname */ }; 回射客户/服务器模型 void echo_srver(int conn){ char recvbuf[1024]; while(1){ memset(recvbuf, 0, sizeof(recvbuf)); int ret = read(conn, recvbuf, sizeof(recvbuf)); if (ret == 0){ printf(\"client close\\n\"); break; } else if (ret == -1){ ERR_EXIT(\"read fail\"); } fputs(recvbuf, stdout); write(conn, recvbuf, ret); memset(recvbuf, 0, sizeof(recvbuf)); } } int main() { int listenfd; listenfd = socket(PF_UNIX, SOCK_STREAM, 0); // UNIUX if (listenfd \u003c 0){ ERR_EXIT(\"socket fail\"); } // init unlink(\"/tmp/test_socket\"); struct sockaddr_un servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sun_family = AF_UNIX; strcpy(servaddr.sun_path, \"/tmp/test_socket\"); if (bind(listenfd, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)) \u003c 0){ ERR_EXIT(\"bind fail\"); } if (listen(listenfd, SOMAXCONN) \u003c 0){ ERR_EXIT(\"listen fail\"); } int conn; pid_t pid; while(1){ conn = accept(listenfd, NULL, NULL); if (conn == -1){ ERR_EXIT(\"accept fail\"); } pid = fork(); if (pid == -1){ ERR_EXIT(\"fork fail\"); } if (pid == 0){ close(listenfd); echo_srver(conn); exit(EXIT_SUCCESS); } else { close(conn); } } return 0; } void echo_client(int sock){ char sendbuf[1024] = {0}; char recvbuf[1024] = {0}; while (fgets(sendbuf, sizeof(sendbuf), stdin) != NULL) { write(sock, sendbuf, strlen(sendbuf)); read(sock, recvbuf, sizeof(recvbuf)); fputs(recvbuf, stdout); memset(sendbuf, 0, sizeof(sendbuf)); memset(recvbuf, 0, sizeof(recvbuf)); } close(sock); } int main() { int sock; sock = socket(PF_UNIX, SOCK_STREAM, 0); if (sock \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_un servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sun_family = AF_UNIX; strcpy(servaddr.sun_path, \"/tmp/test_socket\"); int ret; ret = connect(sock, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)); if (ret \u003c 0){ ERR_EXIT(\"connect fail\"); } echo_client(sock); return 0; } UNIX注意点 bind成功将会创建一个文件,权限为0777\u0026~umask sun path最好用一个绝对路径:一般放在/tmp/路径下 UNIX域协议支持流式套接口(粘包问题)与报式套接口 UNIX域流式套接字connect发现监听队列满时,会立刻返回一个ECONNREFUSED。 socketpair 功能:创建一个全双工的流管道 原型:int socketpair(int domain, int type, int protocol, int sv[2]); 参数:domain:协议家族;type:套接字类型;protocol:协议类型;sv:返回套接字对 返回值:成功:0;失败:-1 int main(){ int sockfds[2]; if (socketpair(PF_UNIX, SOCK_STREAM, 0, sockfds) \u003c 0){ ERR_EXIT(\"socketpair\"); } pid_t pid; pid = fork(); if (pid == -1){ ERR_EXIT(\"fork fail\"); } if (pid \u003e 0){ // 父进程 int val = 0; close(sockfds[1]); while(1){ ++val; printf(\"parent process sending data : %d\\n\", val); write(sockfds[0], \u0026val, sizeof(val)); // 本机通信,不转网络字节序 read(sockfds[0], \u0026val, sizeof(val)); printf(\"parent process received data : %d\\n\", val); sleep(1); } } else if (pid == 0){ int val = 0; close(sockfds[0]); while(1){ read(sockfds[1], \u0026val, sizeof(val)); //printf(\"subprocess received data : %d\\n\", val); ++val; write(sockfds[1], \u0026val, sizeof(val)); //printf(\"subprocess sending data : %d\\n\", val); } } return 0; } sendmsg和recvmsg struct iovec { /* Scatter/gather array items */ void *iov_base; /* Starting address */ size_t iov_len; /* Number of bytes to transfer */ }; struct msghdr { void *msg_name; /* optional address */ socklen_t msg_namelen; /* size of address */ struct iovec *msg_iov; /* scatter/gather array */ size_t msg_iovlen; /* # elements in msg_iov */ void *msg_control; /* ancillary data, see below */ size_t msg_controllen; /* ancillary data buffer len */ int msg_flags; /* flags on received message */ }; // msg_control struct cmsghdr { size_t cmsg_len; /* Data byte count, including header (type is socklen_t in POSIX) */ int cmsg_level; /* Originating protocol */ int cmsg_type; /* Protocol-specific type */ /* followed by unsigned char cmsg_data[]; */ }; sendmsg 功能:通过socket发送消息的系统调用 原型:ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags); 参数:sockfd:socket文件描述符;mag:需要发送的消息内容和相关元数据信息;flags:标志位参数,用于控制消息","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:9:7","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"进程间通信 顺序和并发 顺序程序:顺序性、封闭性(运行环境)、确定性、可再现性 并发程序:共享性、并发性、随机性 互斥和同步:信号量实现 进程互斥:矛盾 进程同步:协作 进程间通信目的 数据传输 资源共享 通知事件 进程控制 进程间通信分类 文件、文件锁、管道pipe和命名管道FIFO、信号 signal、消息队列、共享内存、信号量、互斥量、条件变量、读写锁、套接字socket ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:10:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"死锁 死锁产生的必要条件 互斥条件:进程对资源进行排他性使用,即在一段时间内某资源仅为一个进程所占用。 请求和保持条件:当进程因请求资源而阻塞时,对已获得的资源保持不放。 不可剥夺条件:进程已获得的资源在未使用之前不能被剥夺,只能在使用完时由自己释放。 环路等待条件:各个进程组成封闭的环形链,每个进程都等待下一个进程所占用的资源 防止死锁办法 资源一次性分配:破坏请求和保持条件 可剥夺资源:破坏不可剥夺条件 资源有序分配:破坏环路等待条件 死锁避免 银行家算法 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:10:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"信号量 互斥:P、V在同一个进程中 同步:P、V在不同进程中 信号量值S S \u003e 0:S表示可用资源个数 S = 0:表示无可用资源,无等待进程 S \u003c 0:|S|表示等待队列中进程个数 less /usr/include/sys/sem.h查看semaphore struct semaphore{ int value; pointer_PCB queue; } PV原语 P(s){ s.value = s.value--; if (s.value \u003c 0){ 该进程状态置为等待状态, 该进程的PCB插入相应的等待队列s.queue末尾 } } V(s){ s.value = s.value++; if (s.value \u003c= 0){ 唤醒相应等待队列s.queue中等待的一个进程, 改变其状态为就绪态, 并将其插入就绪队列 } } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:10:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"System V System V 消息队列 消息队列 每个消息的最大长度有上限(MSGMAX),每个消息队列的总字节数是有上限的(MSGMNB),系统上消息队列的总数也有上限(MSGMNI)。 cat /proc/sys/kernel/msgmax 消息队列数据结构 ipc_perm : IPC对象数据结构 man 2 msgctl查看 struct ipc_perm { key_t __key; /* Key supplied to msgget(2) */ uid_t uid; /* Effective UID of owner */ gid_t gid; /* Effective GID of owner */ uid_t cuid; /* Effective UID of creator */ gid_t cgid; /* Effective GID of creator */ unsigned short mode; /* Permissions */ unsigned short __seq; /* Sequence number */ }; struct msqid_ds { struct ipc_perm msg_perm; /* Ownership and permissions */ time_t msg_stime; /* Time of last msgsnd(2) */ time_t msg_rtime; /* Time of last msgrcv(2) */ time_t msg_ctime; /* Time of last change */ unsigned long __msg_cbytes; /* Current number of bytes in queue (nonstandard) */ msgqnum_t msg_qnum; /* Current number of messages in queue */ msglen_t msg_qbytes; /* Maximum number of bytes allowed in queue */ pid_t msg_lspid; /* PID of last msgsnd(2) */ pid_t msg_lrpid; /* PID of last msgrcv(2) */ }; 消息队列函数 ipcrm -q msqid:删除消息队列 ipcs查看 msgget 功能:用来创建和访问一个消息队列 函数原型:int msgget(key_t key, int msgflg); 参数:key:某个消息队列的名字;msgflg:由9个权限标志构成,和mode一样 返回值:成功:消息队列的标识码;失败:-1. int main(int argc, char *args[]){ int msgid; msgid = msgget(1234, 0666 | IPC_CREAT); // msgid = msgget(1234, 0666 | IPC_CREAT | IPC_EXCL); // msgid = msgget(IPC_PRIVATE, 0666 | IPC_CREAT | IPC_EXCL); // msgid = msgget(IPC_PRIVATE, 0666); // msgid = msgget(IPC_PRIVATE, 0); if (msgid == -1){ ERR_EXIT(\"msg_get error\"); } printf(\"msgget success\\n\"); } msgctl 功能:消息队列的控制函数 函数原型:int msgctl(int msqid, int cmd, struct msqid_ds *buf); 参数:msqid:由msgget函数返回的消息队列标识码;cmd:采取的动作(IPC_STAT、IPC_SET、IPC_RMID);buf:动作所需要传递的参数 返回值:成功:0;失败:-1 int main(int argc, char *args[]){ int msgid; msgid = msgget(1234, 0666 | IPC_CREAT); if (msgid == -1){ ERR_EXIT(\"msg_get error\"); } printf(\"msgget success\\n\"); printf(\"msgid = %d\\n\", msgid); // msgctl(msgid, IPC_RMID, NULL); // 删除消息队列 /* struct msqid_ds buf; msgctl(msgid, IPC_STAT, \u0026buf); // 获取消息队列状态 printf(\"mode = %o, bytes = %ld, number = %d, msgmnb = %d\\n\", buf.msg_perm.mode, buf.__msg_cbytes, (int)buf.msg_qnum, (int)buf.msg_qbytes); */ struct msqid_ds buf; msgctl(msgid, IPC_STAT, \u0026buf); printf(\"original msg_perm.mode: %o\\n\", buf.msg_perm.mode); sscanf(\"600\", \"%ho\", \u0026buf.msg_perm.mode); msgctl(msgid, IPC_SET, \u0026buf);// 修改消息队列状态 printf(\"new msg_perm.mode: %o\\n\", buf.msg_perm.mode); return 0; } msgsnd 功能:把一条消息添加到消息队列中 函数原型:int msgsnd(int msqid, const void *msgp, size_t msgsz, int msgflg); 参数: msqid:由msgget函数返回的消息队列标识码; msgp:指针,指针指向准备发送的信息; msgsz:是msgp指向的消息长度,这个长度不含保存消息类型的long int长整型; msgflg:控制当前消息队列满或系统上限时将要发生的事 IPC_NOWAIT表示队列满不等待,返回EAGAIN错误。 返回值:成功:0;失败:-1. int main(int argc, char *argv[]){ if (argc != 3){ fprintf(stderr,\"Usage: %s \u003cbytes\u003e \u003ctype\u003e\\n\", argv[0]); } int len = atoi(argv[1]); int type = atoi(argv[2]); int msgid = msgget(1234, 0); if (msgid == -1){ ERR_EXIT(\"msgget error\"); } struct msgbuf *ptr; ptr = (struct msgbuf*)malloc(sizeof(long) + len); ptr-\u003emtype = type; if (msgsnd(msgid, ptr, len, 0) \u003c 0){ ERR_EXIT(\"msgsnd error\"); } return 0; } msgrcv 功能:从一个消息队列接收消息 函数原型:ssize_t msgrcv(int msqid, void *msgp, size_t msgsz, long msgtyp, int msgflg); 参数: msqid:由msgget函数返回的消息队列标识码; msgp:指针,指针指向准备接收的信息; msgsz:是msgp指向的消息长度,这个长度不含保存消息类型的long int长整型; msgtype:实现接收优先级的简单形式 msgtype=0:返回队列第一条信息 msgtype\u003e0:返回队列第一条类型等于msgtype的消息 msgtype\u003c 0 :返回队列第一条类型小于等于msgtype绝对值的消息 msgtype\u003e0且msgflg=MSC_EXCEPT,接收类型不等于msgtype的第一条消息。 msgflg:控制当队列中没有相应类型的消息可供接收时要发生的事 msgflg=IPC_NOWAIT,队列没有可读消息不等待,返回ENOMSG错误 msgflg=MSG_NOERROR,消息大小超过msgszl时被截断 返回值:成功:接收缓冲区的字符个数;失败:-1。 struct msgbuf { long mtype; /* message type, must be \u003e 0 */ char mtext[1]; /* message data */ }; #define MSGMAX 8192 int main(int argc, char *argv[]){ int flag = 0; int type = 0; int opt; while(1){ opt = getopt(argc, argv, \"nt:\"); if (opt == '?'){ exit(EXIT_FAILURE); } if (opt == -1){ break; } switch(opt){ case 'n': flag |= IPC_NOWAIT; break; case 't': type = atoi(optarg); } } int msgid = msgget(1234, 0); if (msgid == -1){ ERR_EXIT","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:10:3","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"POSIX 消息队列,共享内存,信号量,互斥锁,条件变量,读写锁,自旋锁,文件锁 消息队列 需要链接-lrt 使用查看 man 7 mq_overview,查看消息队列 mkdir /dev/mqueue mount -t mqueue none /dev/mqueue mq_open 功能:用来创建和访问一个消息队列 原型: mqd_t mq_open(const char *name, int oflag); mqd_t mq_open(const char *name, int oflag, mode_t mode, struct mq_attr *attr); 参数: name:某个消息队列的名字 oflag:和open函数类似O_RDONLY、O_WRONLY、O_RDWR、O_CREAT、O_EXCL、O_NONBLOCK mode:如果指定了O_CREAT,需要设置mode。 attr:指定消息队列属性 返回值:成功:消息队列文件描述符;失败:-1 mq_close 功能:关闭消息队列 原型:int mq_close(mqd_t mqdes); 参数:mqdes:消息队列描述符 返回值:成功:0;失败:-1 int main(int argc, char *argv[]) { mqd_t mqid; mqid = mq_open(\"/abc\", O_CREAT | O_RDWR, 0666, NULL); if (mqid == (mqd_t)-1){ ERR_EXIT(\"mq_open\"); } printf(\"mq_open success\\n\"); mq_close(mqid); return 0; } mq_unlink 功能:删除消息队列 原型:int mq_unlink(const char *name); 参数:name:消息队列的名字 返回值:成功:0;失败:-1 mq_getattr/mq_setattr 功能:获取/设置消息队列属性 原型: int mq_getattr(mqd_t mqdes, struct mq_attr *attr); int mq_setattr(mqd_t mqdes, struct mq_attr *newattr, struct mq_attr *oldattr); struct mq_attr { long mq_flags; /* Flags: 0 or O_NONBLOCK */ long mq_maxmsg; /* Max. # of messages on queue */ long mq_msgsize; /* Max. message size (bytes) */ long mq_curmsgs; /* # of messages currently in queue */ }; 返回值:成功:0;失败:-1 mq_send 功能:发送消息 原型:int mq_send(mqd_t mqdes, const char *msg_ptr, size_t msg_len, unsigned msg_prio); 参数:mqdes:消息队列描述符;msg_ptr:指向消息的指针;msg_len:消息长度;msg_prio:消息优先级 返回值:成功:0;失败:-1 mq_receive 功能:接收消息 原型:ssize_t mq_receive(mqd_t mqdes, char *msg_ptr, size_t msg_len, unsigned *msg_prio); 参数:mqdes:消息队列描述符;msg_ptr:返回接收到的消息;msg_len:消息长度;msg_prio:消息优先级 返回值:成功:接收的消息字节数;失败:-1 int main(int argc, char *argv[]) { mqd_t mqid; mqid = mq_open(\"/abc\", O_RDONLY); if (mqid == (mqd_t)-1){ ERR_EXIT(\"mq_open\"); } STU stu; unsigned prio; ssize_t result; struct mq_attr attr; mq_getattr(mqid, \u0026attr); size_t size = attr.mq_msgsize; // 每条消息的最大长度值 result = mq_receive(mqid, (char*)\u0026stu, size, \u0026prio); if (result == -1){ ERR_EXIT(\"mq_receive\"); } printf(\"receive bytes %ld\\n\", result); printf(\"name = %s age = %d prio = %u\\n\", stu.name, stu.age, prio); mq_close(mqid); return 0; } mq_notify 功能:建立或者删除消息达到通知事件 原型:int mq_notify(mqd_t mqdes, const struct sigevent *sevp); 参数:mqdes:消息队列描述符;sevp:非空表示当消息到达且消息队列先前为空,将得到通知;NULL表示撤销已注册的通知 返回值:成功:0;失败:-1 通知方式 产生一个信号 创建一个线程执行一个指定的函数 mqd_t mqid; size_t size; struct sigevent sigev; void handle_signusr1(int sig){ mq_notify(mqid, \u0026sigev); STU stu; unsigned prio; ssize_t result; result = mq_receive(mqid, (char*)\u0026stu, size, \u0026prio); if (result == -1){ ERR_EXIT(\"mq_receive\"); } printf(\"name = %s age = %d prio = %u\\n\", stu.name, stu.age, prio); } int main(int argc, char *argv[]) { mqid = mq_open(\"/abc\", O_RDONLY); if (mqid == (mqd_t)-1){ ERR_EXIT(\"mq_open\"); } struct mq_attr attr; if (mq_getattr(mqid, \u0026attr) == -1) { ERR_EXIT(\"mq_getattr\"); } size = attr.mq_msgsize; // 每条消息的最大长度值 signal(SIGUSR1, handle_signusr1); sigev.sigev_notify = SIGEV_SIGNAL; sigev.sigev_signo = SIGUSR1; mq_notify(mqid, \u0026sigev); for(;;){ pause(); } mq_close(mqid); return 0; } 共享内存 查看 /dev/shm shm_open 功能:用来创建和打开一个共享内存对象 原型:int shm_open(const char *name, int oflag, mode_t mode); 参数: name:共享内存对象的名字 oflag:和open函数类似O_RDONLY、O_WRONLY、O_RDWR、O_CREAT、O_EXCL、O_NONBLOCK mode:如果没有指定了O_CREAT,可以指定为0 返回值:成功:消息队列文件描述符;失败:-1 ftruncate 功能:修改共享内存对象大小 原型:int ftruncate(int fd, off_t length); 参数:fd:文件描述符;length:长度 返回值:成功:0;失败:-1 fstat 功能:获取共享内存对象信息 原型:int fstat(int fd, struct stat *buf); 参数:fd:文件描述符;buf:返回共享内存状态 返回值:成功:0;失败:-1 int main(int argc, char *argv[]) { int shmid; shmid = shm_open(\"/xyz\", O_CREAT | O_RDWR, 0666); if (shmid == -1){ ERR_EXIT(\"shm_open\"); } printf(\"shm_open success\\n\"); if (ftruncate(shmid, sizeof(STU)) == -1){ ERR_EXIT(\"ftruncate\"); } struct stat buf; if (fstat(shmid,\u0026buf) == -1){ ERR_EXIT(\"fstat\"); } printf(\"size = %ld, mode = %o\\n\", buf.st_size, buf.st_mode \u0026 0777); // umask close(shmid); return 0; } shm_unlink 功能:删除共享内存对象 原型:int shm_unlink(const char *name); 参数:name:共享内存对象的名字 返回值:成功:0;失败:-1 mmap // write int main(int argc, char *argv[]) { int shmid; shmid = shm_op","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:10:4","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"线程 进程是资源竞争的基本单位;线程是程序运行的最小单位 线程共享进程数据,但也拥有自己的一部分数据 线程ID;一组寄存器;栈;errno;信号状态;优先级 线程优点:代价小;占用资源少;可充分利用多处理器并行数量;可同时等待不同的I/O操作。 线程优点:性能损失(增加额外的同步和调度开销而可用资源不变);健壮性降低(线程之间缺乏保护);缺乏访问控制; 线程模型 N:1用户线程模型 1:1核心线程模型 N:Mh混合线程模型 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:11:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"POSIX线程 链接 -lpthread pthread_create 功能:创建一个新的线程 原型: int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg); 参数: thread:返回线程ID; attr:设置线程的属性,attr为NULL表示使用默认属性; start_routine:是个函数地址,线程启动后要执行的函数; arg:传给线程启动函数的参数 返回值:成功:0;失败:错误码 pthread_exit 功能:线程终止 原型:void pthread_exit(void *retval) 参数:retval:不要指向一个局部变量 pthread_join 功能:等待线程结束 原型:int pthread_join(pthread_t thread, void **retval); 参数:thread:线程ID;retval:指向一个指针 返回值:成功:0;失败:错误码 void* thread_routine(void *arg){ for (int i = 0; i \u003c 20; i++){ printf(\"B\"); fflush(stdout); usleep(20); if(i == 3){ pthread_exit(\"ABC\"); } } return 0; } int main(){ pthread_t tid; int ret; ret = pthread_create(\u0026tid, NULL, thread_routine, NULL); if (ret != 0){ fprintf(stderr, \"pthread_create: %s\\n\", strerror(ret)); exit(EXIT_FAILURE); } for (int i = 0; i \u003c 20; i++){ printf(\"A\"); fflush(stdout); usleep(20); } void *value; if (pthread_join(tid, \u0026value) != 0){ fprintf(stderr, \"pthread_join: %s\\n\", strerror(ret)); exit(EXIT_FAILURE); } printf(\"\\n\"); printf(\"return msg = %s\\n\", (char*)value); return 0; } pthread_self 功能:返回线程id 原型:pthread_t pthread_self(void); 返回值:成功:0 pthread_cancel 功能:取消一个执行中的线程 原型:int pthread_cancel(pthread_t thread); 返回值:成功:0;失败:错误码 回射客户/服务器 进程改线程 // server void* thread_routine(void *arg){ pthread_detach(pthread_self()); int conn = *((int*)arg); free(arg); echo_service(conn); printf(\"exiting thread ... \\n\"); return NULL; } int main() { int listenfd; // listenfd = socket(PF_INET, SOCK_STREAM, 0); listenfd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP); // 指定TCP if (listenfd \u003c 0){ ERR_EXIT(\"socket fail\"); } // init struct sockaddr_in servaddr; memset(\u0026servaddr, 0, sizeof(servaddr)); servaddr.sin_family = AF_INET; servaddr.sin_port = htons(5188); servaddr.sin_addr.s_addr = htonl(INADDR_ANY); int on = 1; // 在TIME_WAIT还没消失的情况,允许服务器重启 if (setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, \u0026on, sizeof(on)) \u003c 0){ ERR_EXIT(\"setsocketopt\"); } if (bind(listenfd, (struct sockaddr*)\u0026servaddr, sizeof(servaddr)) \u003c 0){ ERR_EXIT(\"bind fail\"); } if (listen(listenfd, SOMAXCONN) \u003c 0){ ERR_EXIT(\"listen fail\"); } struct sockaddr_in peeraddr; socklen_t peerlen = sizeof(peeraddr); int conn; // pid_t pid; while(1){ conn = accept(listenfd, (struct sockaddr*)\u0026peeraddr, \u0026peerlen); if (conn \u003c 0){ ERR_EXIT(\"accept fail\"); } printf(\"ip = %s port = %d\\n\", inet_ntoa(peeraddr.sin_addr), ntohs(peeraddr.sin_port)); int ret; pthread_t tid; int *p = malloc(sizeof(int)); *p = conn; // ret = pthread_create(\u0026tid, NULL, thread_routine, (void*)conn); ret = pthread_create(\u0026tid, NULL, thread_routine, p); // 可移植 if (ret != 0){ fprintf(stderr, \"pthread_create:%s\\n\", strerror(ret)); exit(EXIT_FAILURE); } } return 0; } 线程属性 初始化与销毁 int pthread_attr_init(pthread_attr_t *attr); int pthread_attr_destroy(pthread_attr_t *attr); 获取与设置分离 int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate); int pthread_attr_getdetachstate(pthread_attr_t *attr, int *detachstate); 获取与设置栈大小 int pthread_attr_setstacksize(pthread_attr_t *attr, size_t stacksize); int pthread_attr_getstacksize(pthread_attr_t *attr, size_t *stacksize); 获取与设置栈溢出保护区大小 int pthread_attr_setguardsize(pthread_attr_t *attr, size_t guardsize); int pthread_attr_getguardsize(pthread_attr_t *attr, size_t *guardsize); 获取与设置线程竞争范围 int pthread_attr_setscope(pthread_attr_t *attr, int scope); int pthread_attr_getscope(pthread_attr_t *attr, int *scope); 获取与设置调度策略 int pthread_attr_setschedpolicy(pthread_attr_t *attr, int policy); int pthread_attr_getschedpolicy(pthread_attr_t *attr, int *policy); 获取与设置继承的调度策略 int pthread_attr_setinheritsched(pthread_attr_t *attr, int inheritsched); int pthread_attr_getinheritsched(pthread_attr_t *attr, int *inheritsched); 获取与设置调度参数 int pthread_attr_setschedparam(pthread_attr_t *attr, const struct sched_param *param); int pthread_attr_getschedparam(pthread_attr_t *attr, struct sched_param *param);并发级别 并发级别:获取与设置并发级别 int pthread_setconcurrency(int new_level); int pthread_getconcurrency(void); 线程特定数据(TSD) int pthread","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:11:1","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"POSIX信号量 sem_open sem_close sem_unlink sem_init sem_destroy sem_wait sem_post ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:11:2","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"POSIX锁 互斥锁 pthread_mutex_init pthread_mutex_lock pthread_mutex_unlock pthread_mutex_destroy 自旋锁 自旋锁与互斥锁很重要的一个区别在于,线程在申请自旋锁的时候,线程不会被挂起,它处于忙等待的状态。 pthread_spin_init pthread_spin_lock pthread_spin_unlock pthread_spin_destroy 读写锁 只要没有线程持有给定的读写锁用于写,那么任意数目的线程可以持有读写锁用于读 仅当没有线程持有某个给定的读写锁用于读或用于写时,才能分配读写锁用于写 读写锁用于读称为共享锁,读写锁用于写称为排它锁 pthread_rwlock_init pthread_rwlock_destroy int pthread_rwlock_rdlock int pthread_rwlock_wrlock int pthread_rwlock_unlock ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:11:3","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"生产者消费者模型实践 #define CONSUMERS_COUNT 1 // 消费者 #define PRODUCERS_COUNT 5 // 生产者 #define BUFFSIZE 10 int g_buffer[BUFFSIZE]; // 缓冲区 unsigned short in = 0; // 生产位置 unsigned short out = 0; // 消费位置 unsigned short produce_id = 0; // 当前生产的产品位置 unsigned short consume_id = 0; // 当前消费的产品位置 sem_t g_sem_full; sem_t g_sem_empty; pthread_mutex_t g_mutex; pthread_t g_thread[CONSUMERS_COUNT + PRODUCERS_COUNT]; void* consume(void *arg){ int num = (int)arg; while(1){ printf(\"%d wait buffer not empty\\n\", num); sem_wait(\u0026g_sem_empty); pthread_mutex_lock(\u0026g_mutex); // 打印信息 for (int i = 0; i \u003c BUFFSIZE; i++){ printf(\"%02d \", i); if (g_buffer[i] == -1){ printf(\"%s\", \"null\"); } else { printf(\"%d\", g_buffer[i]); } if (i == out){ printf(\"\\t\u003c--consume\"); } printf(\"\\n\"); } consume_id = g_buffer[out]; printf(\"%d begin consume product %d\\n\", num, consume_id); g_buffer[out] = -1; out = ( out + 1) % BUFFSIZE; printf(\"%d end consume product %d\\n\", num, consume_id); pthread_mutex_unlock(\u0026g_mutex); sem_post(\u0026g_sem_full); sleep(5); } return NULL; } void* produce(void *arg){ int num = (int)arg; while(1){ printf(\"%d wait buffer not full\\n\", num); sem_wait(\u0026g_sem_full); pthread_mutex_lock(\u0026g_mutex); for (int i = 0; i \u003c BUFFSIZE; i++){ printf(\"%02d \", i); if (g_buffer[i] == -1){ printf(\"%s\", \"null\"); } else { printf(\"%d\", g_buffer[i]); } if (i == in){ printf(\"\\t\u003c--produce\"); } printf(\"\\n\"); } printf(\"%d begin produce product %d\\n\", num, produce_id); g_buffer[in] = produce_id; in = ( in + 1) % BUFFSIZE; printf(\"%d end produce product %d\\n\", num, produce_id++); pthread_mutex_unlock(\u0026g_mutex); sem_post(\u0026g_sem_empty); sleep(1); } return NULL; } int main(){ sem_init(\u0026g_sem_full, 0, BUFFSIZE); sem_init(\u0026g_sem_empty, 0, 0); pthread_mutex_init(\u0026g_mutex, NULL); int i; for (i = 0; i \u003c BUFFSIZE; i++){ g_buffer[i] = -1; } for (i = 0; i \u003c CONSUMERS_COUNT; i++){ pthread_create(\u0026g_thread[i], NULL, consume, (void*)i); } for (i = 0; i \u003c PRODUCERS_COUNT; i++){ pthread_create(\u0026g_thread[CONSUMERS_COUNT + i], NULL, produce, (void*)i); } for (i = 0; i \u003c CONSUMERS_COUNT + PRODUCERS_COUNT; i++){ pthread_join(g_thread[i], NULL); } sem_destroy(\u0026g_sem_full); sem_destroy(\u0026g_sem_empty); pthread_mutex_destroy(\u0026g_mutex); return 0; } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:11:4","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"POSIX条件变量 int pthread_cond_init(pthread_cond_t *cond,pthread_condattr_t *cond_attr); int pthread_cond_wait(pthread_cond_t *cond,pthread_mutex_t *mutex); int pthread_cond_timewait(pthread_cond_t *cond,pthread_mutex *mutex,const timespec *abstime); int pthread_cond_destroy(pthread_cond_t *cond); int pthread_cond_signal(pthread_cond_t *cond); int pthread_cond_broadcast(pthread_cond_t *cond); //向所有等待线程发起通知 使用规范 等待条件变量代码 pthread_mutex_lock(\u0026mutex); while (条件为假) pthread_cond_wait(\u0026cond, \u0026mutex); 修改条件 pthread_mutex_unlock(\u0026mutex); pthread_cond_wait(cond, mutex) 对mutex进行解锁; 等待条件,直到有线程向他发起通知 重新对mutex进行加锁操作 为什么用while? pthread_cond_wait会产生信号,有两种情况, 一种是pthread_cond_wait会自动重启,好像这个信号没有发生一样; 第二种pthread_cond_wait可能会被虚假唤醒,因此还需要重新判断。 给条件信号发送信号代码 pthread_mutex_lock(\u0026mutex); while (条件为真); pthread_cond_signal(\u0026cond); 修改条件 pthread_mutex_unlock(\u0026mutex); pthread_cond_signal(\u0026cond) 向第一个等待条件的线程发起通知,如果没有任何一个线程处于等待条件的状态,这个通知将被忽略。 #define CONSUMERS_COUNT 1 // 消费者 #define PRODUCERS_COUNT 4 // 生产者 pthread_cond_t g_cond; pthread_mutex_t g_mutex; pthread_t g_thread[CONSUMERS_COUNT + PRODUCERS_COUNT]; int nready = 0; // 当前缓冲区产品个数 void* consume(void *arg) { int num = (int)arg; while(1) { pthread_mutex_lock(\u0026g_mutex); while(nready == 0) { printf(\"%d begin wait a contition ...\\n\", num); pthread_cond_wait(\u0026g_cond, \u0026g_mutex); } printf(\"%d end wait a condtion...\\n\", num); printf(\"%d begin consume product\\n\", num); --nready; printf(\"%d end consume product\\n\", num); pthread_mutex_unlock(\u0026g_mutex); sleep(1); } return NULL; } void* produce(void *arg) { int num = (int)arg; while(1) { pthread_mutex_lock(\u0026g_mutex); printf(\"%d begin produce product\\n\", num); ++nready; printf(\"%d end produce product\\n\", num); printf(\"%d signal ....\\n\", num); pthread_cond_signal(\u0026g_cond); pthread_mutex_unlock(\u0026g_mutex); sleep(1); } return NULL; } int main(){ pthread_cond_init(\u0026g_cond,NULL); pthread_mutex_init(\u0026g_mutex, NULL); int i; for (i = 0; i \u003c CONSUMERS_COUNT; i++){ pthread_create(\u0026g_thread[i], NULL, consume, (void*)i); } sleep(1); for (i = 0; i \u003c PRODUCERS_COUNT; i++){ pthread_create(\u0026g_thread[CONSUMERS_COUNT + i], NULL, produce, (void*)i); } for (i = 0; i \u003c CONSUMERS_COUNT + PRODUCERS_COUNT; i++){ pthread_join(g_thread[i], NULL); } pthread_mutex_destroy(\u0026g_mutex); pthread_cond_destroy(\u0026g_cond); return 0; } ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:11:5","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"简单线程池 用于执行大量相对短暂的任务 当任务增加的时候能够动态的增加线程池中线程的数量直到达到一个阈值。 当任务执行完毕的时候,能够动态的销毁线程池中的线程 该线程池的实现本质上也是生产者与消费模型的应用。生产者线程向任务队列中添加任务,一旦队列有任务到来,如果有等待线程就唤醒来执行任务,如果没有等待线程并且线程数没有达到阈值,就创建新线程来执行任务。 计算密集型任务:线程个数 = CPU个数 I/O密集型任务: 线程个数 \u003e CPU个数 //任务结构体,将任务放入队列,由线程池中的线程来执行 typedef struct task { void *(*run)(void *arg); // 任务回调函数 void *arg; // 回调函数参数 struct task *next; } task_t; // 线程池结构体 typedef struct threadpool { condition_t ready; // 任务准备就绪或者线程池销毁通知 task_t *first; // 任务队列头指针 task_t *last; // 任务队列尾指针 int counter; // 线程池中当前线程数 int idle; // 线程池中当前正在等待任务的线程数 int max_threads; // 线程池中最大允许的线程数 int quit; // 销毁线程池的时候置1 } threadpool_t; // 初始化线程池 void threadpool_init(threadpool_t *pool, int threads); // 往线程池中添加任务 void threadpool_add_task(threadpool_t *pool, void *(*run)(void *arg), void *arg); // 销毁线程池 void threadpool_destroy(threadpool_t *pool); ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:11:6","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"miniftpd实践 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:12:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"参考阅读 Linux-UNIX系统编程手册(上、下册) (Michael Kerrisk) Linux系统编程-bilibili up对应课件 [Linux系统编程/网络编程] 笔记目录 关于Linux的系统编程总结 linux系统编程 CSDN linux网络编程_chmy1992的博客-CSDN博客 ","date":"2023-08-20","objectID":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/:13:0","tags":["Computer","Linux"],"title":"Linx系统编程","uri":"/posts/computer/linx%E7%B3%BB%E7%BB%9F%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"SQLite","date":"2023-07-20","objectID":"/posts/computer/sqlite/","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"SQLite ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:1:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"基本流程 //打开数据库 sqlite3 *sqldb = NULL; int res = sqlite3_open(db_file_name.c_str(), \u0026sqldb); if (res != SQLITE_OK) { fprintf(stderr, \"Can not open database: %s\\n\", sqlite3_errmsg(sqldb)); return false; } // 关闭数据库 sqlite3_close(sqldb); ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:2:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"常用语句 SELECT column1, column2, columnN FROM table_name; --从表 table_name 中选择column1, column2, columnN列 SELECT * FROM table_name; --选择所有列 SELECT * FROM COMPANY WHERE AGE IN ( 25, 27 ); --列出了 AGE 的值为 25 或 27 的所有记录 ID NAME AGE ADDRESS SALARY ---------- ---------- ---------- ---------- ---------- 2 Allen 25 Texas 15000.0 4 Mark 25 Rich-Mond 65000.0 5 David 27 Texas 85000.0 --直接看菜鸟教程吧,很详细 ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:3:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"常用函数 sqlite3_exec(),称为便捷函数,封装了好多任务。 int sqlite3_exec(sqlite3*, const char* sql, sqlite_callback, void* data, char** errmmsg); sqlite3* 表示指向数据库的指针; sql 为执行的sql语句; callback回调函数 typedef int (*sqlite3_callback) (void*, int, char**,char**); void *是为sqlite3_exec()第四个参数提供的数据, int代表列的数目, char** 一个指向指针数组的指针,每个指针指向一个结果行中的列数据, char**一个指向指针数组的指针,其中每个指针指向一个结果行中的列名。 声明的回调函数,如果这个函数是类成员函数,把它声明成static C++成员函数实际上隐藏了一个参数:this,C++调用类的成员函数的时候,隐含把类指针当成函数的第一个参数传递进去。 结果,这造成跟前面说的 sqlite 回调函数的参数不相符。只有当把成员函数声明成static时,它才没有多余的隐含的this参数。 void *data为回调函数的第一个参数指向提供给回调函数的应用程序特定的数据,也是回调函数的第一个参数; errmsg 为错误信息,是指向错误消息字符串的指针 。sqlite_exec() 有两个错误消息来源,返回值和可读的字符串errmsg。 sqlite3_prepare_v2() int sqlite3_prepare_v2(sqlite3* db, const char* sql, int sql_len, sqlite3_stmt** stmt, const char** tail); sqlite3* 表示指向数据库的指针; sql 为执行的sql语句; sql_len表示SQL 语句的长度。如果设置为 -1,则 SQLite 将自动计算字符串的长度 stmt用于存储编译后的语句对象 tail用于存储 SQL 语句中未使用的部分 sqlite3_get_table() int sqlite3_get_table(sqlite3*, const char *sql, char ***resultp, int *nrow, int *ncolumn, char **errmsg ); resultp表示查询结果,一维数组 nrow表示查询记录(行数) ncolumn表示列数 ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:4:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"SQL 必知必会 file-\u003eOpen DataBase URI…选择书籍sqlite数据库下载的.sqlite文件。 各种语句–直接看菜鸟教程吧,很详细 ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:5:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"检索 SELECT:从 SQLite 数据库表中获取数据,以结果表的形式返回数据 SELECT column1, column2, columnN FROM table_name; SELECT * FROM table_name; --检索所有列 DISTINCT:只返回不同的值(检索去重)(作用于所有列,不仅仅是其后的那一列) SELECT DISTINCT column1, column2,.....columnN FROM table_name LIMIT:限制由 SELECT 语句返回的数据数量。 可搭配OFFSET SELECT column1, column2, columnN FROM table_name LIMIT [no of rows] OFFSET [row num] --LIMIT后为检索的行数,OFFSET后指从那行开始(检索行从0开始) ORDER BY:基于一个或多个列按升序或降序顺序排列数据。 ASC 默认值,从小到大,升序排列 DESC 从大到小,降序排列 (对每列都进行降序,需要对每一列指定DESC关键字) SELECT column-list FROM table_name [ORDER BY column1, column2, .. columnN] [ASC | DESC]; 挑战题答案 #lesson 2 --1 select cust_id from Customers --2 select distinct prod_id from OrderItems --3 /*select */ select cust_id from Customers #lesson 3 --1 select cust_name from Customers order by cust_name desc --2 select cust_id, order_num from Orders order by cust_id, order_date desc --3 select quantity, item_price from OrderItems order by quantity desc, item_price desc --4 多,少by select vend_name from Vendors order by vend_name desc ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:6:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"过滤 WHERE:用于指定从一个表或多个表中获取数据的条件 (搜索、过滤条件,FROM子句之后) SELECT column1, column2, columnN FROM table_name WHERE [condition] 条件操作符 操作符 说明 操作符 说明 = 等于 \u003e 大于 \u003c\u003e 不等于 \u003e= 大于等于 != 不等于 !\u003e 不大于 \u003c 小于 BETWEEN [] AND [] 两值之间 \u003c= 小于等于 IS NULL 为NULL值 !\u003c 不小于 AND/OR:连接运算符 SELECT column1, column2, columnN FROM table_name WHERE [condition1] AND [condition2]...AND [conditionN]; ----------------------------------------------------- SELECT column1, column2, columnN FROM table_name WHERE [condition1] OR [condition2]...OR [conditionN] IN:用于把某个值与一系列指定列表的值进行比较。 SELECT * FROM COMPANY WHERE AGE IN ( 25, 27 ) --- 列出了 AGE 的值为 25 或 27 的所有记录 NOT:否定其后跟的条件 SELECT * FROM COMPANY WHERE AGE NOT IN ( 25, 27 ) --- 列出了 AGE 的值既不是 25 也不是 27 的所有记录 LIKE:匹配通配符指定模式的文本值 %:代表0,1或多个字符 _:代表单个字符 []:指定一个字符集 WHERE condition LIKE '[JM]%' 表示J或M开头 挑战题答案 #lesson 4 --1 select prod_id, prod_name from Products where prod_price=9.49 --2 select prod_id, prod_name from Products where prod_price\u003e=9 --3 select distinct order_num from OrderItems where quantity\u003e=100 --4 select prod_name, prod_price from Products where prod_price between 3 and 6 order by prod_price #lesson 5 --1 select vend_name from Vendors where vend_country = 'USA' and vend_state = 'CA' --2 select order_num, prod_id, quantity from OrderItems where prod_id in ('BR01', 'BR02', 'BR03') and quantity \u003e= 100 --3 lesson 4中第4题 select prod_name, prod_price from Products where (prod_price \u003e= 3 and prod_price \u003c= 6) order by prod_price --4 select vend_name from Vendors where vend_country = 'USA' and vend_state = 'CA' order by vend_name #lesson 6 --1 select prod_name, prod_desc from Products where prod_desc like '%toy%' --2 select prod_name, prod_desc from Products where not prod_desc like '%toy%' order by prod_name --3 select prod_name, prod_desc from Products where prod_desc like '%toy%' and prod_desc like '%carrots%' --4 select prod_name, prod_desc from Products where prod_desc like '%toy%carrots%' ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:7:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"函数 AS:取别名;把表或列重命名 SELECT column_name AS alias_name FROM table_name WHERE [condition]; 运算:+-*/ 拼接 Concat()或者+或者|| 文本处理函数 函数 说明 LENGTH() 返回字符串长度 LOWER() 转换为小写 UPPER() 转化为大写 LEFT() 返回字符串左边的字符 RIGHT() 返回字符串右边的字符 LTRIM() 去掉字符串左边的空格 RTRIM() 去掉字符串右边的空格 SUBSTR() 提取字符串组成 SOUNDEX() 返回字符串的SOUNDEX值(发音类似) 日期时间处理函数 SQLite 日期 \u0026 时间 数值处理函数 ABS() COS() EXP() PI() SIN() TAN() SQRT() 聚集汇总函数 COUNT():计算一个数据库表中的行数。 MAX/MIN() :某列的最大值/最小值。 AVG():某列的平均值。 SUM() :某列计算总和。 挑战题答案 #lesson 7 --1 select vend_id, vend_name as vname, vend_address as vaddress, vend_city as vcity from Vendors order by vname --2 select prod_id, prod_price, prod_price*0.9 as sale_price from Products #lesson 8 --1 select cust_id, cust_name, upper(substr(cust_contact,1,2) || substr(cust_city,1,3)) as user_login from Customers --2 select order_num, order_date from Orders where (strftime('%Y', order_date)='2020' and strftime('%m', order_date)='01') order by order_date #lesson 9 --1 select sum(quantity) as items_ordered from OrderItems --2 select sum(quantity) as items_ordered from OrderItems where prod_id = 'BR01' --3 select max(prod_price) from Products where prod_price \u003c= 10 ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:8:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"分组 ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:9:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"直接使用 sqlite3 ./***.db sqlite\u003e .schema sqlite\u003e .exit ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:10:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["Computer"],"content":"拓展阅读 官方 SQLite中文网 菜鸟教程-SQLite SQLite3 API编程手册 玩转SQLite6:使用C语言来读写数据库 ","date":"2023-07-20","objectID":"/posts/computer/sqlite/:11:0","tags":["Computer","C++","Linux"],"title":"SQLite","uri":"/posts/computer/sqlite/"},{"categories":["C++"],"content":"CMake教程","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"CMake教程 CMakeLists.txt cmake_minimum_required(VERSION 3.10) project(hello) add_executable(hello main.cpp factorial.cpp printhello.cpp) CMake 2.x 一般方便删除多余文件,新建Build文件夹用来生成文件 mkdir build cd build cmake .. -DCMAKE_BUILD_TYPE=Release make -j4 make install ./hello cd ../ rm -rf build cmake .. -DCMAKE_INSTALL_PREFIX= CMake 3.x cmake -B build -DCMAKE_BUILD_TYPE=Release cmake --build build --parallel 4 cmake --build build --target install ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:1:0","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"基础知识 ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:2:0","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"简单项目 cmake_minimum_required(VERSION major.minor[.patch[.tweak]]) 指定了项目所需的CMake的最低版本 强制设置将CMake行为匹配到对应版本 # cmake_minimum_required(VERSION 3.10) 指定版本 3.10 project(projectName [VERSION major[.minor[.patch[.tweak]]]] [LANGUAGES languageName ...] ) projectName:项目名称 LANGUAGES:项目编程语言: C、CXX、JAVA、…多种语言空格分开;默认C和CXX 3.0版本之前不支持LANGUAGES关键字,project(hello CXX) # project(hello) hello项目 可执行文件 add_executable(targetName [WIN32] [MACOSX_BUNDLE] [EXCLUDE_FROM_ALL] source1 [source2 ...] ) 为一组源文件创建一个可执行文件 # add_executable(test main.cpp) 为main.cpp构建test可执行文件 定义库 add_library(targetName [STATIC | SHARED | MODULE] [EXCLUDE_FROM_ALL] source1 [source2 ...] ) STATIC | SHARED | MODULE:静态库,动态库,动态加载库 cmake -DBUILD_SHARED_LIBS=YES /path/to/source:-D选项设置是否构建为动态库,否则为静态库。 set(BUILD_SHARED_LIBS YES):要在 add_library() 命令之前设置,(灵活性较差) add_library(collector src1.cpp) 目标链接 考虑A库依赖于B库,因此将A链接到B target_link_libraries(targetName \u003cPRIVATE|PUBLIC|INTERFACE\u003e item1 [item2 ...] [\u003cPRIVATE|PUBLIC|INTERFACE\u003e item3 [item4 ...]] ... ) 小总结 1 目标名称与项目名称无关,最好将项目名称和可执行文件名称分开 命名库的目标时,不要用lib作为名称的开头或结尾。lib会自动成为前缀 尽量避免直接将库指定 STATIC 或 SHARED 目标在调用 target_link_libraries() 时需要指定 PRIVATE 、 PUBLIC 和/或 INTERFACE ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:2:1","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"变量 基本变量 set(varName value... [PARENT_SCOPE]) 将所有变量都作为字符串处理,给定多个值,这些值将用 分号连接在一起。可以使用转义字符表示引号\\\" set(myVar a b c) # myVar = \"a;b;c\" set(myVar a;b;c) # myVar = \"a;b;c\" set(myVar \"a b c\") # myVar = \"a b c\" set(myVar a b;c) # myVar = \"a;b;c\" set(myVar a \"b c\") # myVar = \"a;b c\" 环境变量 set(ENV{PATH} \"$ENV{PATH}:/opt/myDir\") 只会影响当前的CMake实例,很少用到 缓存变量 set(varName value... CACHE type \"docstring\" [FORCE]) 布尔缓存变量使用optio()代替set() option(optVar heilpstring [initialValue]) # initialValue默认OFF set(optVar initialValue CACHE BOOL hilpstring) # 等价option 调试变量和诊断 message([mode] msg1 [msg2]...) # 打印记录信息 variable_watch(myVar [command]) # 监控变量,很少用 处理字符串 查找和替换操作、正则表达式匹配、大小写转换、删除空格和其他常见字符串操作 查找 string(FIND inputString subString outVar [REVERSE]) et(longStr abcdefabcdef) set(shortBit def) string(FIND ${longStr} ${shortBit} fwdIndex) string(FIND ${longStr} ${shortBit} revIndex REVERSE) message(\"fwdIndex = ${fwdIndex}, revIndex = ${revIndex}\") # 输出 fwdIndex = 3, revIndex = 9 # 代表子字符串被找到,第一次找到的时候索引为3;最后一次找到的索引为9 替换 string(REPLACE matchString replaceWith outVar input [input...]) 将使用 replaceWith 替换输入字符串中每个 matchString ,并将结果存储在 outVar 正则表达式 string(REGEX MATCH regex outVar input [input...]) string(REGEX MATCHALL regex outVar input [input...]) string(REGEX REPLACE regex replaceWith outVar input [input...]) 列表 list(LENGTH listVar outVar) # 统计 list(GET listVar index [index...]) outVar # 检索 list(APPEND listVar item [item...]) # 追加 list(INSERT listVar index item [item...]) # 插入 list(FIND myList value outVar) # 查找 # 删除 list(REMOVE_ITEM myList value [value...]) # 从列表中删除一个或多个目标项。如果目标项不在列表中,也不会出错 list(REMOVE_AT myList index [index...]) # 指定一个或多个要删除的索引,超过索引报错 list(REMOVE_DUPLICATES myList) # 将确保列表不包含重复项。 # 排序 (按字母顺序) list(REVERSE myList) list(SORT myList) set(myList a b c) # Creates the list \"a;b;c\" list(LENGTH myList len) message(\"length = ${len}\") list(GET myList 2 1 letters) message(\"letters = ${letters}\") list(APPEND myList d e f) message(\"myList (first) = ${myList}\") list(INSERT myList 2 X Y Z) message(\"myList (second) = ${myList}\") list(FIND myList d index) message(\"index = ${index}\") # 输出 length = 3 letters = c;b myList (first) = a;b;c;d;e;f myList (second) = a;b;X;Y;Z;c;d;e;f index = 6 数学表达式 math(EXPR outVar mathExpr) 第一个参数必须使用关键字 EXPR ,而 mathExpr 定义要计算的表达式,结果将存储在 outVar 中 set(x 3) set(y 7) math(EXPR z \"(${x}+${y}) / 2\") message(\"result = ${z}\") # 输出 result = 5 ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:2:2","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"控制流 if() if(expression1) # commands ... elseif(expression2) # commands ... else() # commands ... endif() 循环 foreach(loopVar IN [LISTS listVar1 ...] [ITEMS item1 ...]) # ... endforeach() while(condition) # ... endwhile() ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:2:3","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"子目录 add_subdirectory() add_subdirectory(sourceDir [ binaryDir ] [ EXCLUDE_FROM_ALL ]) 允许项目将另一个目录引入到构建中 CMAKE_SOURCE_DIR :源的最顶层目录(最顶层CMakeLists.txt所在位置) CMAKE_BINARY_DIR:构建的最顶层目录 就是build CMAKE_CURRENT_SOURCE_DIR:当前处理的CMakeLists.txt文件的目录 CMAKE_CURRENT_BINARY_DIR:当前处理的CMakeLists.txt文件对应的构建目录 include() ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:2:4","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"参考阅读 官方手册 Professional-CMake cmake cookbook CMake的链接选项:PRIVATE,INTERFACE,PUBLIC - 知乎 (zhihu.com) An Introduction to Modern CMake vscode 插件 : clangd ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:3:0","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"Make命令 python版本的makefile: SCons make规则:“目标\"是必需的,不可省略;“前置条件\"和\"命令\"都是可选的,但是两者之中必须至少存在一个。 每条规则就明确两件事:构建目标的前置条件是什么,以及如何构建。 \u003c目标\u003e : \u003c前置条件\u003e \u003c命令\u003e 文本文件的编写 // functions.h #ifndef _FUNCTIONS_H_ #define _FUNCTIONS_H_ void printhello(); int factorial(int n); #endif // factorial.cpp #include \"functions.h\" int factorial(int n){ if(n == 1) return 1; else return n * factorial(n - 1); } //printhello.cpp #include \u003ciostream\u003e #include \"functions.h\" using namespace std; void printhello(){ int i; cout \u003c\u003c \"Hello world!\" \u003c\u003c endl; } //main.cpp #include \u003ciostream\u003e #include \"functions.h\" using namespace std; int main(){ printhello(); cout \u003c\u003c \"This is main: \" \u003c\u003c endl; cout \u003c\u003c \"This factorial of 5 is: \" \u003c\u003c factorial(5) \u003c\u003c endl; return 0; } 正常编译 g++ *.cpp -o hello ./hello 写makefile文件,管理工程,实现自动化编译(.o) # VERDION 1 hello: main.cpp printhello.cpp factorial.cpp g++ -o hello main.cpp printhello.cpp factorial.cpp # VERDION 2 CXX = g++ TARGET = hello OBJ = main.o printhello.o factorial.o $(TARGET): $(OBJ) $(CXX) -o $(TARGET) $(OBJ) main.o: main.cpp $(CXX) -c main.cpp printhello.o: printhello.cpp $(CXX) -c printhello.cpp factorial.o: factorial.cpp $(CXX) -c factorial.cpp # VERDION 3 CXX = g++ TARGET = hello OBJ = main.o printhello.o factorial.o CXXFLAGS = -c -Wall # Wall 打开警告信息 $(TARGET): $(OBJ) $(CXX) -o $@ $^ %.o: %.cpp $(CXX) $(CXXFLAGS) $\u003c -o $@ .PHONY: clean clean: rm -f *.o $(TARGET) # VERDION 4 CXX = g++ TARGET = hello SRC = $(wildcard *.cpp) OBJ = $(patsubst %.cpp, %.o, $(SRC)) CXXFLAGS = -c -Wall $(TARGET): $(OBJ) $(CXX) -o $@ $^ %.o: %.cpp $(CXX) $(CXXFLAGS) $\u003c -o $@ .PHONY: clean clean: rm -f *.o $(TARGET) # del -f *.o $(TARGET).exe # windows下要想在Makefile中通过命令行删除中间文件,需要将rm替换为del 使用make命令执行makefile文件中的指令集 make # make VERBOSE=1 查看make具体指令 在当前目录下执行main程序 ./hello make clean make -n # 打印make查看具体指令,不运行 .PHONY # 强制每次执行 ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:4:0","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["C++"],"content":"参考阅读 Makefile Tutorial By Example GNU make ","date":"2023-07-15","objectID":"/posts/c++/cmake%E6%95%99%E7%A8%8B/:5:0","tags":["C++","工具"],"title":"CMake教程","uri":"/posts/c++/cmake%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"正则表达式","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"正则表达式 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:1:0","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"正则表达式语法 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:2:0","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"元字符 元字符 描述 . 句号匹配任意单个字符除了换行符。 \".ar\" =\u003e The car parked in the garage. [ ] 匹配方括号内的任意字符 \"[Tt]he\" =\u003e The car parked in the garage. [^ ] 否定的字符种类。匹配除了方括号里的任意字符 \"[^c]ar=\u003e The car parked in the garage. * 匹配\u003e=0个重复的在*号之前的字符。 \"[a-z]*\" =\u003e The car parked in the garage #21. + 匹配\u003e=1个重复的+号前的字符。 \"c.+t\" =\u003e The fat cat sat on the mat. ? 标记?之前的字符为可选. \"[T]?he\" =\u003e The car is parked in the garage. {n,m} 匹配num个大括号之前的字符或字符集 (n \u003c= num \u003c= m). 0~9之间匹配最少2位,最多3位的数字:\"[0-9]{2,3}\" =\u003e The number was 9.9997 but we rounded it off to 10.0. 0~9之间匹配只是2位的数字:\"[0-9]{2,}\" =\u003e The number was 9.9997 but we rounded it off to 10.0. 0~9之间匹配3位数字:\"[0-9]{3}\" =\u003e The number was 9.9997 but we rounded it off to 10.0. (xyz) 字符集,匹配与 xyz 完全相等的字符串. \"`(c | 或运算符,匹配符号前或后的字符. “`(T \\ 转义字符,用于匹配一些保留的字符 [ ] ( ) { } . * + ? ^ $ \\ | \"`(f ^ 从开始行开始匹配 `[T $ 从末端开始匹配 \"(at\\.)” =\u003eThe fat cat. sat. on the mat. \"(at\\.$)\"=\u003eThe fat cat. sat. on the mat. ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:2:1","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"简写字符集 简写 描述 . 除换行符外的所有字符 \\w 匹配所有字母数字,等同于 [a-zA-Z0-9_] \\W 匹配所有非字母数字,即符号,等同于: [^\\w] \\d 匹配数字: [0-9] \\D 匹配非数字: [^\\d] \\s 匹配所有空格字符,等同于: [\\t\\n\\f\\r\\p{Z}] \\S 匹配所有非空格字符: [^\\s] \\f 匹配一个换页符 \\n 匹配一个换行符 \\r 匹配一个回车符 \\t 匹配一个制表符 \\v 匹配一个垂直制表符 \\p 匹配 CR/LF(等同于 \\r\\n),用来匹配 DOS 行终止符 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:2:2","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"零宽度断言 符号 描述 ?= 正先行断言-存在 “`(T ?! 负先行断言-排除 “`(T ?\u003c= 正后发断言-存在 “`(?\u003c=(T ?\u003c! 负后发断言-排除 “`(?\u003c!(T ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:2:3","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"标志(可选项) 标志 描述 i 忽略大小写。 \"The/gi\" =\u003e The fat cat sat on the mat. g 全局搜索。 \".(at)/gi\" =\u003e The fat cat sat on the mat. m 多行修饰符:锚点元字符 ^ $ 工作范围在每行的起始。 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:2:4","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"贪婪与惰性匹配 默认贪婪匹配,意味着会匹配尽可能长的子串 ?转为惰性匹配,则遇到就停 “(.*at)\"=\u003eThe fat cat sat on the mat. “(.*?at)\"=\u003eThe fat cat sat on the mat. ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:2:5","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"正则表达式操作 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:3:0","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"匹配 #include \u003cregex\u003e bool regex_match (const basic_string\u003ccharT,ST,SA\u003e\u0026 s, const basic_regex\u003ccharT,traits\u003e\u0026 rgx, regex_constants::match_flag_type flags = regex_constants::match_default); /** 第一个参数s为:需要用正则表达式去匹配的字符串,简言之就是要处理的字符串。 第二个参数rgx为:为一个basic_regex的一个对象,进行匹配的模式,用正则字符串表示,其声明为: (1)typedef basic_regex\u003cchar\u003e regex;//正常字符处理(常用) (2)typedef basic_regex\u003cwchar_t\u003e wregex;//宽字符处理 第三个参数flags是控制第二个参数如何去匹配,第三个参数处可以设置一个或多个常量去控制,一般设置有默认值 返回值为:如果匹配成功,返回True,否则返回False */ ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:3:1","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"搜索 bool regex_search (const basic_string\u003ccharT,ST,SA\u003e\u0026 s, const basic_regex\u003ccharT,traits\u003e\u0026 rgx, regex_constants::match_flag_type flags = regex_constants::match_default); //参数含义与regex_match一致,此方法不返回匹配成功的字符串,只是确定里面是否有满足正则式的字句 bool regex_search (const basic_string\u003ccharT,ST,SA\u003e\u0026 s, match_results\u003ctypename basic_string\u003ccharT,ST,SA\u003e::const_iterator,Alloc\u003e\u0026 m, const basic_regex\u003ccharT,traits\u003e\u0026 rgx, regex_constants::match_flag_type flags = regex_constants::match_default); //其他参数含义一样,多了一个m参数,其含义为此处为一个match_results的类型,其作用是存储匹配的结果或者满足子表达式匹配的结果,返回结果为一个迭代器 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:3:2","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"替换 basic_string\u003ccharT,ST,SA\u003e regex_replace (const basic_string\u003ccharT,ST,SA\u003e\u0026 s, const basic_regex\u003ccharT,traits\u003e\u0026 rgx, const charT* fmt, regex_constants::match_flag_type flags = regex_constants::match_default); //第一个参数s表示要被操作的字符串对象 //第二个参数rgx为匹配正则表达式 //第三个参数fmt为以何种方式进行替换 //第四个参数flags为一种方式,代表怎样去替换 //返回值为:如果匹配成功返回已经替换成功的字符串,否则匹配失败,返回原字符串 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:3:3","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"py-正则表达式操作 re 模块的一般使用步骤 使用compile函数将正则表达式的字符串形式编译为一个 Pattern 对象 通过 Pattern 对象提供的一系列方法对文本进行匹配查找,获得匹配结果(一个 Match 对象) 最后使用 Match 对象提供的属性和方法获得信息,根据需要进行其他的操作 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:0","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"compile函数 用于编译正则表达式,生成一个 Pattern 对象 import re re.compile(pattern[, flag]) pattern: 匹配的正则表达式 flag :一个可选参数,表示匹配模式,比如忽略大小写,多行模式等 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:1","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"match 从字符串的起始位置匹配一个模式,如果不是起始位置匹配成功的话,match()就返回none。必须从字符串开头匹配 re.match(pattern,string,flags=0) pattern:匹配的正则表达式 string:要匹配的字符串 flags:标志位,用于控制正则表达式的匹配方式,如:是否区分大小写,多行匹配等等 返回一个匹配的对象,而不是匹配的内容。从起始位置开始没有匹配成功,即便其他部分包含需要匹配的内容,re.match()也会返回None。 一般一个小括号括起来就是一个捕获组。使用group()来提取每组匹配到的字符串。group()会返回一个包含所有小组字符串的元组,从 0 到 所含的小组号。 0:表示正则表达式中符合条件的字符串。 1:表示正则表达式中符合条件的字符串中的第一个() 中的字符串。 2:表示正则表达式中符合条件的字符串中的第二个() 中的字符串。 … import re msg = 'name:Alice,age:6,score:80' obj = re.match('name:(\\w+),age:(\\d+)', msg) print(obj.group(0)) # name:Alice,age:6 符合条件的字符串 print(obj.group(1)) # Alice 第一匹配 print(obj.group(2)) # 6 第二匹配 print(obj.groups()) # ('Alice', '6') print(obj.span()) # (0, 16) 返回结果的范围 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:2","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"search ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:3","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"findall ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:4","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"finditer ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:5","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"split ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:6","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"sub 用于替换字符串中的匹配项 def sub(pattern, repl, string, count=0, flags=0): return _compile(pattern, flags).sub(repl, string, count) pattern:该参数表示正则中的模式字符串; repl:该参数表示要替换的字符串(即匹配到pattern后替换为repl),也可以是个函数; string:该参数表示要被处理(查找替换)的原始字符串; count:可选参数,表示是要替换的最大次数,而且必须是非负整数,该参数默认为0,即所有的匹配都会被替换; flags:可选参数,表示编译时用的匹配模式(如忽略大小写、多行模式等),数字形式,默认为0。 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:7","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"subn ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:4:8","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"参考阅读 标准库头文件\u003cregex\u003e 最全的常用正则表达式大全——包括校验数字、字符、一些特殊的需求等等 学习正则表达式 C++正则表达式 在线正则表达式 正则表达式可视化 re.sub()用法的详细介绍 菜鸟教程:python正则表达式 ","date":"2023-07-15","objectID":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/:5:0","tags":["Computer","Linux"],"title":"正则表达式","uri":"/posts/computer/%E6%AD%A3%E5%88%99%E8%A1%A8%E8%BE%BE%E5%BC%8F/"},{"categories":["Computer"],"content":"Docker教程","date":"2023-07-10","objectID":"/posts/computer/docker%E6%95%99%E7%A8%8B/","tags":["Computer","工具"],"title":"Docker教程","uri":"/posts/computer/docker%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"docker教程 ","date":"2023-07-10","objectID":"/posts/computer/docker%E6%95%99%E7%A8%8B/:1:0","tags":["Computer","工具"],"title":"Docker教程","uri":"/posts/computer/docker%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"1 docker安装 官方教程 ssh abcserver # abcserver tmux # tmux里安装 # 1. 更新 apt 包索引并安装包以允许 apt 通过 HTTPS 使用存储库: sudo apt-get update sudo apt-get install \\ ca-certificates \\ curl \\ gnupg # 2. 添加 Docker 的官方 GPG 密钥: sudo mkdir -m 0755 -p /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg # 3. 使用以下命令设置存储库: echo \\ \"deb [arch=\"$(dpkg --print-architecture)\" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \\ \"$(. /etc/os-release \u0026\u0026 echo \"$VERSION_CODENAME\")\" stable\" | \\ sudo tee /etc/apt/sources.list.d/docker.list \u003e /dev/null # 4. 更新apt软件包索引: sudo apt-get update # 5. 安装Docker引擎,容器和Docker组成 sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin # 6. 通过运行Hello-World Image来验证Docker Engine安装成功: # sudo docker run hello-world docker --version ","date":"2023-07-10","objectID":"/posts/computer/docker%E6%95%99%E7%A8%8B/:2:0","tags":["Computer","工具"],"title":"Docker教程","uri":"/posts/computer/docker%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"2 docker教程 将当前用户添加到docker用户组 为了避免每次使用docker命令都需要加上sudo权限,可以将当前用户加入安装中自动创建的docker用户组(可以参考官方文档):执行完此操作后,需要退出服务器,再重新登录回来,才可以省去sudo权限。 sudo usermod -aG docker $USER 镜像(images) docker pull ubuntu:20.04:拉取一个镜像 docker images:列出本地所有镜像 docker image rm ubuntu:20.04 或 docker rmi ubuntu:20.04:删除镜像ubuntu:20.04 没有名称直接用id docker [container] commit CONTAINER IMAGE_NAME:TAG:创建某个container的镜像 []内容为可选 docker save -o ubuntu_20_04.tar ubuntu:20.04:将镜像ubuntu:20.04导出到本地文件ubuntu_20_04.tar中 docker load -i ubuntu_20_04.tar:将镜像ubuntu:20.04从本地文件ubuntu_20_04.tar中加载出来 容器(container) docker run -d -t -v /home/待映射路径:/data --privileged --device=/dev/kfd/ --device=/dev/dri/ --network=host --group-add video --name 容器名字 镜像地址 docker [container] create -it ubuntu:20.04:利用镜像ubuntu:20.04创建一个容器 docker ps -a:查看本地的所有容器 docker ps:显示所有在运行的容器 docker ps -a –format “table {{.ID}}\\t{{.Names}}\\t{{.Image}}\\t{{.Status}}\\t{{.Size}}” docker [container] start CONTAINER:启动容器 docker [container] stop CONTAINER:停止容器 docker [container] restart CONTAINER:重启容器 docker [contaienr] run -itd ubuntu:20.04:创建并启动一个容器 docker [contaienr] run -it ubuntu:20.04:创建启动并进入一个容器 docker [container] attach CONTAINER:进入容器 先按Ctrl+p,再按Ctrl+q可以挂起容器 docker [container] exec CONTAINER COMMAND:在容器中执行命令 docker exec -it CONTAINER /bin/bash 查看环境 docker [container] rm CONTAINER:删除容器 docker container prune:删除所有已停止的容器 docker export -o xxx.tar CONTAINER:将容器CONTAINER导出到本地文件xxx.tar中 docker import xxx.tar image_name:tag:将本地文件xxx.tar导入成镜像,并将镜像命名为image_name:tag docker export/import与docker save/load的区别: export/import会丢弃历史记录和元数据信息,仅保存容器当时的快照状态 save/load会保存完整记录,体积更大 docker top CONTAINER:查看某个容器内的所有进程 docker stats:查看所有容器的统计信息,包括CPU、内存、存储、网络等信息 docker system df -v:容器,镜像占用空间显示 docker inspect CONTAINER : 可视化查看容器配置信息 #!/bin/bash # 列出 /var/lib/docker/overlay2 下的目录,筛选出以 G 结尾的行 du -h -d 1 /var/lib/docker/overlay2 | grep -E \"G\\b\" | head -n -1 | awk '{print $1, $2}' | while read size dir; do # 获取与当前目录匹配的容器 ID container_id=$(docker ps -qa | xargs -I {} sh -c \"docker inspect --format='{{.GraphDriver.Data.MergedDir}}' {} | grep '$dir' \u003e/dev/null \u0026\u0026 echo {}\") if [ -n \"$container_id\" ]; then # 获取容器名称 container_name=$(docker ps -a --filter id=\"$container_id\" --format \"{{.Names}}\") # 输出所需格式 echo \"$size $dir $container_id $container_name\" fi done docker cp xxx CONTAINER:xxx 或 docker cp CONTAINER:xxx xxx:在本地和容器间复制文件 docker rename CONTAINER1 CONTAINER2:重命名容器 docker update CONTAINER --memory 500MB:修改容器限制 ","date":"2023-07-10","objectID":"/posts/computer/docker%E6%95%99%E7%A8%8B/:3:0","tags":["Computer","工具"],"title":"Docker教程","uri":"/posts/computer/docker%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"git教程","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"git教程 git可视化的学习网站 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:1:0","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"1 git基本概念 工作区:仓库的目录。工作区是独立于各个分支的。 暂存区:数据暂时存放的区域,类似于工作区写入版本库前的缓存区。暂存区是独立于各个分支的。 版本库:存放所有已经提交到本地仓库的代码版本。 版本结构:树结构,树中每个节点代表一个代码版本。 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:2:0","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"2 git常用命令 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:0","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"全局设置 git config --global user.name xxx:设置全局用户名,信息记录在~/.gitconfig文件中 git config --global user.email [email protected]:设置全局邮箱地址,信息记录在~/.gitconfig文件中 git init:将当前目录配置成git仓库,信息记录在隐藏的.git文件夹中 git config --global core.autocrlf true:忽略linux和windows换行差别。 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:1","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"常用命令 git add XX :将XX文件添加到暂存区 git add .:将所有待加入暂存区的文件加入暂存区 git commit -m \"给自己看的备注信息\":将暂存区的内容提交到当前分支 git commit --amend: 修改最近一次提交commit git status:查看仓库状态 git log:查看当前分支的所有版本 git push -u (第一次需要-u以后不需要) :将当前分支推送到远程仓库 git clone [email protected]:xxx/XXX.git:将远程仓库XXX下载到当前目录下 git branch:查看所有分支和当前所处分支 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:2","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看命令 git diff XX:查看XX文件相对于暂存区修改了哪些内容 git status:查看仓库状态 git log:查看当前分支的所有版本,按q退出 先用工作区中的XX与缓存区中的XX进行比较,若缓存区中没有XX,再用工作区中的XX与当前head指向的版本中的XX进行比较。 git log --pretty=oneline:用一行来显示 git reflog:查看HEAD指针的移动历史(包括被回滚的版本) 回滚其他版本后想回到回滚前版本使用reflog找到对应编号 git branch:查看所有分支和当前所处分支 git pull :将远程仓库的当前分支与本地仓库的当前分支合并 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:3","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"删除命令 git rm --cached XX:将文件从仓库索引目录中删掉,不希望管理这个文件 git restore --staged xx:将xx从暂存区里移除 git restore -- xx 将xx从暂存区恢复到工作区,其中 – 可以不加 git checkout — XX或git restore XX:将XX文件尚未加入暂存区的修改全部撤销 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:4","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"代码回滚 git reset --hard HEAD^ 或``git reset –hard HEAD~ `:将代码库回滚到上一个版本 git reset --hard HEAD^^:往上回滚两次,以此类推 git reset --hard HEAD~100:往上回滚100个版本 git reset --hard 版本号:回滚到某一特定版本 版本号为哈希值前7位 git reset --soft HEAD^:撤销commit ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:5","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"远程仓库 git remote add origin [email protected]:xxx/XXX.git:将本地仓库关联到远程仓库 git remote prune origin:清理远程已删除的分支 git push -u (第一次需要-u以后不需要) :将当前分支推送到远程仓库 git push origin branch_name:将本地的某个分支推送到远程仓库 git clone [email protected]:xxx/XXX.git:将远程仓库XXX下载到当前目录下 git push --set-upstream origin branch_name:设置本地的branch_name分支对应远程仓库的branch_name分支 git push -d origin branch_name:删除远程仓库的branch_name分支 git checkout -t origin/branch_name: 将远程的branch_name分支拉取到本地 git pull :将远程仓库的当前分支与本地仓库的当前分支合并 git pull origin branch_name:将远程仓库的branch_name分支与本地仓库的当前分支合并 git branch --set-upstream-to=origin/branch_name1 branch_name2:将远程的branch_name1分支与本地branch_name2分支对应 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:6","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"分支命令 git branch branch_name:创建新分支 git branch:查看所有分支和当前所处分支 git checkout -b branch_name:创建并切换到branch_name这个分支 git checkout branch_name:切换到branch_name这个分支 git merge branch_name:将分支branch_name合并到当前分支上 git branch -d branch_name:删除本地仓库的branch_name分支 git push --set-upstream origin branch_name:设置本地的branch_name分支对应远程仓库的branch_name分支 git push -d origin branch_name:删除远程仓库的branch_name分支 git checkout -t origin/branch_name 将远程的branch_name分支拉取到本地 git pull :将远程仓库的当前分支与本地仓库的当前分支合并 git pull origin branch_name:将远程仓库的branch_name分支与本地仓库的当前分支合并 git branch --set-upstream-to=origin/branch_name1 branch_name2:将远程的branch_name1分支与本地的branch_name2分支对应 git rebase -i HEAD^/HEAD~x用来调整commit顺序,或者删除中间某个/些commit 执行后会进入一个类似vim的界面, 可以修改/删除commit, 然后通过:wq保存即可 git cherry-pick \u003ccommitHash\u003e 把commit复制到当前分支作为一个新的commit a -\u003e b -\u003e c(HEAD)(master) | -\u003e d -\u003e e(branch B) git cherry-pick \u003ccommitHash d\u003e a -\u003e b -\u003e c -\u003e d(HEAD)(master) | -\u003e d -\u003e e(branch B) 遇到冲突时可以用–continue(处理冲突后继续操作)/–abort(放弃操作) ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:7","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"stash暂存 git stash:将工作区和暂存区中尚未提交的修改存入栈中 git stash apply:将栈顶存储的修改恢复到当前分支,但不删除栈顶元素 git stash drop:删除栈顶存储的修改 git stash pop:将栈顶存储的修改恢复到当前分支,同时删除栈顶元素 git stash list:查看栈中所有元素 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:3:8","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"3 其他 云端复制到本地 rm project -rf git clone [email protected]:abc/project .git 连接gitlab时:fatal: remote origin already exists. git remote remove origin git remote add origin [email protected]:xxx/XXX.git 将本地仓库关联到远程仓库 个人开发一般流程 git init git pull git add . git commit git push origin branch_name 多人提交 第一个提交后,第二个提交先pull新版本,合并本地修改冲突,再push自己的 ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:4:0","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"常用 git restore --staged xx # 将xx从暂存区里移除 git checkout -b new_branch origin/old_branch # 新建分支 git push origin branch_name # 提交到分支 # 合并分支 git checkout -b branch_name origin/branch_name # 远端分支下载 git checkout develop git merge branch_name git status 修改commit注释 # 修改最后一次提交的注释 git commit --amend # 修改退出后,查看一下git log git push --force origin branch_name # 强制提交 # 修改以前提交的注释 git rebase -i HEAD~2 # 数字指的是倒数第n次提交记录的注释 # pick 改成 edit 后退出 git commit --amend # 修改 git rebase --continue git push --force origin branch_name # 强制提交 拉取远程分支 git fetch origin master git merge FETCH_HEAD #FETCH_HEAD 是一个 Git 内部的引用,表示最近一次 git fetch 命令拉取的内容 [Git].gitignore文件的配置使用 - 知乎 (zhihu.com) ","date":"2023-07-10","objectID":"/posts/computer/git%E6%95%99%E7%A8%8B/:5:0","tags":["Computer","工具"],"title":"git教程","uri":"/posts/computer/git%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"GDB调试","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"GDB调试 gcc -o main main.c -g 或者-G生成Debug可调试版本 mkdir build \u0026\u0026 cd build cmake .. -DCMAKE_BUILD_TYPE=Debug make gdb filename 指定调试的文件 hip程序 hipcc demo.cpp -o demo -g,然后hipgdb hipcc demo.cpp -o demo -gdwarf-4,然后gdb 打日志 HIP_LOG_LEVEL=4 HIP_MODULE_MASK=0x7fffffff HIP_ENABLE_LIST=\"hip:hsa:thunk\" ./demo 设置了HIP的日志级别和模块掩码,生成demo_xxx.nano文件 hiplogdump sort demo_xxx.nano \u003elog.txt gdb [exec file] [core file]:调式coredump内核转储文件,直接进去bt查看调用栈信息。 gdb attach -p \u003c进程ID\u003e :进行进程调试并附加到正在运行的进程 r 程序运行 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:1:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"断点操作 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:2:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"设置断点 b demo.cpp:123 # b 文件:行号 b function_name # 所有同名函数 b class1::function_name # 指定函数 b +5 # 偏移量打断点,当前73行, +5到78行 b demo.cpp:123 if i==3 # b 断点 条件 满足条件命中断点 b *0x11111 # 指令地址设置断点(调试程序没有符号信息时)使用p function_name获得函数地址0x11111,在断点然后r运行 rb funtion_nam* # 正则表达式 tb # 临时断点 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:2:1","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"删除断点(delete是全局的) delete #删除所有断点 delete 5 # 删除5号断点 delete 5 6 # 删除5号和6号断点 delete 5-8 11-13 #删除指定范围断点 clear function_name # 删除指定函数断点 clear demo.cpp:123 # 删除指定行号断点 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:2:2","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看断点 info b # 常用 info: info i 两种方式 breakpoint: breakpoint break b 三种方式 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:2:3","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"启用/禁用断点 enable/disabele 断点编号 #启用/禁用断点,可以单一编号如`1`,也可以范围`2-7` enable once 断点编号 # 启用一次后自动禁用 enable delete 断点编号 # 启用后删除 enable count 数量N 断点编号 # 启用断点并命中N次 ignore 断点编号 次数N # 忽略断点前N次命中 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:2:4","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"观察点 watch 变量/表达式 # 观察点,监视变量 rwatch 变量/表达式 # 读取观察点,变量或表达式被读取时,程序中断 awatch 变量/表达式 # 读写观察点,无论变量或表达式被读取还是写入时,程序都中断 info watchpoints # 查看所有观察点 delete/disable/enable 观察点编号 # 删除/禁用/启用观察点 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:2:5","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"捕获点 C++异常 动态库载入 catch 事件 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:2:6","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"执行 s 步进 finish跳出 n :跳过 (next) c :继续(continue) jump N:跳到第N行 ,或者函数 where :显示当前执行的具体函数和代码行 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:3:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看显示 info args :进入一个函数查看参数信息 info locals :查看局部变量值 info functions :查看有哪些函数 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"窗口显示 gdb -tui filename # 显示代码窗口 tui enable # 显示 crtl + x 再按a关闭打开窗口 layout src # 显示源码 layout asm # 显示汇编 layout split # 显示源代码和汇编 layout regs # 显示寄存器 refresh # 刷新屏幕 crtl + l update # 更新源代码 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:1","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看源代码 l # 查看上下文(list) 默认当前代码行的前5行和后5行 set listsize 20 # 设置显示行数 20行 list demo.cpp:123 # 查看指定文件指定行代码 list function_name # 查看指定函数的源代码 # 搜索源代码 serach 正则表达式 forward-search 正则表达式 # 正向搜索 reverse-search 正则表达式 # 反向搜索 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:2","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看/修改变量的值 print 变量 # 打印变量 p 变量 p 变量名=值 #修改查看的变量值 # 一些内嵌函数 p sizeof(a) p strcmp(\"123\". \"12\") p strlen(\"string\") # 查看结构体/ 类的值 set print null-stop # 设置字符串显示规则,遇到结结束符时停止显示 set print pretty # 美化,格式化结构体 p new_node-\u003eName # 查看结构体/类单个成员 p *new_node # 查看整个结构体/类 # 查看数组 set print array # 控制数组显示 set print array-indexes # 显示数组索引 # 查看联合体 set print union # 自动显示变量值,和断点类似 display 变量名 display {var1, var2, var3} # 多变量名时,长度要相同 undisplay 变量编号 # 取消自动显示,info display可查看编号 enabele/disable display 变量编号 # 启用/禁用自动显示 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:3","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看变量类型 # ptype /选项 变量或类型 查看各个变量类型 ptype node_head # 查看变量类型,显示成员名称和类型 # 选项 /r # 原始数据显示,不会代替一些typedef定义 /m # 查看类时,只显示类的成员变量 /M # 显示类的方法(默认) /t # 不打印类中的typedef数据 /o # 打印结构体字段偏移量和大小信息 whatis 变量或表达式 # 查看变量类型 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:4","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看内存 # x /选项 地址 查看各个变量内存信息 const char* str = \"test\"; x str # 默认16进制显示,内存存储内容和“test\"相反(小端存储) 0x74736574 x /s str # 直接显示内容 ”test\" x /d str # 十进制显示 x /4d str # 十进制显示,显示宽度为4 # 变量非指针类型,如int, 先p \u0026value_name, 使用x查看 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:5","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看寄存器 指针寄存器$rip (32位EIP,64RIP)指向当前执行的代码位置 栈指针寄存器$rsp指向当前栈顶 通用寄存器存储一些变量值,函数参数及返回值等 info registers # 简写 i r info registers rax # 显示特定寄存器值 info all-registers # 显示所有寄存器值 function_test(intc a, const char* str) 调用function_test(10, \"test\") 第一个参数存储在寄存器rdi,第二个参数存储在rsi中,是字符串指针 i r rdi i r rsi x /s $rsi # 查看寄存器值 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:6","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看汇编 starti #开始执行程序并停在第一个汇编指令处 layout asm #显示汇编窗口 si #单步 set disassembly-flavor intel disassemble /mr ./demo #查看反汇编代码 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:7","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"查看调用栈 bt # 查看回溯 backtrace bt 2 # 只显示两个栈帧 f 2 # frame切换栈帧,查看调试位置 up/down 2 # 基于当前帧来切换 f 帧地址 # 通过帧地址切换 info frame # 查看帧信息 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:4:8","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"多线程调试 info threads # 查看线程, *号表示当前线程 thread N # 切换线程 b M thread N # 为N号线程M行设置断点 thread apply N command # 为N号线程执行command命令 thread apply all bt # 查看所用线程堆栈信息 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:5:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"多进程调试 info inferiors # 查看进程 inferior N # 切换相应进程 set follow-fork-mode child # 设置调试子进程 set detach-on-fork off # 对所有的进程进行调试 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:6:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"内存检查 # 需要安装 libasan 即AddressSanitizer g++ -fsanitize=address -g -o demo ./demo.cpp ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:7:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"coredump调试 # 在调试界面生成coredump文件 ps aux | grep ./demo # 查看进程号 gdb attach -p \u003c进程ID\u003e # 附加到进程 gcore ****.core detach q # 配置并生成coredump文件 /etc/security/limits.conf添加 soft core unlimited echo -e \"/root/corefile/core-%e-%s-%p-%t\" \u003e /proc/sys/kernel/core_pattern # %e进程名称。%s崩溃信号,%p进程id,%t时间戳 # 调试 gdb [exec file] [core file] # 调式coredump内核转储文件,直接进去`bt`查看调用栈信息。 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:8:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"发行版调试 # 从调试版中提取调试符号 objcopy --only-keep-debug demo demo.symbol # 生成调试符号表 gdb --symbol=demo.symbol -exec=demo_release # 加上调试符号调试发行版 gdb --symbol=demo -exec=demo_release # 直接使用调试版作为符号源 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:9:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"其他 // int 3是用于触发调试中断的指令 asm{ int 3; } ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:10:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"参考阅读 C/C++代码调试的艺术(第2版) (豆瓣) (douban.com) GDB 高级调试-多线程、后台调试、多进程、反向调试 GDB调试-从入门实践到原理 100个gdb小技巧 GDB官方手册 ","date":"2023-06-18","objectID":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/:11:0","tags":["Computer","Linux","C++"],"title":"GDB调试","uri":"/posts/computer/gdb%E8%B0%83%E8%AF%95%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"Shell编程","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"Shell编程 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:1:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"概论 shell是通过命令行与操作系统沟通的语言。 Linux中常见的shell脚本有很多种,常见的有: Bourne Shell (/usr/bin/sh或/bin/sh) Bourne Again Shell (/bin/bash) C Shell (/usr/bin/csh) K Shell (/usr/bin/ksh) zsh … Linux系统中一般默认使用bash,文件开头需要写#! /bin/bash,指明bash为脚本解释器。 #! /bin/bash echo Hello World # echo类似于C++的字符串 运行方式 # 作为可执行文件运行 chmod +x test.sh # 增加执行权限 可执行文件为绿色 ./test.sh # 当前路径下执行 /home/acs/test.sh # 绝对路径下执行 ~/test.sh # 家目录路径下执行 # 解释器执行 bash test.sh nohup bash test.sh # 后台运行 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:2:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"注释 单行注释:# # 这是一行注释 echo Hello World 多行注释: :\u003c\u003cstring string :\u003c\u003cEOF 注释1 注释2 注释3 EOF # EOF可以替换成其它任意字符串 :\u003c\u003cabc 注释4 注释5 注释6 abc ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:3:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"变量 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"定义变量 不需要加$符号,等号两边不能有空格 name1='abc' # 单引号定义字符串 name2=\"abc\" # 双引号定义字符串 name3=abc # 也可以不加引号,同样表示字符串 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:1","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"使用变量 需要加上$符号,或者${}符号。花括号是可选的,主要为了帮助解释器识别变量边界。 name=abc echo $name # 输出abc echo ${name} # 输出abc echo ${name}acwing # 输出abcacwing,等价于echo \"${name}acwing\" ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:2","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"只读变量 使用readonly或者declare将变量变为只读 name=abc readonly name declare -r name # 两种写法均可 name=abc # 会报错,因为此时name只读 type+命令可以解释该命令的来源(内嵌命令。第三方命令等) type readonly #readonly is a shell builtin(shell内部命令) ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:3","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"删除变量 unset删除变量 name=abc unset name echo $name # 输出空行 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:4","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"变量类型 自定义变量(局部变量),子进程不能访问的变量 环境变量(全局变量),子进程可以访问的变量 自定义变量改成环境变量: name=abc # 定义变量 export name # 第一种方法 declare -x name # 第二种方法 环境变量改为自定义变量: export name=abc # 定义环境变量 declare +x name # 改为自定义变量 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:5","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"字符串 单引号与双引号的区别: 单引号中的内容会原样输出,不会执行、不会取变量; 双引号中的内容可以执行、可以取变量; name=abc # 不用引号 echo 'hello, $name \\\"hh\\\"' # 单引号字符串,输出 hello, $name \\\"hh\\\" echo \"hello, $name \\\"hh\\\"\" # 双引号字符串,输出 hello, abc \"hh\" echo ${#name} # 获取字符串长度,输出3 echo ${name:0:5} # 提取子串,提取从0开始的5个字符 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:6","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"默认变量 文件参数变量 在执行shell脚本时,可以向脚本传递参数。$1是第一个参数,$2是第二个参数,以此类推。特殊的,$0是文件名(包含路径) 传递参数,超过用大括号,如${10} 其它参数相关变量 参数 说明 $# 代表文件传入的参数个数,如上例值为4 $* 由所有参数构成的用空格隔开的字符串,如上例值为\"$1 $2 $3 $4\" $@ 每个参数分别用双引号括起来的字符串,如上例中值为\"$1\" “$2” “$3” “$4” $$ 脚本当前运行的进程ID $? 上一条命令的退出状态(注意不是stdout,而是exit code)。0表示正常退出,其他值表示错误 $(command) 返回command这条命令的stdout(可嵌套) 获取退出状态 command 返回command这条命令的stdout(不可嵌套)获取输出 #! /bin/bash echo \"文件名:\"$0 echo \"第一个参数:\"$1 echo \"第二个参数:\"$2 echo \"第三个参数:\"$3 echo \"第四个参数:\"$4 echo $# echo $* echo $@ echo $$ echo $? echo $(ls) echo `ls` # 执行 ./test.sh 1 2 3 4 文件名:./test.sh 第一个参数:1 第二个参数:2 第三个参数:3 第四个参数:4 4 1 2 3 4 1 2 3 4 1313 0 test.sh test.sh ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:7","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"数组 数组中可以存放多个不同类型的值,只支持一维数组,初始化时不需要指明数组大小。 数组下标从0开始。 # 定义数组用小括号表示,元素之间用空格隔开。 array=(1 abc \"def\" abc) array[0]=1 array[1]=abc array[2]=\"def\" array[3]=abc # 读取数组中某个元素的值 ${array[index]} array=(1 abc \"def\" yxc) echo ${array[0]} echo ${array[1]} echo ${array[2]} echo ${array[3]} # 读取整个数组 ${array[@]} # 第一种写法 ${array[*]} # 第二种写法 array=(1 abc \"def\" abc) echo ${array[@]} # 第一种写法 echo ${array[*]} # 第二种写法 # 数组长度,类似于字符串 ${#array[@]} # 第一种写法 ${#array[*]} # 第二种写法 array=(1 abc \"def\" abc) echo ${#array[@]} # 第一种写法 echo ${#array[*]} # 第二种写法 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:4:8","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"expr 命令 expr 命令用于求表达式的值,格式为:expr 表达式 用空格隔开每一项 用反斜杠放在shell特定的字符前面(发现表达式运行错误时,可以试试转义) 对包含空格和其他特殊字符的字符串要用引号括起来 expr会在stdout中输出结果。如果为逻辑关系表达式,则结果为真时,stdout输出1,否则输出0。 expr的exit code:如果为逻辑关系表达式,则结果为真时,exit code为0,否则为1。 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:5:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"字符串表达式 length string:返回string的长度 index string charset charset中任意单个字符在string 中最前面的字符位置,下标从1开始。如果在string 中完全不存在charset中的字符,则返回0。 substr string postion length 返回STRING字符串中从postion 开始,长度最大为length的子串。如果postion 或length为负数,0或非数值,则返回空字符串。 str=\"Hello World!\" echo `expr length $str` #等价于echo `expr length Hello World!`;中间有空格,length只船一个参数,一般加双引号字符串传入 syntax error:unexpecter argument 'World!'. echo `expr length \"$str\"` # ``不是单引号,表示执行该命令,输出12 echo `expr index \"$str\" aWd` # 输出7,下标从1开始 echo `expr substr \"$str\" 2 3` # 输出 ell ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:5:1","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"整数表达式 expr支持普通的算术操作,算术表达式优先级低于字符串表达式,高于逻辑关系表达式。 +-:加减运算。两端参数会转换为整数,如果转换失败则报错。 / %:乘,除,取模运算。两端参数会转换为整数,如果转换失败则报错。 () 可以改变优先级,但需要用反斜杠转义 a=3 b=4 echo `expr $a + $b` # 输出7 echo `expr $a - $b` # 输出-1 echo `expr $a \\* $b` # 输出12,*需要转义 echo `expr $a / $b` # 输出0,整除 echo `expr $a % $b` # 输出3 echo `expr \\( $a + 1 \\) \\* \\( $b + 1 \\)` # 输出20,值为(a + 1) * (b + 1) ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:5:2","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"逻辑关系表达式 | 如果第一个参数非空且非0,则返回第一个参数的值,否则返回第二个参数的值,但要求第二个参数的值也是非空或非0,否则返回0。如果第一个参数是非空或非0时,不会计算第二个参数。 \u0026 如果两个参数都非空且非0,则返回第一个参数,否则返回0。如果第一个参为0或为空,则不会计算第二个参数。 \u003c \u003c= = == != \u003e= \u003e 比较两端的参数,如果为true,则返回1,否则返回0。”==”是”=”的同义词。”expr”首先尝试将两端参数转换为整数,并做算术比较,如果转换失败,则按字符集排序规则做字符比较。 () 可以改变优先级,但需要用反斜杠转义 \u0026\u0026 表示与,|| 表示或 二者具有短路原则: expr1 \u0026\u0026 expr2:当expr1为假时,直接忽略expr2 expr1 || expr2:当expr1为真时,直接忽略expr2 表达式的exit code为0,表示真;为非零,表示假。(与C/C++中的定义相反) a=3 b=4 # 除了转义字符还可直接加单引号 echo `expr $a \\\u003e $b` # 输出0,\u003e需要转义 echo `expr $a '\u003c' $b` # 输出1,也可以将特殊字符用引号引起来 echo `expr $a '\u003e=' $b` # 输出0 echo `expr $a \\\u003c\\= $b` # 输出1 c=0 d=5 echo `expr $c \\\u0026 $d` # 输出0 echo `expr $a \\\u0026 $b` # 输出3 echo `expr $c \\| $d` # 输出5 echo `expr $a \\| $b` # 输出3 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:5:3","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"read命令 read命令用于从标准输入中读取单行数据。当读到文件结束符时,exit code为1,否则为0。 -p: 后面可以接提示信息 -t:后面跟秒数,定义输入字符的等待时间,超过等待时间后会自动忽略此命令 acs@9e0ebfcd82d7:~$ read name # 读入name的值 acwing yxc # 标准输入 acs@9e0ebfcd82d7:~$ echo $name # 输出name的值 acwing yxc #标准输出 acs@9e0ebfcd82d7:~$ read -p \"Please input your name: \" -t 30 name # 读入name的值,等待时间30秒 Please input your name: acwing yxc # 标准输入 acs@9e0ebfcd82d7:~$ echo $name # 输出name的值 acwing yxc # 标准输出 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:6:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"echo命令 echo用于输出字符串。命令格式:echo string # 显示普通字符串 echo \"Hello AC Terminal\" echo Hello AC Terminal # 引号可以省略 # 显示转义字符 echo \"\\\"Hello AC Terminal\\\"\" # 注意只能使用双引号,如果使用单引号,则不转义 echo \\\"Hello AC Terminal\\\" # 也可以省略双引号 # 显示变量 name=abc echo \"My name is $name\" # 输出 My name is abc # 显示换行 echo -e \"Hi\\n\" # -e 开启转义 echo \"acwing\" # 显示不换行 echo -e \"Hi \\c\" # -e 开启转义 \\c 不换行 echo \"acwing\" # 显示结果定向至文件 echo \"Hello World\" \u003e output.txt # 将内容以覆盖的方式输出到output.txt中 # 原样输出字符串,不进行转义或取变量(用单引号) name=acwing echo '$name\\\"' # 显示命令的执行结果 echo `date` # Thu Apr 6 15:30:20 CST 2023 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:7:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"printf 命令 printf命令用于格式化输出,类似于C/C++中的printf函数。默认不会在字符串末尾添加换行符。 命令格式:printf format-string [arguments...] printf \"%10d.\\n\" 123 # 占10位,右对齐 printf \"%-10.2f.\\n\" 123.123321 # 占10位,保留2位小数,左对齐 printf \"My name is %s\\n\" \"abc\" # 格式化输出字符串 printf \"%d * %d = %d\\n\" 2 3 `expr 2 \\* 3` # 表达式的值作为参数 # 输出结果 123. 123.12 . My name is abc 2 * 3 = 6 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:8:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"test命令 在命令行中输入man test,可以查看test命令的用法。 test命令用于判断文件类型,以及对变量做比较。 test命令用exit code返回结果,而不是使用stdout。0表示真,非0表示假。 expr stdout输出 1表示真,0表示假。 test exit code退出状态 0表示真,非0表示假。 acs@a1d78bab476e:~learn$ test 2 -lt 3 acs@a1d78bab476e:~learn$ echo $? 0 acs@a1d78bab476e:~learn$ ls # 列出当前目录下的所有文件 output.txt test.sh # 使用\u0026\u0026 ||实现if判断 acs@a1d78bab476e:~learn$ test -e test.sh \u0026\u0026 echo \"exist\" || echo \"Not exist\" exist # test.sh 文件存在 acs@a1d78bab476e:~learn$ test -e tesh.sh \u0026\u0026 echo \"exist\" || echo \"Not exist\" Not exist # tesh.sh 文件不存在 test -e filename # 判断文件是否存在 test -f filename # 是否为文件 test -d filename # 是否为目录 # 权限判断 test -r filename# 判断文件是否可读 test -w filename# 判断文件是否可写 test -x filename# 判断文件是否可执行 test -s filename# 判断是否为非空文件 # 整数比较 test $a -eq $b # a是否等于b equal(相等) test $a -ne $b # a是否不等于b not equal(不等) test $a -gt $b # a是否大于b greater than(大于) test $a -lt $b # a是否小于b less than(小于) test $a -ge $b # a是否大于等于b greater than or equal(大于或等于) test $a -le $b # a是否小于等于b less than or equal(小于或等于) # 字符串比较 test -z STRING # 判断STRING是否为空,如果为空,则返回true test -n STRING # 判断STRING是否非空,如果非空,则返回true(-n可以省略) test str1 == str2 # 判断str1是否等于str2 test str1 != str2 # 判断str1是否不等于str2 # 多重条件判定 test -r filename -a -x filename test -r filename -o -x filename test ! -r filename -a # 两条件是否同时成立 -o # 两条件是否至少一个成立 ! # 取反。如 test ! -x file,当file不可执行时,返回true 判断符号[] []与test用法几乎一模一样,更常用于if语句中。[[]]是[]的加强版,支持的特性更多。 acs@a1d78bab476e:~learn$ [2 -lt 3] acs@a1d78bab476e:~learn$ echoi $? 0 acs@a1d78bab476e:~learn$ ls # 列出当前目录下的所有文件 output.txt test.sh acs@a1d78bab476e:~learn$ [ -e test.sh ] \u0026\u0026 echo \"exist\" || echo \"Not exist\" exist # test.sh 文件存在 acs@a1d78bab476e:~learn$ [ -e tesh.sh ] \u0026\u0026 echo \"exist\" || echo \"Not exist\" Not exist # tesh.sh 文件不存在 注意: []内的每一项都要用空格隔开 []]内的变量,最好用双引号括起来 []]内的常数,最好用单或双引号括起来 name=\"acwing abc\" [ $name == \"acwing abc\" ] # 错误,等价于 [ acwing abc == \"acwing abc\" ],前面参数太多 [ \"$name\" == \"acwing abc\" ] # 正确 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:9:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"判断语句 if..then形式 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:10:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"单层if if condition then 语句1 语句2 ... fi # 实例 a=3 b=4 if [ \"$a\" -lt \"$b\" ] \u0026\u0026 [ \"$a\" -gt 2 ] then echo ${a}在范围内 fi # 输出 3在范围内 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:10:1","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"单层if-else 命令格式 if condition then 语句1 语句2 ... else 语句1 语句2 ... fi # 实例 a=3 b=4 if ! [ \"$a\" -lt \"$b\" ] then echo ${a}不小于${b} else echo ${a}小于${b} fi # 输出 3小于4 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:10:2","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"多层if-elif-elif-else if condition then 语句1 语句2 ... elif condition then 语句1 语句2 ... elif condition then 语句1 语句2 else 语句1 语句2 ... fi #示例: a=4 if [ $a -eq 1 ] then echo ${a}等于1 elif [ $a -eq 2 ] then echo ${a}等于2 elif [ $a -eq 3 ] then echo ${a}等于3 else echo 其他 fi # 输出 其他 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:10:3","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"case…esac形式 类似于C/C++中的switch语句。 case $变量名称 in 值1) 语句1 语句2 ... ;; # 类似于C/C++中的break 值2) 语句1 语句2 ... ;; *) # 类似于C/C++中的default 语句1 语句2 ... ;; esac # 示例: a=4 case $a in 1) echo ${a}等于1 ;; 2) echo ${a}等于2 ;; 3) echo ${a}等于3 ;; *) echo 其他 ;; esac # 输出:其他 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:10:4","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"循环语句 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"for…in…do…done for var in val1 val2 val3 do 语句1 语句2 ... done # 示例1,输出a 2 cc,每个元素一行: for i in a 2 cc do echo $i done #示例2,输出当前路径下的所有文件名,每个文件名一行: for file in `ls` do echo $file done # 示例3,输出1-10;seq返回一个序列 for i in $(seq 1 10) do echo $i done # 示例4,使用{1..10} 或者 {a..z} for i in {a..z} do echo $i done ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:1","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"for ((…;…;…)) do…done for ((expression; condition; expression)) do 语句1 语句2 done # 示例,输出1-10,每个数占一行: for ((i=1; i\u003c=10; i++)) do echo $i done ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:2","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"while…do…done循环 while condition do 语句1 语句2 ... done # 示例,文件结束符为Ctrl+d,输入文件结束符后read指令返回false。 while read name do echo $name done ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:3","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"until…do…done循环 当条件为真时结束。 until condition do 语句1 语句2 ... done # 示例,当用户输入yes或者YES时结束,否则一直等待读入。 until [ \"${word}\" == \"yes\" ] || [ \"${word}\" == \"YES\" ] do read -p \"Please input yes/YES to stop this program: \" word done ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:4","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"break命令 跳出当前一层循环,注意与C/C++不同的是:break不能跳出case语句。 # 示例 while read name do for ((i=1;i\u003c=10;i++)) do case $i in 8) break ;; *) echo $i ;; esac done done # 该示例每读入非EOF的字符串,会输出一遍1-7。 # 该程序可以输入Ctrl+d文件结束符来结束,也可以直接用Ctrl+c杀掉该进程。 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:5","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"continue命令 跳出当前循环。 # 示例: for ((i=1;i\u003c=10;i++)) do if [ `expr $i % 2` -eq 0 ] then continue fi echo $i done # 该程序输出1-10中的所有奇数。 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:6","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"死循环的处理方式 如果AC Terminal可以打开该程序,则输入Ctrl+c即可。 否则可以直接关闭进程: 使用top命令找到进程的PID或者ps aux返回当前打开的所有进程。shift+M按照内存排序 输入kill -9 PID即可关掉此进程 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:11:7","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"函数 bash中的函数类似于C/C++中的函数,但return的返回值与C/C++不同,返回的是exit code,取值为0-255,0表示正常结束。 如果想获取函数的输出结果,可以通过echo输出到stdout中,然后通过$(function_name)来获取stdout中的结果。 函数的return值可以通过$?来获取。 [function] func_name() { # function关键字可以省略 语句1 语句2 ... } 不获取 return值和stdout值 func() { name=abc echo \"Hello $name\" } func # 输出结果:Hello abc 获取 return值和stdout值 不写return时,默认return 0。 func() { name=abc echo \"Hello $name\" return 123 } output=$(func) ret=$? echo \"output = $output\" echo \"return = $ret\" # 输出结果: # output = Hello abc # return = 123 函数的输入参数 在函数内,$1表示第一个输入参数,$2表示第二个输入参数,依此类推。 函数内的$0仍然是文件名,而不是函数名。 func() { # 递归计算 $1 + ($1 - 1) + ($1 - 2) + ... + 0 word=\"\" while [ \"${word}\" != 'y' ] \u0026\u0026 [ \"${word}\" != 'n' ] do read -p \"要进入func($1)函数吗?请输入y/n:\" word done if [ \"$word\" == 'n' ] then echo 0 return 0 fi if [ $1 -le 0 ] then echo 0 return 0 fi sum=$(func $(expr $1 - 1)) echo $(expr $sum + $1) } echo $(func 10) # 输出结果:55 函数内的局部变量 可以在函数内定义局部变量,作用范围仅在当前函数内。 可以在递归函数中定义局部变量。 local 变量名=变量值 #! /bin/bash func() { local name=abc echo $name } func echo $name # 输出结果:abc # 第一行为函数内的name变量,第二行为函数外调用name变量,会发现此时该变量不存在。 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:12:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"exit命令 exit命令用来退出当前shell进程,并返回一个退出状态;使用$?可以接收这个退出状态。 exit命令可以接受一个整数值作为参数,代表退出状态。如果不指定,默认状态值是 0。 exit退出状态只能是一个介于 0~255 之间的整数,其中只有 0 表示成功,其它值都表示失败。 #! /bin/bash if [ $# -ne 1 ] # 如果传入参数个数等于1,则正常退出;否则非正常退出。 then echo \"arguments not valid\" exit 1 else echo \"arguments valid\" exit 0 fi # 执行该脚本: ./test.sh acwing arguments valid echo $? # 传入一个参数,则正常退出,exit code为0 0 ./test.sh arguments not valid echo $? # 传入参数个数不是1,则非正常退出,exit code为1 1 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:13:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"文件重定向 每个进程默认打开3个文件描述符: stdin标准输入,从命令行读取数据,文件描述符为0 stdout标准输出,向命令行输出数据,文件描述符为1 stderr标准错误输出,向命令行输出数据,文件描述符为2 可以用文件重定向将这三个文件重定向到其他文件中。 命令 说明 command \u003e file 将stdout重定向到file中 command \u003c file 将stdin重定向到file中 command » file 将stdout以追加方式重定向到file中 command n\u003e file 将文件描述符n重定向到file中 command n» file 将文件描述符n以追加方式重定向到file中 输入和输出重定向 echo -e \"Hello \\c\" \u003e output.txt # 将stdout重定向到output.txt中 echo \"World\" \u003e\u003e output.txt # 将字符串追加到output.txt中 read str \u003c output.txt # 从output.txt中读取字符串 echo $str # 输出结果:Hello World 同时重定向stdin和stdout #! /bin/bash read a read b echo $(expr \"$a\" + \"$b\") 创建input.txt,里面的内容为: 3 4 # 执行命令: ./test.sh \u003c input.txt \u003e output.txt # 从input.txt中读取内容,将输出写入output.txt中 cat output.txt # 查看output.txt中的内容 7 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:14:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"引入外部脚本 类似于C/C++中的include操作,bash也可以引入其他文件中的代码。 . filename # 注意点和文件名之间有一个空格 source filename # test1.sh #! /bin/bash name=abc # 定义变量name # 创建test2.sh #! /bin/bash source test1.sh # 或 . test1.sh echo My name is: $name # 可以使用test1.sh中的变量 # 执行命令: ./test2.sh My name is: abc ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:15:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"文本处理三剑客 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:16:0","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"grep 适合单纯的查找或匹配文本 grep -l './hip-prof-3912.db' log_*.txt:查找文本 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:16:1","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"sed 更适合编辑匹配到的文本 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:16:2","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"awk 更适合格式化文本,对文本进行较复杂格式处理 AWK执行的流程:读(Read)、执行(Execute)与重复(Repeat) 读(Read):从输入流(文件、管道或标准输入)中读取一行,然后将其存入内存中。 执行(Execute):对于每一行的输入,所有的AWK命令按顺序执行。 重复(Repeat):一直重复上述两个过程,直到文件结束。 程序结构 开始块(BEGIN block):启动,只执行一次;BEGIN是关键字需大写;可选(程序可以没有开始块) 主体块(Body block):输入行,执行命令 结束块(END block):介素执行,END是关键字需大写;可选(程序可以没有开始块) [user553@login05 shell]$ cat stu.txt 1) 张三 物理 60 2) 李四 数学 70 3) 王五 英语 80 4) 赵六 语文 90 5) 孙七 化学 100 [user553@login05 shell]$ awk '{print}' stu.txt 1) 张三 物理 60 2) 李四 数学 70 3) 王五 英语 80 4) 赵六 语文 90 5) 孙七 化学 100 [user553@login05 shell]$ awk 'BEGIN{printf \"编号\\t姓名\\t科目\\t成绩\\n\"} {print}' stu.txt 编号 姓名 科目 成绩 1) 张三 物理 60 2) 李四 数学 70 3) 王五 英语 80 4) 赵六 语文 90 5) 孙七 化学 100 语法 awk [options] ' Pattern{Action} '\u003cfile\u003e awk中最常用的动作Action就是print和printf 逐行处理的,默认以换行符为标记,识别每一行;awk会按照用户指定的分隔符去分割当前行,如果没有指定分隔符,默认使用空格作为分隔符。 $0和$NF均为内置变量。$NF表示当前行分割后的最后一列。 内置变量不能加双引号,否则会当文本输出 [user553@login05 shell]$ echo aaa | awk '{print $1}' aaa [user553@login05 shell]$ echo aaa | awk '{print \"$1\"}' $1 # 通过管道输出磁盘信息;printf中可以格式化输出的字符串,确保输出是等宽字符显示。 [user553@login05 shell]$ df -h | awk '{print $1\"\\t\"$2\"\\t\"%5}' Filesystem Size0 /dev/sda3 422G0 devtmpfs 126G0 /dev/sda1 2.0G0 ParaStor_01_work 27P0 ParaStor_01_home 215T0 /dev/sdc1 15T0 ParaStor_01_nvme 388T0 [user553@login05 shell]$ df -h | awk '{printf \"%20s\\t %s\\t %s\\t\\n\", $1,$2,$5}' Filesystem Size Use% /dev/sda3 422G 16% devtmpfs 126G 0% /dev/sda1 2.0G 9% ParaStor_01_work 27P 15% ParaStor_01_home 215T 59% /dev/sdc1 15T 1% ParaStor_01_nvme 388T 2% Pattern,其实就是选择的条件 AWK支持正则表达式;正则表达式被放入两个斜线中:/正则表达式/。 # 从/etc/passwd文件中找出以root开头的行 [user553@login05 shell]$ grep \"^root\" /etc/passwd root:x:0:0:root:/root:/bin/bash [user553@login05 shell]$ awk '/^root/ {print $0}' /etc/passwd root:x:0:0:root:/root:/bin/bash [Options]可选参数。最常用的是: -F, 用于指定输入分隔符; -v varname=value 变量名区分字符大小写, 用于设置变量的值 $ awk -v myvar='hello world!' 'BEGIN {print myvar}' hello world! [user553@login05 shell]$ awk -F\" \" 'NR==2 {print $0}' stu.txt 2) 李四 数学 70 awk if语句:必须用在{}中,且比较内容用()括起来。 # 统计uid小于等于500和大于500的用户个数 [user553@login05 shell]$ awk -F: 'BEGIN {i=0;j=0} {if($3\u003c=500) {i++} else {j++}} END{print i, j}' /etc/passwd 41 24 for循环 [user553@login05 shell]$ awk 'BEGIN {for(i=1; i\u003c=10;i++){if(i%2!=0) continue;print i}}' 2 4 6 8 10 ","date":"2023-06-17","objectID":"/posts/computer/shell%E7%BC%96%E7%A8%8B/:16:3","tags":["Computer","Linux"],"title":"Shell编程","uri":"/posts/computer/shell%E7%BC%96%E7%A8%8B/"},{"categories":["Computer"],"content":"tmux教程","date":"2023-06-16","objectID":"/posts/computer/tmux%E6%95%99%E7%A8%8B/","tags":["Computer","工具","Linux"],"title":"tmux教程","uri":"/posts/computer/tmux%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"tmux教程 功能 分屏。 允许断开Terminal连接后,继续运行进程。 结构 一个tmux可以包含多个session,一个session可以包含多个window,一个window可以包含多个pane。 ","date":"2023-06-16","objectID":"/posts/computer/tmux%E6%95%99%E7%A8%8B/:1:0","tags":["Computer","工具","Linux"],"title":"tmux教程","uri":"/posts/computer/tmux%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"常用操作 tmux:新建一个session,其中包含一个window,window中包含一个pane,pane里打开了一个shell对话框。 tmux kill-server :关闭所有session 按下Ctrl + B 后松开,然后按% (shift + 5):将当前pane左右平分成两个pane 按下Ctrl + B 后松开,然后按\":将当前pane上下平分成两个pane。 Ctrl + D:关闭当前pane; 如果当前window的所有pane均已关闭,则自动关闭window;如果当前session的所有window均已关闭,则自动关闭session。 鼠标点击可以选pane。(我鼠标动不了) 按下Ctrl + B 后松开,然后按方向键:选择相邻的pane。 鼠标拖动pane之间的分割线,可以调整分割线的位置。 按住Ctrl + B 的同时按方向键,可以调整pane之间分割线的位置。 按下Ctrl + B 后松开,然后按z:将当前pane全屏/取消全屏。 按下Ctrl + B 后松开,然后按d:挂起当前session。 tmux a:打开之前挂起的session。 按下Ctrl + B 后松开,然后按s:选择其它session。 方向键 —— 上:选择上一项 session/window/pane 方向键 —— 下:选择下一项 session/window/pane 方向键 —— 右:展开当前项 session/window 方向键 —— 左:闭合当前项 session/window 按下Ctrl + B 后松开,然后按c:在当前session中创建一个新的window。 按下Ctrl + B 后松开,然后按w:选择其他window,操作方法与(12)完全相同。 按下Ctrl + B 后松开,然后按PageUp:翻阅当前pane内的内容。 鼠标滚轮:翻阅当前pane内的内容。 在tmux中选中文本时,需要按住shift键。(仅支持Windows和Linux,不支持Mac) tmux中复制/粘贴文本的通用方式: 按下Ctrl + B 后松开,然后按[ 用鼠标选中文本,被选中的文本会被自动复制到tmux的剪贴板 按下Ctrl + B 后松开,然后按],会将剪贴板中的内容粘贴到光标处 ","date":"2023-06-16","objectID":"/posts/computer/tmux%E6%95%99%E7%A8%8B/:2:0","tags":["Computer","工具","Linux"],"title":"tmux教程","uri":"/posts/computer/tmux%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"安装tmux 无网络使用appImages版本: apps – AppImages Releases · nelsonenzo/tmux-appimage (github.com) chmod +x ./tmux.appimage # 下载后,添加权限 cp tmux.appimage /usr/local/bin/tmux # 放到PATH环境变量记录的文件夹下,以便在任意地方直接调用 ","date":"2023-06-16","objectID":"/posts/computer/tmux%E6%95%99%E7%A8%8B/:3:0","tags":["Computer","工具","Linux"],"title":"tmux教程","uri":"/posts/computer/tmux%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"配置文件 ~/.tmux.conf 令其生效两种方式: 1:tmux source-file ~/.tmux.conf 2:在tmux窗口中,先按下Ctrl+b指令前缀,然后按下系统指令:,进入到命令模式后输入source-file ~/.tmux.conf,回车后生效。 ctrl + B 改成 ctrl + A set-option -g status-keys vi setw -g mode-keys vi setw -g monitor-activity on # setw -g c0-change-trigger 10 # setw -g c0-change-interval 100 # setw -g c0-change-interval 50 # setw -g c0-change-trigger 75 set-window-option -g automatic-rename on set-option -g set-titles on set -g history-limit 100000 #set-window-option -g utf8 on # set command prefix set-option -g prefix C-a unbind-key C-b bind-key C-a send-prefix bind h select-pane -L bind j select-pane -D bind k select-pane -U bind l select-pane -R bind -n M-Left select-pane -L bind -n M-Right select-pane -R bind -n M-Up select-pane -U bind -n M-Down select-pane -D bind \u003c resize-pane -L 7 bind \u003e resize-pane -R 7 bind - resize-pane -D 7 bind + resize-pane -U 7 bind-key -n M-l next-window bind-key -n M-h previous-window set -g status-interval 1 # status bar set -g status-bg black set -g status-fg blue #set -g status-utf8 on set -g status-justify centre set -g status-bg default set -g status-left \" #[fg=green]#S@#H #[default]\" set -g status-left-length 20 # mouse support # for tmux 2.1 # set -g mouse-utf8 on set -g mouse on # # for previous version #set -g mode-mouse on #set -g mouse-resize-pane on #set -g mouse-select-pane on #set -g mouse-select-window on #set -g status-right-length 25 set -g status-right \"#[fg=green]%H:%M:%S #[fg=magenta]%a %m-%d #[default]\" # fix for tmux 1.9 bind '\"' split-window -vc \"#{pane_current_path}\" bind '%' split-window -hc \"#{pane_current_path}\" bind 'c' new-window -c \"#{pane_current_path}\" # run-shell \"powerline-daemon -q\" # vim: ft=conf ","date":"2023-06-16","objectID":"/posts/computer/tmux%E6%95%99%E7%A8%8B/:4:0","tags":["Computer","工具","Linux"],"title":"tmux教程","uri":"/posts/computer/tmux%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"Vim教程","date":"2023-06-16","objectID":"/posts/computer/vim%E6%95%99%E7%A8%8B/","tags":["Computer","工具","Linux"],"title":"Vim教程","uri":"/posts/computer/vim%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"Vim教程 功能 命令行模式下的文本编辑器。 根据文件扩展名自动判别编程语言。支持代码缩进、代码高亮等功能。 使用方式:vim filename 模式 一般命令模式 编辑模式 在一般命令模式里按下i,会进入编辑模式。 按下ESC会退出编辑模式,返回到一般命令模式。 命令行模式 在一般命令模式里按下:/?三个字母中的任意一个,会进入命令行模式。 命令行在最下面。 可以查找、替换、保存、退出、配置编辑器等。 异常处理 每次用vim编辑文件时,会自动创建一个.filename.swp的临时文件。 如果打开某个文件时,该临时文件swp文件已存在,则会报错。此时解决办法有两种: 找到正在打开该文件的程序,并退出 直接删掉该swp文件即可 ","date":"2023-06-16","objectID":"/posts/computer/vim%E6%95%99%E7%A8%8B/:1:0","tags":["Computer","工具","Linux"],"title":"Vim教程","uri":"/posts/computer/vim%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"常用操作 i:进入编辑模式 i:在光标所在字符前开始插入 a:在光标所在字符后开始插入 o:在光标所在行的下面另起一新行插入 s:删除光标所在的字符并开始插入 ESC:进入一般命令模式 移动操作 h 或 左箭头键:光标向左移动一个字符 j 或 向下箭头:光标向下移动一个字符 k 或 向上箭头:光标向上移动一个字符 l 或 向右箭头:光标向右移动一个字符 w : 跳下一个单词 b:回退上个单词 %:跳括号 固定行,在行间移动 n\u003cSpace\u003e:n表示数字,按下数字后再按空格,光标会向右移动这一行的n个字符 0 或 功能键[Home]:光标移动到本行开头 $或 功能键[End]:光标移动到本行末尾 全文内容,移动到某行 gg:光标移动到第一行,相当于1G G:光标移动到最后一行 :n 或 nG:n为数字,光标移动到第n行 n\u003cEnter\u003e:n为数字,光标向下移动n行 查找操作 /word:向光标之下寻找第一个值为word的字符串。 ?word:向光标之上寻找第一个值为word的字符串。 :n1,n2s/word1/word2/g:n1与n2为数字,在第n1行与n2行之间寻找word1这个字符串,并将该字符串替换为word2 :1,$s/word1/word2/g:将全文的word1替换为word2 :1,$s/word1/word2/gc:将全文的word1替换为word2,且在替换前要求用户确认。 n:重复前一个查找操作 N:反向重复前一个查找操作 复制粘贴,删除操作 v:选中文本 d:删除选中的文本; daw删除单词 dd: 删除当前行 y:复制选中的文本 yy: 复制当前行 p: 将复制的数据在光标的下一行/下一个位置粘贴 u:撤销 Ctrl + r:取消撤销 保存操作 :w保存 :w! 强制保存 :q 退出 :q! 强制退出 :wq 保存并退出 格式化操作 set paste 设置成粘贴模式,取消代码自动缩进 :set nopaste 取消粘贴模式,开启代码自动缩进 \u003e:将选中的文本整体向右缩进一次 \u003c:将选中的文本整体向左缩进一次 :set nu 显示行号 :set nonu 隐藏行号 gg=G:将全文代码格式化 :noh 关闭查找关键词高亮 Ctrl + q:当vim卡死时,可以取消当前正在执行的命令 ","date":"2023-06-16","objectID":"/posts/computer/vim%E6%95%99%E7%A8%8B/:2:0","tags":["Computer","工具","Linux"],"title":"Vim教程","uri":"/posts/computer/vim%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"常用技巧 ","date":"2023-06-16","objectID":"/posts/computer/vim%E6%95%99%E7%A8%8B/:3:0","tags":["Computer","工具","Linux"],"title":"Vim教程","uri":"/posts/computer/vim%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":".范式 # h $a;\u003cesc\u003e # $ -\u003e 到行尾; a -\u003e 在光标所在字符后开始插入 $a等价A j$. # j -\u003e 光标下移; .:-\u003e 重复操作相当于(a;\u003cesc\u003e) # 在一个字符前后添加一个空格,示例+号 f+ # f -\u003e 当前行中向后查找下一个指定字符,这里是+ s\u003cspace\u003e+\u003cspace\u003e\u003cesc\u003e # s -\u003e 删除并进入插入模式 ;. # ; -\u003e 重复上一次f查找命令的字符 # 查找替换 将content替换成copy; 也可以使用 :%s/content/copy/gc /content # /content -\u003e 查找content字符 * # * -\u003e 搜索命令,用于在当前光标位置向后查找下一个匹配项; cw # cw -\u003e 删除当前单词并进入插入模式 copy\u003cesc\u003e n. # n -\u003e 重复上一次搜索命令(*) . -\u003e (cw copy\u003cesc\u003e) ","date":"2023-06-16","objectID":"/posts/computer/vim%E6%95%99%E7%A8%8B/:3:1","tags":["Computer","工具","Linux"],"title":"Vim教程","uri":"/posts/computer/vim%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"vi + ctags ctags -R # 全局代码索引文件 crtl+] # 跳转函数或变量定义 g crtl+] # 跳转相同函数或变量定义 crtl+o # 返回 vim -t function_name # 直接跳转到该函数 ","date":"2023-06-16","objectID":"/posts/computer/vim%E6%95%99%E7%A8%8B/:3:2","tags":["Computer","工具","Linux"],"title":"Vim教程","uri":"/posts/computer/vim%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"配置文件 \"语法高亮 syntax on \"启用鼠标 if has(\"autocmd\") au BufReadPost * if line(\"'\\\"\") \u003e 1 \u0026\u0026 line(\"'\\\"\") \u003c= line(\"$\") | exe \"normal! g'\\\"\" | endif set mouse=a endif \"显示括号匹配 set showmatch \"缩进 set cindent set autoindent set shiftwidth=4 \"显示终端 set showcmd \" 设置tab set ts=4 set expandtab \"括号匹配 inoremap { {}\u003cLeft\u003e noremap {\u003cCR\u003e {\u003cCR\u003e}\u003cEsc\u003eO inoremap { { inoremap {} {} ","date":"2023-06-16","objectID":"/posts/computer/vim%E6%95%99%E7%A8%8B/:4:0","tags":["Computer","工具","Linux"],"title":"Vim教程","uri":"/posts/computer/vim%E6%95%99%E7%A8%8B/"},{"categories":["Computer"],"content":"Linx常用指令","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"Linux基础 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:1:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"1. Linux启动 内核引导。 运行 init。 系统初始化。 建立终端 。 用户登录系统。 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:2:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"2. 常用文件管理命令 ls: 列出当前目录下所有文件,蓝色的是文件夹,白色的是普通文件,绿色的是可执行文件 ls -a:查看所有文件包括隐藏文件(以.开头的文件就是隐藏文件) ls -l:查看当前路径下文件的读、写、执行权限 ls | wc -l:查看ls下有多少个文件 pwd: 显示当前路径 cd XXX: 进入XXX目录下, cd .. 返回上层目录 .:当前目录; ..:上级目录 ~:家目录,回到路径/home/acs下 cd -:返回改变路径前的路径 cp XXX YYY: 将XXX文件复制成YYY,XXX和YYY可以是一个路径 mkdir XXX: 创建目录XXX mkdir -p:如果文件夹不存在,则创建 rm XXX: 删除普通文件; rm XXX -r: 删除文件夹; rm *./txt:删除所有同类文件比如txt格式 rm *:删除所有文件(不包括文件夹) rm * -r :删除所有文件夹 rmkdir:删除一个空的目录 mv XXX YYY: 将XXX文件移动到YYY,和cp命令一样,XXX和YYY可以是一个路径;重命名也是用这个命令 touch XXX: 创建一个文件 cat XXX: 展示文件XXX中的内容 tac最后一行开始显示 ; nl显示同时带行号 复制文本:windows/Linux下:Ctrl + insert,Mac下:command + c 粘贴文本:windows/Linux下:Shift + insert,Mac下:command + v clear清屏 history:查看历史输入指令 tree:以树形显示文件目录结构 file xxx:查看文件属性 where/which xxx:查看xxx在哪 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:3:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"3. 环境变量 环境变量类似于全局变量,可以被各个进程访问到。我们可以通过修改环境变量来方便地修改系统配置。 查看 列出当前环境下的所有环境变量: env # 显示当前用户的变量 set # 显示当前shell的变量,包括当前用户的变量; export # 显示当前导出成用户变量的shell变量 输出某个环境变量的值: echo $PATH 修改 为了将对环境变量的修改应用到未来所有环境下,可以将修改命令放到~/.bashrc文件中。 修改完~/.bashrc文件后,记得执行source ~/.bashrc,来将修改应用到当前的bash环境下。 declare设置环境变量 declare [+/-][选项] [变量名=变量值] unset \u003c待清除环境变量\u003e 常见环境变量 HOME:用户的家目录。 PATH:可执行文件(命令)的存储路径。路径与路径之间用:分隔。当某个可执行文件同时出现在多个路径中时,会选择从左到右数第一个路径中的执行。下列所有存储路径的环境变量,均采用从左到右的优先顺序。 LD_LIBRARY_PATH:用于指定动态链接库(.so文件)的路径,其内容是以冒号分隔的路径列表。 C_INCLUDE_PATH:C语言的头文件路径,内容是以冒号分隔的路径列表。 CPLUS_INCLUDE_PATH:CPP的头文件路径,内容是以冒号分隔的路径列表。 PYTHONPATH:Python导入包的路径,内容是以冒号分隔的路径列表。 JAVA_HOME:jdk的安装目录。 CLASSPATH:存放Java导入类的路径,内容是以冒号分隔的路径列表。 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:4:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"module工具箱 常用命令 说明 module av | avail 显示可使用模块 module li | list 显示已加载模块 module load |add [modulefile] 加载模块 module unload |rm [modulefile] 卸载模块 module purge 清理说有已加载模块 module show|display [modulefile] 查询配置信息 module use [modulefile] 使用其他module path module switch|swap [modulefile1] [modulefile2] 切换modulefile ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:4:1","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4. 常用命令 linux-command Linux命令手册 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"常用终端快捷键 ctrl c: 取消命令,并且换行 ctrl d:关闭终端 crtl l: 清空终端 (也可输入clear) ctrl u: 擦除从当前光标位置到行首的全部内容 ctrl k: 擦除从当前光标位置到行尾的全部内容 crtl w: 擦除光标位置前的单词 crtl y: 粘贴使用前三种查出方式的文本。(误删的时候派上用场) crtl a: 移动光标到行首 crtl e: 移动光标到行尾 tab键:可以补全命令和文件名,如果补全不了快速按两下tab键,可以显示备选选项 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:1","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4.1 系统状况 top:查看所有进程的信息(Linux的任务管理器) 打开后,输入M:按使用内存排序 打开后,输入P:按使用CPU排序 打开后,输入q:退出 df -h:查看硬盘使用情况 free -h:查看内存使用情况 du -sh:查看当前目录占用的硬盘空间 du -shc *:查看当前子目录占用的硬盘空间 du -h --max-depth=1 ~/:home 目录下的所有文件和文件夹的大小,但只显示一层目录深度 du -h -d 1 /var/lib/docker/overlay2 | grep -E \"G\\b\":只显示大小中包含以 G 结尾的整个单词 ps aux:查看所有进程 ps aux | grep xxx:使用管道查看具体xxx进程 ps -ef | grep xxx:更详细 kill -9 pid:杀死编号为pid的进程 传递某个具体的信号:kill -s SIGTERM pid netstat -nt:查看所有网络连接 netstat -nlp | grep 22:查看22端口号是否被占用 查看网络设置:netsh wlan show profiles 具体设置:wlan show profiles wifi名称 key=clear w:列出当前登陆的用户 ping www.baidu.com:检查是否连网 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:2","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4.2 文件权限 chown:更改文件属性 chown bin xxx:将xxx的拥有者改为bin账号 chown root:root xxx:将xxx拥有者与群组改回root chmod:修改文件权限 drwxrwxrwx10位 第一位d是不是文件夹,超链接 第一组rwx:自己——可读,可写,可执行 二进制(rwx 111对应7) 第二组rwx:同组——可读,可写,可执行 第三组rwx:其他——可读,可写,可执行 chmod +x xxx:给xxx添加可执行权限 chmod -x xxx:去掉xxx的可执行权限 chmod 777 xxx:将xxx的权限改成777 chmod 777 xxx -R:递归修改整个文件夹的权限 chgrp:更改文件属组 file: 查看文件类型 stat:查看文件属性 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:3","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4.3 文件检索 find /path/to/directory/ -name '*.py':搜索某个文件路径下的所有*.py文件 find . -path \"./envs\" -prune -o -name \"libatomic.a\" -print:同时排除特定的目录envs grep xxx:从stdin中读入若干行数据,如果某行中包含xxx,则输出该行;否则忽略该行。 find XXX/ -name '*.cpp' | xargs cat | grep xxx在XXX文件夹的cpp文件搜索xxx wc:统计行数、单词数、字节数 既可以从stdin中直接读入内容;也可以在命令行参数中传入文件名列表; wc -l:统计行数 wc -w:统计单词数 wc -c:统计字节数 tree:展示当前目录的文件结构 tree /path/to/directory/:展示某个目录的文件结构 tree -a:展示隐藏文件 ag xxx:搜索当前目录下的所有文件,检索xxx字符串 ag命令的帮助文档 cut:分割一行内容 从stdin中读入多行数据 echo $PATH | cut -d ':' -f 3,5:输出PATH用:分割后第3、5列数据 echo $PATH | cut -d ':' -f 3-5:输出PATH用:分割后第3-5列数据 echo $PATH | cut -c 3,5:输出PATH的第3、5个字符 echo $PATH | cut -c 3-5:输出PATH的第3-5个字符 sort:将每行内容按字典序排序 可以从stdin中读取多行数据 可以从命令行参数中读取文件名列表 xargs:将stdin中的数据用空格或回车分割成命令行参数 find . -name '*.py' | xargs cat | wc -l:统计当前目录下所有python文件的总行数 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:4","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4.4 查看文件内容 more:浏览文件内容 回车:下一行 空格:下一页 b:上一页 q:退出 less:与more类似,功能更全 回车:下一行 y:上一行 Page Down:下一页 Page Up:上一页 q:退出 head -3 xxx:展示xxx的前3行内容 同时支持从stdin读入内容 tail -3 xxx:展示xxx末尾3行内容 同时支持从stdin读入内容 tail -n 5 xxx:查看xxx尾部5行内容 (常用于日志) tail -f xxx:实时追踪该xxx文档的所有更新 (常用于 flume 采集数据) ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:5","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4.5 用户相关 history:展示当前用户的历史操作。内容存放在~/.bash_history中 终端中粘贴字符时头尾出现“0~“和“1~“的特殊字符 : printf \"\\e[?2004l\" ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:6","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4.6 工具 md5sum:计算md5哈希值 可以从stdin读入内容 也可以在命令行参数中传入文件名列表; time command:统计command命令的执行时间 ipython3:交互式python3环境。可以当做计算器,或者批量管理文件。 ! echo \"Hello World\":!表示执行shell脚本 watch -n 0.1 command:每0.1秒执行一次command命令 tar:压缩文件 tar -zcvf xxx.tar.gz /path/to/file/*:压缩 tar -zxvf xxx.tar.gz:解压缩 tar -zxvf xxx.tar.gz -C yyy:解压到指定目录 yyy 中 diff xxx yyy:查找文件xxx与yyy的不同点 rpm2cpio demo.rpm | cpio -idmv :解压demo.rmp文件 ln -s /usr/home/file /usr/home/abc :软连接: 在目录/usr/home下建立一个符号链接文件abc,使它指向目录/usr/home/file scp source@host:filename targt@host:filename:远程拷贝文件 [源文件] [目标文件] sz demo:下载文件 strace ./demo:用于跟踪进程的系统调用以及接收和发送到内核的信号 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:7","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"4.7 安装软件 sudo command:以root身份执行command命令 apt-get install xxx:安装软件 pip install xxx --user --upgrade:安装python包 yum常用命令 列出所有可更新的软件清单命令:yum check-update 更新所有软件命令:yum update 仅安装指定的软件命令:yum install \u003cpackage_name\u003e 仅更新指定的软件命令:yum update \u003cpackage_name\u003e 列出所有可安裝的软件清单命令:yum list 删除软件包命令:yum remove \u003cpackage_name\u003e 查找软件包命令:yum search \u003ckeyword\u003e ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:5:8","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"5. 管道 管道类似于文件重定向,可以将前一个命令的stdout重定向到下一个命令的stdin。 管道命令仅处理stdout,会忽略stderr。 管道右边的命令必须能接受stdin。 多个管道命令可以串联。 与文件重定向的区别 文件重定向左边为命令,右边为文件。 管道左右两边均为命令,左边有stdout,右边有stdin。 |是管道链接符 用于两个管道之间的链接与通信 # 统计当前目录下所有python文件的总行数,其中find、xargs、wc等命令可以参考常用命令这一节内容。 find . -name '*.py' | xargs cat | wc -l # find . -name “.py | cat:获取为.py结尾的文件名 # find . -name “.py | xargs cat:获取.py文件的内容 # wc -l:统计行数 # xargs将stdin的内容用空行隔开,作为cat的命令行参数,传给cat ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:6:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"6. 用户组的管理 添加新的用户账号 :useradd 选项 用户名 删除帐号:userdel 选项 用户名 修改帐号:usermod 选项 用户名 增加一个新的用户组:groupadd 选项 用户组 删除一个已有的用户组:groupdel 用户组 修改用户组的属性:groupmod 选项 用户组 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:7:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"7. 定时任务 atd一次性定时任务配置 at \u003coptions\u003e \u003ctime\u003e at now +2 minutes # 2分钟执行ls命令并把命令执行结果输出到/tmp/ls.txt文件 at\u003e ls -al \u003e ~/temp/ls.txt crtl + d atq # 查询 cron周期性定时任务配置 crontab \u003coptions\u003e \u003cfile\u003e crontab -e ***** /test.sh crontab -l ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:8:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"常用情况 windows换行符\\r对应的显示^M;转换为unix格式的\\n 方法1:dos2unix filename 方法2:vi filename打开文件,执行 :set ff=unix 设置文件为unix,然后执行:wq,保存成unix格式。 方法3:使用sed命令sed -i \"s/\\r//\" filename 或 sed -i \"s/^M//\" filename直接替换结尾符 ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:9:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Computer"],"content":"常用C++指令 readelf readelf -d ./demo:查看动态段信息,比如共享库依赖、动态符号、重定位信息、程序入口点等 readelf -a ./demo :显示 ELF 文件的 所有 可用信息,依次输出所有的 ELF 文件头、节区头、符号表、动态段、重定位信息等内容。 readelf –h ./demo:显示 ELF 文件的 文件头信息 readelf --segments ./demo:显示 ELF 文件的 程序头信息 readelf -s --wide ./demo:显示 ELF 文件的 符号表信息,--wide 选项用于避免输出被截断 readelf -S ./demo : 显示 ELF 文件的 节区头信息,查看demo是否是debug模式编译文件,看section里有没有.debug_info 也可用--sections c++filt 解析C++符号工具 反汇编操作 llvm-objdump -d # 反汇编命令,默认cpu, extractkernel -i # 反汇编命令,dcu llvm-amdgpu-objdump --inputs= # 反汇编命令,amdgpu ","date":"2023-06-15","objectID":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/:10:0","tags":["Computer","Linux"],"title":"Linx常用指令","uri":"/posts/computer/linux%E5%B8%B8%E7%94%A8%E6%8C%87%E4%BB%A4/"},{"categories":["Deep Learning"],"content":"BERT","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"BERT 文章标题:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 作者:Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova 发表时间:(NAACL-HLT 2019) 官方代码 ==Transformer一统NLP的开始== BERT: 用深的、双向的、transformer 来做预训练,用来做语言理解的任务。 pre-training: 在一个大的数据集上训练好一个模型 pre-training,模型的主要任务是用在其它任务 training 上 deep bidirectional transformers: 深的双向 transformers language understanding: 更广义,transformer 主要用在机器翻译 MT ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:1:0","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Abstract 新的语言表征模型 BERT: Bidirectional Encoder Representations from Transformers Elmo_GPT_Bert ELMo:使用左右侧的上下文信息 ;基于RNN,应用下游任务需要一点点调整架构 GPT:使用左边的上下文信息,预测未来 BERT:使用左右侧的上下文信息 ;基于Transformer,应用下游任务只需要调整最上层 从无标注的文本中(jointly conditioning 联合左右的上下文信息)预训练得到无标注文本的 deep bidirectional representations BERT = ELMo 的 bidirectional 信息 + GPT 的新架构 transformer ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:2:0","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Introduction NLP任务分两类 sentence-level tasks :句子情绪识别、两个句子的关系; token-level tasks :NER (人名、街道名) 需要 fine-grained output BERT训练方法 通过 MLM 带掩码的语言模型作为预训练的目标,来减轻语言模型的单向约束。inspired by the Close task 1953 **MLM ** (masked language model):每次随机选输入的词源 tokens, 然后 mask 它们,目标函数是预测被 masked 的词;15%的词汇mask 假设输入里面的第二个词汇是被盖住的,把其对应的embedding输入到一个多分类模型中,来预测被盖住的单词。类似挖空填词、完形填空 standard language model:只看左边的信息 NSP: (next sentence prediction ):预测下一个句子;判断两个句子是随机采样的 or 原文相邻,学习sentence-level 的信息。 把两句话连起来,中间加一个[SEP]作为两个句子的分隔符。而在两个句子的开头,放一个[CLS]标志符,将其得到的embedding输入到二分类的模型,输出两个句子是不是接在一起的。 在训练BERT的时候,这两个任务是同时训练的。所以,BERT的损失函数是把这两个任务的损失函数加起来的,是一个「多任务」训练 贡献 bidirectional 双向信息的重要性 BERT 首个微调模型,在 sentence-level and token-level task效果好 好的预训练模型,不用对特定任务做一些模型架构的改动 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:3:0","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Related Work Unsupervised Feature-based approaches 非监督的基于特征表示的工作:词嵌入、ELMo等 Unsupervised Fine-tuning approaches 非监督的基于微调的工作:GPT等 Transfer Learning from Supervised Data 在有标签的数据上做迁移学习。 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:4:0","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Bert 预训练 + 微调 bert_stage pre-training:使用 unlabeled data 训练 fine-tuning:微调的 BERT 使用预训练的参数 初始化,所有的权重参数通过下游任务的 labeled data 进行微调。 每一个下游任务会创建一个 新的 BERT 模型,(由预训练参数初始化),但每一个下游任务会根据自己任务的labeled data 来微调自己的 BERT 模型。 model name L H A Total Parameters $BERT_{base}$ 12 768 12 110M $BERT_{base}$ 24 1024 16 340M L:transform blocks的个数 H:hidden size 隐藏层大小 A:自注意力机制 multi-head 中 head 头的个数 BERT 模型复杂度和层数 L 是 linear, 和宽度 H 是 平方关系。 深度变成了以前的两倍,在宽度上面也选择一个值,使得这个增加的平方大概是之前的两倍。 $H_{large}=\\sqrt {2} H_{base}=\\sqrt 2 \\times 768=1086$ H = 16,因为每个 head 的维度都固定在了64。所以宽度增加了, head 数也增加了 $H = 64 \\times A:\\ \\ 768=64\\times 12;\\ \\ 1024=64\\times 16$ 嵌入层:输入字典大小30k,输出H transformer blocks($H^2\\times 12$):self-attention($H^2\\times 4$) + MLP ($H^2\\times 8$) Transformer block: 多头Q,K,V投影矩阵合并$H(64\\times A)$+输出后再H*H投影 MLP 的 2个全连接层: 第一个全连接层输入是 H,输出是 4 * H; 第二个全 连接层输入是 4 * H,输出是 H。 $Total \\ Parameters = 30K\\times H + 12 \\times H^2 \\times L$ $BERT_{base} = 30000\\times 768 + 12 \\times 768^2 \\times 12 = 107.97M$ $BERT_{large} = 30000\\times 1024+ 12 \\times 1024^2\\times 24= 332.71M$ ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:5:0","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Input/Output Representation(预训练\u0026微调共通部分) BERT 的输入和 transformer 区别 transformer 预训练时候的输入是一个序列对。编码器和解码器分别会输入一个序列。 BERT 只有一个编码器,为了使 BERT 能处理两个句子的情况,需要把两个句子并成一个序列。 BERT切词 WordPiece, 把一个出现概率低的词切开,只保留一个词出现频率高的子序列,30k token 经常出现的词(子 序列)的字典。 否则,空格切词 –\u003e 一个词是一个 token。数据量打的时候,词典会特别大,到百万级别。可学习的参数基 本都在嵌入层了。 BERT 的输入序列构成 [ CLS ] + [ SEP ] 输入序列 Token embeddings: 词源的embedding层,整成的embedding层, 每一个 token 有对应的词向量。 Segement embeddings: 这个 token 属于第一句话 A还是第二句话 B。 Position embedding 的输入是 token 词源在这个序列 sequence 中的位置信息。(和Transformer不一样,这是学习出来的) BERT 的 segment embedding (属于哪个句子)和 position embedding (位置在哪里)是学习得来的, transformer 的 position embedding 是给定的。 序列开始:[CLS] 输出的是句子层面的信息 sequence representation BERT 使用的是 transformer 的 encoder,self-attention layer 会看输入的每个词和其它所有词的关系。 就算 [ CLS ] 这个词放在我的第一个的位置,他也是有办法能看到之后所有的词。所以他放在第一个是没关 系的,不一定要放在最后。 区分两个合在一起的句子的方法: 每个句子后 + [ SEP ] 表示 seperate 学一个嵌入层 来表示整个句子是第一句还是第二句 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:5:1","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Pre-training BERT 预训练的 key factors: 目标函数,预训练的数据 MLM 由 WordPiece 生成的词源序列中的词源,它有 15% 的概率会随机替换成一个掩码。但是对于特殊的词源不 做替换 15% 计划被 masked 的词:80% 的概率被替换为 [MASK], 10% 换成 random token,10% 不改变原 token。 特殊的词源:第一个词源 [ CLS ] 和中间的分割词源 [SEP]。 问题:预训练和微调看到的数据不一样 预训练的输入序列有 15% [MASK],微调时的数据没有 [MASK]. 为什么要Mask 语言模型会根据前面单词来预测下一个单词,但是self-attention的注意力只会放在自己身上,那么这样100%预测到自己,毫无意义,所以用Mask,把需要预测的词给挡住。 Mask方式优缺点: 1)被随机选择15%的词当中以10%的概率用任意词替换去预测正确的词,相当于文本纠错任务,为BERT模型赋予了一定的文本纠错能力; 2)被随机选择15%的词当中以10%的概率保持不变,缓解了finetune时候与预训练时候输入不匹配的问题(预训练时候输入句子当中有mask,而finetune时候输入是完整无缺的句子,即为输入不匹配问题)。 3)针对有两个及两个以上连续字组成的词,随机mask字割裂了连续字之间的相关性,使模型不太容易学习到词的语义信息。主要针对这一短板,因此google此后发表了BERT-WWM,国内的哈工大联合讯飞发表了中文版的BERT-WWM。 NSP 输入序列有 2 个句子 A 和 B,50% 正例,50%反例 50% B 在 A 之后,是一对连续句子,标记为 IsNext;50% 是语料库中 a random sentence 随机采样的,标记为 NotNext。 NSP flight ## less:flightless 出现概率不高,WordPiece 分成了 2 个出现频率高的子序列,## 表示 less 是flightless 的一部分。 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:5:2","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Fine-tuning BERT BERT 经过微小的改造(增加一个小小的层),就可以用于各种各样的语言任务。 (a,b)与 Next Sentence Prediction类似,通过在 「[CLS]」 标记的 Transformer 输出顶部添加分类层,完成诸如情感分析之类的**「分类」**任务 (c)在问答任务(例如 SQuAD v1.1)中,会收到一个关于文本序列的问题,并需要在序列中标记答案。使用 BERT,可以通过学习标记答案开始和结束的两个额外向量来训练问答模型。 (d)在命名实体识别 (NER) 中,接收文本序列,并需要标记文本中出现的各种类型的实体(人、组织、日期等)。使用 BERT,可以通过将每个标记的输出向量输入到预测 NER 标签的分类层来训练 NER 模型 differernt tasks ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:5:3","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"拓展阅读 BERT 论文逐段精读【论文精读】 李宏毅:ELMO, BERT, GPT The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) BERT 详解(附带 ELMo、GPT 介绍) BERT 科普文 作者对双向的回应 ACL 2019:What does BERT learn about the structure of language?:BERT的低层网络就学习到了短语级别的信息表征,BERT的中层网络就学习到了丰富的语言学特征,而BERT的高层网络则学习到了丰富的语义信息特征 BERT Rediscovers the Classical NLP Pipeline 关于BERT:你不知道的事 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/bert/:6:0","tags":["Deep Learning","Transformer","NLP"],"title":"BERT","uri":"/posts/deeplearning/transformer_nlp/bert/"},{"categories":["Deep Learning"],"content":"Transformer","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"Transformer 文章标题:Attention Is All You Need 作者:Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin 发表时间:(NIPS 2017) 继MLP、CNN、RNN后的第四大类架构 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:1:0","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"Introduction sequence transduction: 序列转录,序列到序列的生成。input一个序列,output一个序列。 机器翻译:输入一句中文,输出一句英文。 RNN :从左往右一步一步计算,对第 t 个状态 $h_t$,由 $h_{t-1}$(历史信息)和 当前词 t 计算。 难以并行。 通过 factorization 分解 tricks 和 conditional computation 并行化来提升计算效率 过早的历史信息可能被丢掉。时序信息是一步一步往后传递的 时序长的时候一个大的 $h_t$存历史信息。每一个 计算步都需要存储,内存开销大 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:2:0","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"Background CNN(局部像素–\u003e全部像素;多通道 –\u003e multi-head) Transformer 的 attention mechanism 每一次看到所有的像素,一层能够看到整个序列。 Transformer 的 multi-head self-attention 模拟 CNNs 多通道输出的效果。 自注意力,是一种将单个序列的不同位置关联起来以计算序列表示的注意力机制 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:3:0","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"Model Architecture The Transformer - model architecture 先将输入Input使用embedding algorithm转成向量。 编码器的都会接收到一个list(每个元素都是512维的词向量)。list的尺寸是可以设置的超参,通常是训练集的最长句子的长度。 加入位置编码Positional Encoding RNN :把上一时刻的输出 作为下一个时刻的输入,来传递时序信息 Attention: 在输入里面加入时序信息 –\u003e positional encoding output 是 value 的加权和(权重是 query 和 key 之间的距离,和序列信息无关) 一个词在嵌入层表示成一个 512 维的向量,用另一个 512 维的向量来表示一个数字代表位置信息 Transformer_positional_Encoding positional encoding 是 cos 和 sin 的一个函数,在 [-1, +1] 之间抖动的。 $PE_{(pos,2i)}=sin(pos/10000^{2i/d_{model}})$ $PE_{(pos,2i+1)}=cos(pos/10000^{2i/d_{model}})$ 矩阵第pos行第2i列;行代表词元在序列中的位置,列代表位置编码的不同维度 为啥设计这样的函数,参考位置编码 $input\\ embedding * \\sqrt{d_{model}}$ 学 embedding 的时候,会把每一个向量的 L2 Norm 学的比较小。 乘上$\\sqrt{d_{model}}$使得 embedding 和 positional encoding 的 scale 也是在差不多的 [-1, +1] 数值区间,可以做加法 加入位置编码后再进行dropout=0.1。 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:4:0","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"Encoder Transformer的编码器是由多(N=6)个相同的层叠加而成的,每个层都有两个子层(子层表示为sublayer)。 第一个子层是**[多头自注意力](#####Multi-Head Attention)(multi-head self-attention)**; 输入key、value 和 query 其实就是一个东西,就是自己本身 第二个子层是基于位置的前馈网络(position-wise feed-forward network)。 作用在最后一个维度的 MLP Point-wise: 把一个 MLP 对每一个词 (position)作用一次,对每个词作用的是是同一个多层感知机(MLP) $FFN(x)=max(0,xW_1+b_1)W_2+b_2$:512–\u003e2048–\u003e512 每个子层都采用了残差连接(residual connection)和层规范化(layer normalization) $LayerNorm(x+Sublayer(x))$ Transformer_LayerNor residual connections 需要输入输出维度一致,不一致需要做投影。简单起见,固定每一层的输出维度$d_{model }$= 512 简单设计:只需调 2 个参数: $d_{model }$ 每层维度有多大 和 N 多少层,影响后续一系列网络的设计,BERT、GPT。 层规范化(layer normalization) H:句长,W:词向量长 N:Batch Layer Normalization:是在一个句上的进行归一化。 Batch Normalization:是把每个Batch中每句话的第一个字的同一维度看成一组做归一化。 LayerNorm 每个样本自己算均值和方差,不需要存全局的均值和方差。 LayerNorm 更稳定,不管样本长还是短,均值和方差是在每个样本内计算。 Attention 注意力函数是 一个将一个 query 和一些 key - value 对 映射成一个输出的函数,其中所有的 query、key、value 和 output 都是一些向量。 output 是 value 的一个加权和 –\u003e 输出的维度 == value 的维度。 query改变,权值分配不一样,输出不一样 query 和 key 的长度是等长的,都等于 dk。value 的维度是 dv,输出也是 dv。 query 和 key 可以不等长,可用加性的注意力机制处理。 Transformer_attention Scaled Dot-product Attention Transformer_attention 注意力的具体计算是:对每一个 query 和 key 做内积,然后把它作为相似度。 两个向量做内积:用来衡量两向量的相似度。内积的值越大,它的余弦值越大,这两个向量的相似度就越高。如果你的内积的值为 0 ,这两个向量正交了,没有相似度。 $$ Attention(Q,K,V)=softmax(\\frac{QK^T}{\\sqrt {d_k}})V $$一个 query 对所有 key 的内积值,然后再除以$\\sqrt{d_k}$, 再做 softmax。 softmax 是对每一行的值做 softmax,然后每一行之间是独立的,会得到权重。 除以$\\sqrt{d_k}$:防止softmax函数的梯度消失。 2 个向量的长度比较长的时候,点积的值可能会比较大,相对的差距会变大,导致最大值 softmax会更加靠近于1,剩下那些值就会更加靠近于0。值就会更加向两端靠拢,算梯度的时候,梯度比较小。 Mask机制 padding mask:对输入序列进行对齐。 具体来说,就是给在较短的序列后面填充 0。但是如果输入的序列太长,则是截取左边的内容,把多余的直接舍弃。 操作和Sequence mask一致。 Sequence mask:避免在 t 时刻,看到 t 时刻以后的东西。(选择使用,在decoder时使用) 操作实现:把$ Q_t $和 $K_t $和他们之后的值换成一个很大的负数,进入 softmax 后,权重为0。 和 V 矩阵做矩阵乘法时,没看到 t 时刻以后的内容,只看 t 时刻之前的 key - value pair。 mask是个 0 1矩阵,和attention(scale QK)size一样,t 时刻以后 mask 为 0。 Multi-Head Attention 多头机制扩大了模型对不同位置的关注能力 多头机制赋予attention多种子表达方式 先投影到低维,投影的 w 是可以学习的;multi-head attention 给 h 次机会去学习 不一样的投影的方法,使得在投影进去的度量空间里面能够去匹配不同模式需要的一些相似函数,然后把 h 个 heads 拼接起来,最后再做一次投影。 输入:原始的 value、key、query 进入一个Linear层,把 value、key、query 投影到比较低的维度。然后再做一个 scaled dot product 。执行 h 次会得到 h 个输出,再把 h 个 输出向量全部合并 concat 在一起,最后做一次线性的投影 Linear。 投影维度 $d_v = d_{model} / h = 512 / 8 = 64$,每个 head 得到 64 维度,concat,再投影回 $d_{model}$。 concat过程 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:4:1","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"Decoder Decoder 是 auto-regressive 自回归。当前时刻的输入是之前一些时刻的输出。做预测时,decoder 不能看到之后时刻的输出。 Transformer解码器也是由多(N=6)个相同的层叠加而成的,每个层都有三个子层(子层表示为sublayer)。 attention mechanism 每一次能看完完整的输入,要避免这个情况的发生。 第一个子层是带掩码的多头自注意力(Masked multi-head self-attention); 输入qkv复制 3 份 masked 体现在,在预测第 t 个时刻的输出的时候,看不到 t 时刻以后的输入,具体操作看Mask机制,两个Mask相加。 保留了自回归(auto-regressive)属性,确保预测仅依赖于已生成的输出词元。 第二个子层是**[多头自注意力](#####Multi-Head Attention)(multi-head self-attention)**; 不再是 self-attention。 key - value 来自 encoder 的输出。 query 是来自 decoder 里 masked multi-head attention 的输出。 attention:query 注意到当前的 query 感兴趣的东西,对当前的 query的不感兴趣的内容,可以忽略掉。 在 encoder 和 decoder 之间传递信息 第三个子层是基于位置的前馈网络(position-wise feed-forward network)。 每个子层都采用了残差连接(residual connection)和层规范化(layer normalization) 关于序列到序列模型(sequence-to-sequence model),在训练阶段,其输出序列的所有位置的词元都是已知的;然而,在预测阶段,其输出序列的词元是逐个生成的。因此,只有生成的词元才能用于解码器的自注意力计算中。流程如下(包含解码器Decoder的shifted right 输入状况): decoder_step1 Decoder_step_end ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:4:2","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"The Final Linear and Softmax Layer 线性层是个简单的全连接层,将解码器的最后输出映射到一个非常大的logits向量上。 假设模型已知有1万个单词(输出的词表)从训练集中学习得到。那么,logits向量就有1万维,每个值表示是某个词的可能倾向值。 softmax层将这些分数转换成概率值(都是正值,且加和为1),最高值对应的维上的词就是这一步的输出单词。 ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:4:3","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"拓展阅读 哈佛注释版:The Annotated Transformer 斯坦福100+作者的200+页综述 对LayerNorm的新研究 对Attention在Transformer里面作用的研究 B站:Transformer论文逐段精读【论文精读】 B站:Transformer中Self-Attention以及Multi-Head Attention详解 B站:Transformer模型(1/2): 剥离RNN,保留Attention The Illustrated Transformer Transformer 论文详细解读:多配图 详解Transformer中Self-Attention以及Multi-Head Attention 知乎:【Transformer】10分钟学会Transformer | Pytorch代码讲解 | 代码可运行 知乎:深度学习attention机制中的Q,K,V分别是从哪来的? 芦苇的机器学习笔记:Self-Attention和Transformer 李沐:动手学深度学习——10.7. Transformer ","date":"2023-06-10","objectID":"/posts/deeplearning/transformer_nlp/transformer/:5:0","tags":["Deep Learning","Transformer","NLP"],"title":"Transformer","uri":"/posts/deeplearning/transformer_nlp/transformer/"},{"categories":["Deep Learning"],"content":"MAE","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/mae/","tags":["Deep Learning","Transformer","CV"],"title":"MAE","uri":"/posts/deeplearning/transformer_cv/mae/"},{"categories":["Deep Learning"],"content":"MAE 文章标题:Masked Autoencoders Are Scalable Vision Learners 作者:Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, Ross Girshick 发表时间:2021 官方代码 BERT的CV版 Masked Autoencoders are scalable vision learners 带掩码的自编码器 是可扩展的视觉学习器 两个词的用法 scalable:可扩展的,模型比较大 efficient:算法特别快 vision learners:一个 backbone 的模型 masked:来源于 BERT: 每次挖掉一些东西,然后去预测挖掉的东西 Auto-encoder: auto “自”,ML模型 auto 自模型; 样本 x 和 标号 y 来自于同样的句子里面的词 –\u003e auto 加 auto 在 encoder之前,MAE 的图片标号是图片本身,区分于其它工作 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/mae/:1:0","tags":["Deep Learning","Transformer","CV"],"title":"MAE","uri":"/posts/deeplearning/transformer_cv/mae/"},{"categories":["Deep Learning"],"content":"What makes masked autoencoding different between vision and language? 什么使得 带掩码的自编码器模型在 CV 和 NLP 处理上的不一样呢? CV 使用 CNN,卷积窗口不好将 mask 放进去 CNN 在一张图片上,使用一个卷积窗口、不断地平滑,来汇聚一些像素上面的信息 + 模式识别 Transformer 的一个 mask 对应的是一个特定的词,会一直保留,和别的词区分开来 卷积上不好做掩码:图片的一块盖住 by 像素替换成一个特定的值,卷积窗口扫过来、扫过去时,无法区分边界,无法保持 mask 的特殊性,无法拎出来 mask;最后从掩码信息很难还原出来 卷积不好加入位置编码? 不那么充分 Transformer 需要位置编码:attention 机制没有位置信息 卷积自带位置信息,不断平移时,不需要加入位置信息 语言和图片的信息密度不同 NLP 的一个词是一个语义的实体,一个词在字典里有很长的解释;一句话去掉几个词,任务很难,i.e., 完形填空 –\u003e BERT 的 mask 比例不能过高 CV 的图片的Mask Mask块太少,直接通过对邻居的像素值进行插值还原,太简单 随机去掉很高比例的块,极大降低图片的冗余性,迫使模型学习更好的表征:nontrivial 任务,使模型去看 一张图片的 holistic 全局信息,而不仅关注局部 The autoencoder‘s decoder CV 还原图片的原始像素:低层次的表示 NLP 还原句子里的词:语义层次更高,i.e., BERT 的一个全连接层还原词 图片分类、目标检测的 decoder:一个全连接层 语义分割(像素级别的输出):一个全连接层不够,很有可能使用一个转置的卷积神经网络、来做一个比较大解码器。 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/mae/:2:0","tags":["Deep Learning","Transformer","CV"],"title":"MAE","uri":"/posts/deeplearning/transformer_cv/mae/"},{"categories":["Deep Learning"],"content":"Approach 随机盖住图片里的一些块(patch, image 的一个块),再重构缺失的像素。 MAE 预训练流程:input –\u003e patches –\u003e masked –\u003e unmasked patches in encoder –\u003e unmasked + masked 按位置排列进 decoder –\u003e decoder 重构 masked patches 的像素 patches + masked:一张红色鸟图片进来,切成 patches,masked 块 (3/4) 是 灰色的。 unmasked patches,encoder:没有 masked (1 / 4) 的块 进入 encoder (ViT),得到每一块的特征(蓝色)。 encoder 的输出 和 masked tokens 按照在图片中的原始位置排列成一长条向量 (包含位置信息)。 长条向量 进入 decoder,解码器尝试重构缺失的像素信息,还原原始图片 解码器的最后一层: a linear projection 一个 patch 是 16 * 16 像素的话,线性层会投影到长为 256 的维度,再 reshape(16, 16), 还原原始像素信息 损失函数: MSE,像素值相减,再平方和(只作用于非可见块的损失,和 BERT 一样) encoder 比 decoder 高:计算量主要来自于 encoder,对图片的像素进行编码 用 MAE 做一个 CV 的任务,只需要用编码器。一张图片进来,不需要做掩码,直接切成 patches 格子块,然后得到所有 patches 的特征表示,当成是这张图片的特征表达,用来做 CV 的任务 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/mae/:3:0","tags":["Deep Learning","Transformer","CV"],"title":"MAE","uri":"/posts/deeplearning/transformer_cv/mae/"},{"categories":["Deep Learning"],"content":"Simple implementation 对每一个输入 patch 生成 a token:一个一个 patch 的线性投影 + 位置信息 随机采样:randomly shuffle 随机打断序列,把最后一块拿掉。 从头部均匀的、没有重置的样本采样 25% 意味着 随机 shuffle, 只保留前 25% after encoding 解码时:append 跟以前长度一样的这些掩码的一些词源 mask tokens (一个可以学习的向量 + 位置信息),重新 unshuffle 还原到原来的顺序 MSE 算误差时,跟原始图的 patches 对应 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/mae/:4:0","tags":["Deep Learning","Transformer","CV"],"title":"MAE","uri":"/posts/deeplearning/transformer_cv/mae/"},{"categories":["Deep Learning"],"content":"拓展阅读 MAE 论文逐段精读【论文精读】 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/mae/:5:0","tags":["Deep Learning","Transformer","CV"],"title":"MAE","uri":"/posts/deeplearning/transformer_cv/mae/"},{"categories":["Deep Learning"],"content":"VIT","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"[toc] ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:0:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Vision Transformer (VIT) 文章标题:An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 作者:Alexey Dosovitskiy; Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn,Xiaohua Zhai 发表时间:(ICLR 2021) Transformer杀入CV界 官方代码 每一个方格都是 16 * 16 大小,图片有很多 16 * 16 方格 patches –\u003e an image is worth 16 * 16 words 一个 224 * 224 图片 变成一个 196 个的 16 * 16 图片块(words in NLP)。 vit ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:1:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Introdouction Transformer 应用在 CV 的难点 计算像素的 self-attention,序列长,维度爆炸 Trnasformer 的计算复杂度是序列长度 n 的 平方即 $O(n^2)$ 224 分辨率的图片,有 50176 个像素点,(2d 图片 flatten)序列长度是 BERT(512) 的近 100 倍。 CV 如何用attention( 降低序列长度) CNN 结构 + self-attention Non-Local Network, 网络中的特征图当作输入 Transformer DETR attention 替代卷积 stand-alone attention 孤立自注意力 用 local window 局部小窗口控制 transformer 的计算复杂度 axial attention 轴注意力 2d变成2个1d 顺序操作,降低计算复杂度 Transformer 比 CNN 少 inductive biases 归纳偏置(先验知识 or 提前的假设) CNN 的 inductive biases 是 locality 和 平移等变性 translation equaivariance(平移不变性 spatial invariance) locality: CNN用滑动窗口在图片上做卷积。假设是图片相邻的区域有相似的特征。 translation equaivariance:f (g(x)) = g( f(x) );f 和 g 函数的顺序不影响结果。 CNN 的卷积核 像一个 template 模板,同样的物体无论移动到哪里,遇到了相同的卷积核,它的输出一致 Transformer 没有这些先验信息,只能从图片数据里,自己学习对视觉世界的感知。 ViT 用了图片 2d 结构 的 inductive bias 地方:resolution adjustment 尺寸改变 和 patch extraction 抽patches ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:2:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Related work ICLR 2020 从输入图片里抽取 2 * 2 patches。 CIFAR-10 32 * 32 图片,2 * 2足够,16 * 16 会过大。 抽好 patch 之后,在 patches 上 做 self-attention。 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:3:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"VIT Model ViT-B/16为例 Model Patch_size Layers Hidden_size D MLP_size 4D Heads Params $Vit_{base}$ $16\\times16$ 12 768 3071 12 86M $Vit_{large}$ $16\\times16$ 24 1024 4096 16 307M $Vit_{huge}$ $14\\times14$ 32 1280 5120 16 632M 划分 patches,flatten patches 的线性投影 + patches 的位置信息,得到输入transformer 的 tokens 将图像$224×224×3$划分成大小$16×16$的patch(小方块),每个patch块可以看做是一个token(词向量),共有$(224/16)^2=196$个token,每个token的长度为$16×16×3=768$。[16, 16, 3] -\u003e [768] 在代码实现中,直接使用一个卷积核大小为16x16,步距为16,卷积核个数为768的卷积来实现。通过卷积[224, 224, 3] -\u003e [14, 14, 768],然后把H以及W两个维度展平即可[14, 14, 768] -\u003e [196, 768] 如果改变图像的输入大小,ViT不会改变patchs的大小,那么patchs的数量会发生变化,那么之前学习的pos_embed就维度对不上了,ViT采用的方案是通过插值来解决这个问题 [class] token:可训练的参数,长度为768的向量,Concat([1, 768], [196, 768]) -\u003e [197, 768] 所有的 tokens 在做两两的交互信息。因此,[CLS] 也会和所有的图片patches 的token 交互,从而 [CLS] 从图片 patches + position 的 embedding 学到有用信息,最后用**[CLS]** 做分类判断。 CV 通常的全局特征:feature map (14 * 14) –\u003e GAP globally average-pooling 全局平均池化 –\u003e a flatten vector 全局的图片特征向量 –\u003e MLP 分类 同样的,Transformer 的输出元素 + GAP也可以用做全局信息 + 分类,效果差异不大; ViT 对齐 标准的 transformer,选用 NLP 里常用的 CLS 和 1d position embedding Position Embedding:采用的是一个可训练的参数(1D Pos. Emb.) Add([197, 768], [197, 768]) -\u003e [197, 768] 选择不同位置编码几乎没有差异,原因是Transformer是直接在patch上操作而不是基于像素级,较少数量的 patches 之间的相对位置信息,容易学到,因此,空间信息编码方式差异没那么重要 vit_Position_Embedding ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:4:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Vit Architecture vit_Architecture MLP Head 整个Encoder的输出为[197, 768]我们仅仅保留最前面的CLS token作为全连接的输入[1, 768],然后接上全连接层及分类数n_class,使用交叉熵损失函数计算损失,反向传播更新网络的权重和参数。 在训练ImageNet21K时是由Linear+tanh激活函数+Linear组成。但是迁移到ImageNet1K上或者你自己的数据上时,只用一个Linear即可 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:4:1","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Hybrid Architecture 前 CNN + 后 Transformer R50不同之处 R50的卷积层采用的StdConv2d不是传统的Conv2d 所有的BatchNorm层替换成GroupNorm层。 在原Resnet50网络中,stage堆叠次数 [3,4,6,3]。R50中,把stage4中的3个Block移至stage3中,变成 [3,4,9]。 通过R50 Backbone进行特征提取后,得到的特征矩阵shape是[14, 14, 1024],接着再输入Patch Embedding层,注意Patch Embedding中卷积层Conv2d的kernel_size和stride都变成了1,只是用来调整channel。 Hybrid_Architecture ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:4:2","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"拓展阅读 代码 State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow timm版vit lucidrains/vit-pytorch含动图 B站:ViT论文逐段精读【论文精读】 B站:Vision Transformer详解 视觉Transformer(ViT)模型创新思路总结 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:5:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Swin Transformer 文章标题:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows 作者:Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo 发表时间:(ICCV 2021) 官方代码 多层次的Vision Transformer Swin Transformer是一个用了移动窗口的层级式的Vision Transformer Swin:来自于 Shifted Windows 更大的效率 通过 shifting 移动的这个操作,能够让相邻的两个窗口之间有了交互,所以上下层之间就可以有 cross-window connection,从而变相的达到了一种全局建模的能力 层级式 Hierarchical 减少序列长度方式 用后续的特征图来当做Transformer的输入, 把图片打成 patch 把图片画成一个一个的小窗口,然后在窗口里面去做自注意力 借鉴了很多卷积神经网络的设计理念以及先验知识 采取了在小窗口之内算自注意力 利用了卷积神经网络里的 Locality 的 Inductive bias,就是利用了局部性的先验知识,同一个物体的不同部位或者语义相近的不同物体还是大概率会出现在相连的地方 提出来了一个类似于池化的操作叫做 patch merging 把相邻的小 patch 合成一个大 patch,这样合并出来的这一个大patch其实就能看到之前四个小patch看到的内容,它的感受野就增大了,同时也能抓住多尺寸的特征 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:6:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Overall Architecture swin_vit swin win. sz. 7x7表示使用的窗口(Windows)的大小 dim表示feature map的channel深度(或者说token的向量长度) head表示多头注意力模块中head的个数 swin_Architecture patch partition:将图像$224×224×3$划分成大小$4×4$的patch(小方块),得到$56\\times56\\times48$大小。 ($224/4=56,\\ 4\\times4\\times3=48$) [224, 224, 3] -\u003e [56, 56, 48] Linear Embedding:要把向量的维度变成一个预先设置好的值C,对于 Swin tiny来说$C=96$ [56, 56, 48] -\u003e[56, 56, 96] Patch Partition 和 Linear Embedding 就相当于是 ViT 里的Patch Projection 操作,而在代码里也是用一次卷积操作就完成 $56\\times56=3136$太长,引入了基于窗口的自注意力计算,每个窗口按照默认来说,都只有$M^2=7^2=49$个 patch,所以说序列长度就只有49就相当小了 共有$ (56/7)\\times(56/7)=8\\times8=64 $个窗口。 Stage 1 经过 2 个 Swin Transformer Block,做了窗口滑动后输出的尺寸依然为 $56\\times56\\times96$。 Stage 2 经过 Patch Merging后,尺寸减半,通道数翻倍,变成了$ 28\\times28\\times192$,再经过 2 个 Swin Transformer Block,输出$ 28\\times28\\times192$。 Stage 3 经过Patch Merging后,尺寸减半,通道数翻倍,变成了$ 14\\times14\\times384$,再经过 6 个 Swin Transformer Block,也就是窗口滑动了 3 次,输出 $ 14\\times14\\times384$。 Stage 4 经过Patch Merging后,尺寸减半,通道数翻倍,变成了$ 7\\times7\\times768$,再经过 2 个 Swin Transformer Block,输出$ 7\\times7\\times768$。 ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:7:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Path Merging $H\\times W \\times C -\u003e \\ \\frac{H}{2}\\times \\frac{W}{2} \\times 4C-\u003e \\ \\frac{H}{2}\\times \\frac{W}{2} \\times 2C$ swin_Path_Merging ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:7:1","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Swin Transformer Block W-MSA MSA W_MSA 拿stage1举例:尺寸为$56\\times56\\times96$;每个窗口按照默认来说,都只有$M^2=7^2=49$个 patch;共有$ (56/7)\\times(56/7)=8\\times8=64 $个窗口。这64个窗口里分别去算它们的自注意力。 SW-MSA SW_MSA SW_MSA 移动窗口就是把原来的窗口往右下角移动一半窗口(M/2)的距离 如果Transformer是上下两层连着做这种操作,先是 window再是 shifted window 的话,就能起到窗口和窗口之间互相通信的目的。 两个结构是成对使用的,先使用一个W-MSA结构再使用一个SW-MSA结构。所以堆叠Swin Transformer Block的次数都是偶数。 $$ Attention(Q,K,V)=softmax(\\frac{Q\\dot K^T}{\\sqrt d}V) $$SA模块 $X^{hw\\times C} \\cdot W^{C\\times C}_q = Q^{hw\\times C}$ 矩阵运算量计算:$hw\\times C \\times C$ $X^{hw\\times C} $:将所有像素(token)拼接在一起得到的矩阵(一共有hw个像素,每个像素的深度为C) $W^{C\\times C}_q$:生成query的变换矩阵 同理K,V的生成也是$hw\\times C \\times C$,共$3hwC^2$ $Q\\cdot K^T$:$(hw \\times C )\\cdot(C \\times hw)-\u003e(hw)^2C$ $\\frac{Q\\dot K^T}{\\sqrt d}V$:$(hw \\times hw )\\cdot(hw \\times C)-\u003e(hw)^2C$ 一共$3hwC^2+2(hw)^2C$ MSA模块 多头注意力模块相比单头注意力模块的计算量多最后一个线性投影层$(hw \\times C )\\cdot(C \\times C)-\u003ehwC^2$ 一共$4hwC^2+2(hw)^2C$ W_MSA模块 对每个窗口内使用多头注意力模块,一共有$\\frac{h}{M}\\times \\frac{w}{M}$个窗口,窗口高宽M 计算量:$\\frac{h}{M}\\times \\frac{w}{M} \\times(4hwC^2+2(hw)^2C)=4hwC^2+2M^2hwC$ ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:7:2","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Shifted Window Attention 通过对特征图移位,并给Attention设置mask来间接实现的 swin_shift_mask 特征图移位:使用torch.roll (x, shifts=-1, dims=0)将第一排数值移动到最下面,再使用torch.roll (x, shifts=-1, dims=1)将变换后的第二张图中的第一列移动到最右边 swin_shift swin_mask Mask_1 Mask_2 Mask_3 上图黑色区域是需要的,白色需要Mask,加上较大负数如-100即可。 最后还要恢复位置reverse cyclic shift ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:8:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"Relative position bias $$ Attention(Q,K,V)=softmax((\\frac{Q\\dot K^T}{\\sqrt d}+B)V) $$ Relative position bias 上图中的窗口中有 2*2 个 patch,分别给这四个位置标上绝对位置索引,分别为 (0,0)、(0,1)、(1,0)、(1,1),第一个序号代表行,第二个序号代表列。以蓝色像素为参考点。用蓝色像素的绝对位置索引与其他位置索引进行相减,就得到其他位置相对蓝色像素的相对位置索引。我们将各个相对位置索引展开成一个行向量,再进行拼接得到了下面的矩阵。 Relative position bias 将该矩阵加上一个 M-1,M 为窗口的大小,在 Swin Transformer 中为 7,这里为 2。再将每一个行标都乘以 2M-1,最后将行标和列标求和,就得到最后一个矩阵的值,这个矩阵中的值就是相对位置索引 Relative position bias 这个相对位置索引需要去索引的值会有一个相对位置偏置表 (relative position bias table);这个表的元素的个数为 (2M-1)*(2M-1)。 Relative position bias ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:9:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"拓展阅读 Pytorch实现代码 B站:Swin Transformer论文精读【论文精读】 B站:Swin-Transformer网络结构详解 从零开始学视觉Transformer Swin Transformer 介绍 使用动图深入解释微软的Swin Transformer 知乎:图解Swin Transformer ","date":"2023-06-09","objectID":"/posts/deeplearning/transformer_cv/vit/:10:0","tags":["Deep Learning","Transformer","CV"],"title":"VIT","uri":"/posts/deeplearning/transformer_cv/vit/"},{"categories":["Deep Learning"],"content":"ALBEF","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/albef/","tags":["Deep Learning","多模态学习"],"title":"ALBEF","uri":"/posts/deeplearning/multimodal-learning/albef/"},{"categories":["Deep Learning"],"content":"ALBEF 文章标题:Align before Fuse: Vision and Language Representation Learning with Momentum Distillation 作者:Junnan Li, Ramprasaath R. Selvaraju, Akhilesh Deepak Gotmare, Shafiq Joty, Caiming Xiong, Steven Hoi 发表时间:(NIPS 2021) offical code ALBEF ALBEF 包含一个图像编码器 (ViT-B/16)、一个文本编码器(前 6 层 BERT)和一个多模态编码器(后 6 层 BERT,带有额外的交叉注意层)。 image input打成patch,通过patch embedding layer,在通过12层 Vision Transformer $224\\times224-$-\u003e $(196+1)\\times 768=197\\times768$ BERT前六层去做文本编码,剩下的六层transformer encoder直接当成multi-model fusion的过程 Loss Image-Text Contrastive Learning (ITC)。类似于CLIP,增大同(正)样本对的similarity,减小负样本对的similarity。 CLS Token当做全局特征,图像和文本各一个$768\\times1$的一个向量;通过downsample和normalization变成$256\\times 1$ (MoCo实现) Masked Language Modeling (MLM,generative)。类似于BERT,遮盖住一些单词,然后预测出来。 Image-Text Matching (ITM,contrastive)。二分类任务,判断图-文对是否匹配。 动量蒸馏 momentum distillation ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/albef/:1:0","tags":["Deep Learning","多模态学习"],"title":"ALBEF","uri":"/posts/deeplearning/multimodal-learning/albef/"},{"categories":["Deep Learning"],"content":"拓展阅读 ALBEF offical blog ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/albef/:2:0","tags":["Deep Learning","多模态学习"],"title":"ALBEF","uri":"/posts/deeplearning/multimodal-learning/albef/"},{"categories":["Deep Learning"],"content":"VLMo 文章标题:VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts 作者:Hangbo Bao, Wenhui Wang, Li Dong, Qiang Liu, Owais Khan Mohammed, Kriti Aggarwal, Subhojit Som, Furu Wei 发表时间:(NIPS 2022) offical code VLMo ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/albef/:3:0","tags":["Deep Learning","多模态学习"],"title":"ALBEF","uri":"/posts/deeplearning/multimodal-learning/albef/"},{"categories":["Deep Learning"],"content":"CLIP","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"CLIP 文章标题:Learning Transferable Visual Models From Natural Language Supervision 作者:Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, Ilya Sutskever 发表时间:(ICML 2021) offical code 代码只是可以用来做推理并没有开源 图片和文本之间的对比学习 CLIP:Con-trastive Language-Image Pre-training 利用自然语言的这种监督信号去学习一个迁移性能好的视觉网络 优点 不需要再去标注数据 图片-文本对这种多模态特征适合zero-shot迁移学习 单模态的对比学习:MoCo;单模态的掩码学习:MAE;只能学到视觉特征,很难zero-shot迁移学习 局限性: ResNet50打平手但是离SOTA还很远,扩大模型和数据集能提高预计资源$\\times 1000$,代价太大 在有些数据集上的zero-shot效果也不好:细分类数据集,抽象概念 推理时,目标数据集out-of-distribution,CLIP泛化照样差 不能做成生成式模型(GPT)(对比学习的目标函数和生成式的目标函数结合) 数据利用不高效(数据大)减少数据用量:数据增强;自监督;伪标签 下游任务数据集测试调参带入偏见:创建一个用来测试各种各样的zero-shot的迁移能力的数据集 网上爬的未清洗,可能带有社会偏见 提供一些训练样本反而效果变差(Few Shot效果不好) 不使用ImageNet的训练集的情况下直接Zero-shot 做推理就获得和之前监督训练好ResNet50同样的效果 使用超大规模 web Image Text 数据集 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:1:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"Related work Learning visual n-grams from web data:和CLIP相似,没有transformer和大规模数据集,效果很差 VirTex (CVPR 2021) 自回归的预测方式去做模型的预训练 ICMLM (ECCV 2020) 用这种完形填空的方式去做预训练 ConVIRT (MLHC 2022) 和CLIP类似,只在医疗图像上做了实验 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:2:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"Methods CLIP 模型总览图 (1) 模型的输入是一个图片和文字的配对;图片通过了一个图片编码器 Image Encoder 得到了一些特征 $I_1,I_2,…,I_N$;句子通过一个文本编码器 Text Encoder 得到一些文本的特征 $T_1,T_2,…,T_N$。 正样本:对角线上文本和图片配对的元素 $N$ 负样本:其他 $N^2-N$ (2) prompt template 提示模板 把Image Net 里的1,000个类变成1000个句子;句子通过预训练好的文本编码器得到1,000个文本的特征 如何变成句子?用物体类别去替代图里的 object 变成 A photo of a (object). 为什么要prompt template ?只用一个单词去做 prompt 经常出现歧异性(不同语境下意思不同)。由于模型预训练时,图片和句子成对使用,推理时直接用类别单词得到的文本特征(distribution gap),效果就会稍有下降。 prompt engineering :为每个任务定制提示文本可以显着提高零样本性能(缩小解空间) prompt ensemble:80个模板结果综合 (3) Zero-shot 推理时,输入一张图片通过预训练好的图片编码器得到图片的特征 $I_1$,$I_1 $ 和所有的文本特征做cosine similarity (相似性比较),得到文本特征最相似的句子$I_1T_3$。 摆脱了categorical label 的限制 不论是训练还是推理,都不需要提前定好一个标签列表。 任意一张照片可以通过给模型输入不同的文本句子从而知道这张图片里到底有没有感兴趣的物体 Numpy-like pseudocode for the core of an implementation of CLIP. 两个输入:一个是图片的输入;一个是文本的输入。通过编码器输出图像特征和文本特征 线性投射层 W 学习一下如何从单模态转变为多模态,再做一次 L2 归一化 投射层 线性还是非线性 没太大关系(数据集大,多模态) 数据增强只使用随机裁剪 计算consine similarity 交叉熵目标函数 一个是 Image loss;一个是 text loss; 把两个 loss 加起来取平均 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:3:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"推荐阅读 官方博客 CLIP 论文精读 style CLIP (ICCV 2021): CLIP + style GAN 想通过文字上的改变从而去引导图像生成 CLIP draw 不需要任何训练,CLIPDraw在矢量笔画上操作,而不是在像素图像上操作,使绘画偏向于更简单的人类可识别的形状。 视频检索:CLIP模型把检索对象(一句话表示)变成文本特征,把视频里的每一帧都变成视觉上的特征,然后一帧一帧的去跟文本特征做对比然后挑出相似性最高的那一帧展现出来 How to Train Really Large Models on Many GPUs? ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:4:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"LSeg 文章标题:Language-driven Semantic Segmentation 作者:Boyi Li, Kilian Q. Weinberger, Serge Belongie, Vladlen Koltun, René Ranftl 发表时间:(ICLR 2022) offical code CLIP做图像分割:像素级别的分类 Lseg overview 模型的输入是一个图片和文字的配对;图片通过了一个图片编码器 Image Encoder 得到了一些密集特征$C\\times \\tilde H \\times \\tilde W$矩阵 ,各元素为$I_{11},I_{12},…,I_{\\tilde H \\tilde W}$;文本通过一个文本编码器 Text Encoder 得到一些文本的特征$N\\times C$矩阵,各元素为 $T_1,T_2,…,T_N$。 图片编码器:dpt的结构-vision Transformer + decoder decoder目的:把bottleneck feature慢慢upscale;特征维度$C$一般是512或者768 使用原始的ViT或者dit的预训练参数 文本编码器:CLIP里的文本编码器 图片特征和文本特征做点积得到$N\\times \\tilde H \\times \\tilde W$矩阵,各元素为$F_{11},F_{12},…,F_{\\tilde H \\tilde W}$;拿输出特征和最后的ground truth去做cross entropy loss spetial regularization block 文本和视觉特征交互,加两个这种block效果最好 局限性 目标函数不是对比学习;也不是无监督学习的框架;依赖于手工标注的segametation mask ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:5:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"GroupViT 文章标题:GroupViT: Semantic Segmentation Emerges from Text Supervision 作者:Jiarui Xu, Shalini De Mello, Sifei Liu, Wonmin Byeon, Thomas Breuel, Jan Kautz, Xiaolong Wang 发表时间:(CVPR 2022) offical code CLIP做图像分割:监督信号来自于文本 为什么叫group? 视觉做无监督分割经常就是用一类方法叫做grouping(一种自下而上的方式) 类似于有一些聚类中心点,从这个点开始发散,把附近周围相似的点逐渐扩充成一个group,那这个group相当是一个segametation mask。 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:6:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"Methods ViT + grouping block + 可学习的group tokens The Architecture and Training Pipeline of GroupViT 图像编码器:Vision Transformer(12层Transformer layers) 两部分输入 原始图像的patch embedding 大小$224\\times224$的图片,patch size选择$16\\times16$;就有一个$14\\times14=196$序列长度的一个序列 然后经过这个linear projection就得到了patch embedding,维度为$196\\times384$(ViT small) 可学习的group tokens 开始设的是$64\\times384$:64个聚类中心;384为了保持维度和patch embedding进行拼接 grouping block (6层Transfor Layer之后加了一个grouping block) 类似于自注意力的方式先算一个相似度矩阵,用这个相似的矩阵去帮助原来的这个image token 做聚类中心的分配,从而完成了输入$(196+64)\\times384$降到这个$64\\times 384$ 合并成为更大的group,做一次聚类的分配 降低序列长度,模型的计算复杂度,训练时间相应的都减少了 第9层Transformer Layer 之后又加了一次grouping block:$64\\times 384$降到这个$8\\times 384$ 文本编码器得到文本特在$z^T$;图像编码器输出$8\\times 384$进行average pooling得到$1\\times384$,在通过MLP得到图片特征$z^I$ 后续和CLIP一样对比学习 zero shot推理 给定一个图片首先经过group ViT 得到最后8个group Embedding 再把有可能这些标签通过这个文本编码器得到一系列的这个文本特征 计算这些图像的Group Embedding和这些文本的特征之间的相似度 局限性:最多只能检测到8类;没有很好的利用dense prediction的特性;CLIP 这种训练方式 没有办法学到这些背景类(语义太模糊) ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:7:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"VILD 文章标题:Open-vocabulary Object Detection via Vision and Language Knowledge Distillation 作者:Xiuye Gu, Tsung-Yi Lin, Weicheng Kuo, Yin Cui 发表时间:(ICLR 2022) offical code ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:8:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"Methods VILD (a) baseline 就是一个maskRCNN(定位+分类) 两阶段的分类器:第一阶段RPN抽取 $N$个region Proposal ;第二阶段就是根据着$N$个 Proposal 通过detection head得到一些region embedding ,最后再通过一些分类头判断类别 (b) ViLD-text:和a类似得到N个region embedding之后,和base category基类+背景类的text embedding去做点乘计算相似度,得到一个81维的向量,将这个向量做softmax,再去和ground truth做交叉熵,得到的结果即为ViLD的损失函数 text embedding:经过CLIP的文本编码器得到的,不参与训练的。(类别通过prompt生成一个句子进入编码器输出) 在b中需要改动的参数有两处,一是图像处理模块,也即抽取图像特征的backbone需要训练;二是背景类的embedding。 背景类:不在基础类里的所有别的类别 (c) ViLD-image:利用CLIP的图像编码器对自己的视觉backbone进行知识蒸馏,让backbone输出的region embedding 尽可能地靠近CLIP的image embedding 一些抽好的Proposal 做一些resize的操作 c 中输入的是M个pre-computed proposal,和a、b不同(加快训练) 预先把所有图像的proposal算出来,然后一次性扔到CLIP图像编码器中先抽好存到硬盘中,这样在训练的时候就直接把这些存好的embedding取出来就可以了。 损失函数:常用的L1 Loss。需要注意的是,作者在把一个proposal送入CLIP的图像编码器时,是将其1x和1.5x分别送入进行编码,最后再把这两个embedding加起来。 损失函数:常用的L1 Loss (d) ViLD:ViLD-image和ViLD-text两个的合体 左侧将N+M个proposal同时输入进目标检测框架,然后分开,n个Embedding去算cross entropy loss 然后m 个 precomputer embedding去算这个蒸馏的L_1 loss。 右侧为teacher网络,只有训练的时候用,测试的时候用不到。 VILD_ensemble VILD 训练阶段 图片先通过一个RPN得到一些region Proposal 然后通过RoI Align 和一些Conv层得到一些region embedding $R_1,R_2$; 绿色的基础类先通过一个prompt然后通过文本编码器得到绿色的文本编码和$R_1,R_2$做点乘,再和ground truth做cross entropy loss; 把已经抽取好的region Proposal 通过CLIP model得到一些CLIP的iamge embedding $ I_1, I_2$;使用蒸馏计算$L_1$ loss 希望$R_1, R_2$呢尽可能的跟$I_1, I_2 $去接近 推理阶段 不论是基础类还是新类都通过prompt再通过这个文本编码器得到所有的这些text embedding;然后让Mask RCNN抽取的region embedding去和text embedding做相似度计算,计算结果最大的那个,就是模型输出的检测到的类型。 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:9:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"拓展阅读 利用图像文本的知识蒸馏来进行开放词表目标检测 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:10:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"GLIP 文章标题:Grounded Language-Image Pre-training 作者:Liunian Harold Li, Pengchuan Zhang, Haotian Zhang, Jianwei Yang, Chunyuan Li, Yiwu Zhong 发表时间:(CVPR 2022) offical code object detection 目标检测:给定图片,把bounding box 给找出来 phrase grounding:给定图片和文本,根据文本把物体找出来 定位 loss 部分差不多 分类 loss 部分 detection:它的标签是一个或者两个单词是one-hot的这种标签 给定图片通过backbone得到$N\\times D$的region embedding (n个bounding box,每个bounding box Embedding的维度是d);通过$C\\times D$矩阵的分类头;MNS把bounding box筛选一下,然后再去跟ground Truth 去算cross entropy loss Vision grounding:标签是一个句子。 给定图片通过backbone得到了一些region feature;一个句子prompt通过文本编码器得到文本的embedding,进行相似度计算。(类似ViLD-text) 目标检测和Vision grounding 结合 判断一下什么时候算是一个positive match;什么时候算是一个negative match GLIP 图片通过图像编码器得到一些region embedding;文本通过文本编码器得到一些text embedding 用Cross Attention啊把这个文本和图像的特征交互一下 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:11:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"拓展阅读 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:12:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"CLIPasso 文章标题:CLIPasso: Semantically-Aware Object Sketching 作者:Yael Vinker, Ehsan Pajouheshgar, Jessica Y. Bo, Roman Bachmann, Amit Haim Bermano, Daniel Cohen-Or, Amir Zamir, Ariel Shamir 发表时间:(SIGGRAPH 2022) (Best Paper Award) 主页介绍 + code CLIPasso 贝兹曲线 通过一系列的2维的点控制的一个曲线 基于saliency的一个初始化的方式 把图片扔给已经训练好的Vision Transformer,然后把最后一层的多头自注意力取加权平均做成了一个siliancy map;在这个siliancy map上去看哪些区域更显著,这些显著的区域上去采点。 定义了这几个曲线,也就这里说的$S_1$到$S_N$就是n个笔画,通过光栅化器Rasterizer得到简笔画。 Loss 选择 $L_s$ 基于语义性的目标函数:简笔画生成的特征和原始图像生成的特征尽可能的接近 $L_g$ 基于geometric的目标函数:resnet的 2 3 4各阶段特征拿出来算loss,而不是用最后的那个2048维的特征。 保证最后生成的简笔画无论是在几何形状上,位置上跟原有的图像尽可能的一致;而且在语义信息上也能尽可能的保持一致 局限性 图像有背景,效果就会大打折扣。必须是一个物体然后处在一个纯白色的背景上 先把一张带背景的图片,把这个物体抠出来,背景是一个白色幕布的图片,扔给CLIPasso去生成简笔画(两阶段) 初始化的笔画都是同时生成的而不是序列生成的(怎样才能一笔一画) 通过控制笔画数去控制图片的抽象程度 (手动–优化参数) ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:13:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"拓展阅读 Multimodal Neurons in Artificial Neural Networks 可视化分析 CLIP ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:14:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"CLIP4Clip 文章标题:CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval 作者:Huaishao Luo, Lei Ji, Ming Zhong, Yang Chen, Wen Lei, Nan Duan, Tianrui Li 发表时间:( 2021) offical code 视频领域 CLIP4Clip 对含有时序的视频特征处理,假设10帧 10个图像的特征直接取平均 (没有考虑到这个时序的特性) 一个是一个人逐渐的在坐下,另外一个是一个人逐渐的站起来;只是取一个这个平均的话,这两个动作无法区分 late fusion: 最原始的lstm把这10个特征扔给一个lstm,把最后的输出拿出来 (时序建模:Transformer替代) early fusion:把文本和这个图像帧的特征一起在学习 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:15:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"ActionCLIP 文章标题:ActionCLIP: A New Paradigm for Video Action Recognition 作者:Mengmeng Wang, Jiazheng Xing, Yong Liu 发表时间:( 2021) offical code 动作识别 ActionCLIP 视频的输入通过一个视频编码器得到一些特征,把标签当做文本给一个文本编码器得到一些文本的特征;去计算文本和图像之间的相似度;相似度矩阵和提前定义好的ground truth算一个loss。 把cross entropy loss换成KL divergence Overview of ActionCLIP ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:16:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"PointCLIP 文章标题:PointCLIP: Point Cloud Understanding by CLIP 作者:Renrui Zhang, Ziyu Guo, Wei Zhang, Kunchang Li 发表时间:(CVPR 2022) offical code 3D点云 PointCLIP 把3D点云投射到2D平面上变成了2D的深度图,扔给clip的视觉编码器得到视觉表征。 文本端通过prompt变成了句子point cloud depth Map of a 『CLASS』 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:17:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"DepthCLIP 文章标题:Can Language Understand Depth? 作者:Renrui Zhang, Ziyao Zeng, Ziyu Guo, Yafeng Li 发表时间:(CVPR 2022) offical code 用文本跨界估计深度 DepthCLIP 把深度估计看成了一个分类问题,强制性的把深度距离分成了7大类 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/clip/:18:0","tags":["Deep Learning","多模态学习"],"title":"CLIP","uri":"/posts/deeplearning/multimodal-learning/clip/"},{"categories":["Deep Learning"],"content":"VILT","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"VILT 文章标题:ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision 作者:Wonjae Kim, Bokyung Son, Ildoo Kim 发表时间:(ICML 2021) offical code 第一个摆脱了目标检测的视觉文本模型 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:1:0","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"Abstract Vision and Language Pre-training(VLP) 当前的工作主要集中在图像特征抽取上,一般来讲,图像特征抽取的越好,下游任务中的表现就越好。 效率太低,速度太慢,抽取图像特征花费大量时间,比多模态融合都多。 用一个预训练好的模型去抽取特征,表达能力受限。 目标检测数据集不够大,规模不够大。如果模型不是端到端学习,只是从预训练模型抽取特征,大概率来说不是最优解。 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:2:0","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"Relate work Four categories of vision-and-language models 第一类,代表作VSE,文本端较为简单,图像比较贵,融合端也是简单的神经网络。 第二类,代表作CLIP,图像和文本的计算力度等价,融合的时候将两种特征直接点乘,非常轻量。 第三类,代表作ViLBERT、UNITER占据了大部分工作,文本端非常轻量。图像端使用目标检测的系统,非常贵。融合端也使用了Transformer,相当于两个大模型。 第四类,代表作ViLT,基于ViT对图像使用patch embedding,模态融合部分做得比较大。 Visual comparison of conventional VLP architectures and ViLT 模态融合方法 signal-stream approach:将两种特征拼接起来,用一个模型处理两个输入。 dual-stream approach:两个模型分别对两种模态信息进行处理,充分挖掘每种模态包含的信息,然后再融合。 两种模型表现差不多,但是dual-stream approach参数多一些,VILT 采用signal-stream approaches。 文本编码端都是用预训练的BERT里的tokenizer 视觉编码端 Region Feature:经过一个Backbone抽取特征,然后经过RPN网络生成proposal,经过非极大值抑制 NMS 筛选边界框,最后经过ROI head得到图像序列。把一张图像变成了离散的bound-box,每个边界框内都含有明确的类别语义信息。(目标检测) Grid Feature:仅基于Backbone Patch Projection:基于 ViT 直接将图像打成patch,,得到一个有语义信息的离散的序列。 VILT 把模态的特征抽取做到了极小化,主要计算量在模态融合部分,提高模型推理速度。移除了Region feature ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:3:0","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"Methods Four categories of vision-and-language models 文本端有$L$个长为$H$ 的序列,$L$为一个句子中单词数量,$H$为序列长度。 图像端图像被打成 $N $个patch,每个patch也对应长为$H$的序列。 Modal-type embedding 模态信息(文本为0,图像为1),Token position embedding 文本位置信息,Patch position embedding 图像位置信息。 Modal-type embedding + position embedding + word embedding 不是拼接,是加在一起 Transformer Encoder的输入为$(N+L+2)\\times H$的矩阵。* 代表 [CLS] token,$(N+L+2)\\times H$中2代表两种模态的[CLS]。 使用了两个loss,分别是Image Text Matching和Mask Laguage Modeling。加个小loss :Word Patch Alignment Image Text Matching:文字,图片配对 (文本与图像是否匹配) Mask Laguage Modeling:NLP的完形填空 Word Patch Alignment :利用最优运输理论计算相似度(分布距离) Transformer 的输出为$1\\times H$的矩阵,经过$H\\times H$的pooler(权重矩阵)得到仍是$1\\times H$的矩阵,最后经过一个FC层进行二分类任务。 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:4:0","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"Whole word masking 例如giraffe长颈鹿这个单词,由三个词根组成,分别是gi,raf,fe,如果mask 的时候mask “raf”这个token。由于开头为gi结尾为fe的单词不多,模型就记住了中间一定是raf,就相当于模型学到了shortcut,这样泛化性就不好。 直接mask “giraffe” 整个单词。这样就需要借助图像信息,因此就加强了图像文本的联系。 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:4:1","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"Image Augmentation 为什么前边的研究没有使用数据增强? 多模态学习要考虑图像文本匹配的问题,数据增强可能会改变图像语义 使用预训练模型,无法进行数据增强 不适用color inversion和cutout避免与文本信息不匹配。 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:4:2","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"Experiments 预训练所用的数据集叫4million(4个数据集图片加起来这个数) MSCOCO:113K图片 567K 长标题 VG: 108K图片 5.41M 短标题 GCC:3.01M图片对 SBU:867K图片对 ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:5:0","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"Future work scalability:transformer都是越大越好,数据集越大越好(做的更大) Align before Fuse: Vision and Language Representation Learning with Momentum Distillation 用14million Masked Modeling for Visual Inputs:图像重建 (NLP里进行Mask重建,图像肯定也有用) Augmentation Strategies:数据增强 MixGen: A New Multi-Modal Data Augmentation ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:6:0","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"推荐阅读 ViLT 论文精读 后续改进,时间提升,更少时间训练 Align before Fuse: Vision and Language Representation Learning with Momentum Distillation 单机8卡训练2-3天 BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation Masked Unsupervised Self-training for Zero-shot Image Classification ","date":"2023-06-08","objectID":"/posts/deeplearning/multimodal-learning/vilt/:7:0","tags":["Deep Learning","多模态学习"],"title":"VILT","uri":"/posts/deeplearning/multimodal-learning/vilt/"},{"categories":["Deep Learning"],"content":"BYOL","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"SwAV 文章标题:Unsupervised Learning of Visual Features by Contrasting Cluster Assignments 作者:Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, Armand Joulin 发表时间:(NIPS 2020) 对比学习和聚类结合 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:1:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"methods 给定同样一张图片,如果生成不同的视角,不同的 views 的话,希望可以用一个视角得到的特征去预测另外一个视角得到的特征 SwAV 网路 左边:一个图片 $ X$,做两次数据增强得到了$X_1、X_2$,然后所有的样本通过一个编码器 $f_{\\theta}$,输出一个特征$Z_1、Z_2$,用这些特征做一个对比学习的 loss MoCo从memory bank取负样本6万个:这是一种近似做法 直接拿所有图片的特征跟特征做对比有点原始而且有点费资源 SwAV:跟聚类的中心 $C$ (prototype) 比 C 的维度是$d\\times k$,d是特征的维度,k是聚类中心个数3,000 一个图片 $ X$,做两次数据增强得到了$X_1、X_2$,然后所有的样本通过一个编码器 $f_{\\theta}$,输出一个特征$Z_1、Z_2$,先通过clustering让特征 $Z$ 和prototype $C$ 生成目标$Q_1、Q_2$;C点乘$Z_1$去预测$Q_2$,换位预测 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:2:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"multi crop 思想:全局的和这个局部的特征都要关注 过去的方法:用的两个crop,一个正样本对$X_1、X_2$两个图片 一个图片$X$,先把它resize 到$256\\times 256$,然后随机crop两个$224\\times 224$的图片当成 $X_1、X_2$ SwAV:大的crop抓住的是整个场景的特征,如果更想学习这些局部物体的特征,最好能多个 crop,去图片里crop一些区域,这样就能关注到一些局部的物体 但是增加crop,会增加模型的计算复杂度,因为相当于使用了更多的正样本 进行取舍:把这个crop变得小一点,变成160 ,取2个160的crop去学全局的特征;然后为了增加正样本的数量,为了学一些局部的特征,再去随机选4个小一点crop,大小为$96\\times96$ SwAV_multi_crop 实验 基线模型 2 个$224\\times224$,multi crop 2个$160\\times160$+4个$96\\times96$ SimCLR+ multi crop 涨了2.4个点,如果把 multi crop这个技术用到 BYOL 上有可能BYOL会比SwAV的效果高 如果没有这个multi crop的这个技术其实SwAV的性能也就跟MoCo v2是差不多的 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:3:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"BYOL 文章标题:Bootstrap your own latent: A new approach to self-supervised Learning 作者:Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec 发表时间:(2020) 没有负样本 openmmlab ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:4:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"标题 Bootstrap your own latent: A new approach to self-supervised Learning Bootstrap: If you bootstrap an organization or an activity, you set it up or achieve it alone, using very few resources. latent: 特征 hidden、feature、embedding 只有正样本;目的:让所有相似的物体,特征也尽可能的相似 缺陷:有一个躺平解 如果一个模型不论什么输入,都返回同样的输出,那所有的特征都是一模一样的,loss就都是 0 而只有加上负样本的约束,不光相似的物体要有相似的特征;不相似的物体也要有不相似的特征;模型才有动力去继续学(防止模型学到这个躺平解) 如果输出的所有特征都一样,那在负样本的 loss 无穷大;模型更新让正样本和负样本的 loss 都往下降,达到一个最优解 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:5:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"methods BYOL 网络流程 前向过程 一个mini-batch 式的图片 $x$,做两次数据增强得到了$v、v’$; $v$ 通过编码器 $f_\\theta$ 得到特征$y_\\theta$;$v’$ 通过编码器 $f_\\xi$ 得到特征$y’_\\xi$;输出2048维(ResNet50) $f_\\theta$ 和 $f_\\xi$ 使用同样的网络架构(ResNet50);参数不同。$f_\\theta$ 随着梯度更新而更新;$f_\\xi$ 跟 MoCo 一样,使用动量编码器,以 moving average 形式更新 $y_\\theta$通过 $g_\\theta$ 得到特征$z_\\theta$; $y’\\xi$ 通过 $g\\xi$ 得到特征$z’_\\xi$;输出256维 $g_\\theta$ 和 $g_\\xi$ 使用同样的网络架构 (fc + BN+ ReLU + fc );参数不同 SimCLR 使用projection head 输出是128维 BYOL使用projector 输出是256维 (两者都是MLP层) $z_\\theta$ 通过 $q_\\theta$ 得到新的特征 $q_\\theta (z_\\theta)$; $q_\\theta (z_\\theta)$ 和 $sg(z’_\\xi)$ 尽可能一致 sg:stop gradient $g_\\theta$ 和 $q_\\theta$ 使用同样的网络架构 用自己一个视角的特征去预测另外一个视角的特征 2048维的 $y_\\theta$ 做下游任务;损失函数:mean square error loss BYOL草图 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:6:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"推荐阅读 Understanding self-supervised and contrastive learning with “Bootstrap Your Own Latent”(BYOL) 跟BN后的平均图片mode 做对比 使用 BN 会产生样本信息泄漏 原作解释:BYOL works even without batch statistics BYOL 不需要 batch norm 提供的那些 batch 的这个统计量照样能工作,回应之前博客里提出来假设 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:7:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"SimSiam 文章标题:Exploring Simple Siamese Representation Learning 作者: Xinlei Chen, Kaiming He 发表时间:(2020) offical code 没有负样本,不需要大的batch size, 不需要动量编码器 可以看成是一种 EM 算法,通过这种逐步更新的方式避免模型坍塌 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:8:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"methods simsiam 网络 算法 前向过程 一个mini-batch 式的图片 $x$,做两次数据增强得到了$x_1、x_2’$; $x_1, x_2$ 通过编码器 $f$ 得到特征 $z_1, z_2$ ; $z_1,z_2$ 通过predictor $h$ 得到 $p_1,p_2$; 不同的对比学习模型 SimCLR :两编码器都有梯度回传;对比任务 SwAV :没有跟负样本;跟聚类中心去比;对比任务 BYOL :用左边呢去预测右边;同时使用了动量编码器;预测任务 SimSiam :没有负样本,不需要大的batch size, 不需要动量编码器;预测任务 不同的对比学习模型ImageNet实验 batch size 只有 MoCo v2 和 SimSiam 是可以用256的;其它工作都是要用更大的 batch size 负样本 SimCLR 和 MoCo v2 要用负样本 动量编码器 SimCLR 没有用;SimCLR v2用了 SwAV 没有用 epoch越大,Simsiam就不行了。 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:9:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"Barlow Twins 文章标题: Barlow Twins: Self-Supervised Learning via Redundancy Reduction 作者: Jure Zbontar, Li Jing, Ishan Misra, Yann LeCun, Stéphane Deny 发表时间: (ICML 2021) ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:10:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"methods Barlow Twins 网络 算法 损失函数 生成了一个关联矩阵cross correlation matrix;希望这个矩阵能跟一个单位矩阵 identity matrix尽量的相似 希望正样本的相似性尽量都逼近于1;跟别的样本相似性尽可能是0 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:11:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"DINO 文章标题:Emerging Properties in Self-Supervised Vision Transformers 作者: Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, Armand Joulin 发表时间: (2021) offical code transformer加自监督 一个完全不用任何标签信息训练出的 Vision Transformer ;如果把它的自注意力图进行可视化;发现它能非常准确的抓住每个物体的轮廓 (媲美图像分割) ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:12:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"methods MoCo:左边的网络叫做 query 编码器;右边叫做 key 编码器 BYOL :左边的网络叫做 online network;右边叫做 target network DINO :左边的网络叫做 student network;右边叫做 teacher network DINO 网络 算法 避免模型坍塌:centering 操作 把整个 batch 里的样本都算一个均值然后减掉这个均值 MoCoV3:随机初始化了一个 patch projection 层;然后冻结使得整个训练过程中都不变 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/byol/:13:0","tags":["Deep Learning","对比学习"],"title":"BYOL","uri":"/posts/deeplearning/contrastive-learning/byol/"},{"categories":["Deep Learning"],"content":"InstDisc","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/instdisc/","tags":["Deep Learning","对比学习"],"title":"InstDisc","uri":"/posts/deeplearning/contrastive-learning/instdisc/"},{"categories":["Deep Learning"],"content":"InstDisc 文章标题:Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination 作者:Zhirong Wu, Yuanjun Xiong, Stella Yu, Dahua Lin 发表时间:(CVPR 2018) 这篇论文提出了个体判别任务以及memory bank 把每一个 instance都看成是一个类别,也就是每一张图片都看作是一个类别,目标是能学一种特征能把每一个图片都区分开来 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/instdisc/:1:0","tags":["Deep Learning","对比学习"],"title":"InstDisc","uri":"/posts/deeplearning/contrastive-learning/instdisc/"},{"categories":["Deep Learning"],"content":"Approach InstDisc 网络 通过一个卷积神经网络把所有的图片都编码成一个特征,这些特征在最后的特征空间里能够尽可能的分开 训练这个卷积神经网络使用的是对比学习 需要有正样本和负样本,根据个体判别这个任务,正样本就是这个图片本身(可能经过一些数据增强),负样本就是数据集里所有其它的图片 把所有图片的特征全都存到memory bank 里,也就是一个字典(ImageNet数据集有128万的图片,memory bank里要存128万行,也就意味着每个特征的维度不能太高,否则存储代价太大了,本文用的是128维) 前向过程: 假如batch size是256,有256个图片进入到编码器中,通过一个 ResNet50,最后的特征维度是2048维,然后把它降维降到128维,这就是每个图片的特征大小 batch size 是 256 的话意味着有256个正样本,负样本从 memory bank 里随机地抽一些负样本出来。本文负样本个数4096 用NCE loss 计算对比学习的目标函数 更新网络后,把 mini batch里的数据样本所对应的那些特征,在 memory bank 里进行更新;不停更新,最后学到这个特征尽可能的有区分性 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/instdisc/:2:0","tags":["Deep Learning","对比学习"],"title":"InstDisc","uri":"/posts/deeplearning/contrastive-learning/instdisc/"},{"categories":["Deep Learning"],"content":"CPC 文章标题:Representation Learning with Contrastive Predictive Coding) 作者:Aaron van den Oord, Yazhe Li, Oriol Vinyals 发表时间:(2018) CPC 网络 CPC不仅可以处理音频,还可以处理图片、文字以及在强化学习里使用 输入 $x$(一个持续的序列),$t$ 表示当前时刻,$t-i$ 表示过去的时刻,$t+i$ 表示未来的时刻 把之前时刻的输入通过编码器$g_{enc}$,这个编码器返回一些特征,然后把这些特征放进一个自回归的模型$g_{ar}$,每一步最后的输出,就会得到图中红色的方块$c_t$(context representation,代表上下文的一个特征表示),如果这个上下文的特征表示足够好(它真的包含了当前和之前所有的这些信息),那它应该可以做出一些合理的预测,所以就可以用$c_t$预测未来时刻的这个$z_{t +1}、z_{t + 2}$(未来时刻的特征输出) 一般常见的自回归模型,就是 RNN 或者 LSTM的模型 对比学习的体现 正样本:未来的输入通过编码器以后得到的未来时刻的特征输出,这相当于做的预测是 query,而真正未来时刻的输出是由输入决定的,相对于预测来说是正样本; 负样本:比较广泛,比如可以任意选取输入通过这个编码器得到输出,它都应该跟预测是不相似的。 CPC V2用了更大的模型、用了更大的图像块、做了更多方向上的预测任务,把batch norm 换成了 layer norm,而使用了更多的数据增强。 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/instdisc/:3:0","tags":["Deep Learning","对比学习"],"title":"InstDisc","uri":"/posts/deeplearning/contrastive-learning/instdisc/"},{"categories":["Deep Learning"],"content":"InvaSpread 文章标题:Unsupervised Embedding Learning via Invariant and Spreading Instance Feature 作者:Mang Ye, Xu Zhang, Pong C. Yuen, Shih-Fu Chang 发表时间:(CVPR 2019) 一个编码器的端到端对比学习 可以被理解成是 SimCLR 的一个前身,它没有使用额外的数据结构去存储大量的负样本,它的正负样本就是来自于同一个 mini bach,只用一个编码器进行端到端的学习。 为什么它没有取得 SimCLR 那么好的结果呢?字典必须足够大,也就是说在做对比学习的时候,负样本最好是足够多,而本文的的 batch size 就是256,也就意味着它的负样本只有500多个,再加上它还缺少像 SimCLR 那样那么强大的数据增广以及最后提出的那个 mlp projector。 InvaSpread 思想 同样的图片通过编码器以后,它的特征应该很类似,不同的图片,它的特征出来就应该不类似,这就是题目中说的invariant和 spreading 对于相似的图片、相似的物体,特征应该保持不变性,但是对于不相似的物体或者完全不沾边的物体,特征应该尽可能的分散开 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/instdisc/:4:0","tags":["Deep Learning","对比学习"],"title":"InstDisc","uri":"/posts/deeplearning/contrastive-learning/instdisc/"},{"categories":["Deep Learning"],"content":"Method InvaSpread 网络 前向过程: 如果 batch size 是256,一共有256个图片,经过数据增强,又得到了256张图片 对于 $x_1 $这张图片来说, $\\hat x_1$就是它的正样本,它的负样本是所有剩下的这些图片(包括原始的图片以及经过数据增强后的图片), 正样本是256,负样本是$(256 - 1) \\times 2$,就是除去样本本身之外 mini-batch 剩下的所有样本以及它经过数据增强后的样本。 和 InstDisc 的区别:InstDisc中,正样本虽然是256,负样本却是从一个 memory bank 里抽出来的,用的负样本是4096甚至还可以更大 通过编码器以后,再过一层全连接层进行降维至128维;图中绿色的球在最后的特征空间上应该尽可能的接近,但是这个绿色的球跟别的颜色的特征应该尽可能的拉远 所用的目标函数也是 NCE loss 的一个变体 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/instdisc/:5:0","tags":["Deep Learning","对比学习"],"title":"InstDisc","uri":"/posts/deeplearning/contrastive-learning/instdisc/"},{"categories":["Deep Learning"],"content":"CMC 文章标题:Contrastive Multiview Coding 作者:Yonglong Tian, Dilip Krishnan, Phillip Isola 发表时间:(2019) 多视角下的对比学习 CMC正样本:一个物体的很多个视角 工作目的就是去增大互信息(所有的视角之间的互信息) CMC 四个视角正样本和负样本 选取的是 NYU RGBD 这个数据集(这个数据集有同时4个view,也就是有四个视角:原始的图像$V_1$、这个图像对应的深度信息$V_2$(每个物体离观察者到底有多远)、SwAV ace normal $V_3$、这个物体的分割图像$V_4$) CMC是第一个或者说比较早的工作去做这种多视角的对比学习,它不仅证明了对比学习的灵活性,而且证明了这种多视角、多模态的这种可行性。 open AI的clip模型:有一个图片,还有一个描述这个图片的文本,那这个图像和文本就可以当成是一个正样本对,就可以拿来做多模态的对比学习 局限性:当处理不同的视角或者说不同的模态时候,可能需要不同的编码器,因为不同的输入可能长得很不一样,这就有可能会导致使用几个视角,有可能就得配几个编码器,在训练的时候这个计算代价就有点高 Transformer有可能能同时处理不同模态的数据 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/instdisc/:6:0","tags":["Deep Learning","对比学习"],"title":"InstDisc","uri":"/posts/deeplearning/contrastive-learning/instdisc/"},{"categories":["Deep Learning"],"content":"Moco","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"MoCo 文章标题:Momentum Contrast for Unsupervised Visual Representation Learning 作者:Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, Ross Girshick 发表时间:(CVPR 2020) offical code 视觉 + 对比学习的里程碑式的工作 无监督学习的目的:在一个很大的无标注的数据集上训练,模型学到的特征可以很好的迁移到下游任务。 对比学习:通过对比去学习模型,只需要知道图 A和 图 B相似,图 A、图 B和 图 C不相似;而不需要真的知道各个图片的具体类别。 3 张图进入一个网络 M 得到特征 f1、f2、f3,在一个学习好的特征空间 embedding space 中,f1、f2 的特征尽量近,和 f3 的特征尽量远离。 对比学习学到的很好的特征:类似物体在这个特征空间 相邻,不类似的物体在特征空间 远离 Q: 图 1 和 图 2 相似,和图 3 都不相似,难道不是有监督学习吗?Why 对比学习在 CV 领域被认为是无监督训练呢? CV 领域 设计巧妙的代理任务 pre-text task,人为设立一些规则 —— 定义哪些图片相似、哪些图片不相似,为自监督学习提供监督信号,从而自监督训练 一个无标注的数据集,n 张图片,$x_1, x_2, …, x_n$, 随机选取一张图片,做 transformation。 以 $x_1 $图片为例,$x_1 $随机裁剪 + 数据增广 得到 $x_i^1$, $x_i^2 $(看起来和 $x_1$ 有区别的 2 张照片, $x_1$ 的正样本),数据集中的其它图片 $x_j$, $j ≠ i $是 $x_1$ 的负样本 基于图片和图片本身的变换是正样本,和其它图片是负样本 ImageNet-1K 此时不是 1000 个类别,而是 100w 个类别。每个图片都是它自己的正样本,其余都是负样本。 对比学习的框架:灵活性–定义正负样本的规则 同一个视频里的任意两帧是正样本,和其它视频的所有帧是负样本 NLP, simCSE 把同样的句子扔给模型,但是做 2 次 forward,通过不同的 dropout 得到一个句子的 2 个特征;和其它所有句子的特征都是负样本。 CMC 论文:一个物体的不同视角 view(正面、背面;RGB 图像、深度图像)作为不同形式的正样本。 多模态领域:Open AI 的 CLIP 模型 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:1:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"MoCo标题 Momentum Contrast for Unsupervised Visual Representation Learning 动量对比学习的方法做无监督视觉特征学习 Momentum Contrast: 动量对比学习 动量:(指数)加权移动平均值 $y_t = m * y_{t-1} + (1 - m) * x_t$ m: 动量的超参数 $y_{t-1}$: 上一个时刻的输出 $x_t$: 当前时刻的输入 m 趋近于 1,$y_t $改变缓慢,当前时刻的输入 $x_t $没什么影响 m 趋近于 0, $y_t $更多依赖于当前时刻的输入。 MoCo 利用动量的特性,缓慢的更新一个编码器,从而让中间学习到的字典中的特征尽可能保持一致。 MoCo 从什么角度做对比学习呢? dictionary look-up, 字典查询任务, a dynamic dictionary with a queue and a moving-averaged encoder 动态字典 一个队列:队列中的样本无需梯度回传,可以放很多负样本,让字典变得很大 一个移动平均的编码器:让字典的特征尽可能的保持一致 一个大的、一致的字典,有利于无监督的对比学习训练。 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:2:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"Introduction NLP 的离散单词更具语义性,CV的连续、高维信号不好构建字典 无监督在 CV 不成功的原因是什么? 原始信号空间的不同 NLP 原始信号是离散的,词、词根、词缀,容易构建 tokenized dictionaries 做无监督学习 tokenized: 把一个词对应成某一个特征 Why tokenized dictionaries 有助于无监督学习? 把字典的 key 认为是一个类别,有类似标签的信息帮助学习 NLP 无监督学习很容易建模,建好的模型也好优化 CV 原始信号是连续的、高维的,不像单词具有浓缩好的、简洁的语义信息,不适合构建一个字典 如果没有字典,无监督学习很难建模 给CV 无监督对比学习 构建一个 ==大 (by queue)==+ ==一致 (momentum encoder)==的字典 $f_{11} $当成 query 在 $f_{12}, f_2, f_3, …, f_n$ 组成的字典的 key 特征条目 $k_1, k_2, …$ 里面查找,dictionary look-up 靠近 $f_{12}$, 远离 $f_2, f_3, …$ 从动态字典的角度看对比学习,什么样的字典才适合呢? 大 + 一致性 large 从连续高维空间做更多的采样。字典 key 越多,表示的视觉信息越丰富,匹配时更容易找到具有区分性的本质特征。 如果 字典小、key 少,模型可能学到 shortcut 捷径,不能泛化 consistent 字典里的 key ($k_0, k_1, k_2, …, k_N$) 应该由相同的 or 相似的编码器生成 如果字典的 key 是由不同的编码器得到的,query q 做字典查询时,很有可能 找到和 query 使用同一个 or 相似编码器生成的 key,而不是语义相似的 key。另一种形式的 shortcut solution ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:3:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"Related work 两个可以做的点:pretext tasks and loss functions 代理任务:为了学习一个好的数据特征表示 损失函数:和代理任务可以分开研究。 MoCo 的创新点在损失函数,又大又一致的字典 影响 infoNCE 目标函数的计算 损失函数:判别式、生成式、对比学习、对抗学习 对比学习的损失:目标不固定,训练过程中不断改变。目标有编码器抽出来的特征(MoCo 的字典)而决定 判别式:预测 8 个位置中的哪一个方位(九宫格) 生成式:重建整张图 对比学习的目标:测量样本对 在特征空间的相似性。相似样本离得近,不相似样本离得远 对抗学习的目标:衡量两个概率分布之间的差异 对比学习和代理任务的关系: 不同的代理任务 可以和 某种形式的对比学习的目标函数 配对使用 Instance discrimination 个体判别方法 —– examplar based 代理任务很相关 CPC contrastive predictive coding 用上下文信息预测未来 —– context auto-encoding 上下文自编码 CMC contrastive multiview coding 利用一个物体的不同视角做对比 —– olorization 图片上色(同一个图片的 2 个视角:黑白 和 彩色) ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:4:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"Methods 对比学习和最近的发展,都可以看成是一个训练一个 encoder 来做 字典查询 的任务 $$ L_q=-log\\ \\frac{exp(q\\cdot k_+ / \\tau)}{\\sum_{i=0}^K exp(q\\cdot k_i / \\tau)} $$ q : 编码查询,;${k_0, k_1, k_2, …}$: 一组键 key 编码样本 。假设字典中只有一个键$(k_+)$与 q 匹配。 $\\tau$: 温度超参数; K : 负样本个数。 试图将 q 分类为 k+ 的 (K+1)-way softmax-based 分类器的对数损失。 MoCo 框架图 end to end: 编码器都能通过反向传播实时更新学习。(特征高度一致) 局限性:字典大小 == mini-batch size 硬件限制;大 batch size 优化难,难收敛 memory bank: 只有query 编码器可以进行梯度回传更新。 把所有特征存在memory bank里, 从memory bank中随机抽取 key 当作字典;没有反向传播,所以它可以支持大的字典大小。 局限性:不同时刻编码器(梯度回传更新)得到的特征缺乏一致性 MoCo: 采用队列形式实现一个字典( 不受batch size限制 ),使用动量编码器(提高一致性) queue 数据结构: 剥离字典的大小 和 显卡内存的限制,让字典的大小和模型每次做前向传播的 batch size 的大小分开 当前 mini-batch 入队,最早进入队列的 mini-batch 出队 队列的大小 == 字典的大小,但是每次做 iteration 更新,并不需要更新字典中所有 key 元素的值。 momentum encoder:使用 queue,只有当前 mini-batch 的特征是由当前的编码器得到的;之前的 key 是由不同时刻的编码器抽取的特征,如何保持 consistent 呢? momentum encoder 由当前时刻的 encoder 初始化而来 $\\theta_k = m * \\theta_{k-1}+ (1-m) * \\theta_q$ 动量系数: $m\\in[0,1)$。只有参数 θq 通过反向传播更新。 动量参数 m 较大时,$\\theta_k $的更新缓慢,不过多的依赖于 $\\theta_q $当前时刻的编码器,即不随着当前时刻的编码器快速改变,尽可能保证 字典里的 key 都是由相似的编码器生成的特征,保证特征的 consistent MoCo pytorch 算法 Shuffling BN 小 trick ,后续没再用 BN会阻止模型学习好的表征,可能是由于样本中间的信息(由BN引起)泄露。 操作:配给GPU之前打乱样本顺序,用多个GPU进行训练,并对每个GPU的样本独立进行BN。再把顺序恢复后算loss。 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:5:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"拓展阅读 MoCo 论文逐段精读【论文精读】 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:6:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"MoCo-V2 文章标题:Improved Baselines with Momentum Contrastive Learning 作者: Xinlei Chen, Haoqi Fan, Ross Girshick, Kaiming He 发表时间: (Arxiv 2021) 技术报告 (2页) MoCo v2发现SimCLR里的那些技术都是即插即用型的,引入了mlp projection head以及使用更多的数据增强,就又刷新ImageNet 上的最好成绩。 加了一个 mlp 层 没有 batch norm ;直接 fc + Relu + fc + Relu 加了更多的数据增强 训练的时候用了cosine的 learning rate schedule 训练更长的 epoch,从200变到了800 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:7:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"MoCo-V3 文章标题:An Empirical Study of Training Self-Supervised Vision Transformers 作者: Xinlei Chen, Saining Xie, Kaiming He 发表时间: (ICCV 2021) MoCo v2 + SimSiam offical code MoCoV3 pytorch 算法 残差网络换成 ViT 当这个 batch size 变大了以后曲线会抖动,效果变差 方法:观察了一下训练时,每一层这个回传梯度的情况;发现每次 loss 有大幅震动 导致这个准确度大幅下降的时候,梯度也会有一个波峰(发生在第一层) 第一层:patch projection 解决:随机初始化了一个 patch projection 层;然后冻结使得整个训练过程中都不变 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/moco/:8:0","tags":["Deep Learning","对比学习"],"title":"Moco","uri":"/posts/deeplearning/contrastive-learning/moco/"},{"categories":["Deep Learning"],"content":"SimCLR","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/simclr/","tags":["Deep Learning","对比学习"],"title":"SimCLR","uri":"/posts/deeplearning/contrastive-learning/simclr/"},{"categories":["Deep Learning"],"content":"SimCLR-V1 文章标题:A Simple Framework for Contrastive Learning of Visual Representations 作者:Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton 发表时间:(ICML 2020) offical code SimCLR V1 如果有一个mini-batch的图片 x,对 x 做不同的数据增强就会得到 $\\hat x_i$ 和 $\\hat x_j$ 一个图片延伸得到的两个图片就是正样本,如果batch size是N的话,正样本个数就是N,负样本的个数就是这个 batch size 剩下所有的样本以及它们数据增强过后的样本是 $2\\times (N - 1)$ (和InvaSpread一样) 通过编码器 $f$ 进行编码(共享权重),得到的 $h$ (特征表示)是2048维 SimCLR的重大创新点 在特征之后又加了一个projector,也就是上图中的 $g$ 函数,它就是一个mlp层(fc 层+ Relu),变成128维 让最后学到的特征在ImageNet 这个分类任务上直接提点将近10个点 不论是32、64还是2048其实都没太大区别 衡量一下正样本之间是不是能达到最大的一致性,采用的是normalized temperature-scaled的交叉熵函数。 normalized:特征后面进行了 L2 归一化,temperature-scaled :温度超参数 $g$ 函数只有在训练的时候才用,只用$h$这个特征去做下游任务 SimCLR和InvaSpread区别 用了更多的数据增强 从最开始的原始图片,到裁剪,到改变色彩,到旋转,使用 cutout,使用高斯的噪声和高斯 blur,以及最后使用sobel滤波器 进行了消融实验发现最有效的两个数据增强就是随机裁剪和随机色彩变换 加了一个$g$函数(一个可以学习的分线性的变换,就是一个 mlp层) 用了更大的batch size ,而且训练的时间更久 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/simclr/:1:0","tags":["Deep Learning","对比学习"],"title":"SimCLR","uri":"/posts/deeplearning/contrastive-learning/simclr/"},{"categories":["Deep Learning"],"content":"SimCLR-V2 文章标题:Big Self-Supervised Models are Strong Semi-Supervised Learners 作者:Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey Hinton 发表时间:(NIPS 2020) 模型总览图 使用了更大的模型,换了一个152层的残差网络,同时用了这个selective kernels,也就是SKNet 加深了V1里的projection head:,原来是 fc + relu,现在是 fc + relu + fc + relu 引入了MoCo的动量编码器 ","date":"2023-06-07","objectID":"/posts/deeplearning/contrastive-learning/simclr/:2:0","tags":["Deep Learning","对比学习"],"title":"SimCLR","uri":"/posts/deeplearning/contrastive-learning/simclr/"},{"categories":["Deep Learning"],"content":"分支通道注意力","date":"2023-06-06","objectID":"/posts/deeplearning/attention/branch_channel-attention/","tags":["Deep Learning","注意力机制"],"title":"Branch Channel Attention","uri":"/posts/deeplearning/attention/branch_channel-attention/"},{"categories":["Deep Learning"],"content":"different branches ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/branch_channel-attention/:1:0","tags":["Deep Learning","注意力机制"],"title":"Branch Channel Attention","uri":"/posts/deeplearning/attention/branch_channel-attention/"},{"categories":["Deep Learning"],"content":"SKNet 文章标题:Selective Kernel Networks 作者:Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang 发表时间:(CVPR 2019) Official Code SK_module 用multiple scale feature汇总的information来channel-wise地指导如何分配侧重使用哪个kernel的表征 一种非线性方法来聚合来自多个内核的信息,以实现神经元的自适应感受野大小 Split:生成具有不同内核大小的多条路径,这些路径对应于不同感受野(RF,receptive field) 大小的神经元 $X\\in R^{H’\\times W’\\times C’} $ $\\tilde F:X\\to \\tilde U \\in R^{H\\times W\\times C} $ kernel size $3\\times3$ $\\hat F:X\\to \\hat U \\in R^{H\\times W\\times C}$ kernel size $5\\times5$:使用空洞卷积$3\\times3$,空洞系数为2。 Fuse:聚合来自多个路径的信息,以获得选择权重的全局和综合表示。 $$ U=\\tilde U+\\hat U\\\\ s_c=F_{gp}(U_c)=\\frac{1}{H\\times W}\\sum_{i=1}^H\\sum_{j=1}^WU_c(i,j)\\\\ z=F_{fc}(s)=\\delta(B(Ws)) 降维处理\\\\ $$ $s\\in R^c$;$\\delta$:ReLU;$z\\in R^{d\\times1}$;$W\\in R^{d\\times C}$:批量归一化; $d=max(C/r,L)$ L:d的最小值,本文设置32 Select:根据选择权重聚合不同大小内核的特征图 $$ a_c=\\frac{e^{A_cz}}{e^{A_cz}+e^{B_cz}}\\\\ b_c=\\frac{e^{B_cz}}{e^{A_cz}+e^{B_cz}}\\\\ $$ $ A,B ∈R^{C\\times d}$ ,$ a,b$ 分别表示 $\\tilde U,\\hat U$的软注意力向量。$A_c ∈ R^{1\\times d }$是 A 的第$ c $行,$a_c$ 是 a 的第 $c $个元素,同理$B_c,b_c$。 $$ V_c=a_c\\cdot\\tilde U_c + b_c\\cdot \\hat U_c\\\\\\ a_c+b_c=1\\\\ V_c\\in R^{H\\times W} $$ Selective Kernel Convolution三分支 $SK[M,G,r]\\to SK[2,32,16]$ M:确定要聚合的不同内核的选择数量 G:控制每条路径的基数的组号 r:reduction ratio $$ U_k=F_k(X) \\\\ U = \\sum_{k=1}^K U_k \\\\ z = \\delta(BN(WGAP(U))) \\\\ s_k^{(c)} = \\frac{e^{W_k^{(c)}z}}{\\sum_{k=1}^K e^{W_k^{(c)}z}} \\\\ Y=\\sum_{k=1}^K s_kU_k \\\\ global\\ average\\ pooling\\rightarrow MLP\\rightarrow softmax $$ class SKAttention(nn.Module): def __init__(self, channel=512,kernels=[1,3,5,7],reduction=16,group=1,L=32): super().__init__() self.d=max(L,channel//reduction) self.convs=nn.ModuleList([]) for k in kernels: self.convs.append( nn.Sequential(OrderedDict([ ('conv',nn.Conv2d(channel,channel,kernel_size=k,padding=k//2,groups=group)), ('bn',nn.BatchNorm2d(channel)), ('relu',nn.ReLU()) ])) ) self.fc=nn.Linear(channel,self.d) self.fcs=nn.ModuleList([]) for i in range(len(kernels)): self.fcs.append(nn.Linear(self.d,channel)) self.softmax=nn.Softmax(dim=0) def forward(self, x): bs, c, _, _ = x.size() conv_outs=[] ### split for conv in self.convs: conv_outs.append(conv(x)) feats=torch.stack(conv_outs,0)#k,bs,channel,h,w ### fuse U=sum(conv_outs) #bs,c,h,w ### reduction channel S=U.mean(-1).mean(-1) #bs,c Z=self.fc(S) #bs,d ### calculate attention weight weights=[] for fc in self.fcs: weight=fc(Z) weights.append(weight.view(bs,c,1,1)) #bs,channel attention_weughts=torch.stack(weights,0)#k,bs,channel,1,1 attention_weughts=self.softmax(attention_weughts)#k,bs,channel,1,1 ### fuse V=(attention_weughts*feats).sum(0) return V ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/branch_channel-attention/:2:0","tags":["Deep Learning","注意力机制"],"title":"Branch Channel Attention","uri":"/posts/deeplearning/attention/branch_channel-attention/"},{"categories":["Deep Learning"],"content":"different conv kernels ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/branch_channel-attention/:3:0","tags":["Deep Learning","注意力机制"],"title":"Branch Channel Attention","uri":"/posts/deeplearning/attention/branch_channel-attention/"},{"categories":["Deep Learning"],"content":"CondConv 文章标题:CondConv: Conditionally Parameterized Convolutions for Efficient Inference 作者:Brandon Yang, Gabriel Bender, Quoc V. Le, Jiquan Ngiam 发表时间:(NIPS 2019) official code pytorch版 Condconv $$ \\alpha = \\sigma(W_r(GAP(X))) \\\\ Y = (\\alpha _1W_1+\\dots +\\alpha_nW_n) *X \\\\ global\\ average\\ pooling\\rightarrow linear \\ layer\\rightarrow sigmoid $$","date":"2023-06-06","objectID":"/posts/deeplearning/attention/branch_channel-attention/:4:0","tags":["Deep Learning","注意力机制"],"title":"Branch Channel Attention","uri":"/posts/deeplearning/attention/branch_channel-attention/"},{"categories":["Deep Learning"],"content":"DynamicConv 文章标题:Dynamic Convolution: Attention over Convolution Kernels 作者:Yinpeng Chen, Xiyang Dai, Mengchen Liu, Dongdong Chen, Lu Yuan, Zicheng Liu 发表时间:(CVPR 2020) pytorch版 DynamicConv $$ s = softmax(W_2\\delta(W_1GAP(X))) \\\\ DyConv = \\sum_{i=1}^K s_kConv_k \\\\Y = DyConv(X) $$","date":"2023-06-06","objectID":"/posts/deeplearning/attention/branch_channel-attention/:5:0","tags":["Deep Learning","注意力机制"],"title":"Branch Channel Attention","uri":"/posts/deeplearning/attention/branch_channel-attention/"},{"categories":["Deep Learning"],"content":"通道,空间注意力","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Directly estimate 3D attention map ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:1:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Residual Attention 文章标题:Residual Attention Network for Image Classification 作者:Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang 发表时间:(CVPR 2017) pytorch code ICCV2021-Residual Attention另一篇不同的记得看 Residual_Attention 每个注意力模块可以分为掩码分支和主干分支。 主干分支处理特征,可以换其他先进模块用f表示。 掩码分支使用bottom-up top-down的结构来学习相同大小的掩码,该掩码对来自主干分支的输出特征进行软加权。 bottom-up结构,在残差单元之后使用几次 maxpooling 来增加感受野, top-down部分,使用线性插值来保持输出大小与输入特征图相同。两部分之间也有跳跃连接 在两个 1 × 1 卷积层之后,一个 sigmoid 层将输出归一化为 [0, 1]。 采用由多个卷积组成的bottom-up top-down的结构来生成 3D(高度、宽度、通道)注意力图。 $$ s = \\sigma(Conv_2^{1\\times1}(Conv_1^{1\\times1}(h_{up}(h_{down}(X))))) \\\\ X_{out} = sf(X)+f(X) \\\\ top\\_down\\ network\\rightarrow bottom\\_down\\ network\\rightarrow 1\\times1Conv\\rightarrow Sigmoid $$","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:2:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"SimAM 文章标题:Simam: A simple, parameter-free attention module for convolutional neural networks 作者:Lingxiao Yang, Ru-Yuan Zhang, Lida Li, Xiaohua Xie , 发表时间:(ICML 2021) pytorch code simam 无参模型,基于数学与神经科学 class simam_module(torch.nn.Module): def __init__(self, channels = None, e_lambda = 1e-4): super(simam_module, self).__init__() self.activaton = nn.Sigmoid() self.e_lambda = e_lambda def forward(self, x): b, c, h, w = x.size() n = w * h - 1 x_minus_mu_square = (x - x.mean(dim=[2,3], keepdim=True)).pow(2) y = x_minus_mu_square / (4 * (x_minus_mu_square.sum(dim=[2,3], keepdim=True) / n + self.e_lambda)) + 0.5 return x * self.activaton(y) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:3:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Strip Pooling 文章标题:Strip Pooling: Rethinking spatial pooling for scene parsing 作者:Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng (一作Coordinate Attention) 发表时间:(CVPR 2020) official code Strip_Pooling #即对应文中的MPM模块 class StripPooling(nn.Module): \"\"\" Reference: \"\"\" def __init__(self, in_channels, pool_size, norm_layer, up_kwargs): super(StripPooling, self).__init__() #空间池化 self.pool1 = nn.AdaptiveAvgPool2d(pool_size[0]) self.pool2 = nn.AdaptiveAvgPool2d(pool_size[1]) #strip pooling self.pool3 = nn.AdaptiveAvgPool2d((1, None)) self.pool4 = nn.AdaptiveAvgPool2d((None, 1)) inter_channels = int(in_channels/4) self.conv1_1 = nn.Sequential(nn.Conv2d(in_channels, inter_channels, 1, bias=False), norm_layer(inter_channels), nn.ReLU(True)) self.conv1_2 = nn.Sequential(nn.Conv2d(in_channels, inter_channels, 1, bias=False), norm_layer(inter_channels), nn.ReLU(True)) self.conv2_0 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, 1, 1, bias=False), norm_layer(inter_channels)) self.conv2_1 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, 1, 1, bias=False), norm_layer(inter_channels)) self.conv2_2 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, 1, 1, bias=False), norm_layer(inter_channels)) self.conv2_3 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, (1, 3), 1, (0, 1), bias=False), norm_layer(inter_channels)) self.conv2_4 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, (3, 1), 1, (1, 0), bias=False), norm_layer(inter_channels)) self.conv2_5 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, 1, 1, bias=False), norm_layer(inter_channels), nn.ReLU(True)) self.conv2_6 = nn.Sequential(nn.Conv2d(inter_channels, inter_channels, 3, 1, 1, bias=False), norm_layer(inter_channels), nn.ReLU(True)) self.conv3 = nn.Sequential(nn.Conv2d(inter_channels*2, in_channels, 1, bias=False), norm_layer(in_channels)) # bilinear interpolate options self._up_kwargs = up_kwargs def forward(self, x): _, _, h, w = x.size() x1 = self.conv1_1(x) x2 = self.conv1_2(x) x2_1 = self.conv2_0(x1) x2_2 = F.interpolate(self.conv2_1(self.pool1(x1)), (h, w), **self._up_kwargs) x2_3 = F.interpolate(self.conv2_2(self.pool2(x1)), (h, w), **self._up_kwargs) x2_4 = F.interpolate(self.conv2_3(self.pool3(x2)), (h, w), **self._up_kwargs) x2_5 = F.interpolate(self.conv2_4(self.pool4(x2)), (h, w), **self._up_kwargs) #PPM分支的输出结果 x1 = self.conv2_5(F.relu_(x2_1 + x2_2 + x2_3)) #strip pooling的输出结果 x2 = self.conv2_6(F.relu_(x2_5 + x2_4)) #拼接+1x1卷积 out = self.conv3(torch.cat([x1, x2], dim=1)) return F.relu_(x + out) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:4:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"SCNet 文章标题:Improving convolutional networks with self-calibrated convolutions 作者:Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng, Changhu Wang, Jiashi Feng 发表时间:(CVPR 2020) official code SC_conv class SCConv(nn.Module): def __init__(self, inplanes, planes, stride, padding, dilation, groups, pooling_r, norm_layer): super(SCConv, self).__init__() self.k2 = nn.Sequential( nn.AvgPool2d(kernel_size=pooling_r, stride=pooling_r), nn.Conv2d(inplanes, planes, kernel_size=3, stride=1, padding=padding, dilation=dilation, groups=groups, bias=False), norm_layer(planes), ) self.k3 = nn.Sequential( nn.Conv2d(inplanes, planes, kernel_size=3, stride=1, padding=padding, dilation=dilation, groups=groups, bias=False), norm_layer(planes), ) self.k4 = nn.Sequential( nn.Conv2d(inplanes, planes, kernel_size=3, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=False), norm_layer(planes), ) def forward(self, x): identity = x out = torch.sigmoid(torch.add(identity, F.interpolate(self.k2(x), identity.size()[2:]))) # sigmoid(identity + k2) out = torch.mul(self.k3(x), out) # k3 * sigmoid(identity + k2) out = self.k4(out) # k4 return out ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:5:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"VAN 文章标题:Visual Attention Network 作者:Meng-Hao Guo, Cheng-Ze Lu, Zheng-Ning Liu, Ming-Ming Cheng, Shi-Min Hu 发表时间:2022 official code LKA VAN_stage class LKA(nn.Module): def __init__(self, dim): super().__init__() self.conv0 = nn.Conv2d(dim, dim, 5, padding=2, groups=dim) self.conv_spatial = nn.Conv2d(dim, dim, 7, stride=1, padding=9, groups=dim, dilation=3) self.conv1 = nn.Conv2d(dim, dim, 1) def forward(self, x): u = x.clone() attn = self.conv0(x) attn = self.conv_spatial(attn) attn = self.conv1(attn) return u * attn class Attention(nn.Module): def __init__(self, d_model): super().__init__() self.proj_1 = nn.Conv2d(d_model, d_model, 1) self.activation = nn.GELU() self.spatial_gating_unit = LKA(d_model) self.proj_2 = nn.Conv2d(d_model, d_model, 1) def forward(self, x): shorcut = x.clone() x = self.proj_1(x) x = self.activation(x) x = self.spatial_gating_unit(x) x = self.proj_2(x) x = x + shorcut return ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:6:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"split channel and spitial attention ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:7:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"CBAM 文章标题:CBAM: Convolutional Block Attention Modul 作者:Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon 发表时间:(ECCV 2018) pytorch code CBAM 空间域中的池化操作 $$ F_{avg}^c=GAP^s(F) \\\\ F_{max}^c=GMP^s(F)\\\\ s_c(X)=\\sigma(W_1\\delta(W_0(F_{avg}^c))+W_1\\delta(W_0(F_{max}^c)))\\\\ M_c(F)=s_cF $$ 通道域中的池化操作 $$ F_{avg}^s=GAP^c(F)\\\\ F_{max}^s=GMP^c(F)\\\\ s_s =\\sigma(f^{7\\times7}([F_{avg}^s;F_{max}^s]))\\\\ M_s(F)=s_sF $$$$ F' = M_c(F)\\\\ Y=M_s(F') $$ 它将通道注意力图和空间注意力图解耦以提高计算效率,并通过引入全局池化来利用空间全局信息 缺点:CBAM 采用卷积来生成空间注意力图,因此空间子模块可能会受到有限的感受野的影响 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:8:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"BAM 文章标题:BAM: Bottleneck Attention Module 作者:Jongchan Park, Sanghyun Woo, Joon-Young Lee, In So Kweon (同CBAM作者) 发表时间:(BMCV 2018) pytorch code BAM $$ M_c(F)=BN(W_1(W_0AvgPool(F)+b_0)+b_1)\\\\ M_s(F)=BN(f_3^{1\\times1}(f_2^{3\\times3}(f_1^{3\\times3}(f_0^{1\\times1}(F)))))\\\\ M(F)=\\sigma(M_c(F)+M_s(F)) $$它使用扩张卷积来扩大空间注意力子模块的感受野,并按照 ResNet 的建议构建瓶颈结构以节省计算成本 为了有效地利用上下文信息,空间注意力分支结合了瓶颈结构和扩张卷积 缺点:尽管扩张卷积有效地扩大了感受野,但它仍然无法捕获远程上下文信息以及编码跨域关系 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:9:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"scSE 文章标题:Recalibrating Fully Convolutional Networks with Spatial and Channel ‘Squeeze \u0026 Excitation’ Blocks Concurrent Spatial and Channel Squeeze \u0026 Excitation in Fully Convolutional Networks (MICCAI 2018) 作者:Abhijit Guha Roy, Nassir Navab, Christian Wachinger 发表时间:(TMI 2018) pytorch code scSE $$ \\hat U_{cSE} = U *\\sigma((W_s\\delta(W_1GAP(U)))) \\\\ \\hat U_{sSE} = U *\\sigma((Conv^{1\\times1}(U)) \\\\ \\hat U_{scSE} = f(\\hat U_{cSE},\\hat U_{sSE}) $$f 表示融合函数,可以是最大值、加法、乘法或串联 ''' https://github.com/qubvel/segmentation_models.pytorch/blob/a6e1123983548be55d4d1320e0a2f5fd9174d4ac/segmentation_models_pytorch/base/modules.py ''' class SCSEModule(nn.Module): def __init__(self, in_channels, reduction=16): super().__init__() self.cSE = nn.Sequential( nn.AdaptiveAvgPool2d(1), nn.Conv2d(in_channels, in_channels // reduction, 1), nn.ReLU(inplace=True), nn.Conv2d(in_channels // reduction, in_channels, 1), nn.Sigmoid(), ) self.sSE = nn.Sequential(nn.Conv2d(in_channels, 1, 1), nn.Sigmoid()) def forward(self, x): return x * self.cSE(x) + x * self.sSE(x) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:10:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"PSA Polarized Self-Attention: Towards High-quality Pixel-wise Regression ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:11:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Cross-dimension interaction ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:12:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Triplet Attention 文章标题:Rotate to attend: Convolutional triplet attention module 作者:Diganta Misra, Trikay Nalamada, Ajay Uppili Arasanipalai, Qibin Hou 发表时间: (WACV 2021) official code Structural Design of Triplet Attention Module. (a). Squeeze Excitation Block. (b). Convolution Block Attention Module (CBAM) . (c). Global Context (GC) block. (d). Triplet Attention 使用三个分支,每个分支都在捕获来自 H、W 和 C 的任意两个域之间的跨域交互。 在每个分支中,沿不同轴的旋转操作应用于输入,然后一个 Z-pool 层负责聚合第零维的信息。 最后,内核大小为 k × k 的标准卷积层对最后两个域之间的关系进行建模。 $$ X_1=Pm_1(X) \\\\X_2=Pm_2(X) \\\\s_0=\\sigma(Conv_0(ZPool(X))) \\\\s_1=\\sigma(Conv_1(ZPool(X_1))) \\\\s_2=\\sigma(Conv_2(ZPool(X_2))) \\\\ Y=\\frac{1}{3}(s_0X+Pm_1^{-1}(s_1X_1)+Pm_2^{-1}(s_2X_2)) $$其中 $P_{m1},P_{m2}$ 分别表示绕 H 轴和 W 轴逆时针旋转 90°,而$P_{mi}^{-1}$ 表示逆时针旋转。 Z-Pool 沿第零维连接最大池化和平均池化 triplet attention 强调捕获跨域交互的重要性,而不是独立计算空间注意力和通道注意力。这有助于捕获丰富的判别特征表示。 GCBlock = SEBlock + Simplified selfattention ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:12:1","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Long-range dependencies ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:13:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Coordinate Attention 文章标题:Coordinate attention for efficient mobile network design 作者:Qibin Hou, Daquan Zhou, Jiashi Feng 发表时间: (CVPR 2021) official code (a) Squeeze-and-Excitation block (b) CBAM (C) Coordinate attention block 将位置信息嵌入到通道注意中,使网络以很少的计算成本关注重要区域 coordinate information embedding 池化内核的两个空间范围$(H,1),(1,W)$对每个通道进行水平和垂直编码。 $z_c^h = GAP^h(X)=\\frac{1}{W}\\sum_{0\\leq i\u003cW}x_c(h,i)$ $z_c^w = GAP^w(X)=\\frac{1}{H}\\sum_{0\\leq j\u003cH}x_c(j,w)$ coordinate attention generation $$ f=\\sigma(BN(Conv_1^{1\\times1}([z_c^h;z_c^w]))) \\\\ f^h,f^w=Split(f) \\\\s^h=\\sigma(Conv_h^{1\\times1}(f^h)) \\\\s^w=\\sigma(Conv_w^{1\\times1}(f^w)) \\\\Y=Xs^hs^w $$ class CoordAtt(nn.Module): def __init__(self, inp, oup, reduction=32): super(CoordAtt, self).__init__() self.pool_h = nn.AdaptiveAvgPool2d((None, 1)) self.pool_w = nn.AdaptiveAvgPool2d((1, None)) mip = max(8, inp // reduction) self.conv1 = nn.Conv2d(inp, mip, kernel_size=1, stride=1, padding=0) self.bn1 = nn.BatchNorm2d(mip) self.act = h_swish() self.conv_h = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0) self.conv_w = nn.Conv2d(mip, oup, kernel_size=1, stride=1, padding=0) def forward(self, x): identity = x n,c,h,w = x.size() x_h = self.pool_h(x) x_w = self.pool_w(x).permute(0, 1, 3, 2) y = torch.cat([x_h, x_w], dim=2) y = self.conv1(y) y = self.bn1(y) y = self.act(y) x_h, x_w = torch.split(y, [h, w], dim=2) x_w = x_w.permute(0, 1, 3, 2) a_h = self.conv_h(x_h).sigmoid() a_w = self.conv_w(x_w).sigmoid() out = identity * a_w * a_h return out ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:13:1","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"DANet 文章标题:Dual Attention Network for Scene Segmentation 作者:Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang,and Hanqing Lu 发表时间: (CVPR 2019) official code danet danet Position attention–\u003e selfattention $$ Q,K,V=W_qX,W_kX,W_vX \\\\Y^{pos} = X+V*Softmax(Q^TK) \\\\ Y^{chn} = X + X * Softmax(X^TX) \\\\ Y = Y^{pos}+Y^{chn} $$","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:13:2","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"Relation-aware attention ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:14:0","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"RGA 文章标题:Relation-Aware Global Attention for Person Re-identification 作者:Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Xin Jin, Zhibo Chen 发表时间: (CVPR 2020) official code RGA $$ Q =\\delta(W^Q X)\\\\ K =\\delta(W^K X)\\\\ R=Q^TK\\\\ r_i=[R(i,:);R(:,i)]\\\\ Y_i=[g_{avg}^c(\\delta(W^{\\varphi}x_i));\\delta(W^{\\phi}r_i)]\\\\ a_i=\\sigma(W_2\\delta(W_1y_i)) $$channel和spital形式一样。位置上的空间注意力得分$a_i$ 建议按顺序联合使用它们以更好地捕捉空间和跨通道关系。 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channlespatial-attention/:14:1","tags":["Deep Learning","注意力机制"],"title":"Channel \u0026\u0026 Spatial Attention","uri":"/posts/deeplearning/attention/channlespatial-attention/"},{"categories":["Deep Learning"],"content":"通道注意力","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"SENet 文章标题:Squeeze-and-Excitation Networks 作者:Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu 发表时间:(CVPR 2018) Official Code External-Attention-pytorch senet.pytorch Diagram of a Squeeze-and-Excitation building block. $$ u_c=v_c*X=\\sum_{s=1}^{C'}v_c^s*x^s $$ 输入:$X=[x^1,x^2,…,x^{C’}]$ 输出:$U=[u_1,u_2,…,u_C]$ $v_c=[v_c^1,v_c^2,…,v_c^{C’}] $;$v_c^s$是一个二维空间内核,表示作用于 $X $的相应通道的 $v_c$的单个通道。 卷积核的集合:$V=[v_1,v_2,…,v_C]$ 压缩(Squeeze):经过(全局平均池化)压缩操作后特征图被压缩为1×1×C向量;也可以采用更复杂的策略**(收集全局空间信息)** 为什么用平均池化:卷积计算:参数量比较大;最大池化:可能用于检测等其他任务,输入的特征图是变化的,能量无法保持 $$ z_c=F_{sq}(u_c)=\\frac{1}{H\\times W}\\sum_{i=1}^H\\sum_{j=1}^Wu_c(i,j) $$ 激励(Excitation):将特征维度降低到输入的 1/16$(r)$,然后经过 ReLu 激活后再通过一个 Fully Connected 层升回到原来的维度,然后通过一个 Sigmoid 的门获得 0~1 之间归一化的权重(捕获通道级关系并输出注意向量) $$ s=F_{ex}(z,W)=\\sigma(g(z,W)) =\\sigma(W_2\\delta(W_1z)) $$ $\\delta$:ReLU;$\\sigma$:sigmoid激活,$W_1\\in R^{\\frac{C}{r}\\times C}$:降维层;$W_2\\in R^{C \\times\\frac{C}{r}}$:升维层 比直接用一个 Fully Connected 层的好处在于 1)具有更多的非线性,可以更好地拟合通道间复杂的相关性; 2)极大地减少了参数量和计算量 c可能很大,所以需要降维 scale操作:最后通过一个 Scale 的操作来将归一化后的权重加权到每个通道的特征上 $$ \\tilde x_c = F_{scale}(u_c,s_c)=s_cu_c $$$$ s = F_{se}(X,\\theta) = \\sigma(W_2\\delta(W_1 GAP(X))) \\\\ Y = sX \\\\ global\\ average\\ pooling\\rightarrow MLP\\rightarrow sigmoid $$ 缺点:在挤压模块中,全局平均池(一阶统计信息)太过简单,无法捕获复杂的全局信息。在激励模块中,全连接层增加了模型的复杂性。 GAP(全局平均池化)在某些情况下会失效,如将SE模块部署在LN层之后,因为LN固定了每个通道的平均数,对于任意输入,GAP的输出都是恒定的。 import torch from torch import nn class SEAttention(nn.Module): def __init__(self, channel=512,reduction=16): super().__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(channel, channel // reduction, bias=False), nn.ReLU(inplace=True), nn.Linear(channel // reduction, channel, bias=False), nn.Sigmoid() ) def forward(self, x): b, c, _, _ = x.size() y = self.avg_pool(x).view(b, c) y = self.fc(y).view(b, c, 1, 1) return x * y.expand_as(x) CV27 Momenta研发总监 孙刚 Squeeze and Excitation Networks上 CV27 Momenta研发总监 孙刚 Squeeze and Excitation Networks下 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:1:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"改进挤压模块 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:2:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"EncNet 文章标题:Context Encoding for Semantic Segmentation 作者:Hang Zhang, Kristin Dana, Jianping Shi, Zhongyue Zhang, Xiaogang Wang, Ambrish Tyagi, Amit Agrawal 发表时间:(CVPR 2018) Official Code (看不懂) EncModule $$ e_k = \\frac{\\sum_{i=1}^N e^{-s_k||X_i-d_k||^2}(X_i-d_k)}{\\sum_{i=1}^K e^{-s_j||X_i-d_j||^2}} \\\\ e = \\sum_{k=1}^K \\phi(e_k) \\\\ s = \\sigma(We) \\\\ Y = sX \\\\ encoder\\rightarrow MLP\\rightarrow sigmoid $$ class EncModule(nn.Module): def __init__(self, in_channels, nclass, ncodes=32, se_loss=True, norm_layer=None): super(EncModule, self).__init__() self.se_loss = se_loss self.encoding = nn.Sequential( nn.Conv2d(in_channels, in_channels, 1, bias=False), norm_layer(in_channels), nn.ReLU(inplace=True), Encoding(D=in_channels, K=ncodes), norm_layer(ncodes), nn.ReLU(inplace=True), Mean(dim=1)) self.fc = nn.Sequential( nn.Linear(in_channels, in_channels), nn.Sigmoid()) if self.se_loss: self.selayer = nn.Linear(in_channels, nclass) def forward(self, x): en = self.encoding(x) b, c, _, _ = x.size() gamma = self.fc(en) y = gamma.view(b, c, 1, 1) outputs = [F.relu_(x + x * y)] if self.se_loss: outputs.append(self.selayer(en)) return tuple(outputs) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:3:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"GSoP-Net 文章标题:Global Second-order Pooling Convolutional Networks 作者:Zilin Gao, Jiangtao Xie, Qilong Wang, Peihua Li 发表时间:(CVPR 2019) Official Code GSoP-block 压缩(Squeeze) 使用 1x1 卷积将输入特征通道降维(c’-\u003ec) 计算通道间的协方差矩阵($c\\times c$) 由于二次运算涉及到改变数据的顺序,因此对协方差矩阵执行逐行归一化,保留固有的结构信息 激励(Excitation) 对协方差特征图进行非线性逐行卷积得到4c的结构信息 用一个全连接层调整到输入的通道数c ′维度, 通过sigmoid 函数得到注意力向量与输入进行逐通道相乘,得到输出特征 $$ s = F_{gsop}(X,\\theta) = \\sigma(WRC(Cov(Conv(X)))) \\\\ Y=sX \\\\ 2nd\\ order\\ pooling\\rightarrow convolution\\\u0026MLP\\rightarrow sigmoid $$在收集全局信息的同时,使用全局二阶池化(GSoP)块对高阶统计数据建模 self.isqrt_dim = 256 self.layer_reduce = nn.Conv2d(512 * block.expansion, self.isqrt_dim, kernel_size=1, stride=1, padding=0, bias=False) self.layer_reduce_bn = nn.BatchNorm2d(self.isqrt_dim) self.layer_reduce_relu = nn.ReLU(inplace=True) self.fc = nn.Linear(int(self.isqrt_dim * (self.isqrt_dim + 1) / 2), num_classes) # forward x = self.layer_reduce(x) x = self.layer_reduce_bn(x) x = self.layer_reduce_relu(x) x = MPNCOV.CovpoolLayer(x) x = MPNCOV.SqrtmLayer(x, 3) x = MPNCOV.TriuvecLayer(x) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:4:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"FcaNet 文章标题:FcaNet: Frequency Channel Attention Networks 作者:Zequn Qin, Pengyi Zhang, Fei Wu, Xi Li 发表时间:(ICCV 2021) official code fca_module GAP是DCT(二维离散余弦变换)的特例 将图像特征分解为不同频率分量的组合。GAP操作仅利用到了其中的一个频率分量。 首先,将输入 X 按通道维度划分为n部分,其中n必须能被通道数整除。 对于每个部分,分配相应的二维DCT频率分量,其结果可作为通道注意力的预处理结果(类似于GAP) 2D DCT可以使用预处理结果来减少计算 $$ s = F_{fca}(X,\\theta) = \\sigma(W_2\\delta(W_1[(DCT(Group(X)))])) \\\\ Y=sX \\\\ discrete\\ cosine\\ transform\\rightarrow MLP\\rightarrow sigmoid $$ # https://github.com/cfzd/FcaNet/blob/aa5fb63505575bb4e4e094613565379c3f6ada33/model/layer.py#L29 class MultiSpectralAttentionLayer(torch.nn.Module): def __init__(self, channel, dct_h, dct_w, reduction = 16, freq_sel_method = 'top16'): super(MultiSpectralAttentionLayer, self).__init__() self.reduction = reduction self.dct_h = dct_h self.dct_w = dct_w mapper_x, mapper_y = get_freq_indices(freq_sel_method) self.num_split = len(mapper_x) mapper_x = [temp_x * (dct_h // 7) for temp_x in mapper_x] mapper_y = [temp_y * (dct_w // 7) for temp_y in mapper_y] # make the frequencies in different sizes are identical to a 7x7 frequency space # eg, (2,2) in 14x14 is identical to (1,1) in 7x7 self.dct_layer = MultiSpectralDCTLayer(dct_h, dct_w, mapper_x, mapper_y, channel) self.fc = nn.Sequential( nn.Linear(channel, channel // reduction, bias=False), nn.ReLU(inplace=True), nn.Linear(channel // reduction, channel, bias=False), nn.Sigmoid() ) def forward(self, x): n,c,h,w = x.shape x_pooled = x if h != self.dct_h or w != self.dct_w: x_pooled = torch.nn.functional.adaptive_avg_pool2d(x, (self.dct_h, self.dct_w)) # If you have concerns about one-line-change, don't worry. :) # In the ImageNet models, this line will never be triggered. # This is for compatibility in instance segmentation and object detection. y = self.dct_layer(x_pooled) y = self.fc(y).view(n, c, 1, 1) return x * y.expand_as(x) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:5:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"Billinear attention 文章标题:Bilinear Attention Networks for Person Retrieval 作者:Pengfei Fang , Jieming Zhou , Soumava Kumar Roy , Lars Petersson , Mehrtash Harandi, 发表时间:(ICCV 2019) Billinear_attention 线性注意块(双注意),以捕获每个通道内的局部成对特征交互,同时保留空间信息。 双注意采用注意中注意(AiA)机制来捕获二阶统计信息:从内部通道注意的输出计算外部逐点通道注意向量。形式上,给定输入特征映射X,bi注意首先使用双线性池来捕获二阶信息 $$ \\widetilde x = Bi(\\phi(X))=Vec(Utri(\\phi(X)\\phi(X)^T)) \\\\ \\hat x = \\omega (GAP(\\widetilde x))\\varphi(\\widetilde x) \\\\ s = \\sigma(\\hat x) \\\\ Y =sX $$","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:6:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"改进激励模块 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:7:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"ECANet 文章标题:ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks 作者:Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, Qinghua Hu 发表时间:(CVPR 2020) official code eca_module $$ s=F_{eca}(X,\\theta) = \\sigma(Conv1D(GAP(X))) \\\\ Y = sX \\\\ global\\ average\\ pooling\\rightarrow conv1d\\rightarrow sigmoid $$使用1D卷积来确定通道之间的相互作用,而不是全连接降维。 只考虑每个通道与其k近邻之间的直接交互,而不是间接对应,以控制模型复杂度 使用交叉验证从通道维度C自适应确定内核大小k,而不是通过手动调整 $k = \\psi(C)=|\\frac{log_{2}(C)}{\\gamma}+\\frac{b}{\\gamma}|_{odd}$ $\\gamma, b$超参数;$|x|_{odd}$:最近的奇数 kernel_size = int(abs((math.log(channel, 2) + b) / gamma)) kernel_size = kernel_size if kernel_size % 2 else kernel_size + 1 # 为啥源码ResNet固定kernel_size: https://github.com/BangguWu/ECANet/issues/24 import torch from torch import nn from torch.nn.parameter import Parameter class eca_layer(nn.Module): \"\"\"Constructs a ECA module. Args: channel: Number of channels of the input feature map k_size: Adaptive selection of kernel size \"\"\" def __init__(self, channel, k_size=3): super(eca_layer, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.conv = nn.Conv1d(1, 1, kernel_size=k_size, padding=(k_size - 1) // 2, bias=False) self.sigmoid = nn.Sigmoid() def forward(self, x): # feature descriptor on the global spatial information y = self.avg_pool(x) # Two different branches of ECA module y = self.conv(y.squeeze(-1).transpose(-1, -2)).transpose(-1, -2).unsqueeze(-1) # Multi-scale information fusion y = self.sigmoid(y) return x * y.expand_as(x) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:8:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"RCAN 文章标题:Image Super-Resolution Using Very Deep Residual Channel Attention Networks 作者:Yulun Zhang, Kunpeng Li 发表时间:(ECCV 2018) official Code CA $$ s=F_{rca}(X,\\theta) = \\sigma(Conv_U(\\delta(Conv_D(GAP(X)))) \\\\ Y = sX \\\\ global\\ average\\ pooling\\rightarrow conv2d\\rightarrow Relu \\rightarrow conv2d\\rightarrow sigmoid $$ class CALayer(nn.Module): def __init__(self, channel, reduction=16): super(CALayer, self).__init__() # global average pooling: feature --\u003e point self.avg_pool = nn.AdaptiveAvgPool2d(1) # feature channel downscale and upscale --\u003e channel weight self.conv_du = nn.Sequential( nn.Conv2d(channel, channel // reduction, 1, padding=0, bias=True), nn.ReLU(inplace=True), nn.Conv2d(channel // reduction, channel, 1, padding=0, bias=True), nn.Sigmoid() ) def forward(self, x): y = self.avg_pool(x) y = self.conv_du(y) return x * y ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:9:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"DIANet 文章标题:DIANet: Dense-and-Implicit Attention Network 作者:Zhongzhan Huang, Senwei Liang, Mingfu Liang, Haizhao Yang 发表时间:(AAAI 2020) official Code DIA_module $$ s = F_{dia}(X,\\theta) = \\delta (LSTM(GAP(X))) \\\\ Y = sX + X \\\\ global\\ average\\ pooling\\rightarrow LSTM\\rightarrow Relu $$ class DIA_Attention(nn.Module): def __init__(self, ModuleList, block_idx): super(DIA_Attention, self).__init__() self.ModuleList = ModuleList if block_idx == 1: self.lstm = LSTMCell(64, 64, 1) elif block_idx == 2: self.lstm = LSTMCell(128, 128, 1) elif block_idx == 3: self.lstm = LSTMCell(256, 256, 1) self.GlobalAvg = nn.AdaptiveAvgPool2d((1, 1)) self.relu = nn.ReLU(inplace=True) self.block_idx = block_idx def forward(self, x): for idx, layer in enumerate(self.ModuleList): x, org = layer(x) # 64 128 256 BatchSize * NumberOfChannels * 1 * 1 # BatchSize * NumberOfChannels if idx == 0: seq = self.GlobalAvg(x) # list = seq.view(seq.size(0), 1, seq.size(1)) seq = seq.view(seq.size(0), seq.size(1)) ht = torch.zeros(1, seq.size(0), seq.size(1)).cuda() # 1 mean number of layers ct = torch.zeros(1, seq.size(0), seq.size(1)).cuda() ht, ct = self.lstm(seq, (ht, ct)) # 1 * batch size * length # ht = self.sigmoid(ht) x = x * (ht[-1].view(ht.size(1), ht.size(2), 1, 1)) x += org # x = selrelu(x) else: seq = self.GlobalAvg(x) # list = torch.cat((list, seq.view(seq.size(0), 1, seq.size(1))), 1) seq = seq.view(seq.size(0), seq.size(1)) ht, ct = self.lstm(seq, (ht, ct)) # ht = self.sigmoid(ht) x = x * (ht[-1].view(ht.size(1), ht.size(2), 1, 1)) x += org # x = self.relu(x) # print(self.block_idx, idx, ht) return x #, list ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:10:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"同时改进挤压、激励模块 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:11:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"SRM 文章标题:SRM : A Style-based Recalibration Module for Convolutional Neural Networks 作者:HyunJae Lee, Hyo-Eun Kim, Hyeonseob Nam 发表时间:(ICCV 2019) Code SRM $$ s = F_{srm}(X,\\theta) = \\sigma(BN(CFC(SP(X)))) \\\\ Y = sX \\\\ style\\ pooling\\rightarrow convolution\\\u0026 MLP\\rightarrow sigmoid $$利用输入特征的平均值和标准偏差来提高捕获全局信息的能力 class SRMLayer(nn.Module): def __init__(self, channel, reduction=None): # Reduction for compatibility with layer_block interface super(SRMLayer, self).__init__() # CFC: channel-wise fully connected layer self.cfc = nn.Conv1d(channel, channel, kernel_size=2, bias=False, groups=channel) self.bn = nn.BatchNorm1d(channel) def forward(self, x): b, c, _, _ = x.size() # Style pooling mean = x.view(b, c, -1).mean(-1).unsqueeze(-1) std = x.view(b, c, -1).std(-1).unsqueeze(-1) u = torch.cat((mean, std), -1) # (b, c, 2) # Style integration z = self.cfc(u) # (b, c, 1) z = self.bn(z) g = torch.sigmoid(z) g = g.view(b, c, 1, 1) return x * g.expand_as(x) ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:12:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"GCT 文章标题:Gated Channel Transformation for Visual Recognition 作者:Zongxin Yang, Linchao Zhu, Yu Wu, Yi Yang 发表时间:(CVPR 2020) official Code GCT GCT模块可以促进 shallow layer 特征间的合作,同时,促进 deep layer 特征间的竞争。这样,浅层特征可以更好的获取通用的属性,深层特征可以更好的获取与任务相关的 discriminative 特征 通过计算每个通道的l2范数来收集全局信息。 利用可学习向量$\\alpha$对特征进行缩放。然后通过通道归一化,采用竞争机制来实现信道间的交互。 $$ s = F_{gct}(X,\\theta)=tanh(\\gamma CN(\\alpha Norm(X))+\\beta) \\\\ Y = sX+X \\\\ computer\\ L2norm\\ on \\ spatial\\rightarrow channel\\ normalization\\rightarrow tanh $$ class GCT(nn.Module): def __init__(self, num_channels, epsilon=1e-5, mode='l2', after_relu=False): super(GCT, self).__init__() self.alpha = nn.Parameter(torch.ones(1, num_channels, 1, 1)) self.gamma = nn.Parameter(torch.zeros(1, num_channels, 1, 1)) self.beta = nn.Parameter(torch.zeros(1, num_channels, 1, 1)) self.epsilon = epsilon self.mode = mode self.after_relu = after_relu def forward(self, x): if self.mode == 'l2': embedding = (x.pow(2).sum((2,3), keepdim=True) + self.epsilon).pow(0.5) * self.alpha norm = self.gamma / (embedding.pow(2).mean(dim=1, keepdim=True) + self.epsilon).pow(0.5) elif self.mode == 'l1': if not self.after_relu: _x = torch.abs(x) else: _x = x embedding = _x.sum((2,3), keepdim=True) * self.alpha norm = self.gamma / (torch.abs(embedding).mean(dim=1, keepdim=True) + self.epsilon) else: print('Unknown mode!') sys.exit() gate = 1. + torch.tanh(embedding * norm + self.beta) return x * gate ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:13:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"SoCA 文章标题:Second-order Attention Network for Single Image Super-Resolution 作者:Tao Dai1,2, Jianrui Cai , Yongbing Zhang 发表时间:(CVPR 2019) 基于RCAN official code SoCA $$ s=F_{soca}(X,\\theta) = \\sigma(Conv_U(\\delta(Conv_D(GCP(X)))) \\\\ Y = sX \\\\ global\\ covariance\\ pooling\\rightarrow conv2d\\rightarrow Relu \\rightarrow conv2d\\rightarrow sigmoid $$ class SOCA(nn.Module): def __init__(self, channel, reduction=8): super(SOCA, self).__init__() # global average pooling: feature --\u003e point # self.avg_pool = nn.AdaptiveAvgPool2d(1) # self.max_pool = nn.AdaptiveMaxPool2d(1) self.max_pool = nn.MaxPool2d(kernel_size=2) # feature channel downscale and upscale --\u003e channel weight self.conv_du = nn.Sequential( nn.Conv2d(channel, channel // reduction, 1, padding=0, bias=True), nn.ReLU(inplace=True), nn.Conv2d(channel // reduction, channel, 1, padding=0, bias=True), nn.Sigmoid() ) def forward(self, x): batch_size, C, h, w = x.shape # x: NxCxHxW N = int(h * w) min_h = min(h, w) h1 = 1000 w1 = 1000 if h \u003c h1 and w \u003c w1: x_sub = x elif h \u003c h1 and w \u003e w1: # H = (h - h1) // 2 W = (w - w1) // 2 x_sub = x[:, :, :, W:(W + w1)] elif w \u003c w1 and h \u003e h1: H = (h - h1) // 2 # W = (w - w1) // 2 x_sub = x[:, :, H:H + h1, :] else: H = (h - h1) // 2 W = (w - w1) // 2 x_sub = x[:, :, H:(H + h1), W:(W + w1)] ## MPN-COV cov_mat = MPNCOV.CovpoolLayer(x_sub) cov_mat_sqrt = MPNCOV.SqrtmLayer(cov_mat,5) ## cov_mat_sum = torch.mean(cov_mat_sqrt,1) cov_mat_sum = cov_mat_sum.view(batch_size,C,1,1) y_cov = self.conv_du(cov_mat_sum) return y_cov*x ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/channel-attention/:14:0","tags":["Deep Learning","注意力机制"],"title":"Channel Attention","uri":"/posts/deeplearning/attention/channel-attention/"},{"categories":["Deep Learning"],"content":"空间注意力","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"基于 RNN 的方法 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:1:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"RAM 文章标题: 作者: 发表时间:() ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:2:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"使用子网络来显式预测相关区域 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:3:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"STN ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:4:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"使用子网络来隐式预测软掩码以选择重要区域 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:5:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"GENet ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:6:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"自注意力相关 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:7:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"Non-local 文章标题:Non-local Neural Networks 作者:Xiaolong Wang, Ross Girshick, Abhinav Gupta, Kaiming He 发表时间:(CVPR 2018) official code Non-local_pytorch Non-Local-NN-Pytorch Non-Local $$ y = softmax(x^TW_\\theta^T W_\\phi x)g(x) \\\\ y_i=\\frac{1}{C(x)}\\sum_{\\forall j}f(x_i,x_j)g(x_j) \\\\ z_i= W_zy_i+x_i \\\\ Q,K \\ dot\\ product\\rightarrow softmax $$ f 函数式计算i和j的相似度;g 函数计算feature map在j位置的表示;最终的y是通过响应因子C(x) 进行标准化处理以后得到的 Non-Local neural networks的理解与实现 ","date":"2023-06-06","objectID":"/posts/deeplearning/attention/spatial-attention/:8:0","tags":["Deep Learning","注意力机制"],"title":"Spatial Attention","uri":"/posts/deeplearning/attention/spatial-attention/"},{"categories":["Deep Learning"],"content":"CycleGAN","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/cyclegan/","tags":["Deep Learning","生成模型"],"title":"CycleGAN","uri":"/posts/deeplearning/generative-model/cyclegan/"},{"categories":["Deep Learning"],"content":"CycleGAN 文章标题:Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks 作者:Jun-Yan Zhu, Taesung Park, Phillip Isola Alexei A. Efros 发表时间:(ICCV 2017) 论文主页 ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/cyclegan/:1:0","tags":["Deep Learning","生成模型"],"title":"CycleGAN","uri":"/posts/deeplearning/generative-model/cyclegan/"},{"categories":["Deep Learning"],"content":"pix2pix 文章标题:Image-to-Image Translation with Conditional Adversarial Nets 作者:Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros 发表时间:(CVPR 2017) 论文主页 官方代码 Image-to-Image Demo 床单充电线作画Gommy Sunday:https://vimeo.com/260612034 调色板生成:http://colormind.io/blog/ 人脸简笔画转肖像画:https://twitter.com/quasimondo/status/826065030944870400 ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/cyclegan/:2:0","tags":["Deep Learning","生成模型"],"title":"CycleGAN","uri":"/posts/deeplearning/generative-model/cyclegan/"},{"categories":["Deep Learning"],"content":"代码 https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/generative/pix2pix.ipynb https://github.com/TommyZihao/MMGeneration_Tutorials/blob/main/%E3%80%90E%E3%80%91%E5%9B%BE%E5%83%8F%E8%BD%AC%E8%AF%91-pix2pix.ipynb ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/cyclegan/:3:0","tags":["Deep Learning","生成模型"],"title":"CycleGAN","uri":"/posts/deeplearning/generative-model/cyclegan/"},{"categories":["Deep Learning"],"content":"DALL·E·2","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/dall_e2/","tags":["Deep Learning","生成模型"],"title":"DALL·E·2","uri":"/posts/deeplearning/generative-model/dall_e2/"},{"categories":["Deep Learning"],"content":"DALL·E·2 文章标题:Hierarchical Text-Conditional Image Generation with CLIP Latents | 作者:Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, Mark Chen 发表时间:(2022) dalle-mini pyorch code CLIP + Diffusion models ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/dall_e2/:1:0","tags":["Deep Learning","生成模型"],"title":"DALL·E·2","uri":"/posts/deeplearning/generative-model/dall_e2/"},{"categories":["Deep Learning"],"content":"title 使用CLIP训练好的特征做层级式的依托于文本的图像生成 层级式:先生成一个小分辨率的图片再多次上采样成高清大图 ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/dall_e2/:2:0","tags":["Deep Learning","生成模型"],"title":"DALL·E·2","uri":"/posts/deeplearning/generative-model/dall_e2/"},{"categories":["Deep Learning"],"content":"Methods DALL_E2 or unclip 先训练好一个CLIP模型,然后找到图片和文本对$(x,y)$之间的关系之后;给定一个文本,CLIP的文本编码器就可以把这个文本变成一个文本特征$z_t$; 训练一个prior模型$ P(z_i|y)$,输入文本特征,输出类似于CLIP的图像特征 CLIP生成的对应的图像特征$z_i$是用来训练prior做ground truth用的 方法:auto regressive自回归模型和扩散模型(选择了扩散模型) decoder解码器 $P(x|z_i,y)$ 输入图像特征生成一个完整的图像 扩散模型生成图像;扩散模型大部分时候是U-Net 通过将 CLIP 输出编码和添加 timestep embedding,并将 CLIP编码投影到四个额外的文本token中,token连接到 GLIDE 文本编码器的输出序列 使用classifier-free guidance guidance信号有10%的时间内把这个CLIP的特征呢设成0,在训练的时候有50%的时间内随机删除文本特征。 $$ P(x|y)=P(x,z_i|y)=P(x|z_i,y)P(z_i|y) $$","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/dall_e2/:3:0","tags":["Deep Learning","生成模型"],"title":"DALL·E·2","uri":"/posts/deeplearning/generative-model/dall_e2/"},{"categories":["Deep Learning"],"content":"拓展阅读 DALL·E 2【论文精读】 ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/dall_e2/:4:0","tags":["Deep Learning","生成模型"],"title":"DALL·E·2","uri":"/posts/deeplearning/generative-model/dall_e2/"},{"categories":["Deep Learning"],"content":"GAN","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/gan/","tags":["Deep Learning","生成模型"],"title":"GAN","uri":"/posts/deeplearning/generative-model/gan/"},{"categories":["Deep Learning"],"content":"GAN 文章标题:Generative Adversarial Networks 作者:Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio 发表时间:(NIPS 2014) 官方代码 ——Learnable cost function损失函数时从数据学习得到的 ——Mini-Max game based on Nash Equilibrium 博弈论 ——Hard to training -no guarantee to equilibrium GAN GAN理论 噪音 z 一维标量 (均匀分布采样得来),数据 x 也是 一维标量(图中黑点,高斯分布)。 G对噪声z的映射 D对噪声z的映射到数据空间的判断线 (a):G 把 均匀分布采样得到的噪声 z 映射到 绿色高斯分布 (b):D 尽可能地去学习如何判断 真实数据均值更靠左,将左侧的数据判断为真 1, 将右侧数据判断为假 0 (c):G 尽可能地去糊弄 D,把高斯分布的均值中心往左挪动 (d):G 把 来自均匀分布的噪声,映射成和真实 data x 相似的高斯分布。D 对来自真实采样的数据 or G 生成的数据无法判断来源,概率都是 0.5 Step1:Fix generator G , and update discriminator D. 训练D:D learns to assign high scores to real objects and low scores to generated objects.(提高判断力,分辨真假,生成数据为假) Step2:Fix discriminator D, and update generator G . 训练G:G learns to“fool”the discriminator.(提高造假能力,改进生成数据,使其接近真实数据) GAN_Alogorithm 流程 k 不能太小 –\u003e 保证 判别器 D 有足够的更新 k 不能太大 –\u003e D 过于完美,$D(G(z^{x(i)}) = 0$,对值为 0 求导,G 梯度更新有困难 k 的设计目标:使得 D 和 G 的更新进度差不多 ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/gan/:1:0","tags":["Deep Learning","生成模型"],"title":"GAN","uri":"/posts/deeplearning/generative-model/gan/"},{"categories":["Deep Learning"],"content":"全局最优解证明 GAN $P_{data}$:${x_i}_{i=1}^N$ $P_g(x;\\theta_g)$:generator,$G(z;\\theta_g)$ $y|x$:discriminater,$P(y=1|x)=D(x)$,$P(y=0|x)=1-D(x)$ $$ \\begin{equation} V(D,G) = \\mathbb{E}_{x\\sim P_{data}}[\\log D(x)] + \\mathbb{E}_{x\\sim P_{z}}[\\log (1-D(G(z)))] \\end{equation} $$$$ \\begin{equation} \\theta_g = \\arg\\max_{\\theta_g} \\sum_{i=1}^N \\log P_g(x_i) \\end{equation} $$$$ \\arg\\min_{\\theta_g} \\text{KL}(P_{data}\\|P_g) $$$$ \\begin{equation} \\begin{split} \\max_D V(D,G) = \u0026 \\int P_{\\text{data}} \\log D dx + \\int P_g \\log(1-D) dx \\\\ =\u0026 \\int (P_{\\text{data}} \\log D + P_g \\log(1-D)) dx \\end{split} \\end{equation} $$$$ \\begin{equation} \\begin{split} \\frac{\\partial \\max_D V(D,G)}{\\partial D} = \u0026 \\frac{\\partial}{\\partial D} \\int \\left[ P_{\\text{data}} \\log D + P_g \\log(1-D) \\right]dx \\\\ = \u0026 \\int \\frac{\\partial}{\\partial D} \\left[ P_{\\text{data}} \\log D + P_g \\log(1-D) \\right]dx \\\\ = \u0026 \\int P_{\\text{data} }\\cdot \\frac{1}{D} + P_g \\frac{-1}{1-D} dx \\\\ = \u0026 0 \\\\ \\end{split} \\end{equation} $$$$ \\begin{equation} D_G^\\ast = \\frac{P_{\\text{data}}}{P_{\\text{data}} + P_{g}} \\end{equation} $$$$ \\begin{equation} \\begin{split} \\min_G \\max_D V(D,G) = \u0026 \\min_G V(D_G^\\ast,G) \\\\ = \u0026 \\min_G \\mathbb{E}_{x\\sim P_{\\text{data}}} \\left[ \\log \\frac{P_{\\text{data}}}{P_{\\text{data}} + P_{g}}\\right] + \\mathbb{E}_{x\\sim P_{g}} \\left[ \\log \\frac{P_g}{P_{\\text{data}} + P_{g}}\\right] \\\\ = \u0026 \\min_G \\mathbb{E}_{x\\sim P_{\\text{data}}} \\left[ \\log \\frac{P_{\\text{data}}}{\\frac{P_{\\text{data}} + P_{g}}{2}} \\cdot \\frac{1}{2}\\right] + \\mathbb{E}_{x\\sim P_{g}} \\left[ \\log \\frac{P_g}{\\frac{P_{\\text{data}} + P_{g}}{2}} \\cdot \\frac{1}{2}\\right] \\\\ = \u0026 \\min_G \\text{KL}\\left[ P_{\\text{data}} \\| \\frac{P_{\\text{data}} + P_{g}}{2} \\right] + log\\frac{1}{2} +\\text{KL}\\left[ P_{g} \\| \\frac{P_{\\text{data}} + P_{g}}{2} \\right]+ log\\frac{1}{2} \\\\ = \u0026 \\min_G (-log 4+2\\cdot JS(P_{data}||P_g))\\\\ \\geq \u0026 - \\log 4 \\end{split} \\end{equation} $$当且仅当$P_{\\text{data}} = P_{g} = \\frac{P_{\\text{data}} + P_{g}}{2}$时,等号成立。此时,$P^\\ast_g = P_d,P^\\ast_d = \\frac{1}{2}$。 KL散度: $$ KL(P||Q)=\\sum P(x) log\\frac{P(x)}{Q(x)} or \\int P(x) log\\frac{P(x)}{Q(x)} dx $$JS散度: $$ JS(P_1 || P_2)=\\frac{1}{2}KL(P_1||\\frac{P_1+P_2}{2})+\\frac{1}{2}KL(P_2||\\frac{P_1+P_2}{2}) $$","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/gan/:2:0","tags":["Deep Learning","生成模型"],"title":"GAN","uri":"/posts/deeplearning/generative-model/gan/"},{"categories":["Deep Learning"],"content":"拓展阅读 趣味demo 交互式可视化GAN Lab:https://poloclub.github.io/ganlab 交互式可视化GAN Lab视频讲解:https://www.bilibili.com/video/BV1R44y1377T 照片转动漫AnimeGAN:https://github.com/TachibanaYoshino/AnimeGAN 英伟达GauGAN2神笔马良:http://gaugan.org/gaugan2/ 不存在的人脸:https://thispersondoesnotexist.com/ 不存在的花瓶:https://thisvesseldoesnotexist.com 不存在的梗图:https://imgflip.com/ai-meme 黑白老照片上色:https://deepai.org/machine-learning-model/colorizer 黑白老照片上色:https://github.com/jantic/DeOldify 黑白老照片上色:https://www.myheritage.com/incolor OpenAI DALLE-2:https://openai.com/dall-e-2 B站:GAN论文逐段精读【论文精读】 B站:李宏毅生成对抗网络公开课 B站:【机器学习】白板推导系列(三十一) ~ 生成对抗网络(GAN) 白板推导系列笔记 GAN动物园 OpenMMLab开源图像生成算法库MMGeneration 各种GAN的Keras实现 An Annotated Proof of Generative Adversarial Networks with Implementation Notes Understanding Generative Adversarial Networks ","date":"2023-06-05","objectID":"/posts/deeplearning/generative-model/gan/:3:0","tags":["Deep Learning","生成模型"],"title":"GAN","uri":"/posts/deeplearning/generative-model/gan/"},{"categories":["Deep Learning"],"content":"SSD","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/ssd/","tags":["Deep Learning","目标检测"],"title":"SSD","uri":"/posts/deeplearning/object-detection/ssd/"},{"categories":["Deep Learning"],"content":"SSD 文章标题:SSD: Single Shot MultiBox Detector 作者:Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy 发表时间:(ECCV 2016) ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/ssd/:1:0","tags":["Deep Learning","目标检测"],"title":"SSD","uri":"/posts/deeplearning/object-detection/ssd/"},{"categories":["Deep Learning"],"content":"论文结构 Abstract: 提出了SSD目标检测算法: 基于端对端学习的目标检测 将边界框的输出空间离散化为一组默认框 预测结果来自于不同层次的特征图 介绍了SSD的性能: 300*300分辨率的输入,VOC2007测试集上获得74.3%mAP和59FPS 512*512分辨率的输入,获得76.9%mAP,超过了最先进的Faster Rcnn Introduction:介绍了当前目标检测的常用方法;介绍了当前目标检测方法的缺点;.简单介绍了SSD的改进,成果和贡献。 Our improvements include using a small convolutional filter to predict object categories and offsets in bounding box locations, using separate predictors (filters) for different aspect ratio detections, and applying these filters to multiple feature maps from the later stages of a network in order to perform detection at multiple scales. SSD的改进: 不需要预先提出候选框 不需要像素或者特征的重采样 采用了多尺度的fliters 在多个特征图上预测 SSD的贡献: SSD 方法的核心是使用小卷积滤波器来预测特征图上固定的一组默认边界框的类别分数和位置偏移。 为了实现高检测精度,我们从不同尺度的特征图产生不同尺度的预测,并且通过宽高比来明确地分离预测。 总之,这些设计特性得到了简单的端到端训练和高精度,进一步提高速度和精度的权衡,即使输入相对低分辨率图像。 The Single Shot Detector (SSD): Model: 算法流程: 输入一张图片,经过CNN提取特征,并生成feautre map; 抽取其中六层feautre map,然后在feautre map的每个点生成default box; 将生成的default box所有集合,丢到NMS(非极大值抑制)中,输出筛选后的default box。 SSD网络 说明:Classifier:Conv:3×3×4×(Classes+4):3×3卷积核,4boxes,classes类别,4 offsets. SSD网络 Multi-scale feature maps for detection多尺度特征图检测 在基础网络(VGG16)结构后,添加了额外的卷积层,这些卷积层的大小是逐层递减的,可以在多尺度下进行 predictions。 Convolutional predictors for detection检测的卷积预测器 每个添加的特征层(或可选的基础网络的现有特征层)可以使用一组卷积滤波器产生固定的预测集合。对于具有p 个通道的大小为m×n 的特征层,使用3×3×p 卷积核卷积操作,产生类别的分数或相对于默认框的坐标偏移。在每个应用卷积核运算的m×n 大小位置处,产生一个输出值。边界框相对于默认框测量的偏移输出值。 Default boxes and aspect ratios默认框与宽高比 将一组默认边界框与顶层网络每个特征图单元关联。默认框对特征图作卷积运算,使得每个框实例相对于其对应单元格的位置是固定的。在每个特征映射单元中,我们预测相对于单元格中的默认框形状的偏移,以及每个框中实例的每类分数。具体来说,对于在给定位置的k 个框中每个框,我们计算c 类分数(位置和置信度)和相对于原始默认框的4 个偏移量。于是,在 feature map 中的每一个 feature map cell 上,就需要有$ (c+4)\\times k$ 个 filters。对于一张$ m×n$大小的 feature map,即会产生$(c+4)\\times k\\times m\\times n$ 个输出结果。 默认框与宽高比 feature map cell 就是将 feature map 切分成 8×8 或者 4×4 之后的一个格子; 而 default box 就是每一个格子上,一系列固定大小的 box,即图中虚线所形成的一系列 boxes。 Training Matching strateg匹配策略 (MutiBox ):the best jaccard overlap(IOU)匹配每一个ground truth box 和 default box. 与MutiBox不同的是:只要 jaccard overlap(IOU)大于一个threshold阈值(0.5)即配对成功。 Why:使得有多个重叠默认框时网络预测获得高置信度,不单单只找匹配度做高的框。 Training objective $x^p_{ij}={1,0}$;第$i$个default box与类别$p$的第$j$个ground truth box相匹配时值为1。 根据匹配策略,必有$\\sum_i x^p_{ij}\\geq1$。 objective loss function总体目标损失函数是位置损失(loc)和置信损失(conf)的加权和: :$L(x,c,l,g)=\\frac{1}{N}(L_{conf}(x,c)+\\alpha L_{loc}(x,l,g))$ N是与ground truth box相匹配的default boxes的个数 $L_{loc}$是类似于Fast RCNN 的预测框(l)和真实标签值框(g)参数之间的Smooth L1 loss. $L_{conf}$置信损失是softmax 损失对多类别置信$(c)$和权重项$\\alpha$设置为1 的交叉验证。 Choosing scales and aspect ratios for default boxes选择默认框的比例和宽高比 越底层的layers,保留的图像细节越多。 lower and upper feature maps for detection. $s_k=s_{min}+\\frac{s_{max}-s_{min}}{m-1}(k-1), \\ k\\in[1,m]$ $s_k$为每一个特征图中默认框尺寸大小,m 为feature maps,其中$s_{min}$ 是0.2,$s_{max}$ 是0.95,意味着最低层具有0.2 的刻度,最高层具有0.95 的刻度,并且其间的所有层是规则间隔的。 宽高比$a _r={1,2,3,\\frac{1}{2},\\frac{1}{3}}$;width:$w^a_k=s_k\\sqrt a_r$;height:$h^a_k=s_k/\\sqrt a_r$ 当$a_r$为1时,增加一个默认框,$s’k=\\sqrt{s_ks{k+1}}$。每个feature map location有6个默认框。 每个default box的中心:$(\\frac{i+0.5}{|f_k|},\\frac{j+0.5}{|f_k|})$ ,$|f_k|$为第k个feature map的大小,$i,j\\in[0,|f_k|]$ Hard negative mining 生成的预测框负样本Negative boxes远多于Positive boxes.(不平衡,训练难以收敛) 解决方案:每个默认框的最高置信度对它们进行排序,并选择前面的那些,使得正负样本之间的比率最多为3:1,以代替使用所有的负样本。 Data augmentation数据增广 Experimental Results: PASCAL VOC2007 Model analysis PASCAL VOC2012 COCO Preliminary ILSVRC results Data Augmentation for Small Object Accuracy Inference time ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/ssd/:1:1","tags":["Deep Learning","目标检测"],"title":"SSD","uri":"/posts/deeplearning/object-detection/ssd/"},{"categories":["Deep Learning"],"content":"YOLO","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"[toc] ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:0:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"YOLO V1 文章标题:You Only Look Once:Unified, Real-Time Object Detection 作者:Joseph Redmon, Santosh Divvalay, Ross Girshick, Ali Farhadi 发表时间:(CVPR 2016) YOLO算法是单阶段目标检测的经典算法,能实现快速、实时、高精度的图像识别和目标检测。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:1:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Abstract 介绍yolo算法及其速度快的优点 将检测变为一个 regression problem,YOLO 从输入的图像,仅仅经过一个 neural network,直接得到 bounding boxes 以及每个 bounding box 所属类别的概率。正因为整个的检测过程仅仅有一个网络,所以它可以直接 end-to-end 的优化。 速度快:标准的 YOLO 版本每秒可以实时地处理 45 帧图像。一个较小版本:Fast YOLO,可以每秒处理 155 帧图像,它的 mAP(mean Average Precision) 依然可以达到其他实时检测算法的两倍。 出现较多coordinate errors定位误差,但YOLO 有更少的 background errors背景误差。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:2:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Introduction yolo简单原理图;与R-CNN相比yolo的优点;与传统检测算法相比yolo的优点 yolo流程图 Resize image.将图片尺寸变为448*448 Run convolutional network.输入到神经网络中 Non-max suppression.使用非极大值抑制到最后结果 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:3:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Unified Detection one stage detection算法的原理与细节 yolo算法原理 将图片隐式的分为S*S个网格(grid cell) 物体的中心落在哪个网格内,哪个网格就负责预测这个物体 每个网格需要预测B个bounding box,C个类别(这B个框预测的为一个类别,一个物体) 如果一个网格内出现两个物体中心? 一个网格里包含了很多小物体? yolo对靠的很近的物体以及小目标群体检测效果不是很好 每个框包含了位置信息和置信度(x,y,w,h,confidence) xy表示bounding box的中心相对于cell左上角坐标偏移 宽高则是相对于整张图片的宽高进行归一化的。(物体相对grid cell的大小) 图中框线粗细表示confidence的大小 一张图预测的信息有S*S*(B*5+C)(注意:class信息是针对每个网格的,confidence信息是针对每个bounding box的。) Comfidence Score:指的是一个边界框中包含某个物体的可能性大小以及位置的准确性(即是否恰好包裹这个物体)。 Pr(object)是bounding box内存在对象的概率。Pr(object)并不管是哪个对象,它表示的是有或没有对象的概率。如果有object落在一个grid cell里,第一项取1,否则取0。第二项是预测的bounding box和实际的groundtruth之间的IoU值。其中IOU表示了预测的bbox与真实bbox(GT)的接近程度。置信度高表示这里存在一个对象且位置比较准确,置信度低表示可能没有对象或即便有对象也存在较大的位置偏差。 训练阶段: Pr(object)标签值非0即1;$IOU^{truth}_{pred}$按实际计算 两者乘积即为Comfidence Score的标签值 对于负责预测物体的box,这个便签值就是$IOU^{truth}_{pred}$ 预测阶段: 回归多少就是多少 隐含包含两者 YOLO的bbox是没有设定大小和形状的,只是对两个bbox进行预测,保留预测比较准的bbox。YOLO的2个bounding box事先并不知道会在什么位置,只有经过前向计算,网络会输出2个bounding box,这两个bounding box与样本中对象实际的bounding box计算IOU。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:4:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Network design yolo网络结构图 yolo网络结构 24层卷积层提取图像特征 2层全连接层回归得到$7\\times7\\times30$的Tensor ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:4:1","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Training yolo训练方法,损失函数及参数 最后一层用线性激活函数,其他层用leaky ReLU; 相比于ReLU,leaky并不会让负数直接为0,而是乘以一个很小的系数(恒定),保留负数输出,但是衰减负数输出 损失函数 设计目标就是让坐标(x,y,w,h),confidence,classification 这个三个方面达到很好的平衡。 yolov1损失函数 $\\mathbb I_{i j}^{obj}$:第$i$个grid cell的第$j$个bounding box若==负责==预测物体则为1,否则为0; $\\mathbb I_{i j}^{nobj}$:第$i$个grid cell的第$j$个bounding box若==不负责==预测物体则为1,否则为0; $\\mathbb I_{i }^{obj}$:第$i$个grid cell是否包含物体,即是否有ground truth 框的中心点落在此grid cell中,若有则为1,否则为0 全部采用sum-squared error loss存在的问题: 第一,8维的localization error和20维的classification error同等重要显然是不合理的; 第二,如果一个网格中没有object(一幅图中这种网格很多),那么就会将这些网格中的box的confidence push到0,相比于较少的有object的网格,这种做法是overpowering的,这会导致网络不稳定甚至发散。 解决办法: 更重视8维的坐标预测,给这些损失前面赋予更大的loss weight,记为$\\lambda_{coord}$在pascal VOC训练中取5。 对没有object的box的confidence loss,赋予小的loss weight,记为$\\lambda_{noobj}$在pascal VOC训练中取0.5。 有object的box的confidence loss和类别的loss的loss weight正常取1。 对不同大小的bbox预测中,相比于大bbox预测偏一点,小box预测偏一点更不能忍受。而sum-square error loss中对同样的偏移loss是一样。 为了缓和这个问题,作者用了一个比较取巧的办法,就是将box的width和height取平方根代替原本的height和width。 如下图:small bbox的横轴值较小,发生偏移时,反应到y轴上的loss(下图绿色)比big box(下图红色)要大。 yolov1损失函数 训练设置 batchsize=64;momentum=0.9(动量因子);decay=0.0005(权重衰减$ L_2$正则化) 第一个迭代周期学习率从$10^{-3}$到 $10^{-2}$;$10^{-2}$训练第2-75轮;$10^{-3}$再训练30轮;$10^{-4}$再训练30轮; 在第一个连接层之后,丢弃层使用=.05的比例,防止层之间的互相适应 数据增强: 引入原始图像$20%$大小的随机缩放和转换 在HSV色彩空间中使用1.5的因子来随机调整图像的曝光和饱和度。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:4:2","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Inference yolo预测阶段细节 $$ Pr(Class_i|Object)*Pr(Object)*IOU^{truth}_{pred}=Pr(Object)*IOU^{truth}_{pred} $$ 等式左边第一项就是每个网格预测的类别信息,第二三项就是每个bounding box预测的confidence。这个乘积即encode了预测的box属于某一类的概率,也有该box准确度的信息。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:4:3","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Limitations of YOLO 速度快:把检测作为回归问题处理,流程简单,仅需要输入一张图 泛化能力强:yolo可以学习到物体的通用特征,泛化能力更好。应用在新领域不会崩掉。 全局推理:对整张图处理,利用全图信息,假阳性错误少(背景当作物体错误率少) 精度与最先进的算法比不高,对小物体不友好 分类正确但定位误差大 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:4:4","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Comparison to Other Detection Systems DPM 传统特征:HOG 传统分类器:SVM 滑窗套模板 弹簧模型:子模型+主模型 R-CNN 候选区域生成 提取特征 SVM进行分类 NMS剔除重叠建议框 使用回归器精细修正候选框位置 Deep MultiBox OverFeat 使用全卷积网络进行高效滑窗运算 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:5:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Experiments R-T Systems on pas VOC 2007结果分析 与实时检测器相比: fast yolo 不仅速度而且map还高 yolo的map比fast yolo高,而且也可以达到实时检测 与速度稍慢的检测器相比:yolo在保证不错的精度同时速度最快。 各类错误比例分析 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:6:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Real-Time Detection In The Wild yolo可以连接摄像头进行实时检测 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:7:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Conclusion 结论再次强调yolo的优点:one-stage 快速 鲁棒 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:8:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"拓展阅读 Object Detection in 20 Years: A Survey YOLO发展路线博客 YOLO官网 YOLOv1官网 YOLOv1作者CVPR2016大会汇报 一个不错的slide介绍 Joseph Redmon 推特 谷歌学术主页 Github主页 [简历]([https://pjreddie.com/static/Redmon%20Resume.pdf](https://pjreddie.com/static/Redmon Resume.pdf)) 2017年8月TED演讲:How computers learn to recognize objects instantly | Joseph Redmon 2018年6月TED演讲:Computers can see. Now what? | Joseph Redmon | TEDxGateway ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:9:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"YOLO V2 文章标题:YOLO9000: Better, Faster, Stronger 作者:Joseph Redmon, Ali Farhadi 发表时间:(CVPR 2017) YOLOV2是YOLO目标检测系列算法的第二个版本。 第一部分:在YOLOV1基础上进行了若干改进优化,得到YOLOV2,提升算法准确度和速度。特别是增加了Anchor机制,改进了骨干网络。 第二部分:提出分层树状的分类标签结构WordTree,在目标检测和图像分类数据集上联合训练,YOLO9000可以检测超过9000个类别的物体。 CVPR 2017论文:YOLO9000: Better, Faster, Stronger,获得CVPR 2017 Best Paper Honorable Mention ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:10:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Better 其目的是弥补YOLO的两个缺陷: 定位误差 召回率(Recall)较低(和基于候选区域的方法相比) Recall 是被正确识别出来的物体个数与测试集中所有对应物体的个数的比值。 YOLOv2相比YOLOv1的改进策略 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Batch Normalization CNN网络通用的方法,不但能够改善网络的收敛性,而且能够抑制过拟合,有正则化的作用。 BN与Dropout通常不一起使用 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:1","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"High Resolution Classifier 在YOLO V2中使用ImageNet数据集,首先使用224×224的分辨率训练160个epochs,然后调整为448×448在训练10个epochs。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:2","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Convolutional With Anchor Boxes 在YOLO V2中借鉴 Fast R-CNN中的Anchor的思想。 去掉了YOLO网络的全连接层和最后的池化层,使提取特征的网络能够得到更高分辨率的特征。 使用$416\\times416$代替$448\\times448$作为网络的输入,得到的特征图的尺寸为奇数。 奇数大小的宽和高会使得每个特征图在划分cell的时候就只有一个center cell 网络最终将$416\\times416$的输入变成$13\\times13$大小的feature map输出,也就是缩小比例为32。(5个池化层,每个池化层将输入的尺寸缩小1/2)。 Anchor Boxes( 提高object的定位准确率)在YOLO中,每个grid cell只预测2个bbox,最终只能预测$7\\times7\\times2=98$个bbox。在YOLO V2中引入了Anchor Boxes的思想,,每个grid cell只预测5个anchor box,预测$13\\times13\\times5=845$个bbox。 总性能下降;recall增大;precision降低 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:3","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Dimension Clusters (聚类) (解决每个Grid Cell生成的bounding box的个数问题) K均值聚类 距离度量指标:$d(box,centroid)=1-IOU(box,centroid)$ 针对同一个grid cell,其将IOU相近的聚到一起 选择k=5 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:4","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Direct location prediction 模型不稳定,由于预测box的位置(x,y)引起的 Faster RCNN: $x=(t_x\\times w_a)+x_a$ $y=(t_y\\times h_a)+y_a$ $x,y$是预测边框的中心, $x_a,y_a$是先验框(anchor)的中心点坐标, $w_a,h_a$是先验框(anchor)的宽和高, $t_x,t_y$是要学习的参数。输出的偏移量 YOLOV2:将预测边框的中心约束在特定gird网格内 $b_x=\\sigma(t_x)+c_x$ $b_y=\\sigma(t_y)+c_y$ $b_w=p_we^{t_w}$ $b_h=p_he^{t_h}$ $Pr(object)*IOU(b,object)=\\sigma(t_o)$ $b_x,b_y,b_w,b_h$是预测边框的中心和宽高。 $Pr(object)∗IOU(b,object)$是预测边框的置信度,YOLO1是直接预测置信度的值,这里对预测参数$t_o$进行σ变换后作为置信度的值。 $c_x,c_y$是当前网格左上角到图像左上角的距离,要先将网格大小归一化,即令一个网格的宽=1,高=1。 $p_w,p_h$是先验框的宽和高。 $\\sigma$ 是sigmoid函数。 $t_x,t_y,t_w,t_h,t_o$是要学习的参数,分别用于预测边框的中心和宽高,以及置信度。 边框预测 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:5","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Fine-Grained Features 细粒度特征 提出一种称之为“直通”层(passthrough layer)的操作,也是将具有丰富纹理信息的浅层特征与具有丰富语义信息的深层特征进行融合,实现对目标的“大小通吃”。 据YOLO2的代码,特征图先用$1\\times1$卷积从$ 26\\times26\\times512 $降维到$ 26\\times26\\times64$,再做1拆4并passthrough。 passthrough ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:6","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Multi-Scale Training 通过不同分辨率图片的训练来提高网络的适应性。 采用了{320,352,…,608}等10种输入图像的尺寸,这些尺寸的输入图像对应输出的特征图宽和高是{10,11,…19}。训练时每10个batch就随机更换一种尺寸,使网络能够适应各种大小的对象检测。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:11:7","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Faster ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:12:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Darknet-19 BackBone:Darknet19 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:12:1","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Training for detection YOLOV2模型框架 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:12:2","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"损失函数 $$ \\begin{array}{r} \\operatorname{loss}_{t}=\\sum_{i=0}^{W} \\sum_{j=0}^{H} \\sum_{k=0}^{A} \\mathbb I_{\\text {Max IOU }\u003c\\text { Thresh }} \\lambda_{\\text {noobj }} *\\left(-b_{i j k}^{o}\\right)^{2} \\\\ +\\mathbb I_{t\u003c12800} \\lambda_{\\text {prior }} * \\sum_{r \\in(x, y, w, h)}\\left(\\text { prior }_{k}^{r}-b_{i j k}^{r}\\right)^{2} \\\\ +\\mathbb I_{k}^{\\text {truth }}\\left(\\lambda_{\\text {coord }} * \\sum_{r \\in(x, y, w, h)}\\left(\\text { truth }^{r}-b_{i j k}^{r}\\right)^{2}\\right. \\\\ +\\lambda_{o b j} *\\left(I O U_{\\text {truth }}^{k}-b_{i j k}^{o}\\right)^{2} \\\\ \\left.+\\lambda_{\\text {class }} *\\left(\\sum_{c=1}^{C}\\left(\\operatorname{truth}^{c}-b_{i j k}^{c}\\right)^{2}\\right)\\right) \\end{array} $$ W:输出特征图宽度13;H:输出特征图高度13; A:先验框个数为5 置信度误差(边框内无对象)background的置信度误差 $b_{ijk}^o$预测框置信度 计算各个预测框和所有ground truth的IOU值,并且取最大值Max_IOU,如果该值小于一定的阈值(YOLOv2使用的是0.6),那么这个预测框就标记为background 预测框与Anchor位置误差(前12800次迭代) $prior_k^r$:Anchor位置;$b_{ijk}^r$:预测框位置 $\\mathbb I_k^{truth}$:该Anchor和ground truth的IOU最大对应的预测框负责预测物体(IOU\u003e0.6但非最大的预测框忽略其损失) 定位误差(边框内有对象) $truth^r$:标注框位置;$b_{ijk}^r$:预测框位置 置信度误差(边框内有对象) $I O U_{\\text {truth }}^{k}$ :Anchor与标注框的IOU; $b_{i j k}^{o}$:预测框置信度 分类误差(边框内有对象) $truth^c$:标注框类别;$b_{ijk}^c$:预测框类别 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:12:3","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Stronger ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:13:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"拓展阅读 可视化YOLOv2网络结构 可视化YOLOv2-tiny YOLO v2 损失函数源码分析 YOLO v2的官方Darknet实现 YOLO v2的Keras实现 知乎:0目标检测那点儿事——更好更快的YOLO-V2 知乎:1目标检测那点儿事——更好更快的YOLO-V2 知乎:\u003c机器爱学习\u003eYOLOv2 / YOLO9000 深入理解 知乎:目标检测|YOLOv2原理与实现(附YOLOv3) 目标检测之YOLO V2 V3 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:14:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"YOLO V3 文章标题:YOLOv3: An Incremental Improvement 作者:Joseph Redmon ,Ali Farhadi 发表时间:(CVPR 2018) YOLOV3是单阶段目标检测算法YOLO系列的第三个版本,由华盛顿大学Joseph Redmon发布于2018年4月,广泛用于工业界。 改进了正负样本选取、损失函数、Darknet-53骨干网络,并引入了特征金字塔多尺度预测,显著提升了速度和精度。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:15:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"The Deal ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:16:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"[Bounding Box Prediction](###Direct location prediction) 正负样本的匹配 预测框(每个GT仅分配一个Anchor负责预测) 正例:与GT IOU最大 负例:IOU\u003c0.5 忽略:IOU\u003e0.5但非最大 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:16:1","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Predictions Across Scales多尺度 输入 grid cell Anchor 预测框数 输出张量的数据结构 YOLO V1 $448\\times448$ $7\\times7$ 0 $7\\times7\\times2=98$ $7\\times7\\times(5\\times B+C)$ YOLO V2 $416\\times416$ $13\\times13$ 5 $13\\times13\\times5=845$ $845\\times(5+20)$ YOLO V3 $256\\times256$ $32\\times32$,$16\\times16$,$8\\times8$ 3 $4032$ $4032\\times(5+80)$ $416\\times416$ $52\\times52$,$26\\times26$,$13\\times13$ 3 $10647$ $10647\\times(5+80)$ Yolov3借鉴了FPN特征图思想,小尺寸特征图用于检测大尺寸物体,而大尺寸特征图检测小尺寸物体。特征图的输出维度为 $N\\times N\\times[3\\times(4+1+80)]$,$N\\times N为$输出特征图格点数,一共3个Anchor框,每个框有4维预测框数值$t_x,t_y,t_w,t_h$ ,1维预测框置信度,80维物体类别数。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:16:2","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"yolov3网络图 BackBone:Darknet53 YOLOV3模型框架 Yolov3中,只有卷积层,通过调节卷积步长控制输出特征图的尺寸 concat操作与加和操作的区别:加和操作来源于ResNet思想,将输入的特征图,与输出特征图对应维度进行相加,即$y=f(x)+x$ ;而concat操作源于DenseNet网络的设计思路,将特征图按照通道维度直接进行拼接,例如的$8\\times8\\times16$特征图与$8\\times8\\times16$的特征图拼接后生成$8\\times8\\times32$的特征图。 上采样层(upsample):作用是将小尺寸特征图通过插值等方法,生成大尺寸图像。例如使用最近邻插值算法,将$8\\times8$的图像变换为$16\\times16$。上采样层不改变特征图的通道数。 YOLOV3 YOLOV3训练过程 YOLOV3测试过程 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:16:3","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"损失函数 $$ \\begin{equation} \\begin{split} {loss} \u0026= \\sum_{i=0}^{K^2}\\sum_{j=0}^{M}\\mathbb{I}_{i,j}^{obj}\\cdot (2-w_i\\cdot h_i)(-x_i log(\\hat x_i)-(1-x_i)log(1-\\hat x_i))\\\\ \u0026+\\quad\\sum_{i=0}^{K^2}\\sum_{j=0}^{M}\\mathbb{I}_{i,j}^{obj}\\cdot (2-w_i\\cdot h_i)(-y_i log( \\hat y_i)-(1-y_i)log(1-\\hat y_i))\\\\ \u0026+\\quad\\ \\sum_{i=0}^{K^2}\\sum_{j=0}^{M}\\mathbb{I}_{i,j}^{obj}\\cdot (2-w_i\\cdot h_i)[(w_i-\\hat w_i)^2+(h_i-\\hat h_i)^2]\\\\ \u0026- \\quad \\sum_{i=0}^{K^2}\\sum_{j=0}^{M}\\mathbb{1}_{i,j}^{obj}\\cdot[C_ilog(\\hat C_i)+(1-C_i)log(1-\\hat C_i)]\\\\ \u0026- \\quad \\sum_{i=0}^{K^2}\\sum_{j=0}^{M}\\mathbb{1}_{i,j}^{noobj}\\cdot[C_ilog(\\hat C_i)+(1-C_i)log(1-\\hat C_i)]\\\\ \u0026-\\quad \\sum_{i=0}^{K^2}\\sum_{j=0}^{M}\\mathbb{1}_{i,j}^{obj}\\cdot\\sum_{c\\in classes}[p_i(c)log(\\hat p_i(c))+(1-p_i(c))log(1-\\hat p_i(c))] \\\\ \\end{split}\\end{equation} $$一个是目标框位置$x,y,w,h$(左上角和长宽)带来的误差,又分为$x,y$带来的BCE Loss以及$w,h$带来的MSE Loss。 K:grid size;M:Anchor box;$\\mathbb{I}_{i,j}^{obj}$表示如果在$i,j$处的box有目标,则为1,否则为0;w 和 h 分别是ground truth 的宽和高 带$\\hat x$号代表预测值;不带的表示标签 一个是目标置信度带来的误差,也就是obj带来的loss(BCE Loss) $\\mathbb{I}_{i,j}^{noobj}$:是否为负样本 最后一个是类别带来的误差,也就是class带来的loss(类别数个BCE Loss)。 $BCE=-\\hat c_ilog(c_i)-(1-\\hat c_i)log(1-c_i)$:二元交叉熵损失函数(Binary Cross Entropy);$\\hat c_i$标签值(非0即1);$ c_i$预测值(0-1之间) ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:17:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"拓展阅读 YOLOV3目标检测Demo视频 YOLOv3官网 darknet github 代码复现 Ultralytics公司:https://github.com/ultralytics/yolov3 https://github.com/qqwweee/keras-yolo3 https://github.com/bubbliiiing/yolo3-pytorch cvpods:https://github.com/Megvii-BaseDetection/cvpods/blob/master/cvpods/modeling/meta_arch/yolov3.py 博客 知乎:深入浅出Yolo系列之Yolov3\u0026Yolov4\u0026Yolov5\u0026Yolox核心基础知识完整讲解 知乎:近距离观察YOLOv3 知乎:Yolo三部曲解读——Yolov3 Netron可视化YOLOV3网络结构 yolov3实现理论 yolo系列之yolo v3【深度解析】 YOLO v3网络结构分析 B站工程师Algernon鉴黄YOLO 损失函数 官方DarkNet YOLO V3损失函数完结版 What’s new in YOLO v3? 结构解析 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:18:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"YOLO V4 文章标题:YOLOv4: Optimal Speed and Accuracy of Object Detection 作者:Alexey Bochkovskiy,Chien-Yao Wang, Hong-Yuan Mark Liao 发表时间:(CVPR 2020) 原始代码 YoloV4-pytorch代码 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:19:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Introduction 提出了一种实时、高精度的目标检测模型。 它是可以使用1080Ti 或 2080Ti 等通用 GPU 来训练快速和准确的目标检测器; 在检测器训练阶段,验证了一些最先进的 Bag-of-Freebies 和 Bag-of-Specials 方法的效果; 对 SOTA 方法进行改进,使其效率更高,更适合单 GPU 训练,包括 CBN,PAN 和 SAM 等。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:20:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Related work ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:21:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Bag of freebies 只改变训练策略或只增加训练成本,不影响推理成本的方法;白给的提高精度(赠品) Data Augmentation 数据增强 增加输入图片的可变性;更高的鲁棒性。 像素级调整;保留调整区域内的所有原始像素信息。 photometric distortions 光照畸变 brightness, contrast,hue, saturation, and noise of an image亮度、对比度、色调、饱和度和噪声 geometric distortions 几何畸变 random scaling, cropping, flipping, and ro-tating 随机缩放、裁剪、翻转和旋转 模拟对象遮挡 random erase 随机擦除 CutOut :随机屏蔽输入的方形区域的简单正则化技术填充0像素值 hide-and-seek:训练图像中随机隐藏patches,当最具区别性的内容被隐藏时,迫使网络寻找其他相关内容 grid mask:通过生成1个和原图相同分辨率的mask,然后将该mask和原图相乘得到一个GridMask增强后的图像。 正则化 DropOut:随机删除减少神经元的数量,使网络变得更简单 DropConnect DropBlock:将Cutout应用到每一个特征图。并不是用固定的归零比率,而是在训练时以一个小的比率开始,随着训练过程线性的增加这个比率;可应用于网络的每一层;不同组合,灵活 图像融合 MixUp:使用两个图像以不同的系数比率进行乘法和叠加,然后用这些叠加的比率调整标签 CutMix:把Mixup和Cutout结合,切割一块patch并且粘贴上另外一张训练图片相同地方的patch,对应的label也按照patch大小的比例进行混合 风格迁移 类别不平衡 Two stage:RCNN … hard negative example mining:用初始的正负样本(一般是正样本+与正样本同规模的负样本的一个子集)训练分类器, 然后再用训练出的分类器对样本进行分类, 把其中负样本中错误分类的那些样本(hard negative)放入负样本集合, 再继续训练分类器, 如此反复, 直到达到停止条件(比如分类器性能不再提升). online hard example mining:自动地选择难分辨样本来进行训练 One stage:SSD,yolo… Focal Loss One-hot难表达类别之间的关联 label smoothing(Inception V3):将硬标签转化为软标签进行训练,可以使模型更具有鲁棒性 knowledge distillation:引入知识蒸馏的概念并用于设计标签细化网络 BBox Regression 重叠面积 中心点距离 长宽比 发展历程:IOU_Loss(2016)-\u003eGIOU_Loss(2019)-\u003eDIOU_Loss(2020)-\u003eCIOU_Loss(2020) IOU_Loss IOU_Loss A:预测框与真实框的交集;B:预测框与真实框的并集 $IOU=\\frac{A}{B}$ $IOU_{Loss}=1-IOU$ :考虑了预测BBox面积和ground truth BBox面积的重叠面积 IOU_Loss_q Q1:即状态1的情况,当预测框和目标框不相交时,IOU=0,无法反应两个框距离的远近,此时损失函数不可导,IOU_Loss无法优化两个框不相交的情况。 Q2:即状态2和状态3的情况,当两个预测框大小相同,两个IOU也相同,IOU_Loss无法区分两者相交情况的不同。 GIOU_Loss GIOU_Loss $GIOU=IOU-\\frac{|C-B|}{|C|}$;C:两框的最小外接矩形;差集=C-并集B $GIOU_{Loss}=1-GIOU$ :增加了相交尺度的衡量方式 GIOU_Loss_q Q:状态1、2、3都是预测框在目标框内部且预测框大小一致的情况,这时预测框和目标框的差集都是相同的,因此这三种状态的GIOU值也都是相同的,这时GIOU退化成了IOU,无法区分相对位置关系。 DIOU_Loss DIOU_Loss $DIOU=IOU-\\frac{{Distance_2}^2}{Distance_C^2}$;Distance_C:C的对角线距离;Distance_2:两个框的两个中心点的欧氏距离$DIOU_Loss=1-DIOU$ :考虑了重叠面积和中心点距离;当目标框包裹预测框的时候,直接度量2个框的距离,因此DIOU_Loss收敛的更快。 DIOU_Loss_q Q:目标框包裹预测框;预测框的中心点的位置都是一样的 CIOU_Loss $CIOU=IOU-\\frac{{Distance_2}^2}{Distance_C^2}-\\frac{v^2}{(1-IOU)+v}$ $v=\\frac{4}{\\pi^2}(arctan\\frac{w^{gt}}{h^{gt}}-arctan\\frac{w^{p}}{h^{p}})^2$ : gt表示目标框的宽高;p表示预测框的宽高 $CIOU_{Loss}=1-CIOU$:同时考虑到重叠面积和中心点之间的距离以及长宽比 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:21:1","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Bag of specials 少量增加了推理成本,却显著提升性能的插件模块和后处理方法;不免费,但很实惠(特价) Enlarging Receptive Field 扩大感受野 SPP :SPP将SPM集成到CNN使用max-pooling操作而不是bag-of-word运算; 源于SPM 将特征图分割成几个d×d相等大小的块,其中d可以是{1,2,3,…},从而形成空间金字塔,然后提取bag-of-word特征。 [YOLOV3](###YOLO V3)改进版SPP模块:将SPP模块修改为融合$k×k$池化核的最大池化输出,其中$k = {1,5,9,13}$,步长等于1。 一个相对较大的$k×k$有效地增加了backbone的感受野 ASPP :和改进版SPP模块区别是主要由原来的步长1、核大小为$k×k$的最大池化到几个$3×3$核,缩放比例为$k$,步长1的空洞卷积。 RFB :几个$k×k$核,缩放比例为$k$,步长1的空洞卷积 Attention Mechanism 注意力机制 channel-wise attention SE point-wise attention SAM SAM 对卷积层的输出特征图应用最大池化和平均池化。将这两个特征做concat操作来,然后在一个卷积层中传递,然后应用 sigmoid 函数,该函数将突出显示最重要的特征所在的位置。 SAM Feature Integration 特征融合模块 skip connection (FCN) hyper-column SFAM :使用SE模块在多尺度串联的特征图上执行channel-wise级别的重新加权 ASFF :使用softmax作为point-wise级别重新加权,然后添加不同尺度的特征图 BiFPN :提出了多输入加权残差连接以执行按 scale-wise级别重新加权,然后添加不同尺度的特征图。 Activation Function 激活函数 让梯度更有效地传播,同时不会造成太多额外的计算成本 ReLU:基本上解决梯度消失问题 traditional:$tanh,sigmoid$ LReLU ,PReLU :解决输出小于零时ReLU的梯度为零的问题。 ReLU6 (MobileNet),hard-Swish (MobileNet V3):专为量化网络设计 Scaled ExponentialLinear Unit (SELU) :self-normalizing 神经网络设计 Swish, Mish:连续可微的激活函数 Mish 的下界和上界为 [≈ -0.31,∞]。由于保留了少量的负面信息,Mish通过设计消除了**Dying ReLU现象**所必需的先决条件。较大的负偏差会导致 ReLu 函数饱和,并导致权重在反向传播阶段无法更新,从而使神经元无法进行预测。 Mish 属性有助于更好的表现力和信息流。由于在上面无界,Mish 避免了饱和,这通常会由于接近零的梯度而导致训练减慢。下界也是有利的,因为它会产生很强的正则化效果。 Post-processing Method 后处理方法 用来过滤对同一物体预测不好的BBoxes,只保留响应较高的候选BBoxes Greedy NMS (R-CNN):增加分类置信度;由高到低顺序 Soft NMS :考虑了对象的遮挡可能导致具有IoU得分的Greedy NMS中的置信度得分下降的问题 DIOU NMS:在soft NMS的基础上,在BBox筛选过程中加入中心点距离信息。 Anchor free里不使用NMS后处理:NMS都没有直接涉及提取特征图 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:21:2","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Methodology 目的是在输入网络分辨率、卷积层数目、参数数量和每层输出个数之间找到最佳平衡 Selection of architecture 检测器和分类器不同点 更大的输入网络尺寸(分辨率)——用于检测多个小尺寸目标 更多的层数——获得更大的感受野以便能适应网络输入尺寸的增加 更多参数——获得更大的模型容量以便在单个图像中检测多个大小不同的物体。 不同大小的感受野的影响 最大目标尺寸——允许观察到整个目标 最大网络尺寸——允许观察到目标周围的上下文 超出网络尺寸——增加图像像素点与最终激活值之间的连接数 Selection of BoF and BoS [Activations](####Activation Function 激活函数): ReLU, ==leaky-ReLU==, parametric-ReLU,ReLU6, SELU, ==Swish==, ==Mish== RRelu和SELU难训练;ReLU6是量化网络专用(排除选项) [Bounding box regression loss](####Post-processing Method 后处理方法): MSE, IoU, GIoU,CIoU, DIoU 不用CIOU_nms:影响因子v包含标注框信息;前向推理没有标注框信息 [Data augmentation](####Data Augmentation 数据增强l): CutOut, MixUp, CutMix,Mosaic [Regularization method](####Data Augmentation 数据增强l): DropOut, DropPath ,Spatial DropOut , DropBlock DropBlock最优 Normalization : BN, CGBN or SyncBN) ,FRN,CBN) 一个GPU:排除SyncBN Skip-connections: Residual connections, Weighted residual connections, Multi-input weighted residual connections(MiWRC), Cross stage partial connections (CSP) Additional improvements 引入了一种新的数据增强方法Mosaic和自对抗训练方法(Self-Adversarial Training,SAT) Mosaic:随机裁剪4个训练图片,再拼接到1张图片(COCO数据集目标分布不均衡) 丰富数据集 减少GPU 归一化计算每层的4张不同图片计算激活统计信息 减少large mini-batch size的需求 Augementation for small object dection 2019:界定大中小目标$(0-32;32-96;96-∞)$ Self-Adversarial Training (SAT) 自对抗训练 以2个forward backward stages的方式进行操作。在第一个阶段,神经网络改变的是原始图像而不是的网络权重。这样神经网络对其自身进行对抗性攻击,改变原始图像并创造出图像上没有目标的假象。在第2个阶段中,通过正常方式在修改的图像上进行目标检测对神经网络进行训练。 使用遗传算法选择最优超参数 修改的SAM、修改的PAN和Cross mini-Batch Normalization (CmBN) Modified SAM Modified PAN SAM从spatial-wise attention修改为point-wise attention。 PAN的 shortcut connection改为concatenation。 Cross mnin-Batch Normalization BN是对当前mini-batch进行归一化, CBN是对当前以及当前往前数3个mini-batch的结果进行归一化, CmBN 表示 CBN 修改版本,这仅在单个批次内的mini-batch之间收集统计信息。 当batch size变小时,BN不会执行。标准差和均值的估计值受样本量的影响。样本量越小,就越不可能代表分布的完整性。 YOLO V4 YOLOV4_CSP网络图 Backbone :CSPDarkNet53 每个CSP模块前面的卷积核的大小都是$3\\times3$,stride=2,起到下采样的作用。 因为Backbone有5个CSP模块,输入图像是$608\\times608$,所以特征图变化的规律是:608-\u003e304-\u003e152-\u003e76-\u003e38-\u003e19 Cross Stage Partial Network 跨阶段局部网络:CSPNet CSPNet: A New Backbone that can Enhance Learning Capability of CNN 原始代码 CSP模块,解决网络优化中的梯度信息重复 将基础层的特征映射划分为两部分,然后通过跨阶段层次结构将它们合并 通过截断梯度流来防止过多的重复梯度信息。 增强CNN学习能力,使得在轻量化的同时保持准确性 降低计算瓶颈 降低内存成本 Neck:PAN ,[SPP](####Enlarging Receptive Field 扩大感受野) SPP模块:显著地增加了感受野,分离出最显著的上下文特征,并且几乎没有造成网络运行速度的降低。 《DC-SPP-Yolo》:主干网络采用SPP比单一的使用最大池化方式更加有效地增加主干特征的接收范围;可以显著分离上下文特征。 FPN,自顶向下,将高层的特征信息通过上采样的方式进行传递融合,得到进行预测的特征图。传达强语义特征 PAN,自顶向上,传达强定位特征 Head:YOLOV3 使用技巧 Bag of Freebies (BoF) for backbone CutMix和Mosaic数据增强,DropBlock正则化, 类标签平滑 Bag of Specials (BoS) for backbone Mish激活函数,跨阶段部分连接(CSP),多输入加权残差连接 (MiWRC) Bag of Freebies (BoF) for detector: CIoU损失函数, CmBN, DropBlock正则化,Mosaic数据增强,自对抗训练(SAT),Eliminate grid sensitivity,为每个真实标签使用多个anchor,Cosine annealing scheduler,优化的超参数,随机的训练形状 Bag of Specials (BoS) for detector: Mish激活函数,SPP模块,SAM模块,路径聚合模块(PAN), DIoU-NMS ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:22:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Experiments ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:23:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"实验设置 ImageNet图像分类实验 训练步骤:8,000,000 batch size=128;mini-batch size=32 多项式衰减调度策略初始学习率=0.1 warm-up步骤=1,000 动量因子=0.9;衰减权重=0.005 均使用1080 Ti或2080 Ti GPU进行训练 MS COCO目标检测实验 训练步骤:500,500 batch size=64执行多尺度训练;mini-batch size=8或者4 步阶衰减学习率调度策略,初始学习率=0.01,分别在40万步和45万步上乘以系数0.1 动量因子=0.9;衰减权重=0.0005 遗传算法使用YOLOv3-SPP训练GIoU损失,并搜索300个epoch的最小5k集 搜索学习率=0.00261,动量=0.949,IoU阈值= 0.213, loss normalizer 0.07。 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:23:1","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"不同技巧对分类器和检测器训练的影响 分类器训练的BoF-backbone (Bag of Freebies)包括CutMix和Mosaic数据增强、类别标签smoothing。 Influence of BOF and Mish on the clasffie 检测器消融实验: Ablation Studies of BOF S:消除grid灵敏度,在YOLOv3通过方程$b_x=\\sigma(t_x)+c_x;b_y=\\sigma(t_y)+c_y$计算对象坐标,其中$c_x,c_y$始终为整数,因此,当$b_x$值接近$c_x$或$c_x+1$时需要极高的$t_x$绝对值。我们通过将sigmoid乘以超过1.0的因子来解决此问题,从而消除了没有检测到目标格子的影响。 M:Mosaic数据增强 IT:IoU阈值——如果IoU(ground truth, anchor) \u003e IoU阈值,为一个ground truth使用多个anchor GA:遗传算法 LS:类别标签smoothing CBN:CmBN CA:Cosine annealing scheduler——余弦退火衰减法;上升的时候使用线性上升,下降的时候模拟cos函数下降。执行多次。 DM:Dynamic mini-batch size——采用随机训练形状时,对于小分辨率的输入自动增大mini-batch的大小 OA:最优化Anchors 当使用SPP、PAN和SAM时,检测器获得最佳性能。 Ablation Studies of BOS ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:23:2","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"不同backbone和预训练权重对检测器训练的影响 CSPDarknet53比CSPResNeXt50更适合于做检测器的backbone ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:23:3","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"不同的mini-batch size对检测器训练的影响 训练时加入BoF和BoS后mini-batch大小几乎对检测器性能没有任何影响 不再需要使用昂贵的GPU来进行训练;一个即可 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:23:4","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"拓展阅读 知乎:YOLOv4 介绍及其模型优化方法 知乎:深入浅出Yolo系列之Yolov3\u0026Yolov4\u0026Yolov5\u0026Yolox核心基础知识完整讲解 YOLOv4网络详解 YOLOv4重磅发布,五大改进,二十多项技巧实验,堪称最强目标检测万花筒 项目实践YOLO V4万字原理详细讲解并训练自己的数据集 激活函数(ReLU, Swish, Maxout) YOLOv4论文详细解读 睿智的目标检测32——TF2搭建YoloV4目标检测平台 YOLO V4 — 损失函数解析 004.YOLO-V4(yolo系列) Explanation of YOLO V4 a one stage detector ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:24:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Scaled-YOLOv4 文章标题:Scaled-YOLOv4: Scaling Cross Stage Partial Network 作者:Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao 发表时间:(CVPR 2021) source code - Pytorch (use to reproduce results) scaled YOLOV4_L网络图 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:25:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"拓展阅读 Review — Scaled-YOLOv4: Scaling Cross Stage Partial Network YOLO演進 — 4 — Scaled-YOLOv4 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:26:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"YOLO V5 原始代码 6.1 YOLOV5_L_2网络图 YOLOV5_L网络图 数据增强 data/hyps/hyp.scratch-high.yaml配置 Mosaic copy paste:不同目标复制粘贴拼接 Random affine MixUp Albumentations 数据增强库 Augment HSV Random horizontal flip 训练策略 Multi-scale training (0.5~1.5x) AutoAnchor (For training custom data) Warmup and Cosine LR scheduler EMA (Exponential Moving Average) Mixed precision Evolve hyper-parameters ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:27:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"损失计算 Classes loss, 分类损失,采用的是BCE loss, 注意只计算正样本的分类损失。 Objectness loss, obj损失,采用的依然是BCE loss,注意这里的ob指的是网络预测的目标边界框与GT Box的CIoU。这里计算的是所有样本的obj损失。 $$ Loss = \\lambda_1L_{cls} + \\lambda_2L_{obj} + \\lambda_3L_{loc} $$","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:28:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"平衡不同尺度损失 $$ L_{obj} = 4.0\\cdot L_{obj}^{small}+1.0 L_{obj}^{medum}+0.4 L_{obj}^{large} $$","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:28:1","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"拓展阅读 YOLOv5网络详解 YOLOv5 教程 训练自定义数据 🚀推荐的 获得最佳训练结果的提示 ☘️ 推荐的 权重和偏差记录 🌟新的 用于数据集、标签和主动学习的 Roboflow 🌟新的 多 GPU 训练 PyTorch 集线器 ⭐新的 TFLite、ONNX、CoreML、TensorRT 导出 🚀 测试时间增强 (TTA) 模型合奏 模型修剪/稀疏 超参数演化 冻结层的迁移学习 ⭐新的 架构总结 ⭐新的 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:29:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"YOLOX 文章标题:YOLOX: Exceeding YOLO Series in 2021 作者:Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, Jian Sun 发表时间:(CVPR 2021) 原始代码 Anchor-Free 和yolov5的v5.0不同的是head部分 YOLOX_L_2网络图 YOLOX_L网络图 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:30:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"拓展阅读 知乎:深入浅出Yolo系列之Yolox核心基础完整讲解 B站:YoloX网络详解 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:31:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"YOLO V7 文章标题:YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors 作者:Chien-Yao Wang, Alexey Bochkovskiy, Hong-Yuan Mark Liao 发表时间:( 2022) 官方源码 ","date":"2023-06-04","objectID":"/posts/deeplearning/object-detection/yolo/:32:0","tags":["Deep Learning","目标检测"],"title":"YOLO","uri":"/posts/deeplearning/object-detection/yolo/"},{"categories":["Deep Learning"],"content":"Distilling knowledge ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/distilling-knowledge-/","tags":["Deep Learning","轻量级网络"],"title":"Distilling knowledge ","uri":"/posts/deeplearning/light-weight/distilling-knowledge-/"},{"categories":["Deep Learning"],"content":"Distilling knowledge 文章标题:Distilling the knowledge in a neural network 作者:Hinton G, Vinyals O, Dean J. 发表时间:(NIPS 2014) ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/distilling-knowledge-/:1:0","tags":["Deep Learning","轻量级网络"],"title":"Distilling knowledge ","uri":"/posts/deeplearning/light-weight/distilling-knowledge-/"},{"categories":["Deep Learning"],"content":"Distillation $$ q_i = \\frac{exp(z_i/T)}{\\sum _j exp(z_j/T)} $$","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/distilling-knowledge-/:2:0","tags":["Deep Learning","轻量级网络"],"title":"Distilling knowledge ","uri":"/posts/deeplearning/light-weight/distilling-knowledge-/"},{"categories":["Deep Learning"],"content":"拓展阅读 论文主页 蒸馏机理论文 Hinton-NeurIPS2019论文:When Does Label Smoothing Help? ICLR2021论文:Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study :知乎解读 Does Knowledge Distillation Really Work? :知乎解读 标签平滑和知识蒸馏的关系 发展趋势 多老师多学生 知识的表示、数据集蒸馏、对比学习: Attention Transfer论文:https://arxiv.org/abs/1612.03928 Dataset Distillation 多模态、知识图谱、预训练大模型的知识蒸馏 论文解读 Knowledge Distillation 百科 Knowledge distillation in deep learning and its applications Knowledge Distillation: Principles, Algorithms, Applications Hinton官方PPT 知乎:知识蒸馏是什么?一份入门随笔——公式(2)的详细推导 知乎:【经典简读】知识蒸馏(Knowledge Distillation) 经典之作 知乎:哪位来聊聊深度学习知识蒸馏(knowledge distillation)? 博客解读1 博客解读2 博客解读3 博客解读4 知乎:陀飞轮 代码库 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/distilling-knowledge-/:3:0","tags":["Deep Learning","轻量级网络"],"title":"Distilling knowledge ","uri":"/posts/deeplearning/light-weight/distilling-knowledge-/"},{"categories":["Deep Learning"],"content":"GhostNet","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/ghostnet/","tags":["Deep Learning","轻量级网络"],"title":"GhostNet","uri":"/posts/deeplearning/light-weight/ghostnet/"},{"categories":["Deep Learning"],"content":"GhostNet 文章标题:GhostNet: More Features from Cheap Operations 作者:Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu 发表时间:(CVPR 2020) 官方源码 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/ghostnet/:1:0","tags":["Deep Learning","轻量级网络"],"title":"GhostNet","uri":"/posts/deeplearning/light-weight/ghostnet/"},{"categories":["Deep Learning"],"content":"GhostNets 文章标题:GhostNets on Heterogeneous Devices via Cheap Operations 作者:Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian 发表时间:(IJCV 2022) 官方源码 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/ghostnet/:2:0","tags":["Deep Learning","轻量级网络"],"title":"GhostNet","uri":"/posts/deeplearning/light-weight/ghostnet/"},{"categories":["Deep Learning"],"content":"LCNet","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/lcnet/","tags":["Deep Learning","轻量级网络"],"title":"LCNet","uri":"/posts/deeplearning/light-weight/lcnet/"},{"categories":["Deep Learning"],"content":"PP-LCNet 文章标题:PP-LCNet: A Lightweight CPU Convolutional Neural Network 作者:Cheng Cui, Tingquan Gao, Shengyu Wei, Yuning Du, Ruoyu Guo, Shuilong Dong, Bin Lu, Ying Zhou, Xueying Lv, Qiwen Liu, Xiaoguang Hu, Dianhai Yu, Yanjun Ma 发表时间:(2021) 官方源码 设计流程 全面摸底:将目前所有轻量级模型及组合模型在Intel CPU上做推理(开启MKLDNN),记录推理时间和精度 总结归纳:寻找推理时间较小,精度较高的模型的特点 强强联合:组合这些block构建一个基本的baseline模型——BaseNet 锦上添花:基于BaseNet,实验性地添加学术界和工业界常用的提升模型性能的模块 替换激活函数ReLU为H-Swish 不涉及指数运算,模型推理时间几乎不变,精度提升2%~3% 合适位置添加SE模块 LCNet_SE_location 合适位置添加更大卷积核 kernel Size不是越大越好,有最优值 LCNet_large_kernel_location GAP后添加更大的1x1conv GAP后直接接分类层,特征进一步整合加工会受限,添加一层大的1x1的conv层可以有效避免这个问题 推理时间增加:10% 精度增加:2%~3% LCNet LCNet LCNet_ablation LCNet_ablation ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/lcnet/:1:0","tags":["Deep Learning","轻量级网络"],"title":"LCNet","uri":"/posts/deeplearning/light-weight/lcnet/"},{"categories":["Deep Learning"],"content":"MobileNet","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MobileNetV1 文章标题:MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 作者:Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam 发表时间:(CVPR 2017) MobileNet V1是谷歌2017年提出的轻量化卷积神经网络,用于在移动端、边缘终端设备上进行实时边缘计算和人工智能推理部署。 使用深度可分离卷积Depthwise Separable Convolution,在保证准确度性能的基础上,将参数量、计算量压缩为标准卷积的八到九分之一。引入网络宽度超参数和输入图像分辨率超参数,进一步控制网络尺寸。 在ImageNet图像分类、Stanford Dog细粒度图像分类、目标检测、人脸属性识别、人脸编码、以图搜地等计算机视觉任务上,结合知识蒸馏进行评估,MobileNet表现出极致的轻量化和速度性能。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:1:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Prior Work 压缩已有模型 知识蒸馏 权值量化 剪枝 权重剪枝 通道剪枝 注意力迁移 直接训练小模型 squeezeNet MobileNet ShuffleNet Xception EfficientNet NasNet DARTS 直接加速卷积运算 im2col+GEMM Winograd 低秩分解 硬件部署 TensorRT Jetson Tensorflow-slim Tensorflow-lite Openvino ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:2:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MobileNet Architecture ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:3:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Depthwise Separable Convolution 深度可分离卷积 将标准卷积分为两部分:depthwise convolution,$1\\times1$ pointwise convolution 逐层卷积处理每个特征通道上的空间信息,逐点卷积进行通道间的特征融合。 标准卷积:卷积核channel=输入特征矩阵channel;输出特征矩阵channel=卷积核个数 深度可分离卷积:卷积核channel=1;输出特征矩阵channel=卷积核个数=输入特征矩阵channel; 每个输入通道应用一个卷积核进行逐层卷积 标准卷积 深度可分离卷积 $D_K$:卷积核尺寸;$M$:卷积核通道数(输入通道数);$N$:卷积核个数(输出通道数);$D_F$:特征图大小 标准卷积参数计算:$D_K\\times D_K\\times M\\times N$; 计算量:$D_K\\times D_K\\times M\\times N\\times D_F\\times D_F$ 深度可分离卷积参数计算:$D_K\\times D_K\\times M+M\\times N$; 深度可分离卷积计算量:$D_K\\times D_K\\times M\\times D_F\\times D_F+M\\times N \\times D_F \\times D_F$ ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:3:1","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Network Structure and Training MobileNet Body Architecture Filter Shape:卷积核尺寸×输入特征矩阵深度×卷积核个数 第一层是标准卷积 放弃pooling层,而使用stride=2的卷积 所有层后面都有BN层和ReLU6;更多的ReLU6,增加了模型的非线性变化,增强了模型的泛化能力。 这个激活函数在float16/int8的嵌入式设备中效果很好,能较好地保持网络的鲁棒性。 深度可分离卷积 MobileNetV1的大部分计算量和参数量都是$1\\times1$卷积花费的。 MobileNetV1的计算量和参数分布 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:3:2","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Width and Resolution Multiplier 宽度$\\alpha $和分辨率$ \\rho$超参数 宽度超参数$\\alpha $:控制卷积层卷积核个数 $D_K\\times D_K\\times \\alpha M\\times D_F\\times D_F+\\alpha M\\times \\alpha N \\times D_F \\times D_F$ $\\alpha \\in(0,1]$;一般设置为:$1, 0.75,0.5,0.25$ 分辨率超参数$\\rho $:控制输入图像大小 $D_K\\times D_K\\times \\alpha M\\times \\rho D_F\\times \\rho D_F+\\alpha M\\times \\alpha N \\times \\rho D_F \\times \\rho D_F$ $\\rho \\in(0,1]$;一般设置为:$1, \\frac {6}{7},\\frac {5}{7},\\frac {4}{7}$ 对应分辨率为$224,192,160,128$ 计算举例:$D_K=3,M=512,N=512,D_F=14$ MobileNetV1的计算例子 深度卷积在GPU上运行速度还不如一般的标准卷积,因为depthwise 的卷积核复用率比普通卷积要小很多,计算和内存访问的比值比普通卷积更小,因此会花更多时间在内存开销上,而且per-channel的矩阵计算很小不容易并行导致的更慢,但理论上计算量和参数量都是大大减少的,只是底层优化的问题。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:3:3","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"代码 import torch.nn as nn class MobileNet(nn.Module): def __init__(self, num_classes=1000): super(MobileNet, self).__init__() self.num_classes = num_classes def conv_bn(inp, oup, stride): return nn.Sequential( nn.Conv2d(inp, oup, 3, stride, 1, bias=False), nn.BatchNorm2d(oup), nn.ReLU(inplace=True) ) def conv_dw(inp, oup, stride): return nn.Sequential( nn.Conv2d(inp, inp, 3, stride, 1, groups=inp, bias=False), nn.BatchNorm2d(inp), nn.ReLU(inplace=True), nn.Conv2d(inp, oup, 1, 1, 0, bias=False), nn.BatchNorm2d(oup), nn.ReLU(inplace=True), ) self.model = nn.Sequential( conv_bn(3, 32, 2), conv_dw(32, 64, 1), conv_dw(64, 128, 2), conv_dw(128, 128, 1), conv_dw(128, 256, 2), conv_dw(256, 256, 1), conv_dw(256, 512, 2), conv_dw(512, 512, 1), conv_dw(512, 512, 1), conv_dw(512, 512, 1), conv_dw(512, 512, 1), conv_dw(512, 512, 1), conv_dw(512, 1024, 2), conv_dw(1024, 1024, 1), nn.AvgPool2d(7), ) self.fc = nn.Linear(1024, self.num_classes) def forward(self, x): x = self.model(x) x = x.view(-1, 1024) x = self.fc(x) return x ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:4:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"扩展阅读 Keras中的MobileNet预训练模型文档 Keras中的MobileNet预训练模型代码 Laurent Sifre2013博士论文:Rigid-motion scattering for image classification] 贾扬清博士论文 为什么 MobileNet、ShuffleNet 在理论上速度很快,工程上并没有特别大的提升? 轻量级神经网络“巡礼”(二)—— MobileNet,从V1到V3 Why MobileNet and Its Variants (e.g. ShuffleNet) Are Fast 卷积神经网络中的Separable Convolution Google’s MobileNets on the iPhone ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:5:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MobileNetV2 文章标题:MobileNetV2: Inverted Residuals and Linear Bottlenecks 作者:Mark Sandler, Andrew Howard ,Menglong Zhu ,Andrey Zhmoginov, Liang-Chieh Chen 发表时间:(CVPR 2018) ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:6:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Preliminaries, discussion and intuition ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:7:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"[Depthwise Separable Convolution](###Depthwise Separable Convolution 深度可分离卷积) ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:7:1","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"linear bottleneck MobileNetV2_MobileNetV1 MobileNetV2微结构 第一层Pointwise convolution:目的是在数据进入深度卷积之前扩展数据中的通道数 Depthwise convolution的Filter数量取决于之前的Pointwise的通道数。而这个通道数是可以任意指定的,因此解除了3x3卷积核个数的限制 第二次Pointwise则不采用非线性激活,保留线性特征 If the manifold of interest remains non-zero volume after ReLU transformation, it corresponds to a linear transformation. ReLU is capable of preserving complete information about the input manifold, but only if the input manifold lies in a low-dimensional subspace of the input space. ReLU激活函数对低维特征信息造成大量损失。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:7:2","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Inverted residuals Inverted residuals 灰色为下一个结构的开始;有格子阴影的层:表示不包含非线性的层 The diagonally hatched texture indicates layers that do not contain non-linearities. MobileNetV2_ResNet MobileNetV2_ExpandProject ResNet 先降维 (0.25倍)、卷积、再升维,而 MobileNet V2 则是 先升维 (6倍)、卷积、再降维。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:7:3","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Information flow interpretation compress 扩展层充当解压缩器(如unzip),首先将数据恢复为完整形式,然后深度层执行网络此阶段重要的任何过滤,最后投影层压缩数据以使其再次变小。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:7:4","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Model Architecture MobileNetV2 Architecture t:expansion rate;c:卷积核个数;n:重复次数;s:首个模块的步长,其他为1 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:8:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Experiments Classifier 基础网络的输出通常是 7×7 像素的图像。分类器首先使用全局池化层将大小从 7×7 减小到 1×1 像素——基本上采用 49 个不同预测器的集合——然后是分类层和 softmax。 Object Detection 获取最后一个基础网络层的输出,还获取前几个层的输出,并将这些输出送到 SSD 层。MobileNet 层的工作是将输入图像中的像素转换为描述图像内容的特征,并将这些特征传递给其他层。因此,此处使用 MobileNet 作为第二个神经网络的特征提取器。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:9:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"拓展阅读 Keras预训练MobileNetV2源代码 与MobileNetV2有关的Github高赞开源项目 谷歌AI博客 Figure 2: MobileNetV2 with inverted residuals 图解MobileNetV2中的Bottlenecks 知乎:MobileNet V2 论文初读 MobileNet version 2 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:10:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MobileNetV3 文章标题:Searching for MobileNetV3 作者:Andrew Howard,Mark Sandler,Grace Chu,Liang-Chieh Chen,Bo Chen, Mingxing Tan,Weijun Wang,Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam 发表时间:(CVPR 2019) ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:11:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Efficient Mobile Building Blocks更新Block MobileNetV3_block 加入SE模块(Squeeze-and-Excite):SE模块是一种轻量级的通道注意力模块。depthwise之后,经过池化层,然后第一个fc层,通道数缩小4倍,再经过第二个fc层,通道数变换回去(扩大4倍),然后与depthwise进行按位相乘。 MobileNetV3_block_SE 更新激活函数 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:11:1","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"使用NAS搜索参数 利用NAS(神经结构搜索)和NetAdapt来搜索网络的配置和参数。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:11:2","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Redesigning Expensive Layers重新设计耗时层结构 MobileNetV3_Last stage 减少第一个卷积层的卷积核个数(32-\u003e16):使用ReLU或者swish激活函数,能将通道数缩减到16维,且准确率保持不变。这又能节省2ms的延时。 $$ ReLU6(x)=min(max(x,0),6)\\\\ h\\_sigmoid[x]=\\frac{ReLU6(x+3)}{6} $$ $swish\\ x = x\\dot \\sigma (x)$ ;$\\sigma = \\frac{1}{1+e^{-x}}$ 计算、求导复杂,对量化过程不友好。将sigmoid函数替换为piece-wise linear hard analog function. $$ h\\_swish[x]=x\\frac{ReLU6(x+3)}{6} $$精简Last Stage :Original Last Stage为v2的最后输出几层,v3版本将平均池化层提前了。在使用1×1卷积进行扩张后,就紧接池化层-激活函数,最后使用1×1的卷积进行输出。通过这一改变,能减少7ms的延迟,提高了11%的运算速度,且几乎没有任何精度损失。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:11:3","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MobileNetV3 Architecture Specification for MobileNetV3-Large Specification for MobileNetV3-Small exo_size:升维;#out 输出通道数; NL:激活函数 ; s:步距 ;NBN:没有批量归一化 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:12:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MobileNeXt 文章标题:Rethinking Bottleneck Structure for Efficient Mobile Network Design 作者:Zhou Daquan, Qibin Hou, Yunpeng Chen, Jiashi Feng, Shuicheng Yan 发表时间:(ECCV 2020) 官方代码 code block 每个块的厚度来表示相应的相对通道数 (a):Classic residual bottleneck blocks 经典残差块 一个 1×1 卷积用于通道缩减,一个 3×3 卷积用于空间特征提取,另一个 1×1 卷积用于通道扩展 (b):Inverted residual blocks 倒残差块 一个 1×1 卷积用于通道扩展,一个 3×3 深度可分离卷积用于空间特征提取,另一个 1×1 卷积用于通道缩减 将低维压缩张量作为输入,并通过逐点卷积将其扩展到更高维。应用深度卷积进行空间上下文编码,另一个逐点卷积以生成低维特征张量作为下一个块的输入。 由于相邻倒置差之间的表示是低维的。bottleneck之间的shortcut可能会阻止来自顶层的梯度在模型训练期间成功传播到底层。 深度可分离卷积用于空间特征提取后进行通道压缩可能无法保留足够的有用信息,造成信息丢失 ShuffleNetV2 在反向残差块之前插入一个通道拆分模块,并在其后添加另一个通道混洗模块 HBONet 中,下采样操作被引入到倒残差块中,用于对更丰富的空间信息进行建模。 MobileNetV3 提出在每个阶段搜索最优激活函数和倒残差块的扩展率 MixNet 提出在倒残差块中搜索深度可分离卷积的最佳内核大小 (c):Sandglass Block 沙漏块 在更高维度上执行恒等映射和空间变换,从而有效地减轻信息丢失和梯度混淆 MobileNeXt_different_variants Performance of different variants (a):直接修改经典的残差块构建的,将标准的 3×3 卷积替换为 3×3 深度卷积。 与(d)相比,性能下降了约 5%:可能由于深度卷积是在具有低维特征空间的bottleneck中进行的,因此无法捕获足够的空间信息。 (b):在a基础上添加加了另一个 3×3 深度卷积 与(a)相比 精度提高了 1% 以上:表明编码更多的空间信息确实有帮助 与(d)相比 (c):基于原始的倒残差块,将深度卷积从高维特征空间移动到特征通道较少的bottleneck位置 与(b)相比更差:表明在高维表示之间建立shortcut更有利于网络性能 (d):沙漏块 设计原则 为了在传输到顶层时保留来自底层的更多信息并促进跨层的梯度传播,应该在高维表示之间建立shortcut 具有小内核大小(例如 3 × 3)的深度卷积是轻量级的,可以适当地将几个深度卷积应用于更高维的特征,以便可以编码更丰富的空间信息。 将bottleneck保持在剩余路径的中间,以节省参数和计算成本。 高维表示之间建立shortcut。 两个深度卷积都是在高维空间中进行的,可以提取更丰富的特征表示。 使用线性瓶颈可以帮助防止特征值被归零,从而减少信息丢失:第一个逐点卷积之后不添加任何激活层。 仅在第一个深度卷积层和最后一个逐点卷积层之后添加激活层:最后一个卷积之后添加一个激活层会对分类性能产生负面影响(经验)。 MobileNeXt_sandglass_block 注意,$M\\neq N$时不添加shortcut ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:13:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MobileNeXt Architecture MobileNetV2 MobileNeXt b:重复次数。 t:通道扩展比。 k:类别 为了证明模型的好处来自于新颖架构,而不是利用更多的深度卷积或更大的感受野 与 MobileNetV2 的改进版本进行比较,在中间插入了一个深度卷积块。MobileNetV2 的性能提高到了 73%,这仍然比MobileNeXt的 (74%) 差得多。 $$ G_{1:\\alpha M}= φ(F)_{1:\\alpha M} + F_{1:\\alpha M}, \\ G_{\\alpha M:M} = φ(F)_{\\alpha M:M} $$ 首先,在减少乘数之后,可以减少每个构建块中的element-wise additions的数量。逐元素加法非常耗时。可以选择较低的恒等张量乘数以产生更好的延迟,而性能几乎没有下降。 其次,可以减少内存访问次数。减少恒等张量的通道维度可以有效地鼓励处理器将其存储在缓存或处理器附近的其他更快的内存中,从而改善延迟。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:14:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"ReXNet 文章标题:Rethinking Channel Dimensions for Efficient Model Design 作者:Dongyoon Han, Sangdoo Yun, Byeongho Heo, YoungJoon Yoo 发表时间:(CVPR 2021) v1版本叫做ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network 官方代码 ReXNet,ReXNet 是 NAVER 集团 ClovaAI 研发中心基于一种网络架构设计新范式而构建的网络。针对现有网络中存在的 Representational Bottleneck 问题,作者提出了一组新的设计原则。作者认为传统的网络架构设计范式会产生表达瓶颈,进而影响模型的性能。为研究此问题,作者研究了上万个随机网络生成特征的 matric rank,同时进一步研究了网络层中通道配置方案。基于此,作者提出了一组简单而有效的设计原则,以消除表达瓶颈问题。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:15:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Designing an Expansion Layer Visualization of the output rank 在第一个1×1卷积时,需要用6或更小的扩展比来设计一个inverted bottleneck; 在轻量级模型中,每个带有深度卷积的inverted bottleneck都需要更高的通道维度比; 复杂的非线性,如ELU和SiLU,需要放在1×1卷积或3×3卷积之后(不是深度卷积) channel dimension ratio:$d_{in}/d_{out}\\in[0.1,1]$ rank ratio:$rank(f(WX))/d_{out}$ $f(WX)$:输出特征;$W\\in R^{d_{out}\\times d_{in}};X\\in R^{d_{in}\\times N}$;$N$为batchsize;$f$为归一化后的非线性函数 Average Rank Ratio:每个模型取平均 Visualization of the searched models’ channel dimensions vs. block index Detailed searched channel configurations 从200个搜索过的模型中收集前10%、中间10%(即前50%和60%之间的模型)和后10%的模型 Red: top-10%; blue: middle-10%; green: bottom-10% accuracy models 红色的Block Index的线性参数化享有更高的精度,同时保持类似的计算成本。最佳模型的通道配置为线性增加。 绿色的模型大幅减少了输入侧的通道,因此,大部分的权重参数被放置在输出侧,导致精度的损失。 蓝色代表处于中间10%精度的模型,与传统通道配置相似。传统配置是通过限制早期层的通道,并在靠近输出的地方提供更多的通道来达到flop-efficienty的目的。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:16:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"Network upgrade MobileNetv2 ReXNet_1.0x MobileNetv1 ReXNet_plain ReXNet 通道数线性增加 在每个倒置瓶颈的第一个1×1卷积后替换ReLU6 观察到维数比较小的层需要更多的处理 第二个深度卷积的通道维数比为1,所以在此不替换ReLU6。 MB1和MB6指的是MobileNetV2的inverted bottleneck,扩展率分别为1和6。 ReXNet|消除表达瓶颈,提升性能指标 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:17:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"MixNet 文章标题:MixConv: Mixed Depthwise Convolutional Kernels 作者:Mingxing Tan, Quoc V. Le 发表时间:(BMVC 2019) MixConv,MixNet 是谷歌出的一篇关于轻量级网络的文章,主要工作就在于探索不同大小的卷积核的组合。作者发现目前网络有以下两个问题:小的卷积核感受野小,参数少,但是准确率不高;大的卷积核感受野大,准确率相对略高,但是参数也相对增加了很多.为了解决上面两个问题,文中提出一种新的混合深度分离卷积(MDConv)(mixed depthwise convolution),将不同的核大小混合在一个卷积运算中,并且基于 AutoML 的搜索空间,提出了一系列的网络叫做 MixNets,在 ImageNet 上取得了较好的效果。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/mobilenet/:18:0","tags":["Deep Learning","轻量级网络"],"title":"MobileNet","uri":"/posts/deeplearning/light-weight/mobilenet/"},{"categories":["Deep Learning"],"content":"NAS","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/nas/","tags":["Deep Learning","轻量级网络"],"title":"NAS","uri":"/posts/deeplearning/light-weight/nas/"},{"categories":["Deep Learning"],"content":"MnasNet 文章标题:MnasNet: Platform-Aware Neural Architecture Search for Mobile 作者:Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, Quoc V. Le 发表时间:(CVPR 2019) 谷歌轻量化卷积神经网络Mnasnet,介于MobileNet V2和V3之间。 使用多目标优化的目标函数,兼顾速度和精度,其中速度用真实手机推断时间衡量。 提出分层的神经网络架构搜索空间,将卷积神经网络分解为若干block,分别搜索各自的基本模块,保证层结构多样性。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/nas/:1:0","tags":["Deep Learning","轻量级网络"],"title":"NAS","uri":"/posts/deeplearning/light-weight/nas/"},{"categories":["Deep Learning"],"content":"拓展阅读 官方代码 Github-pytorch模型代码 知乎:如何评价 Google 最新的模型 MnasNet? 知乎:MnasNet:终端轻量化模型新思路 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/nas/:2:0","tags":["Deep Learning","轻量级网络"],"title":"NAS","uri":"/posts/deeplearning/light-weight/nas/"},{"categories":["Deep Learning"],"content":"Re-parameterization","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"ACNet 文章标题:ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks 作者:Xiaohan Ding, Yuchen Guo, Guiguang Ding, Jungong Han 发表时间:(ICCV 2019) 官方源码 ACNet,提出了一个Asymmetric Convolution Block (ACB),可以在普通的网络中加入一些ACB来代替普通的卷积,这个仅在训练的时候起作用,然后测试的时候可以使得网络恢复之前的结构,所以这种方法是提升了网络的性能但是完全不会破坏网络。 Reparam(KxK) = KxK-BN + 1xK-BN + Kx1-BN。这一记法表示用三个平行分支(KxK,1xK,Kx1)的加和来替换一个KxK卷积。注意三个分支各跟一个BN,三个分支分别过BN之后再相加。这样做可以提升卷积网络的性能 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:1:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"拓展阅读 结构重参数化:利用参数转换解耦训练和推理结构 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:2:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"ACNetV2 文章标题:Diverse Branch Block: Building a Convolution as an Inception-like Unit 作者:Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding 发表时间:(CVPR 2021) 官方源码 Reparam(KxK) = KxK-BN + 1x1-BN + 1x1-BN-AVG-BN + 1x1-BN-KxK-BN。本届CVPR接收的另一篇文章。跟ACNet的相似点在于都是通用的卷积网络基本模块,都可以用来替换常规卷积层。采用了更为复杂的连续卷积(1x1-BN-KxK-BN表示先过1x1卷积,再过BN,再过KxK卷积,再过另一个BN)和average pooling(记作AVG),效果超过ACNet。在这篇文章里也探索了reparam之所以work的原因,给出了一些解释。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:3:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"RepVGG 文章标题:RepVGG: Making VGG-style ConvNets Great Again 作者:Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun 发表时间:(CVPR 2021) 官方源码 RepVGG,RepVGG(Making VGG-style ConvNets Great Again)系列模型是由清华大学(丁贵广团队)、旷视科技(孙剑等人)、港科大和阿伯里斯特威斯大学在 2021 年提出的一个简单但强大的卷积神经网络架构,该架构具有类似于 VGG 的推理时间主体,该主体仅由 3x3 卷积和 ReLU 的堆栈组成,而训练时间模型具有多分支拓扑。训练时间和推理时间架构的这种解耦是通过结构重新参数化(re-parameterization)技术实现的,因此该模型称为 RepVGG。 Reparam(3x3) = 3x3-BN + 1x1-BN + BN。对每个3x3卷积,在训练时给它构造并行的恒等和1x1卷积分支,并各自过BN后相加。我们简单堆叠这样的结构得到形成了一个VGG式的直筒型架构。推理时的这个架构仅有一路3x3卷积夹ReLU,连分支结构都没有,可以说“一卷到底”,效率很高。这样简单的结构在ImageNet上可以达到超过80%的准确率,比较精度和速度可以超过或打平RegNet等SOTA模型。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:4:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"ResRep 文章标题:ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting 作者:Xiaohan Ding, Tianxiang Hao, Jianchao Tan, Ji Liu, Jungong Han, Yuchen Guo, Guiguang Ding 发表时间:(ICCV 2021) 官方源码 ResRep: Reparam(KxK) = KxK-BN-1x1。这是一个剪枝(channel pruning)方法。1x1卷积初始化为单位矩阵,因而不改变模型原本的输出。然后我们通过一套特殊设计的更新规则将这个单位矩阵变得行数少于列数(即output_channels\u003cinput_channels),然后将整个KxK-BN-1x1序列转换为一个KxK卷积,从而将原本的KxK卷积的output_channels减少。这一方法能在ResNet-50上实现超过50%压缩率的情况下精度完全不掉(从76.15%的torchvision标准模型压缩到还是76.15%),据我所知这是第一个实现如此高无损压缩率的传统(结构化,非动态,非NAS)剪枝方法。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:5:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"RepMLP 文章标题:RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality 作者:Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Jungong Han, Guiguang Ding 发表时间:(CVPR 2022) 官方源码 本文提出了一种将局部性注入 FC 层的重新参数化方法、一种新颖的 MLP 样式块和分层 MLP 架构。 RepMLPNet 在准确性-效率权衡和训练成本方面优于几个同时提出的 MLP。然而,作为 MLP,RepMLPNet 有几个明显的共同弱点。 1) 与 Vision Transformers 类似,MLP 容易过拟合,需要强大的数据增强和正则化技术。 2)在手机等低功耗设备上,MLP 的模型尺寸可能是一个障碍。 3) 虽然我们第一次尝试使用 MLP 骨干进行语义分割的结果很有希望,但我们没有观察到优于传统 CNN 的优势。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:6:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"RepLKNet 文章标题:Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs 作者:Xiaohan Ding, Xiangyu Zhang, Yizhuang Zhou, Jungong Han, Guiguang Ding, Jian Sun 发表时间:(CVPR 2022) 官方源码 这篇论文重新审视了在设计 CNN 架构时长期被忽视的大卷积核。我们证明,使用几个大内核而不是许多小内核可以更有效地产生更大的有效感受野,从而大幅提升 CNN 的性能,尤其是在下游任务上的性能,并在数据和模型扩展时大大缩小 CNN 和 ViT 之间的性能差距.我们希望我们的工作能够推进 CNN 和 ViT 的研究。一方面,对于 CNN 社区,我们的研究结果表明我们应该特别注意 ERF,这可能是高性能的关键。另一方面,对于 ViT 社区,由于大卷积可以替代具有类似行为的多头自注意力,这可能有助于理解自注意力的内在机制 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:7:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"RepGhost 文章标题:RepGhost: A Hardware-Efficient Ghost Module via Re-parameterization 作者:Chengpeng Chen, Zichao Guo, Haien Zeng, Pengfei Xiong, Jian Dong 发表时间:( 2022) 官方源码 为了在轻量级 CNN 架构设计中有效地利用特征重用,本文提出了一种新的视角,通过结构重新参数化技术隐式实现特征重用,而不是广泛使用但效率低下的串联操作。通过这种技术,提出了一种用于隐式特征重用的新颖且硬件高效的 RepGhost 模块。所提出的 RepGhost 模块在训练时融合来自不同层的特征,并在推理前在权重空间中执行融合过程,从而产生用于快速推理的简化且硬件高效的架构。基于 RepGhost 模块,我们开发了一个名为 RepGhostNet 的硬件高效轻量级 CNN,它在移动设备的准确性 - 延迟权衡方面展示了多项视觉任务的最新技术水平。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/re-parameterization/:8:0","tags":["Deep Learning","轻量级网络"],"title":"Re-parameterization","uri":"/posts/deeplearning/light-weight/re-parameterization/"},{"categories":["Deep Learning"],"content":"ShuffleNet","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"ShuffleNetV1 文章标题:Shufflenet: An extremely efficient convolutional neural network for mobile devices 作者:Xiangyu Zhang,Xinyu Zhou,Mengxiao Lin ,Jian Sun ,Megvii Inc (Face++) 发表时间:(CVPR 2018) 旷视官方开源ShuffleNet代码;pytorch代码 ShuffleNet V1,ShuffleNet主要包含两个新型的结构:分组逐点卷积(pointwise group conv)和通道重排(channel shuffle)。作者们通过使用分组的1x1卷积在MobileNet的基础上进一步减少参数, 同时为了保障分组后的Channelwise的信息交换, 作者们引入了ChannelShuffle这一操作, 将channel重新排列, 使得下次分组卷积的每一组特征图都含有来自上次卷积的各组特征图. 再引入残差连接之外, ShuffleNetV1也通过连接下采样的输入在降低分辨率的同时扩张通道数, 同时也没有引入新的参数。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:1:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"Related work ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:2:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"Group Convolution 每个卷积核不再处理所有输入通道,而只是处理一部分通道。 CondenseNet: An Efficient DenseNet using Learned Group Convolutions ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:2:1","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"Approach ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:3:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"Channel Shuffle for Group Convolutions Channel Shuffle (a) GConv虽然能够减少参数与计算量,但 GConv中不同组之间信息没有交流。 (b)(c) 通道重排 def shuffle_channels(x, groups): \"\"\" Parameters x: Input tensor of with `channels_last` data format groups: int number of groups per channel Returns channel shuffled output tensor Examples Example for a 1D Array with 3 groups \u003e\u003e\u003e d = np.array([0,1,2,3,4,5,6,7,8]) \u003e\u003e\u003e x = np.reshape(d, (3,3)) \u003e\u003e\u003e x = np.transpose(x, [1,0]) \u003e\u003e\u003e x = np.reshape(x, (9,)) '[0 1 2 3 4 5 6 7 8] --\u003e [0 3 6 1 4 7 2 5 8]' \"\"\" \"\"\"shuffle channels of a 4-D Tensor\"\"\" batch_size, channels, height, width = x.size() assert channels % groups == 0 channels_per_group = channels // groups # split into groups x = x.view(batch_size, groups, channels_per_group, height, width) # transpose 1, 2 axis x = x.transpose(1, 2).contiguous() # reshape into orignal x = x.view(batch_size, channels, height, width) return x ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:3:1","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"ShuffleNet unit ShuffleNetV1-block (a) ResNet网络中$1\\times1$卷积理论计算量占据93.4% $1\\times1$卷积换成GConv,第一个进行通道重排 (b) stride =1 (c) stride =2;进行Concat拼接 Feature Map的尺寸为$w\\times h \\times c$;bottleneck的通道数为$m$。 ResNet:$F_{ResNet}=hw(1\\times1\\times c\\times m)+hw(3\\times3\\times m\\times m)+hw(1\\times1\\times m\\times c)=hw(2cm+9m^2)$ ResNeXt:$F_{ResNeXt}=hw(1\\times1\\times c\\times m)+hw(3\\times3\\times m\\times m)/g+hw(1\\times1\\times m\\times c)=hw(2cm+9m^2/g)$ ShuffleNet:$F_{ShuffleNet}=hw(1\\times1\\times c\\times m)/g+hw(3\\times3\\times m)+hw(1\\times1\\times m\\times c)/g=hw(2cm/g+9m)$ 结构的第一个block的第一个point conv用的普通卷积 shuffle block:如下图所示,整个block首先通过如下图所示的group操作减少网络参数,并对group操作后输出的特征图作shuffle操作,用以消除由于group造成的特征屏蔽现象,紧接着再跟一个group操作。 ![shuffle block1](ShuffleNet.assets/shuffle block1.png) ShuffleNetV1 architecture ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:3:2","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"拓展阅读 旷视科技官网 知乎:如何看待 Face++ 旷视科技出品的轻量高效网络 ShuffleNet ? ShuffleNet V1/V2 | 轻量级深层神经网络 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:4:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"ShuffleNetV2 文章标题:ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design 作者:Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun 发表时间:(ECCV 2018) ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:5:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"高效网络设计的四个指导原则 G1:当输入和输出的通道数相同时,conv计算所需的MAC最小(保持FLOPs不变); Validation experiment for Guideline 1,Input image size is 56 × 56. MobileNetV2瓶颈结构违背了G1。 FLOPS计算量: $B=1\\times 1\\times c_1\\times h \\times w\\times c_2=hwc_1c_2$; $c_1$:输入通道数;$c_2$:输出通道数 $$ \\begin{equation} \\begin{split} MAC \u0026= h\\times w\\times c_1+h \\times w \\times c_2+1\\times 1\\times c_1 \\times c_2\\\\ \u0026=hw(c_1+c_2)+c_1c_2\\\\ \u0026=B(\\frac{1}{c_1}+\\frac{1}{c_2})+\\frac{B}{hw}\\\\ \u0026\\geq 2\\sqrt{hwB}+\\frac{B}{hw} (当且仅当c_1=c_时,等号成立) \\end{split} \\end{equation} $$ $$ \\sqrt{\\frac{{c_1}^2+{c_2}^2}{2}}\\geq \\frac{c_1+c_2}{2}\\geq \\sqrt{c_1c_2}\\geq \\frac{2}{\\frac{1}{c_1}+\\frac{1}{c_2}} $$ G2:大量的分组卷积会增加MAC开销(保持FLOPs不变); Validation experiment for Guideline 2,Input image size is 56 × 56. $$ \\begin{equation} \\begin{split} MAC \u0026=hw(c_1+c_2)+c_1c_2/g\\\\ \u0026=hwc_1+\\frac{Bg}{c_1}+\\frac{B}{hw}\\\\ \\end{split} \\end{equation} $$$$ B = \\frac{hwc_1c_2}{g} $$ 固定输入尺寸和计算量:$g$越大,MAC越大 ShuffleNet V1 严重依赖分组卷积,这违反了 G2 G3:网络结构的碎片化会减少其可并行优化的程度; GoogleNet系列和NASNet中很多分支进行不同的卷积/pool计算非常碎片,对硬件运行很不友好; Validation experiment for Guideline 3,Input image size is 56 × 56 2-fragment-series表示一个block中有2个卷积层串行,也就是简单的叠加; 4-fragment-parallel表示一个block中有4个卷积层并行,类似Inception的整体设计。 可以看出在相同FLOPs的情况下,单卷积层(1-fragment)的速度最快。 在 GPU 上碎片结构会大大降低运算速度,而在 CPU 上则不是那么明显 G4:Element-wise操作不可忽视; 对延时影响很大,包括Add/Relu/short-cut/depthwise convolution等,主要是因为这些操作计算与内存访问的占比太小; Validation experiment for Guideline 4 采用的是Resnet50的瓶颈结构(bottleneck),分别去掉其中的 ReLU 和跳跃连接,然后测试它们各自的运行速度。可以看到无论是去掉其中哪一个操作,运行速度都会加快。 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:6:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"ShuffleNet V2 ShuffleNetV2-block 去掉了分组卷积(G2)的操作,去掉了Add(G4)操作,换成两个分支拼接(Concat)起来,从而通道数量保持不变 (G1),然后进行与ShuffleNetV1相同的Channel Shuffle操作来保证两个分支间能进行信息交流。 ShuffleNetV2 architecture ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:7:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"拓展阅读 知乎:如何评价shufflenet V2? 知乎:轻量级神经网络“巡礼”(一)—— ShuffleNetV2 知乎:如何看待 Face++ 旷视科技出品的轻量高效网络 ShuffleNet ? 51CTO博客 ECCV 2018 | 旷视科技提出新型轻量架构ShuffleNet V2:从理论复杂度到实用设计准则 轻量级神经网络:ShuffleNetV2解读 torchstate-计算神经网络各层参数量和计算量 常见pytorch模型的参数量和MAC ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/shufflenet/:8:0","tags":["Deep Learning","轻量级网络"],"title":"ShuffleNet","uri":"/posts/deeplearning/light-weight/shufflenet/"},{"categories":["Deep Learning"],"content":"SqueezeNet","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/squeezenet/","tags":["Deep Learning","轻量级网络"],"title":"SqueezeNet","uri":"/posts/deeplearning/light-weight/squeezenet/"},{"categories":["Deep Learning"],"content":"SqueezeNet 文章标题:SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \u003c0.5MB model size 作者: 发表时间:(ICLR 2016) ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/squeezenet/:1:0","tags":["Deep Learning","轻量级网络"],"title":"SqueezeNet","uri":"/posts/deeplearning/light-weight/squeezenet/"},{"categories":["Deep Learning"],"content":"SqueezeNext 文章标题:SqueezeNext: Hardware-Aware Neural Network Design 作者: 发表时间:(CVPR 2018) 作者提出了三种策略来实现在保持精度的情况下大大减少当时主流模型(以AlexNet为例)的计算量和参数量: 将模型中一部分的3×3卷积用1×1来代替,1×1卷积是3×3参数量和计算量的1/9,所以可以大大减少参数量和计算量; 减少3×3卷积的输入通道数,这个可以通过在进入3×3卷积之前加一个1×1卷积来实现通道数量的减少; 将下采样层的位置往后推,使得模型可以在更大的feature map上进行更多的学习,这一步虽然会在增加计算量,但是和上面两个策略结合可以在维持模型精度的情况下仍大大减少参数量和计算量; fire module 这个fire module由squeeze部分和expand部分构成,squeeze部分是1×1的卷积层,而expand部分是1×1的卷积和3×3的卷积拼接起来的,每次feature map输入这个fire module会在squeeze层降低通道数,然后在expand通道增加通道数,从而在参数量更少的情况下仍然可以得到充分的学习。 最后结合一些模型压缩的方法可以使得SqueezeNet在达到AlexNet同等精度的情况下,参数量减少到后者的1/50,计算量减少到后者的1/510。 squeezenext 以 squeezenet为baseline。 Low Rank Filters,将3×3卷积分解为:3×1+1×3,实现低秩滤波器从而减少网络参数。 SqueezeNext Block提出了如图所示先利用两个1×1卷积核进行降维减少输入通道数,再通过两个低秩滤波器,最后通过1×1卷积再升维。 采用shortcut connection——ResNet经典结构 在multi-processor embedded system上进行实验,并通过实验结果指导网络的设计,使网络inference时速度更快 ","date":"2023-06-03","objectID":"/posts/deeplearning/light-weight/squeezenet/:2:0","tags":["Deep Learning","轻量级网络"],"title":"SqueezeNet","uri":"/posts/deeplearning/light-weight/squeezenet/"},{"categories":["Deep Learning"],"content":"AlexNet","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"AlexNet 文章标题:ImageNet Classification with Deep Convolutional Neural Networks 作者:Alex Krizhevsky, Ilya Sutskever, Hinton 发表时间:(NIPS 2012) 论文主页 AlexNet是2012年ImageNet图像分类竞赛冠军,首次将深度学习和卷积神经网络用于大规模图像数据集分类,比之前的模型有巨大的性能飞跃,在ILSVRC-2012图像分类竞赛中获得了top-5误差15.3%的冠军成绩,远远优于第二名(top-5错误率为26.2%),在学术界和工业界引起巨大轰动,自此之后,计算机视觉开始广泛采用深度卷积神经网络,模型性能日新月异,并迁移泛化到目标检测、语义分割等其它计算机视觉任务。 AlexNet的作者之一Hinton因为在神经网络和计算机视觉的贡献,获得2019年图灵奖。 AlexNet采用了ReLU激活函数、双GPU模型并行、LRN局部响应归一化、重叠最大池化、数据增强、Dropout正则化等技巧。 AlexNet包含五个卷积层,池化层,Dropout层和三个全连接层,最终通过1000个输出神经元进行softmax分类。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:1:0","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"The Dataset ImageNet 网络对每张图片给出五个预测l类别结果概率从高到低 Top1:概率最高的预测类别为正确标签 Top5:五个预测类别里包含正确标莶 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:2:0","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"The Architecture ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:3:0","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"ReLu激活函数 在AlexNet中用的非线性非饱和函数是$f=max(0,x)$,即ReLU。实验结果表明,要将深度网络训练至training error rate达到25%的话,ReLU只需5个epochs的迭代,但tanh单元需要35个epochs的迭代,用ReLU比tanh快6倍。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:3:1","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"多GPU模型并行 为提高运行速度和提高网络运行规模,作者采用双GPU的设计模式。并且规定GPU只能在特定的层进行通信交流。其实就是每一个GPU负责一半的运算处理。作者的实验数据表示,two-GPU方案会比只用one-GPU跑半个上面大小网络的方案,在准确度上提高了1.7%的top-1和1.2%的top-5。值得注意的是,虽然one-GPU网络规模只有two-GPU的一半,但其实这两个网络其实并非等价的。 由反向传播原理,显存中不仅存储模型参数还需存储正向传播时每一层batch的中间结果。batch size越大,占显存越大。 双GPU(全参数)的训练时间比单GPU(半参数)更短;单GPU(半参数)模型中最后一个卷积层和全连接层数量和双GPU(全参数)模型相同,因此“半参数”并非真的只有一半的参数。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:3:2","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"LRN局部响应归一化 $$ b_{x,y}^i=a_{x,y}^i/(k+\\alpha \\sum_{j=max(0,i-n/2)}^{min(N-1,i+n/2)}(a_{x,y}^j)^2)^\\beta $$ $a_{x,y}^i$:代表在feature map中第$i$个通道上$(x,y)$位置上的值;k常数防止分母为0; N:feature map 通道数(本层卷积核个数)n:表示相邻的几个卷积核。 $(k,\\alpha,\\beta,n)=(0,1,1,N)$代表普通沿所有通道归一化; $(k,\\alpha,\\beta,n)=(2,10^{-4},0.75,5)$AlexNet所用参数他们的值是在验证集上实验得到的。 这种归一化操作实现了某种形式的横向抑制(兴奋的神经元对周围神经元有抑制作用)。 卷积核矩阵的排序是随机任意,并且在训练之前就已经决定好顺序。这种LRN形成了一种侧向抑制机制。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:3:3","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"Overlapping Pooling 池层是相同卷积核领域周围神经元的输出。池层被认为是由空间距离s个像素的池单元网格的组成。也可以理解成以大小为步长对前面卷积层的结果进行分块,对块大小为的卷积映射结果做总结。Pooling单元在总结提取特征的时候,其输入会受到相邻pooling单元的输入影响,也就是提取出来的结果可能是有重复的(对max pooling而言)。而且,实验表示使用带交叠的Pooling的效果比的传统要好,在top-1和top-5上分别提高了0.4%和0.3%,在训练阶段有避免过拟合的作用。 后来的paper不采用这种方法 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:3:4","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"总体结构 网络结构详解 conv1–\u003eReLu–\u003ePool–\u003eLRN;conv2–\u003eReLu–\u003ePool–\u003eLRN;conv3–\u003eReLu;conv4–\u003eReLu;conv5–\u003eReLu–\u003ePool AlexNet网络结构 输入数据为$227\\times227\\times3$图像,通过Conv1,卷积核为,$11\\times11$,卷积核个数为96个,步长为4,Padding为2,输出特征图$55\\times55\\times96$。$N=(W-F+2P)/s+1=[227-11+(2\\times2)]/4+1=55$ 输入数据为$224\\times224\\times3$图像,通过Conv1,卷积核为$11\\times11$,卷积核个数为96个,步长为4,Padding为[1:2](左上各补1个0,右下各补2个0);输出特征图$55\\times55\\times96$。$N=(W-F+2P)/s+1=[224-11+(1+2)]/4+1=55$ tuple:(1,2) # 1代表上下方各补一行零 # 2代表左右两侧各补两列零 nn.ZeroPad2d((1,2,1,2)) # 左侧补一列,右侧补两列((z)) # 上方补一行,下方补两行 Maxpooling1滑动窗口$3\\times3\\times96$,步长为2,padding:0 ;输出特征图$27\\times27\\times96$。$N=(W-F+2P)/s+1=[55-3+0)]/2+1=27$ 池化操作只改变特征图大小,不改变深度。 通过Conv2,卷积核为$5\\times5$,卷积核个数为256个,步长为1,Padding为2,输出特征图$27\\times27\\times256。$$N=(W-F+2P)/s+1=[27-5+(2\\times2)]/1+1=27$ Maxpooling2滑动窗口$3\\times3\\times256$,步长为2,padding:0 ;输出特征图$13\\times13\\times256$。$N=(W-F+2P)/s+1=[27-3+0)]/2+1=13$ 通过Conv3,卷积核为$3\\times3$,卷积核个数为384个,步长为1,Padding为1,输出特征图$13\\times13\\times384$ 。 $N=(W-F+2P)/s+1=[13-3+(2\\times1)]/1+1=13$ 通过Conv4,卷积核为$3\\times3$,卷积核个数为384个,步长为1,Padding为1,输出特征图$13\\times13\\times384$ 。 $N=(W-F+2P)/s+1=[13-3+(2\\times1)]/1+1=13$ 通过Conv5,卷积核为$3\\times3$,卷积核个数为256个,步长为1,Padding为1,输出特征图$13\\times13\\times256$ 。 $N=(W-F+2P)/s+1=[13-3+(2\\times1)]/1+1=13$ Maxpooling3滑动窗口$3\\times3\\times256$,步长为2,padding:0 ;输出特征图$6\\times6\\times256$。$N=(W-F+2P)/s+1=[13-3+0)]/2+1=6$ FC6:$6\\times6\\times256$进行扁平化处理成为$1\\times 9216$,用一个维度为$9216\\times4096$矩阵完成输入输出的全连接,输出$1\\times 4096$ FC7:用一个维度为$4096\\times4096$矩阵完成输入输出的全连接,输出$1\\times 4096$ FC8:用一个维度为$4096\\times1000$矩阵完成输入输出的全连接,输出$1\\times 1000$ ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:3:5","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"Reducing Overfiting ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:4:0","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"Data Augmentation数据增强 针对==位置==: 训练阶段:随机地从$256\\times256$的原始图像中截取$224\\times224$大小的区域(水平翻转及镜像),相当于增加了$2*(256-224)^2=2048$倍的数据量。 如果没有数据增强,仅靠原始的数据量,参数众多的CNN会陷入过拟合中,使用了数据增强后可以大大减轻过拟合,提升泛化能力。 测试阶段:取图片的四个角加中间共5个位置,并进行左右翻转,一共获得10张图片,对他们进行预测并对10次结果求均值 针对==颜色==: 对图像的RGB数据进行PCA处理,并对主成分做一个标准差为0.1的高斯扰动,增加些噪声,(修改RGB通道像素值)这个 Trick可以让错误率再下降1%。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:4:1","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"Dropout 随机失活 随机:dropout probability (eg: p=0.5) 失活:weight = 0 训练阶段:每一个batch随机失活一半的神经元(将神经元输出设置为0)阻断该神经元的前向-反向传播。 预测阶段:保留所有神经元,预测结果乘以0.5 。 Dropout减少过拟合的理由 模型集成 p=0.5意味着$2^n$个共享权重的潜在网络 记忆随即抹去 减少神经元之间的联合依赖性 有性繁殖 每个基因片段都要与来自另一个随即个体的基因片段协同工作 数据增强 总可以找到一个图片使神经网络中间层结果与Dropout后相同 相当于增加了这张图片到数据集里 稀疏性 等价于正则项 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:4:2","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 AlexNet – ImageNet Classification with Deep Convolutional Neural Networks LRN与BN的区别 AlexNet的CUDA代码实现 重叠池化为何能防止过拟合 NORB数据集 Caltech-101数据集 Caltech-256数据集 Caltech-256数据集(Kaggle) 标注工具LabelMe(主页) 标注工具LabelMe(Github) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:5:0","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"ZFNet 文章标题:Visualizing and Understanding Convolutional Networks 作者:Matthew D Zeiler,Rob Fergus 发表时间:(CVPR 2013) 纽约大学ZFNet,2013年ImageNet图像分类竞赛冠军模型。提出了一系列可视化卷积神经网络中间层特征的方法,并巧妙设置了对照消融实验,从各个角度分析卷积神经网络各层提取的特征及对变换的敏感性。 使用反卷积deconvnet,将中间层feature map投射重构回原始输入像素空间,便于可视化每个feature map捕获的特征。 改进AlexNet模型,减小卷积核尺寸,减小步长,增加卷积核,提出ZFNet。 训练过程中不同层特征演化可视化。 图像平移、缩放、旋转敏感性分析。 图像局部遮挡敏感性分析(遮挡同一张狗脸图像的不同部位,分析结果变化)。 图像局部遮挡相关性敏感性分析(遮挡不同狗脸的同一部位,分析相关性)。 ZFNet在ImageNet2012图像分类竞赛结果。 模型迁移学习泛化到其它数据集的性能分析:Caltech-101、Caltech-256、PASCAL VOC2012。 去除全连接层和卷积层后模型性能分析。 模型各层特征对分类任务的有效性分析。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:6:0","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 原作者讲解视频(视频中有几页ppt播放顺序错误) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/alexnet/:7:0","tags":["Deep Learning","图像分类模型"],"title":"AlexNet","uri":"/posts/deeplearning/image-classification/alexnet/"},{"categories":["Deep Learning"],"content":"AutoML","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/automl/","tags":["Deep Learning","图像分类模型"],"title":"AutoML","uri":"/posts/deeplearning/image-classification/automl/"},{"categories":["Deep Learning"],"content":"NIR 文章标题:Neural Inheritance Relation Guided One-Shot Layer Assignment Search 作者:Rang Meng, Weijie Chen, Di Xie, Yuan Zhang, Shiliang Pu 发表时间:(AAAI 2020) 相同FLOPs里,各个stage里层数的调整;层数搜索单独摘出来,更加存粹的探究神经网络结构之间的关系,并希望网络结构关系的思路,可以给NAS社区带来一些insight,进一步得到更加通用的高效NAS方案。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/automl/:1:0","tags":["Deep Learning","图像分类模型"],"title":"AutoML","uri":"/posts/deeplearning/image-classification/automl/"},{"categories":["Deep Learning"],"content":"RegNet 文章标题:Designing Network Design Spaces 作者:Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Dollár 发表时间:(CVPR 2020) 官方源码 RegNet 是由 facebook 于 2020 年提出,旨在深化设计空间理念的概念,在 AnyNetX 的基础上逐步改进,通过加入共享瓶颈 ratio、共享组宽度、调整网络深度与宽度等策略,最终实现简化设计空间结构、提高设计空间的可解释性、改善设计空间的质量,并保持设计空间的模型多样性的目的。最终设计出的模型在类似的条件下,性能还要优于 EfficientNet,并且在 GPU 上的速度提高了 5 倍。 我们发现最佳模型的深度在计算机制(~20 个块)中是稳定的,并且最佳模型不使用瓶颈或倒置瓶颈 输入是一个初始设计空间,输出是一个细化的设计空间,其中每个设计步骤的目的是发现能够产生更简单或性能更好的模型群体的设计原 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/automl/:2:0","tags":["Deep Learning","图像分类模型"],"title":"AutoML","uri":"/posts/deeplearning/image-classification/automl/"},{"categories":["Deep Learning"],"content":"AnyNet Design Space stem,body,head 在该设计空间中,网络的主体就是由三部分组成(stem,body,head)。其中stem和head是固定不变的,stem就是一个普通的卷积层(默认包含bn以及relu),卷积核大小为3x3,步距为2,卷积核个数为32,head就是分类网络中常见的分类器,由一个全局平均池化层和全连接层构成。所以网络中最主要的就是body部分,body是由4个stage堆叠组成,而stage是由一系列block堆叠组成。但block的详细结构以及参数并没有做任何限制,这就是AnyNet AnyNetX(A) 由图可知,主分支都是一个1x1的卷积(包括bn和relu)、一个3x3的group卷积(包括bn和relu)、再接一个1x1的卷积(包括bn)。shortcut捷径分支上当stride=1时不做任何处理,当stride=2时通过一个1x1的卷积(包括bn)进行下采样。图中的r代表分辨率简单理解为特征矩阵的高、宽,当步距s等于1时,输入输出的r保持不变,当s等于2时,输出的r为输入的一半。w代表特征矩阵的channel(注意当s=2时,输入的是w i − 1 w_{i-1} AnyNetX(B) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/automl/:3:0","tags":["Deep Learning","图像分类模型"],"title":"AutoML","uri":"/posts/deeplearning/image-classification/automl/"},{"categories":["Deep Learning"],"content":"拓展阅读 自动驾驶系列论文解读(一):RegNet——颠覆NAS的AutoML文章 RegNet网络结构与搭建 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/automl/:4:0","tags":["Deep Learning","图像分类模型"],"title":"AutoML","uri":"/posts/deeplearning/image-classification/automl/"},{"categories":["Deep Learning"],"content":"ConvNeXt","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"ConvNeXt 文章标题:A ConvNet for the 2020s 作者:Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, Saining Xie 发表时间:2022 Official Code ResNet的Transformer版 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:1:0","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"Modernizing a ConvNet: a Roadmap路线图 深色为 ResNet-50/Swin-T;灰色为ResNet-200/Swin-B;阴影为未修改 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:2:0","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"Detailed Architectures ConvNeXt Detailed results for modernizing a ResNet-50 Detailed results for modernizing a ResNet-200 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:2:1","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"macro design 宏观设计 Changing stage compute ratio (78.8%—\u003e79.4%) 每个stage的block数量:(3,4,6,3)-\u003e(3,3,9,3) 和为Swin-T的stage(1,1,3,1)一致。 Changing stem to “Patchify” (79.4%—\u003e79.5%) 输入224;经历stem,导致$4\\times$下采样成56;卷积计算: $ (W-F+2P)/s+1$ 传统:stride=2的$7\\times7$卷积(padding为3)—\u003estride=2的$3\\times3$max pooling(padding为1) $(224-7+2\\cdot3)/2+1=112–\u003e(112-3+2)/2=56$(pytorch向下取整) Swin-T:stride=4的$4\\times4$卷积 $(224-4)/4+1=56$ ConvNeXt :stride=4的$4\\times4$卷积 # 标准ResNet stem = nn.Sequential( nn.Conv2d(in_chans, dims[0], kernel_size=7, stride=2,padding=3), nn.MaxPool2d(kernel_size=3, stride=2, padding=1) ) # ConvNeXt stem = nn.Sequential( nn.Conv2d(in_chans, dims[0], kernel_size=4, stride=4), LayerNorm(dims[0], eps=1e-6, data_format=\"channels_first\") ) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:2:2","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"ResNeXt (79.4%—\u003e80.5%) Use more groups, expand width 使用更多的组,扩大宽度 bottleneck的$3\\times3$卷积—\u003edepthwise conv(组数等于通道数) 将网络宽度增加到与Swin-T的通道数量相同(从64到96) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:2:3","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"Inverted bottleneck (80.5%—\u003e80.6%) Block modifications and resulted specifications (a) ResNeXt block; (b) inverted bottleneck block ; (c) b的深度卷积位置上移 d=4(维度系数) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:2:4","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"large kernel size 使用[c图](###Inverted bottleneck (80.5%—\u003e80.6%))深度卷积位置上移后的倒残差结构 (退化到79.9%) 使用$7\\times7$卷积 (79.9% (3×3) —\u003e 80.6%) (7×7) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:2:5","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"various layer-wise micro designs各种层级的微观设计 ConvNeXt_Block class Block(nn.Module): r\"\"\" ConvNeXt Block. There are two equivalent implementations: (1) DwConv -\u003e LayerNorm (channels_first) -\u003e 1x1 Conv -\u003e GELU -\u003e 1x1 Conv; all in (N, C, H, W) (2) DwConv -\u003e Permute to (N, H, W, C); LayerNorm (channels_last) -\u003e Linear -\u003e GELU -\u003e Linear; Permute back We use (2) as we find it slightly faster in PyTorch Args: dim (int): Number of input channels. drop_path (float): Stochastic depth rate. Default: 0.0 layer_scale_init_value (float): Init value for Layer Scale. Default: 1e-6. \"\"\" def __init__(self, dim, drop_path=0., layer_scale_init_value=1e-6): super().__init__() self.dwconv = nn.Conv2d(dim, dim, kernel_size=7, padding=3, groups=dim) # depthwise conv self.norm = LayerNorm(dim, eps=1e-6) self.pwconv1 = nn.Linear(dim, 4 * dim) # pointwise/1x1 convs, implemented with linear layers self.act = nn.GELU() self.pwconv2 = nn.Linear(4 * dim, dim) # gamma的作用是用于做layer scale训练策略 self.gamma = nn.Parameter(layer_scale_init_value * torch.ones((dim)), requires_grad=True) if layer_scale_init_value \u003e 0 else None # drop_path是用于stoch. depth训练策略 self.drop_path = DropPath(drop_path) if drop_path \u003e 0. else nn.Identity() def forward(self, x): input = x x = self.dwconv(x) # 由于用FC来做1x1conv,所以需要调换通道顺序 x = x.permute(0, 2, 3, 1) # (N, C, H, W) -\u003e (N, H, W, C) x = self.norm(x) x = self.pwconv1(x) x = self.act(x) x = self.pwconv2(x) if self.gamma is not None: x = self.gamma * x x = x.permute(0, 3, 1, 2) # (N, H, W, C) -\u003e (N, C, H, W) x = input + self.drop_path(x) return x 用GELU代替RELU (80.6%不变) 和Swin-T一样只用一个GELU (80.6%—\u003e81.3%) 只留下一个BN层(比Swin-T还少:在Block开始添加一个额外的BN层并不能提高性能)(81.3%—\u003e81.4%) 用LN代替BN (81.4%—\u003e81.5%) 直接在ResNet基础上替换成LN,效果并不好。 单独的下采样层 (81.5%—\u003e82%) ResNet:stride=2的$3\\times3$卷积,有残差结构的block则在短路连接中使用stride=2的$1\\times1$卷积 Swin-T:单独采样层 ConvNeXt :stride=2的$2\\times2$卷积 #https://github.com/facebookresearch/ConvNeXt/blob/e4e7eb2fbd22d58feae617a8c989408824aa9eda/models/convnext.py#L72 self.downsample_layers = nn.ModuleList() # stem and 3 intermediate downsampling conv layers stem = nn.Sequential( nn.Conv2d(in_chans, dims[0], kernel_size=4, stride=4), LayerNorm(dims[0], eps=1e-6, data_format=\"channels_first\") ) self.downsample_layers.append(stem) for i in range(3): downsample_layer = nn.Sequential( LayerNorm(dims[i], eps=1e-6, data_format=\"channels_first\"), nn.Conv2d(dims[i], dims[i+1], kernel_size=2, stride=2), ) self.downsample_layers.append(downsample_layer) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:2:6","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"Empirical Evaluations on ImageNet ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:3:0","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"ConvNeXt 变体配置 ConvNeXt 系列 C_channels B_stage_blocks IN-1K top-1 acc_input_224 ConvNeXt-T (96,192,384,768) (3,3,9,3) 82.1 ConvNeXt-S (96,192,384,768) (3,3,27,3) 83.1 ConvNeXt-B (128,256,512,1024) (3,3,27,3) 83.8 ConvNeXt-L (192,384,768,1536) (3,3,27,3) 84.3 ConvNeXt-XL (256,512,1024,2048) (3,3,27,3) IN-22K pre-trained-87.0 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:3:1","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"Training Techniques ImageNet-1K (Pre)-training config ResNet50(standard) ResNet50(timm) ResNet50(torchvision) ConvNeXt-T optimizer SGD LAMB SGD AdamW base learning rate 0.1 5e-3 0.5 4e-3 weight decay 1e-4 0.01 2e-5 0.05 optimizer momentum 0.9 - 0.9 $\\beta_1,\\beta_2=0.9,0.999$ batch size $8\\times32=256$ $4\\times512=2048$ $8\\times128=1024$ $4\\times8\\times128=4096$ training epochs 90 600 600 300 learning rate schedule StepLR (step=30,gamma=0.1) cosine decay cosine decay cosine decay warmup epochs - 5 5 20 warmup schedule - linear linear linear The effective batch size = --nodes * --ngpus * --batch_size * --update_freq. In the example above, the effective batch size is 4*8*128*1 = 4096 数据增强 (Pre)-training config ResNet50(standard) ResNet50(timm) ResNet50(torchvision) ConvNeXt-T Mixup - 0.2 0.2 0.8 Cutmix - 1.0 1.0 1.0 RandAugment - (7,0.5) auto_augment=‘ta_wide’ (9,0.5) 正则化 (Pre)-training config ResNet50(standard) ResNet50(timm) ResNet50(torchvision) ConvNeXt-T Stochastic Depth - 0.05 - 0.1 Label Smoothing - 0.1 0.1 0.1 Layer Scale - - - 1e-6 EMA - - 0.99998 0.9999 Top-1 acc (Pre)-training config ResNet50(standard) ResNet50(timm) ResNet50(torchvision) ConvNeXt-T Top-1 acc 75.3 80.4 80.674 82.1 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:3:2","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"拓展阅读 ConvNeXt:手把手教你改模型 ResNet strikes back: An improved training procedure in timm How to Train State-Of-The-Art Models Using TorchVision’s Latest Primitives ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/convnext-/:4:0","tags":["Deep Learning","图像分类模型"],"title":"ConvNeXt","uri":"/posts/deeplearning/image-classification/convnext-/"},{"categories":["Deep Learning"],"content":"EfficientNet","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"EfficientNet 文章标题:EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks 作者:Mingxing Tan, Quoc V. Le 发表时间:(ICML 2019) Official Code EfficientNet PyTorch EfficientNet 是一组针对FLOPs和参数效率进行优化的模型。它利用NAS搜索基线EfficientNet-B0,它在准确性和FLOPs方面有更好的权衡。然后使用复合缩放策略对基线模型进行缩放,以获得一系列模型B1-B7。 Model_Scaling Scaling Up EfficientNet-B0 with Different Methods 增加网络的深度depth能够得到更加丰富、复杂的特征并且能够很好的应用到其它任务中。但网络的深度过深会面临梯度消失,训练困难的问题。(ResNet) 增加网络的width能够获得更高细粒度的特征并且也更容易训练,但对于width很大而深度较浅的网络往往很难学习到更深层次的特征。(Inception) 增加输入网络的图像分辨率能够潜在得获得更高细粒度的特征模板,但对于非常高的输入分辨率,准确率的增益也会减小,并且大分辨率图像会增加计算量。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:1:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"Scaling Dimensions Scaling Up a Baseline Model with Different Network Width (w), Depth (d), and Resolution (r) Coefficient 扩大网络中深度、宽度或者分辨率的任一维度能提高模型的准确率,但随着模型的扩大,这种准确率的增益效果会逐步消失; Scaling up any dimension of network width, depth, or resolution improves accuracy, but the accuracy gain diminishes for bigger models. Scaling Network Width for Different Baseline Networks $(d=1.0,r=1.0)$:18个卷积层,分辨率为$224\\times224$ $(d=2.0,r=1.3)$:36个卷积层,分辨率为$299\\times299$ 为了更好的准确率和效率,很有必要去平衡提升网络中深度、宽度和分辨率的所有维度。 In order to pursue better accuracy and efficiency, it is critical to balance all dimensions of network width, depth, and resolution during ConvNet scaling. ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:2:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"Problem Formulation $N = \\bigodot_{i=1…s} F_i^{L_i}(X_{\u003cH_i,W_i,C_i\u003e})$ $ \\bigodot_{i=1…s}$:连乘运算 $F_i$表示一个运算操作;$F_i^{L_i}$表示在第i个stage中$F_i$运算被重复执行了$L_i$次 $X$表示第i个stage的特征矩阵(输入张量) $\u003cH_i,W_i,C_i\u003e$表示$X$的高宽和通道数 $max_{d,w,r} \\ \\ Accuracy(N(d,w,r))$ 其中 $N(d,w,r)=\\bigodot_{i=1…s} \\hat F_i^{\\hat L_i}(X_{\u003cr\\dot {\\hat H_i},r\\dot {\\hat W_i},w\\dot {\\hat C_i}\u003e})$ $Memory(N)\\leq target_memory$ $ FLOPS(N)\\leq target_flops$ $d$用来缩放深度$\\hat {L_i}$ $r$用来缩放分辨率即影响$\\hat{H_i},\\hat{W_i}$ $w$用来缩放特征矩阵的通道数$\\hat{C_i}$ $$ depth:d=\\alpha^\\phi\\\\ width:w=\\beta^\\phi\\\\ resolution:r=\\gamma^\\phi\\\\ s.t. \\ \\alpha \\cdot \\beta^2\\cdot\\gamma^2 \\approx2\\\\ \\alpha\\geq1,\\beta\\geq1,\\gamma\\geq1 $$ FLOPs(理论计算量)与depth的关系:当depth翻倍,FLOPs也翻倍。 FLOPs与width的关系:当width翻倍(即channal翻倍),FLOPs会翻4倍 当width翻倍,输入特征矩阵的channels和输出特征矩阵的channels或卷积核的个数都会翻倍,所以FLOPs会翻4倍 FLOPs与resolution的关系:当resolution翻倍,FLOPs会翻4倍 总的FLOPs倍率可以用近似用$(\\alpha \\cdot \\beta^{2} \\cdot \\gamma^{2})^{\\phi}$表示 :$\\beta^2:c_i,c_o;\\gamma^2:h,w$ 固定$\\phi=1$,基于上述约束条件进行搜索,EfficientNet_B0的最佳参数为$\\alpha=1.2,\\beta=1.1.\\gamma=1.15$。 固定$\\alpha=1.2,\\beta=1.1.\\gamma=1.15$,在EfficientNetB-0的基础上使用不同的$ \\phi$分别得到EfficientNetB1-B7。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:3:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"EfficientNet Architecture EfficientNet-B0 baseline network 其中卷积层后默认都有BN以及Swish激活函数 MBConv 第一个升维的$1\\times1$卷积层,它的卷积核个数是输入特征矩阵channel的n倍(这里的n对应Operator里的MBConvn) 当n=1时,不要第一个升维的$1\\times1$卷积层,即Stage2中的MBConv结构都没有第一个升维的1x1卷积层(这和MobileNetV3网络类似) 关于shortcut连接,仅当输入MBConv结构的特征矩阵与输出的特征矩阵shape相同时才存在 注意:在源码中只有使用到shortcut的MBConv模块才有Dropout层;Dropout层的drop_rate是从0递增到0.2的,是Stochastic Depth,即会随机丢掉整个block的主分支(只剩捷径分支,相当于直接跳过了这个block)也可以理解为减少了网络的深度。 SE 由一个全局平均池化,两个全连接层组成。 第一个全连接层的节点个数是输入该MBConv模块的特征矩阵channels的1/4(MobileNetV3是feature map的channels的1/4),且使用Swish激活函数。 第二个全连接层的节点个数等于Depthwise Conv层输出的特征矩阵channels,且使用Sigmoid激活函数。 model width_coefficient depth_coefficient resolution dropout_rate efficientnet-b0 1.0 1.0 224 0.2 efficientnet-b1 1.0 1.1 240 0.2 efficientnet-b2 1.1 1.2 260 0.3 efficientnet-b3 1.2 1.4 300 0.3 efficientnet-b4 1.4 1.8 380 0.4 efficientnet-b5 1.6 2.2 456 0.4 efficientnet-b6 1.8 2.6 528 0.5 efficientnet-b7 2.0 3.1 600 0.5 efficientnet-b8 2.2 3.6 672 0.5 efficientnet-12 4.3 5.3 800 0.5 width_coefficient代表channel维度上的倍率因子 比如在 EfficientNetB0中Stage1的$3\\times3$卷积层所使用的卷积核个数是32,那么在B6中就是$32 \\times 1.8=57.6$,接着取整到离它最近的8的整数倍即56,其它Stage同理。 depth_coefficient代表depth维度上的倍率因子(仅针对Stage2到Stage8) 比如在EfficientNetB0中Stage7的$ {\\widehat L}_i=4 $那么在B6中就是$4 \\times 2.6=10.4$,接着向上取整即11。 dropout_rate是最后一个全连接层前的dropout层(在stage9的Pooling与FC之间)的dropout_rate。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:4:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 EfficientNet网络详解 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:5:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"EfficientNetV2 文章标题:EfficientNetV2: Smaller Models and Faster Training 作者:Mingxing Tan, Quoc V. Le 发表时间:(ICML 2021) Official Code PyTorch Image Models EfficientNetV1的训练瓶颈 大图像尺寸导致了大量的内存使用,训练速度非常慢。 解决方法:降低训练图像的尺寸 深度卷积在网络浅层(前期)中速度缓慢,但在后期阶段有效。(无法充分利用现有的一些加速器) 解决方法:引入Fused-MBConv结构 同等的扩大每个stage是次优的。在EfficientNetV1中,每个stage的深度和宽度都是同等放大的。但每个stage对网络的训练速度以及参数数量的贡献并不相同 解决方法:非均匀的缩放策略来缩放模型 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:6:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"Fused-MBConv Structure of MBConv and Fused-MBConv. 源码没有使用SE模块 Replacing MBConv with Fused-MBConv 只替换stage1-3,用NAS搜索出来的结果 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:7:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"EfficientNetV2 Architecture EfficientNetV2_S 与EfficientNetV1的不同点 除了使用MBConv模块,还使用Fused-MBConv模块 会使用较小的expansion ratio 偏向使用更小的kernel_size($3\\times3$) 移除了EfficientNetV1中最后一个步距为1的stage(V1中的stage8) #################### EfficientNet V2 configs #################### # r代表当前Stage中Operator重复堆叠的次数 # k代表kernel_size # s代表步距stride # e代表expansion ratio # i代表input channels # o代表output channels # c代表conv_type,1代表Fused-MBConv,0代表MBConv(默认为MBConv) # se代表使用SE模块,以及se_ratio v2_base_block = [ # The baseline config for v2 models. 'r1_k3_s1_e1_i32_o16_c1', 'r2_k3_s2_e4_i16_o32_c1', 'r2_k3_s2_e4_i32_o48_c1', 'r3_k3_s2_e4_i48_o96_se0.25', 'r5_k3_s1_e6_i96_o112_se0.25', 'r8_k3_s2_e6_i112_o192_se0.25', ] v2_s_block = [ # about base * (width1.4, depth1.8) 'r2_k3_s1_e1_i24_o24_c1', 'r4_k3_s2_e4_i24_o48_c1', 'r4_k3_s2_e4_i48_o64_c1', 'r6_k3_s2_e4_i64_o128_se0.25', 'r9_k3_s1_e6_i128_o160_se0.25', 'r15_k3_s2_e6_i160_o256_se0.25', ] v2_m_block = [ # about base * (width1.6, depth2.2) 'r3_k3_s1_e1_i24_o24_c1', 'r5_k3_s2_e4_i24_o48_c1', 'r5_k3_s2_e4_i48_o80_c1', 'r7_k3_s2_e4_i80_o160_se0.25', 'r14_k3_s1_e6_i160_o176_se0.25', 'r18_k3_s2_e6_i176_o304_se0.25', 'r5_k3_s1_e6_i304_o512_se0.25', ] v2_l_block = [ # about base * (width2.0, depth3.1) 'r4_k3_s1_e1_i32_o32_c1', 'r7_k3_s2_e4_i32_o64_c1', 'r7_k3_s2_e4_i64_o96_c1', 'r10_k3_s2_e4_i96_o192_se0.25', 'r19_k3_s1_e6_i192_o224_se0.25', 'r25_k3_s2_e6_i224_o384_se0.25', 'r7_k3_s1_e6_i384_o640_se0.25', ] efficientnetv2_params = { # (block, width, depth, train_size, eval_size, dropout, randaug, mixup, aug) 'efficientnetv2-s': # 83.9% @ 22M (v2_s_block, 1.0, 1.0, 300, 384, 0.2, 10, 0, 'randaug'), 'efficientnetv2-m': # 85.2% @ 54M (v2_m_block, 1.0, 1.0, 384, 480, 0.3, 15, 0.2, 'randaug'), 'efficientnetv2-l': # 85.7% @ 120M (v2_l_block, 1.0, 1.0, 384, 480, 0.4, 20, 0.5, 'randaug'), } ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:8:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"progressive learning渐进式学习 在训练早期,先对图像尺寸小且正则化程度较弱的网络进行训练(如dropout、data augmentation),然后逐渐增大图像尺寸并加入更强的正则化。 建立在渐进调整大小的基础上,但通过动态(自适应)调整正则化(Dropout, Rand Augment, Mixup) EfficientNetV2_Algorithm_1. ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:9:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 EfficientNetV2网络详解 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/efficientnet/:10:0","tags":["Deep Learning","图像分类模型"],"title":"EfficientNet","uri":"/posts/deeplearning/image-classification/efficientnet/"},{"categories":["Deep Learning"],"content":"Inception","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"策略:split-transform-merge ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:0:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"InceptionV1(GoogLeNet) 文章标题:Going Deeper with Convolutions 作者:Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich 发表时间:(CVPR 2015) GoogLeNet深度卷积神经网络结构,及其后续变种Inception-V1、Inception-V2-Inception-V3、Inception-V4。 使用Inception模块,引入并行结构和不同尺寸的卷积核,提取不同尺度的特征,将稀疏矩阵聚合为较为密集的子矩阵,大大提高计算效率,降低参数数量。加入辅助分类器,实现了模型整合、反向传播信号放大。 GoogLeNet在ILSVRC-2014图像分类竞赛中获得了top-5误差6.7%的冠军成绩。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:1:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Introduction 启发文献 Network In Network $1\\times1$卷积降维-升维 Global Average pooling层取代全连接层 Provable Bounds for Learning Some Deep Representations 用稀疏、分散的网络取代以前庞大密集臃肿的网络 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:2:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Related work LeNet,AlexNet,ZFNet,NiN,overfeat ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:3:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Motivation and High Level Considerations 提高模型性能的传统方法: 增加深度(层数) 增加宽度(卷积核个数)适用于大规模标注好的数据集 产生的问题: 标注成本高 计算效率问题 两个相连卷积层,两层同步增加卷积核个数,计算量将平方增加 如果很多权重训练后接近0,这部分计算就被浪费掉了 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:4:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"GoogLeNet 原始Inception模块通道数越来越多,计算量爆炸。 原始Inception 每个 Inception 结构有 4 个分支,主要包含 1x1, 3x3, 5x5 卷积核和 max pooling 操作的步长为1,以保持输出特征层的尺寸与卷积核输出尺寸一致。1x1 卷积核的作用是降维,以避免 cancatenation 操作导致特征层过深,并减少网络参数. 优化Inception 太过密集压缩的嵌入向量不便于模型处理;只在$3\\times3$、$5\\times5$卷积层之前用$1\\times1$降维。 GoogLeNet #$3\\times3$reduce:$3\\times3$卷积之前的$1\\times1$卷积,其他reduce同理 pool proj:池化后的$1\\times1$卷积 所有卷积使用relu激活函数 GAP:全局平均池化 一个channel用一个平均值代表取代全连接层,减少参数量。 便于fine-tune迁移学习 提升了0.6%的TOP1准确率 原始输入图像为224x224x3,且都进行了零均值化的预处理操作(图像每个像素减去均值)。 Stem Network:Conv-Pool-2x Conv-Pool(底层先用普通卷积层,后面用9个Inception模块叠加) Classifier output(removed expensive FC layers!) Auxiliary classification outputs to inject additional gradient at lower layers (AvgPool-1x1Conv-FC-FC-Softmax) 在4a和4d后面加辅助分类层 改善梯度消失 正则化 让浅层也能学习到区分特征 其实没太大用处,在v2/v3版本去掉 训练时损失函数:$L=L_{最后}+0.3\\times L_{辅1}+0.3\\times L_{辅2}$ 测试阶段:去掉辅助分类器 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:5:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Training Methodlogy 数据并行:一个batch均分k份,让不同节点前向和反向传播,再由中央param sever优化更新权重 asynchronous stochastic gradient descent:异步随机梯度下降 图像增强 裁剪为原图8%-100%之间,宽高比3/4和4/3之间; 等概率使用不同插值方法(双线性,区域,最近邻,三次函数) 裁剪: 将原图缩放为短边长度256,288,320,352的四个尺度 每个尺度裁剪出左中右(或上中下)三张小图 每张小图取四个角和中央的五张$224\\times224$的patch以及每张小图缩放为$224\\times224$,共6个patch同时取镜像 $4\\times3\\times6\\times2=144$个patch ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:6:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"拓展阅读 ILSVRC2014竞赛汇报 博客 Hebbian原则理解 1、网络更容易过拟合,当数据集不全的时候,过拟合更容易发生,于是我们需要为网络feed大量的数据,但是制作样本集本身就是一件复杂的事情。 2、大量需要更新的参数就会导致需要大量的计算资源,而当下即使硬件快速发展,这样庞大的计算也是很昂贵的 解决以上问题的根本方法就是把全连接的网络变为稀疏连接(卷积层其实就是一个稀疏连接),当某个数据集的分布可以用一个稀疏网络表达的时候就可以通过分析某些激活值的相关性,将相关度高的神经元聚合,来获得一个稀疏的表示。 这种方法也呼应了Hebbian principle,一个很通俗的现象,先摇铃铛,之后给一只狗喂食,久而久之,狗听到铃铛就会口水连连。这也就是狗的“听到”铃铛的神经元与“控制”流口水的神经元之间的链接被加强了,而Hebbian principle的精确表达就是如果两个神经元常常同时产生动作电位,或者说同时激动(fire),这两个神经元之间的连接就会变强,反之则变弱(neurons that fire together, wire together) 可视化GoogLeNet ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:7:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"InceptionV2(BN-Inception) 文章标题:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 作者:Sergey Ioffe, Christian Szegedy 发表时间:(ICML 2015) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:8:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Abstract 训练慢和困难:internal covariate shift (ICS) 低学习率;参数初始化 Batch Normalization 加快模型训练速度;加速收敛 可以使用更高的学习率;参数初始化 和当前最好的分类网络相比训练步骤降低14倍 具有一定正则化作用 在某些情况,减少Dropout的使用 使模型效果更好(并不是所有模型用了BN 就会更好) top-5:4.9%; test error:4.8% 超过了人工评分的准确性。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:9:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Introduction 使用mini-batch 小批量的损失梯度是对训练集上梯度的估计,其质量随着批量大小的增加而提高。 现代计算平台提供的并行性,批处理的计算比单个示例的m次计算效率要高得多。 internal covariate shift:在深度学习网络的训练过程中网络内部结点的分布变化称为内部协变量偏移 每一层数据的微小变化都会随着网络一层一层的传递而被逐渐放大。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:10:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Normalization via Mini-Batch Statistics 白化(Whitening):对输入数据分布进行变换 使得输入特征分布具有相同的均值与方差。其中PCA白化保证了所有特征分布均值为0,方差为1;而ZCA白化则保证了所有特征分布均值为0,方差相同; 去除特征之间的相关性。 白化过程计算成本太高 白化过程由于改变了网络每一层的分布,因而改变了网络层中本身数据的表达能力 Batch Normalization:简化白化 单独对每个特征标准化参数,使其具有零均值和单位方差。 引入了两个可学习的参数$\\gamma$与$\\beta$,这两个参数的引入是为了恢复数据本身的表达能力,对规范化后的数据进行线性变换。 对全连接层,作用在特征维 对卷积层,作用在通道维 如果batch size为$m$,则在前向传播过程中,网络中每个节点都有$m$个输出,Batch Normalization就是对该层每个节点的这$m$个输出进行归一化再输出,具体计算方式如下: BN Standardization:首先对$m$个$x$进行 Standardization,得到 zero mean unit variance的分布$\\hat x$。 scale and shift:然后再$ \\hat x$对进行scale and shift,缩放并平移到新的分布$y$,具有新的均值$\\beta$方差$\\gamma$。 $$ y_i^{(b)}=BN_{(x_i)^{(b)}}=\\gamma \\cdot (\\frac{x_i^{(b)}-\\mu(x_i)}{\\sqrt {\\sigma(x_i)^2+\\epsilon}})+\\beta $$ $\\mu$和$\\sigma$为该行的均值和标准差,$\\epsilon$为防止除零引入的极小量(可忽略) $\\gamma$和$\\beta$为scale和shift参数,以提高表现力 BN反向传播 BN反向传播 Understanding the backward pass through Batch Normalization Layer 训练阶段计算的是每一个batch的均值和方差,但是测试时用的是训练后的(指数加权平均)的均值和方差(吴恩达) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:11:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Inception V2 Architecture InceptionV2 网络结构图 与Inception V1对比 Inception V1的$5\\times 5$卷积被替换成两个$3\\times 3$ 使网络的最大深度增加9层。增加了25%的参数,计算成本增加了约30%。 Inception(3X)(特征图为$28\\times28$)模块从2个变成3个。(Inception3a,b——\u003eInception3a,b,c) 模块内部有时使用平均池化,有时使用最大池化 模块3c, 4e的过滤器连接之前使用了stride-2卷积/池化层。 在第一层卷积层上采用深度乘子8的可分离卷积。 减少了计算成本,同时增加了训练时的内存消耗。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:12:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"拓展阅读 Batch Normalization详解以及pytorch实验 Batch Normalization详解 Understanding the backward pass through Batch Normalization Layer 深入解读Inception V2之Batch Normalization(附源码) Batch Normalization原理与实战 理解Batch Normalization系列3——为什么有效及11个问题 Batch-normalized 应该放在非线性激活层的前面还是后面? How Does Batch Normalization Help Optimization? BN层让损失函数更平滑。通过分析训练过程中每步梯度方向上步长变化引起的损失变化范围、梯度幅值的变化范围、光滑度的变化,认为添加BN层后,损失函数的landscape(loss surface)变得更平滑,相比高低不平上下起伏的loss surface,平滑loss surface的梯度预测性更好,可以选取较大的步长。 对比了标准VGG以及加了BN层的VGG每层分布随训练过程的变化,发现两者并无明显差异,认为BatchNorm并没有改善 Internal Covariate Shift。 How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) An empirical analysis of the optimization of deep network loss surfaces BN更有利于梯度下降。绘制了VGG和NIN网络在有无BN层的情况下,loss surface的差异,包含初始点位置以及不同优化算法最终收敛到的local minima位置。没有BN层的,其loss surface存在较大的高原,有BN层的则没有高原,而是山峰,因此更容易下降。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:13:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"InceptionV3 文章标题:Rethinking the Inception Architecture for Computer Vision Pytorch官方Inception-V3 作者:Szegedy, Christian, et al 发表时间:(CVPR 2016) 本论文在GoogLeNet和BN-Inception的基础上,对Inception模块的结构、性能、参数量和计算效率进行了重新思考和重新设计。提出了Inception V2和Inception V3模型,取得了3.5%左右的Top-5错误率。 Inception V3具有强大的图像特征抽取和分类性能,是常用的迁移学习主干网络基模型。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:14:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"General Design Principles通用设计原则(建议) 避免过度降维或收缩特征Bottleneck(避免过度的1 x 1卷积,特别是在网络浅层) feature map的长宽大小应该随网络加深缓慢减小 降维会造成各通道间的相关性信息丢失,仅反应了致密的嵌入信息 独立的特征越多收敛越快(尽可能在分类层之前增加通道数) 相互独立特征越多,输入的信息分解的越彻底 Hebbin原理 大卷积核卷积之前可用1x1卷积降维(3x3或5x5卷积之前可先用1x1卷积降维,可保留相邻单元的强相关性) 大尺度卷积:聚合空间信息大感受野 相邻感受野的卷积结果 邻近单元的强相关性在降维过程中信息损失很少 均衡网络的宽度和深度 两者同时提升,既可以提升性能,也能提升计算效率 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:15:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Factorizing Convolutions with Large Filter Size卷积分解 $5\\times5$卷积分解成2个$3\\times3$卷积;减少参数数量 分解卷积是否会影响模型表达能力? 是否需保留第一层的非线性激活函数? 增加非线性可学习空间增强了 Inception Module A $3\\times3$卷积分解成$3\\times1$卷积和$1\\times3$卷积非对称(空间可分离卷积) $n\\times n$卷积分解成$n \\times1$卷积和$1\\times n$卷积 n越大,节省的运算量越大 Inception Module B 不对称卷积分解在靠前的层效果不好,适用于feature map尺寸在12-20之间 拓展滤波器组(加宽网络,升维)在最后分类层之前,用该模块拓展特征维度,生成高维稀疏特征。 Inception Module C ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:16:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Utility of Auxiliary Classifiers辅助分类器 提出辅助分类器并不能帮助模型更快收敛和更快的特征演化。 增加了BN层和Dropout层的辅助分类器可以起到正则化作用。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:17:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Efficient Grid Size Reduction高效下采样技巧 先卷积再池化(计算量大) 先卷积再池化 步长为2的卷积(大量信息丢失,违反原则1) 步长为2的卷积化 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:18:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Inception V3 Inception-v3 的 PyTorch 版本 Google Cloud 上的 Inception-v3 InceptionV3 figure5:$5\\times5$卷积分解成2个$3\\times3$卷积 figure6:空间卷积可分离卷积 figure7:拓展滤波器组 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:19:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Label Smoothing标签平滑 When Does Label Smoothing Help? 标签平滑的目的是防止最大的 logit 变得比所有其他 logit 大得多 new_labels = (1 — ε) * one_hot_labels + ε / K 其中 ε 是 0.1,这是一个超参数,K 是 1000,这是类的数量。在分类器层观察到的一种dropout效应。 标签用one-hot独热编码 $$ L=-\\sum_{i=1}^{k}q_ilog{p_i}=-log{p_y}=-z_y+log{(\\sum_{i=1}^k e^{z_i})} $$可能导致过拟合 它鼓励最大的逻辑单元与所有其它逻辑单元之间的差距变大,与有界限的梯度∂ℓ/∂zk相结合,这会降低模型的适应能力。 $$ z^* = \\begin{cases} \\log{\\frac{(k-1)(1-\\varepsilon)}{\\varepsilon}} + \\alpha \u0026 \\text{if } i = y \\\\ \\alpha \u0026 \\text{if } i \\neq y \\end{cases} $$ ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:20:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"拓展阅读 在 Cloud TPU 上运行 Inception v3 的高级指南 博客 Label Smooth的Pandas实现小例子 知乎:神经网络中的label smooth为什么没有火? ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:21:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"InceptionV4 文章标题:Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning 作者:Szegedy C , Ioffe S , Vanhoucke V , et al. 发表时间:(AAAI 2017) 提出了Inception-V4、Inception-ResNet-V1、Inception-ResNet-V2三个模型。 Inception-V4在Inception-V3的基础上进一步改进了Inception模块,提升了模型性能和计算效率。 Inception-V4没有使用残差模块, Inception-ResNet将Inception模块和深度残差网络ResNet结合,提出了三种包含残差连接的Inception模块,残差连接显著加快了训练收敛速度。 Inception-ResNet-V2和Inception-V4的早期stem网络结构相同。 Inception-ResNet-V1和Inception-V3准确率相近,Inception-ResNet-V2和Inception-V4准确率相近。 经过模型集成和图像多尺度裁剪处理后,模型Top-5错误率降低至3.1%。 针对卷积核个数大于1000时残差模块早期训练不稳定的问题,提出了对残差分支幅度缩小的解决方案。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:22:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Inception-V4 V:不使用padding 不加V:same padding 如果padding设置为SAME,则说明输入图片大小和输出图片大小是一致的 InceptionV4 Stem主干网络 InceptionV4_Module_A InceptionV4_Module_B InceptionV4_Module_C 模块A输出Grid Size:$35\\times35$ 模块B输出Grid Size:$17\\times17$ 模块C输出Grid Size:$8\\times8$ ReductionA ReductionB ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:23:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Inception-ResNet-V1 性能和InceptionV3相近 带残差模块的Inception Inception之后使用不带激活函数的$1\\times1$卷积:升维拓展filter bank ,匹配输入维度 在相加层之后不做BN,减少计算量。 Inception-ResNet-v1 Stem主干网络 Inception-ResNet-v1_Module_A Inception-ResNet-v1_Module_B Inception-ResNet-v1_Module_C 模块A输出Grid Size:$35\\times35$ 模块B输出Grid Size:$17\\times17$ 模块C输出Grid Size:$8\\times8$ ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:24:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Inception-ResNet-V2 Inception-ResNet-V1和Inception-ResNet-V2网络总体结构一样 Inception-ResNet-V2和InceptinV4主干网络一样 性能和InceptionV4相近 Inception-ResNet-V2 Stem主干网络 Inception-ResNet-v2_Module_A Inception-ResNet-v2_Module_B Inception-ResNet-v2_Module_C 模块A输出Grid Size:$35\\times35$ 模块B输出Grid Size:$17\\times17$ 模块C输出Grid Size:$8\\times8$ ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:25:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Scaling of the Residuals InceptionV4_Scaling of the Residuals 对残差块输出进行幅度减小 在加法融合之前,对残差分支的结果乘以幅度缩小系数 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:26:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"拓展阅读 论文版本1 论文版本2 论文版本3 Inception-V4和Inception-Resnet论文阅读和代码解析 Inception-V4源代码 Inception-ResNet-V2源代码 Inception英文综述博客 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:27:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Xception 文章标题:Xception: Deep Learning with Depthwise Separable Convolutions 作者:Francois Chollet Github主页 推特 谷歌学术 领英 发表时间:(CVPR 2017) 谷歌Xception,将深度可分离卷积引入Inception模块,实现长宽方向的空间信息和跨通道信息的完全解耦。X代表Extreme,极致。 在ImageNet数据集和JFT数据集两个大规模图像分类任务上,收敛速度、最终准确率都超过Inception V3。 Xception作者为深度学习框架Keras作者François Chollet。在Keras中可调用预训练的Xception模型作为迁移学习的骨干网络。 VGG:经典串行堆叠深度 Inception:拓展多分支宽度,分别处理(解耦)再整合汇总 在 DeeplabV3+ 中,作者将 Xception 做了进一步的改进,同时增加了 Xception 的层数,设计出了 Xception65 和 Xception71 的网络。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:28:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"The Cxeption architecture 假设:跨通道信息和长宽方向的空间信息可完全分离解耦 Xception与标准可分离卷积的区别 MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (2017) 顺序不同 Xception与深度可分离卷积的区别 Xception中使用非线性激活函数ReLu 常规卷积:一个卷积核处理所有通道 深度可分离卷积:一个卷积核处理一个通道 SeperableConv包含$1\\times1$卷积+深度可分离卷积+合并 极限版本:每个$3\\times3$卷积单独处理一个通道 extreme version of Inception module ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:29:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"Effect of an intermediate activation after pointwise convolutions 非线性激活对空间-通道未解耦时有用 对$1\\times1$卷积后的特征图,非线性激活会导致信息丢失,不利于后续的深度可分离卷积。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:30:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"拓展阅读 Keras中的Xception预训练模型 博客 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/inception/:31:0","tags":["Deep Learning","图像分类模型"],"title":"Inception","uri":"/posts/deeplearning/image-classification/inception/"},{"categories":["Deep Learning"],"content":"ResNet","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"ResNet 文章标题:Deep Residual Learning for Image Recognition 代码地址 pytorch版 作者:Kaiming He ,Xiangyu Zhang ,Shaoqing Ren ,Jian Sun,Microsoft Research 发表时间:(CVPR 2016) CVPR论文主页 微软亚洲研究院提出的深度残差网络ResNet,获得2015年ImageNet图像分类、定位、检测,MS COCO竞赛检测、分割五条赛道的冠军,通过引入残差连接,有效解决深层网络训练时的退化问题,可以通过加深网络大大提升性能。 ResNet在ILSVRC-2015图像分类竞赛中获得了top-5误差3.57%的冠军成绩,在图像分类任务上首次超过人类能力。ResNet常用于迁移学习和fine-tuning微调的特征提取的基模型。 提出残差学习结构解决深网络的退化问题和训练问题。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:1:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Introduction Question: 简单叠加神经网络层可以吗? Phenomenon: 明显的梯度消失/爆炸问题,难以收敛——正则化,适当的权重初始化+Batch Normalization可以加快网络收敛 模型退化问题凸显,准确率饱和 网络退化:深层网络在训练集和测试集上的表现都不如浅层网络 模型退化问题并非过拟合导致,增加深度导致训练集错误率提升 深层网络不能比浅层网络错误率更高——identity mapping恒等映射 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:2:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Residual block 残差路径如何设计? shortcut路径如何设计? Residual Block之间怎么连接? Residual block 残差模块 过去: 直接拟合$H(x)$ 现在: 拟合残差$F(x)=H(x)-x$ shortcut connection:短路连接/捷径连接 既没有引入额外参数,也没有增加计算复杂度 打破了网络对称性,提升网络表征能力 identity mapping:恒等映射 $\\bigoplus$为element-wise addition,要求参与运算的$F(x)$和$x$的尺寸要相同 残差网络 易于优化收敛 解决退化问题 可以很深 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:2:1","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Related Work 残差表示( Residual Representations): 有效的浅层表示方法: VLAD( vector of locally aggregated descriptors) Fisher Vector: Probabilistic version of VLAD 编码残差向量比编码原始向量表现更好 捷径连接( shortcut Connections): MLP——通过线性层将输入连接到输出 从中间层直接连接到辅助分类器 GoogLeNet——Inception Layer Highway Networks——门控函数扮演残差角色,门控参数由学习得到 Residual Learning——提高信息流效率 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:3:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Deep Residual Learning 传统多层网络难以拟合恒等映射 如果恒等映射已经最优,残差模块只需要拟合零映射 后面的网络只拟合前面网络的输出与期望函数的残差。 [残差块](###Residual block):$y=F(x,{W_i})+x$ $F(x,{W_i})$:需要学习的残差映射,维度与$x$一致 $x$:自身输入 $F+x$:跳跃连接,逐一加和,最后输岀经过激活函数ReLU 没有额外参数,不增加复杂度 $F$包含两个或两个以上网络层,否则表现为线性层$y=W_1x+x$ 如果卷积层后加BN层,则不需要偏置项(期望为0) 残差分支出现下采样(虚线表示) shortcut分支第一个卷积层步长都为2 ResNet_optionB 对多出来的通道padding补零填充 用$1\\times1$卷积升维 普通残差模块,用于ResNet-18/34 bottleneck残差模块,用于ResNet-50/101/152 $1\\times1$卷积,先降维后升维 减少参数量与计算量 block和bottleneck Plain Network(普通无残差网络) ResNet-34与34-layer plain net和VGG对比 ResNet ResNet ResNet中,所有的Residual Block都没有pooling层,降采样是通过conv的stride实现的; 分别在conv3_1、conv4_1和conv5_1 Residual Block,降采样1倍,同时feature map数量增加1倍,如图中虚线划定的block; 通过Average Pooling得到最终的特征,而不是通过全连接层; 每个卷积层之后都紧接着BatchNorm layer。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:4:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 ResNet解决退化问题机理 深层梯度回传顺畅 恒等映射这一路的梯度是1,把深层梯度注入底层,防止梯度消失。 类比其它机器学习模型 集成学习 boosting,每一个弱分类器拟合“前面的模型与GT之差” 长短时记忆神经网络LSTM的遗忘门。 Relu激活函数。 传统线性结构网络难以拟合“恒等映射” skip connection可以让模型自行选择要不要更新 弥补了高度非线性造成的不可逆的信息损失。( MobileNet v2) ResNet反向传播传回的梯度相关性好 The Shattered Gradients Problem: If resnets are the answer, then what is the question? 网络加深,相邻像素回传回来的梯度相关性越来越低,最后接近白噪声但相邻像素之间具有局部相关性,相邻像素的梯度也应该局部相关。相邻像素不相关的白噪声梯度只意味着随机扰动,并无拟合。 ResNet梯度相关性衰减从$\\frac{1}{2^L}$加为$\\frac{1}{\\sqrt L}$。保持了梯度相关性。 ResNet相当于几个浅层网络的集成 Residual Networks Behave Like Ensembles of Relatively Shallow Networks $2^n$个潜在路径(类似 dropout) 测试阶段去掉某几个残差块,几乎不影响性能 skip connection可以实现不同分辨率特征的组合 FPN、 DenseNet 从非线性动力学系统用离散微分方程解释(鄂维南院士) ResNet数学本质是用微分方程的积分曲线去拟合系统的目标函数;构造了一个平滑的解空间流形,在这个平滑的流形上更容易找到解。 残差网络相当于不同长度的神经网络组成的组合函数;残差模块相当于一个差分放大器 CVPR2016何恺明汇报(2016年6月27日) ILSVRC2015竞赛 COCO2015检测与分割竞赛 论文翻译 孙剑首个深度学习博士张祥雨:3年看1800篇论文,28岁掌舵旷视基础模型研究 PreResNet:Identity Mappings in Deep Residual Networks-2016 代码地址 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:5:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"ResNeXt 文章标题:Aggregated Residual Transformations for Deep Neural Networks 作者:Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He 发表时间:(CVPR 2017) pytorch代码 设计block遵循以下两个规则: 如果输出相同 size 的 spatial map, 那么,block 的 hyper-parameters (即 width 和 filter size) 相同 feature map大小缩减一半,通道数增一倍。 这个规则保证了每个block的计算复杂度几乎一致! ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:6:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"ResNeXt Residual Block in ResNet (Left), A Block of ResNeXt with Cardinality = 32 (Right) ResNeXt_block:A layer is denoted as (# input channels, filter size, # output channels) 三种等价 b和Inception V3类似,但b是同构 ResNeXt ResNeXt-50(32x4d):32指进入网络的第一个ResNeXt基本结构的分组数量C(即cardinality基数)为32,4d表示depth即每一个分组的通道数为4(所以第一个基本结构输入通道数为128) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:7:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 Review: ResNeXt — 1st Runner Up in ILSVRC 2016 (Image Classification) 知乎:薰风读论文:ResNeXt 深入解读与模型实现 Exploring the Limits of Weakly Supervised Pretraining-ECCV-2018 在 2019 年,facebook 通过弱监督学习研究了该系列网络在 ImageNet 上的精度上限,为了区别之前的 ResNeXt 网络,该系列网络的后缀为 wsl,其中 wsl 是弱监督学习(weakly-supervised-learning)的简称。为了能有更强的特征提取能力,研究者将其网络宽度进一步放大,其中最大的 ResNeXt101_32x48d_wsl 拥有 8 亿个参数,将其在 9.4 亿的弱标签图片下训练并在 ImageNet-1k 上做 finetune,最终在 ImageNet-1k 的 top-1 达到了 85.4%。Fix-ResNeXt 中,作者使用了更大的图像分辨率,针对训练图片和验证图片数据预处理不一致的情况下做了专门的 Fix 策略,并使得 ResNeXt101_32x48d_wsl 拥有了更高的精度,由于其用到了 Fix 策略,故命名为 Fix-ResNeXt101_32x48d_wsl。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:8:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"ResNeSt 文章标题:ResNeSt: Split-Attention Networks 作者:Hang Zhang, Chongruo Wu, Zhongyue Zhang, Yi Zhu, Haibin Lin, Zhi Zhang, Yue Sun, Tong He, Jonas Mueller, R. Manmatha, Mu Li, Alexander Smola 发表时间:(2020) Official Code ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:9:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Split-Attention Networks ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:10:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Split-Attention Block featuremap group and split attention operations ResNeSt中每个块将特征图沿着channel维度划分为几个组(groups)和更细粒度的子组(splits),每个组的特征表示是由其splits的表示的加权组合来确定的(根据全局上下文信息来确定权重),将得到的这个单元称之为 Split-Attention block ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:10:1","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"featuremap group 借鉴了ResNeXt网络的思想,将输入分为K个,每一个记为Cardinal1-k ,然后又将每个Cardinal拆分成R个,每一个记为Split1-r,所以总共有G=KR个组 ResNeSt Block ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:10:2","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Split Attention in Cardinal Groups Split-Attention within a cardinal group 一个cardinal group的组合表示可以通过多个splits按元素求和进行融合来得到,第k个cardinal group为$\\hat U^k$; $$ \\hat U^k=\\sum_{j=R(k-1)+1}^{Rk}U_j\\\\ s_c^k=F_{gp}(\\hat U_c^k)=\\frac{1}{H\\times W}\\sum_{i=1}^H\\sum_{j=1}^W\\hat U_c^k(i,j)\\\\ $$ $\\hat U^k\\in R^{H\\times W\\times C/K}$;$s^k\\in R^{C/K}$;$k\\in1,2,…,K$;$H、W $和$ C $是block输出特征图的大小 $$ a_i^k(c) = \\begin{cases} \\frac{\\exp(G_i^c(s^k))}{\\sum_{j=1}^R \\exp(G_i^c(s^k))} \u0026 \\text{if } R \u003e 1 \\\\ \\frac{1}{1 + \\exp(-G_i^c(s^k))} \u0026 \\text{if } R = 1 \\end{cases} $$ G:注意力权重函数G是两个全连接层(Dense)外加relu激活函数;如果R=1的话就是对该Cardinal中的所有通道视为一个整体 $$ V_c^k=\\sum_{i=1}^R a_i^k(c)U_{R(k-1)+1}\\\\ V =Concat\\{V^1,V^2,...,V^K\\} $$ $\\hat V^k\\in R^{H\\times W\\times C/K}$ ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:10:3","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Radix-major Split-Attention Block 转换成这一形式是为了便于使用标准的CNN进行加速(像group convolution, group fully connectd layer等) 在 channel 维度上被分为 cardinal 个不同的组,每个组叫 cardinal groups。可以把这个 cardinal group 继续分成 radix 个小组。这样每个组都一个 cardinal 的序号,和 radix 的序号。 代码测试 Radix-major implementation of ResNeSt block 展示出从 cardinality-major 到 radix-major 的变化过程 cardinality-major 中间变换 radix-major Channel/group shuffling equivariant: 全局平均池化层:把 channel 维度任意打乱,再通过全局平均池化层,然后再把 channel 顺序还原 批量归一化 (Batch Normalization):打乱 channel,然后再还原,保证 BN 层的 gamma 和 beta 也相应调整顺序 分组卷积 1x1 卷积与全连接层:1x1 的分组卷积来实现多个并行的全连接层 If several consecutive modules are shuffling-equivariant, then the entire block is shuffling-equivariant. import torch import torch.nn as nn from torch.nn import functional as F # https://github.com/zhanghang1989/ResNeSt/issues/66 class rSoftMax(nn.Module): def __init__(self, radix, cardinality): super().__init__() assert radix \u003e 0 self.radix = radix self.cardinality = cardinality def forward(self, x): batch = x.size(0) if self.radix \u003e 1: x = x.view(batch, self.cardinality, self.radix, -1).transpose(1, 2) x = F.softmax(x, dim=1) x = x.reshape(batch, -1) else: x = torch.sigmoid(x) return x class Splat(nn.Module): def __init__(self, channels, radix, cardinality, reduction_factor=4): super(Splat, self).__init__() self.radix = radix self.cardinality = cardinality self.channels = channels inter_channels = max(channels*radix//reduction_factor, 32) self.fc1 = nn.Conv2d(channels//radix, inter_channels, 1, groups=cardinality) self.bn1 = nn.BatchNorm2d(inter_channels) self.relu = nn.ReLU(inplace=True) self.fc2 = nn.Conv2d(inter_channels, channels*radix, 1, groups=cardinality) self.rsoftmax = rSoftMax(radix, cardinality) def forward(self, x): batch, rchannel = x.shape[:2] if self.radix \u003e 1: splited = torch.split(x, rchannel//self.radix, dim=1) gap = sum(splited) else: gap = x gap = F.adaptive_avg_pool2d(gap, 1) gap = self.fc1(gap) gap = self.bn1(gap) gap = self.relu(gap) atten = self.fc2(gap) atten = self.rsoftmax(atten).view(batch, -1, 1, 1) if self.radix \u003e 1: attens = torch.split(atten, rchannel//self.radix, dim=1) out = sum([att*split for (att, split) in zip(attens, splited)]) else: out = atten * x return out.contiguous() ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:10:4","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Network and Training Network Tweaks 平均下采样: 对于检测和分割任务,下采样过程对于保持空间信息非常重要,ResNeSt采用的是平均池化方法,使用$3\\times3$的kernel来很好的保持空间信息。 从ResNet-D中学到的策略:用3个$3\\times3$卷积替代一个$7\\times7$卷积;加了一个$2\\times2$的平均池化到skip connection里去。 Training Strategy 大型小批量分布式训练:$\\eta =\\frac{B}{256}\\eta_{base}$; B 為为mini-batch size、base learning rate 设定为0.1;在前五個个epoch 使用 warm-up strategy 逐渐增加 learning rate Label Smoothing Auto Augmentation Mixup Training 则化:可以选择dropout、DropBlock、L2正则化方法。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:11:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Ablation Study breakdown of improvement radix vs. cardinality under ResNeSt-fast setting 补充来源 setting #P GFLOPs PyTorch Gluon ResNeSt-50-fast 1s1x64d 26.3M 4.34 80.33 80.35 ResNeSt-50-fast 2s1x64d 27.5M 4.34 80.53 80.65 ResNeSt-50-fast 4s1x64d 31.9M 4.35 80.76 80.90 ResNeSt-50-fast 1s2x40d 25.9M 4.38 80.59 80.72 ResNeSt-50-fast 2s2x40d 26.9M 4.38 80.61 80.84 ResNeSt-50-fast 4s2x40d 30.4M 4.41 81.14 81.17 ResNeSt-50-fast 1s4x24d 25.7M 4.42 80.99 80.97 2s2x40d :radix=2, cardinality=2 and width=40 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:12:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 张航-ResNeSt:拆分注意力网络 关于ResNeSt的点滴疑惑 ResNeSt 实现有误? 等价图片来源 https://github.com/zhanghang1989/ResNeSt/issues/74 https://github.com/zhanghang1989/ResNeSt/issues/4 https://github.com/zhanghang1989/ResNeSt/issues/41 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:13:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"DenseNet 文章标题:Densely Connected Convolutional Networks 作者:Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger 发表时间:(CVPR 2017) Official Code 稠密连接网络(DenseNet)在某种程度上是ResNet的逻辑扩展 相比 ResNet 中的 bottleneck,dense-block 设计了一个更激进的密集连接机制,即互相连接所有的层,每个层都会接受其前面所有层作为其额外的输入。 DenseNet 将所有的 dense-block 堆叠,组合成了一个密集连接型网络。 密集的连接方式使得 DenseNet更容易进行梯度的反向传播,使得网络更容易训练。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:14:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Motivation动机 DenseNets 不是从极深或极宽的架构中汲取表征能力,而是通过特征重用来利用网络的潜力。 Q:从输入层到输出层的信息路径(以及相反方向的梯度)变得很大,以至于它们可能在到达另一边之前就消失了。 只需将每一层直接相互连接起来:解决了确保最大信息(和梯度)流动的问题。 每一层都可以直接访问损失函数和原始输入图像的梯度。 缺点:反向传播虽然容易,但是计算复杂 Q:DenseNets 比等效的传统 CNN 需要更少的参数 不需要学习冗余特征图:对于旧的特征图(feature-map)是不需要再去重新学习的 特征重用缺点:训练模型时RAM会爆炸 growth-rate不用设很大,所以减少许多参数。 growth-rate:卷积层中卷积核的数量(k),DenseNet:k=12 卷积块的通道数控制了输出通道数相对于输入通道数的增长,因此也被称为增长率 Memory-Efficient Implementation of DenseNets Memory-Efficient Implementation of DenseNets ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:15:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"DenseNet 假如我们有$L$层卷积神经网路,那就有$L$个(层与层之间的)连结。但是DenseNet设计成有$\\frac{L(L+1)}{2}$个连结。 DenseNet ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:16:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"稠密块(dense block) 一个稠密块由多个卷积块组成,每个卷积块使用相同数量的输出通道。在前向传播中,我们将每个卷积块的输入和输出在通道维上连结。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:16:1","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"过渡层(transition layer) 过渡层可以用来控制模型复杂度。通过$1\\times1$卷积层来减小通道数,并使用步幅为2的平均汇聚层减半高和宽,从而进一步降低模型复杂度。 为什么在过渡层使用平均汇聚层而不是最大汇聚层? 参考:平均池化的特点是保留背景信息让每一个信息对最后的输出都有帮助,最大池化的特点是提取特征只保留特征最明显的信息,当我们费劲心力把不同层的信息叠在了一起以后用最大池化等于前面都做了无用功 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:16:2","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Model DenseNet DenseNet-121是指网络总共有121层:(6+12+24+16)*2 + 3(transition layer) + 1(7x7 Conv) + 1(Classification layer) = 121。 DenseNet首先使用同ResNet一样的单卷积层和最大汇聚层。 类似于ResNet使用的4个残差块,DenseNet使用的是4个稠密块。 在每个模块之间,ResNet通过步幅为2的残差块减小高和宽,DenseNet则使用过渡层来减半高和宽,并减半通道数。 与ResNet类似,最后接上全局汇聚层和全连接层来输出结果。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:16:3","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 DenseNet Architecture Explained with PyTorch Implementation from TorchVision Understanding and visualizing DenseNets 动手学深度学习:7.7. 稠密连接网络(DenseNet) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:17:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"DPN 文章标题:Dual Path Networks 作者:Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng 发表时间:(NIPS 2017) DPN,DPN 的全称是 Dual Path Networks,即双通道网络。 该网络是由 DenseNet 和 ResNet 结合的一个网络,利用残差网络的跳跃连接对特征进行复用,又可以利用密集连接路径持续探索新特征。 DenseNet 把每一层的输出都拼接(concatenate)到其后每一层的输入上,从靠前的层级中提取到新的特征。 善于挖掘新特征,冗余度高 ResNet 把输入直接加到(element-wise adding)卷积的输出上是对之前层级中已提取特征的复用。善于复用特征,冗余度低 $[W_1 \\ W_2][X_1;X_2]=W_1X_1+W_2X_2$:如果两组conv,输出的filter个数是一样的,那么在input channel上concat是可以等价于分别两组conv求和的形式 DPN_block (a) 残差网络。 (b) 密集连接的网络,其中每一层都可以访问所有先前微块的输出。在这里,为了与(a)中的微块设计保持一致,添加了一个 1×1 卷积层(下划线)。 (c) 通过在 (b) 中的微块之间共享相同输出的第一个 1×1 连接,密集连接的网络退化为残差网络。(c) 中的虚线矩形突出显示了残差单元。 (d) 双路径架构,DPN。 (e) 从实现的角度看(d)的等价形式,其中符号“~”表示拆分操作,“+”表示逐元素加法 DPN $3\\times3$ 的卷积层采用的是 group convolution $1×1×256(+16) $中的 256 代表的是 ResNet 的通道数,16 代表的是 DenseNet 一层的输出通道数,将结果分成 256 和 16 两部分,256 的 element-wise 的加到 ResNet 通道,16 的 concat 到 DenseNet 通道,然后继续下一个 block,同样输出 256 + 16 个通道,重复操作。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:18:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 代码 DPNS代码 知乎:解读Dual Path Networks(DPN,原创) 知乎:卷积神经网络学习路线(十五) | NIPS 2017 Dual Path Network 【论文阅读】—— Dual Path Network ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:19:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"HarDNet 文章标题:HarDNet: A Low Memory Traffic Network 作者:Ping Chao, Chao-Yang Kao, Yu-Shan Ruan, Chien-Hsiang Huang, Youn-Long Lin 发表时间:(ICCV 2019) 官方代码 HarDNet,HarDNet(Harmonic DenseNet)是 2019 年由国立清华大学提出的一种全新的神经网络,在低 MAC 和内存流量的条件下实现了高效率。与 FC-DenseNet-103,DenseNet-264,ResNet-50,ResNet-152 和 SSD-VGG 相比,新网络的推理时间减少了 35%,36%,30%,32% 和 45%。使用了包括 Nvidia Profiler 和 ARM Scale-Sim 在内的工具来测量内存流量,并验证推理延迟确实与内存流量消耗成正比,并且所提议的网络消耗的内存流量很低。 评价指标 Nvidia profiler获取DRAM读/写的字节数。 ARM Scale Sim获取每个CNN框架的流量数据和推理次数。 Convolutional Input/Output (CIO):每个卷积层的输入和输出尺寸之和。CIO是DRAM流量的近似处理。 MoC(MACs over CIO)。在MoC低于某个值时,CIO才会在推理时间中占主导地位。 对每一层的MoC施加一个软约束,以设计一个低CIO网络模型,并合理增加MACs。 首先减少来自DenseNet的大部分层连接,以降低级联损耗。然后,通过增加层的通道宽度来平衡输入/输出通道比率。 Block 采用的稀疏连接方式:当$k$能被$2^n$整除,让$k$层和$k-2^n$层相连,其中$n$为非负整数;并且还需满足$k-2^{n} \\ge 0$。 $l$层初始化growth-rate k 卷积层中卷积核的数量(k),DenseNet:k=12 m 用作低维压缩因子 通道数:$k\\times m^n$,n是$l$除以$2^n$时的最大数 (a) Inverted transition down module, (b) Depthwise-separable convolution for HarDNet 在HDB后连接一个1x1 conv层,作为trainsition。此外,设置HDB的深度为$L=2^n$,这样一个HDB的最后一层就有最大的通道数,梯度最多能传输$\\text{log}L$层。为了缓解这种梯度消失,将一个HDB的输出设置为第L层和它前面所有奇数层的级联。当完成HDB以后,就可以丢弃从2至L-2的所有偶数层。当m=1.6-1.9时,这些偶数层的内存占用是奇数层的2至3倍。 HarDNet 3x3, 64:64个输出通道的Conv3x3层 8,k=14,t=256:有8层的HDB,增长率k和一个有t个输出通道的trainsition过渡层 conv1x1 m :低维压缩因子 所有层的Conv-BN-ReLU,而不是DenseNet中使用的BN-ReLU-Conv 实现折叠批量标准化 HardNet-68 中每个 HDB 的专用增长率 k 提高了 CIO 效率。 由于深度 HDB 具有更多的输入通道,因此更大的增长率有助于平衡层的输入和输出之间的通道比率,以满足对MoC 约束。 对于层分布,没有集中在大多数 CNN 模型采用的 stride16 上,而是让 stride8 在 HardNet-68 中拥有最多的层, 提高了局部特征学习,有利于小规模目标检测。相比之下,分类任务更多地依赖全局特征学习,因此专注于低分辨率可以获得更高的准确度和更低的计算复杂度 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:20:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 HarDNet简析 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:21:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"ResNet_D 文章标题:Bag of Tricks for Image Classification with Convolutional Neural Networks 作者:Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li 发表时间:(CVPR 2019) Official Code ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:22:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Basline Implemention 预处理与数据增强 随机sample并且转化为[0,255]之间的32位宽浮点数 随机Crop的高宽比在3/4到4/3;面积占比大小从8~100%;最后被Resize到[224,224] 50%概率水平翻转 缩放色调,饱和度和亮度,取[0.6,1.4] 加上PCA Noise 最后Normalize整个图片 (对测试的时候,不做增强,首先对图片按照短边resize到256,再随机Crop到244,然后Normalize) 采用Xavier初始化 使用加Nesterov加速的SGD(NAG) batch-size: 256 共训练120 epoch lr 0.1(30,60,90 epoch上除以10) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:23:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Efficient Training ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:24:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Large-batch training 与小批量训练的模型相比,使用大批量训练训练的模型的验证精度降低 如何解决:四种启发式方法,有助于扩大单机训练的批处理规模 Linear scaling learning rate 线性缩放学习率。 在小批量SGD中,梯度下降是一个随机过程,因为每批样本都是随机选取的。增加批处理大小并不会改变随机梯度的期望,但会降低其方差。换句话说,大的batch size会降低gradient中的noise。 随批大小线性增加学习率对ResNet-50训练有效。选取0.1作为批量大小为256的初始学习率,那么当批量大小为b时,我们将初始学习率提高到0.1 × b/256。 learning rate warmup 一开始使用较小的学习率,然后在训练过程稳定时切换回初始学习率。 一种渐进的预热策略,将学习率从0线性增加到初始学习率。 Zero $\\gamma$ BN 的$\\gamma$和$\\beta$一般分别初始化为1和0 Zero $\\gamma$:对位于残差块末端的所有BN层初始化$\\gamma=0$。 因此,所有的残差块都只是返回它们的输入,模拟的网络,它的层数较少,在初始阶段更容易训练。Therefore, all residual blocks just return their inputs, mimics network that has less number of layers and is easier to train at the initial stage. No bias decay 将权值衰减应用于卷积层和全连接层中的权值。其他参数,包括偏置和在BN层的γ和β,保持不正则化 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:24:1","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Low-precision training 将所有参数和激活存储在FP16中,并使用FP16计算梯度。同时,所有参数在FP32中都有一个副本,用于参数更新。此外,将一个标量乘以损失,以更好地将梯度范围对齐到FP16 结论 Efficient_Training 与基线模型相比,1024批大小和FP16训练的模型甚至略微提高了0.5%的top-1精度。 仅通过线性缩放学习率将批量大小从256增加到1024会导致top-1准确率下降0.9%,而堆叠其余三个启发式方法可以弥补这一差距。训练结束时从FP32切换到FP16不会影响精度。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:24:2","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Model Tweaks Resnet50 ResnetB-C-D ResNet-B:前两个卷积的步长进行了切换 Path A中的卷积忽略了四分之三的输入特征映射,因为它使用了一个跨步为2的内核大小1×1。 ResNet-C:Input stem 的$7\\times7$卷积替换成3个$3\\times 3$ ResNet-B:ResNet-B基础上增加一个stride为2的2×2平均池化层,将$1\\times1$卷积stride改为1 Path B中的卷积忽略了四分之三的输入特征映射,因为它使用了一个跨步为2的内核大小1×1。 model_conclusion ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:25:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Training Refinements Cosine learning rate decay $\\eta_{t}=\\frac{1}{2}\\left(1+\\cos \\left(\\frac{t \\pi}{T}\\right)\\right) \\eta $,其中是$\\eta$初始化学习率。 label smoothing 通常分类任务中每张图片的标签是one hot形式的,也就是说一个向量在其对应类别索引上设置为1,其他位置为0,形如[0,0,0,1,0,0]。 label smoothing就是将类别分布变得平滑一点,即 $q_{i}=\\left{\\begin{array}{ll}{1-\\varepsilon} \u0026 {\\text { if } i=y} \\ {\\varepsilon /(K-1)} \u0026 {\\text { otherwise }}\\end{array}\\right. $ 其中$q_{i}$就代表某一类的ground truth,例如如果\\(i==y\\),那么其最终真实值就是$1-\\varepsilon$,其它位置设置为$\\varepsilon /(K-1)$,而不再是。这里的$\\varepsilon$=0.1 Knowledge Distillation T=20 Mixup 每次随机抽取两个样本进行加权求和得到新的样本,标签同样做加权操作。公式中的$\\lambda\\in[0,1]$是一个随机数,服从$\\text{Beta}(\\alpha,\\alpha)$分布。$\\alpha=0.2$ $\\begin{aligned} \\hat{x} \u0026=\\lambda x_{i}+(1-\\lambda) x_{j} \\ \\hat{y} \u0026=\\lambda y_{i}+(1-\\lambda) y_{j} \\end{aligned} $ training_refinements 蒸馏在ResNet上工作得很好,然而,它在Inception-V3和MobileNet上不太好。 可能解释是:教师模型不是来自于学生的同一家庭,因此在预测中分布不同,给模型带来了负面影响 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:26:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 Bag of tricks for image classification with convolutional neural networks review [cdm] ResNet strikes back: An improved training procedure in timm Top-1:80.4% ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:27:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"Res2Net 文章标题:Res2Net: A New Multi-scale Backbone Architecture 作者:Shang-Hua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, Philip Torr 发表时间:(TPAMI 2020) 官方源码 通过在一个残差块中构筑类似残差分层的方式进行连接。Res2Net 可以在更细粒度级别表达多尺度特征,并且可以增加每层网络的感受野大小。 Res2Net 揭示了一个新的提升模型精度的维度,即 scale,其是除了深度、宽度和基数的现有维度之外另外一个必不可少的更有效的因素。 Res2Net_Module:s=4 Res2Net_Module + group_conv + SE_block集成 在$1\\times1$卷积层后面,将特征图分为s个子集($s$为尺度(scale)维度) 原有的$n$通道$3\\times3$滤波器替换为一系列有$w$通道的更小的滤波器组(避免损失,令$n = s × w$);子集有着和原始特征图集相同的空间大小。 每一组滤波器先从一组输入特征图中进行特征提取,然后与先前组生成的特征图和另一组输入的特征图一起被送到下一组卷积核进行处理。 小滤波器组以类似于残差的模式被逐层连接,这样可以增加输出特征能表达的不同尺度的数量。 最终,所有特征图将被拼接在一起并被送到一组$1\\times1$的卷积核处进行信息融合。 忽略了第一个分组的卷积层:这也是一种特征复用的形式,减少参数并增加$s $的数量 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:28:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 作者博客 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:29:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"RedNet 文章标题:Involution: Inverting the Inherence of Convolution for Visual Recognition 作者:Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, Qifeng Chen 发表时间:(CVPR 2021) 官方源码 普通convolution 空间不变性(spatial-agnostic) 平移等价性 大小一般3x3,偏小 通道特异性(channel-specific) 不同通道包含不同语义信息 不同通道的卷积核存在冗余 希望具有:自适应长距离关系建模 involution 通道不变性(channel-agnostic): kernel privatized for different positions 空间特异性(spatial-specific): kernel shared across different channels kernel:$H\\in R^{H\\times W \\times K\\times K \\times G}$ #groups: G RedNet_involution 针对输入feature map的一个坐标点上的特征向量,先通过 $\\phi$ (FC-BN-ReLU-FC)和reshape (channel-to-space)变换展开成kernel的形状,从而得到这个坐标点上对应的involution kernel,再和输入feature map上这个坐标点邻域的特征向量进行Multiply-Add得到最终输出的feature map。 RedNet_involution 在 ResNet的stem中(使用$3\\times 3$或$7\\times7$ involution进行分类或密集预测)和trunk(对所有任务使用$7\\times7$ involution)位置中的所有bottleneck位置上替换掉了$3\\times 3$卷积,但保留了所有的$1\\times 1$卷积用于通道映射和融合。这些精心重新设计的实体联合起来,形成了一种新的高效 Backbone 网络,称为 RedNet。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:30:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 超越卷积、自注意力机制:强大的神经网络新算子involution 论文笔记 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:31:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"DCDC 文章标题:Dual Complementary Dynamic Convolution for Image Recognition 作者:Longbin Yan, Yunxiao Qin, Shumin Liu, Jie Chen 发表时间:( 2022) 在本文中,我们新颖地将特征建模为局部空间自适应**(LSA)和全局位移不变[GSI]**部分的组合,然后提出了一个双分支双互补动态卷积算子来正确处理这两类特征,显着增强了代表能力。基于所提出的算子构建的 DCDC-ResNets 的性能明显优于 ResNet 基线和大多数最先进的动态卷积网络,同时具有更少的参数和 FLOP。我们还对目标检测、实例和全景分割等下游视觉任务进行了迁移实验,以评估模型的泛化能力,实验结果显示出显着的性能提升 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/resnet/:32:0","tags":["Deep Learning","图像分类模型"],"title":"ResNet","uri":"/posts/deeplearning/image-classification/resnet/"},{"categories":["Deep Learning"],"content":"SENet","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"SENet 文章标题:Squeeze-and-Excitation Networks 作者:Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu 发表时间:(CVPR 2018) Official Code External-Attention-pytorch senet.pytorch ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:1:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"Squeeze-and-Excitation blocks 关注channel之间的关系,希望模型可以自动学习到不同channel特征的重要程度 Diagram of a Squeeze-and-Excitation building block. $$ u_c=v_c*X=\\sum_{s=1}^{C'}v_c^s*x^s $$ 输入:$X=[x^1,x^2,…,x^{C’}]$ 输出:$U=[u_1,u_2,…,u_C]$ $v_c=[v_c^1,v_c^2,…,v_c^{C’}] $;$v_c^s$是一个二维空间内核,表示作用于 $X $的相应通道的 $v_c$的单个通道。 卷积核的集合:$V=[v_1,v_2,…,v_C]$ 压缩(Squeeze):经过(全局平均池化)压缩操作后特征图被压缩为1×1×C向量;也可以采用更复杂的策略 卷积计算:参数量比较大 最大池化:可能用于检测等其他任务,输入的特征图是变化的,能量无法保持 $$ z_c=F_{sq}(u_c)=\\frac{1}{H\\times W}\\sum_{i=1}^H\\sum_{j=1}^Wu_c(i,j) $$ $$ s=F_{ex}(z,W)=\\sigma(g(z,W)) =\\sigma(W_2\\delta(W_1z)) $$ $\\delta$:ReLU;$\\sigma$:sigmoid激活,$W_1\\in R^{\\frac{C}{r}\\times C}$:降维层;$W_2\\in R^{C \\times\\frac{C}{r}}$:升维层 比直接用一个 Fully Connected 层的好处在于 1)具有更多的非线性,可以更好地拟合通道间复杂的相关性; 2)极大地减少了参数量和计算量 c可能很大,所以需要降维 scale操作:最后通过一个 Scale 的操作来将归一化后的权重加权到每个通道的特征上 $$ \\tilde x_c = F_{scale}(u_c,s_c)=s_cu_c $$ from torch import nn class SELayer(nn.Module): def __init__(self, channel, reduction=16): super(SELayer, self).__init__() self.avg_pool = nn.AdaptiveAvgPool2d(1) self.fc = nn.Sequential( nn.Linear(channel, channel // reduction, bias=False), nn.ReLU(inplace=True), nn.Linear(channel // reduction, channel, bias=False), nn.Sigmoid() ) def forward(self, x): b, c, _, _ = x.size() y = self.avg_pool(x).view(b, c) y = self.fc(y).view(b, c, 1, 1) return x * y.expand_as(x) SE-Inception-module SE-ResNet-module ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:2:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"Model and computational complexity SENet reduction为16. $fc,[16,256]$:16为第一个全连接层的输出通道数;256为第二个全连接层的输出通道数; $$ \\frac{2}{r}\\sum_{s=1}^SN_s\\cdot C_s^2 $$$r$表示降维比;$S$:第几个stage;$C_s$ 表示输出通道的维度;$N_s $表示第$s$个stage的重复块的数量 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:3:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"Ablation study 不同Reduction ratio也进行了消融实验。 Squeeze Operator不同操作(如Max,Avg)也进行了消融实验。 Excitation Operator不同激活函数操作(如ReLU,Tanh,Sigmoid)也进行了消融实验。 SE block在不同stage也进行了消融实验。 Integration strategy进行消融实验。 SE block integration designs explored in the ablation study Effect of different SE block integration strategies with ResNet-50 on ImageNet SE的三种变体:SE 单元在分支聚合之前应用产生的性能改进对其位置相当稳健 如果对 Addition 后主支上的特征进行重标定,由于在主干上存在 0~1 的 scale 操作,在网络较深 BP 优化时就会在靠近输入层容易出现梯度消散的情况,导致模型难以优化。 Effect of integrating SE blocks at the 3x3 convolutional layer of each residual branch in ResNet-50 on ImageNet 另一种设计变体:将 SE 块移动到残差单元内,将其直接放在 3×3 卷积层之后。 以更少的参数实现了可比的分类精度 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:4:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"Role of SE blocks Effect of Squeeze:强调了挤压操作的重要,作为对比,它添加了相同数量的参数,删除了池化操作,用具有相同通道维度的相应 $1\\times1 $卷积替换了两个 FC 层,即 NoSqueeze,其中激励输出保持空间维度作为输入。 Effect of Squeezet Role of Excitation Role of Excitation $$ SE\\_5\\_2:SE\\_{stageID}\\_{blockID} $$不同类别的分布在网络的早期层非常相似,表明特征通道的重要性很可能在早期由不同的类别共享 后面的层特征表现出更高水平的特异性 SE_5_2表现出一种有趣的趋向于饱和状态的趋势,大多数激活都接近于 1 SE_5_3 的网络末端(紧随其后的是分类器之前的全局池),在不同的类中出现了类似的模式 为网络提供重新校准方面不如之前的块重要,通过移除最后阶段的 SE 块,可以显着减少额外的参数计数,而性能只有边际损失 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:5:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"训练细节 每个瓶颈构建块的前$ 1\\times1 $卷积通道的数量减半以降低模型的计算成本性能下降最小。 第一个$ 7 \\times 7 $卷积层被三个连续的 $3 \\times3 $卷积层替换。(Inception) 具有步长为2 的 $1 \\times 1 $下采样卷积被替换为$ 3 \\times 3$ 步长为2的 卷积以保留信息。 在分类层之前插入一个 dropout 层(dropout 比为 0.2)以减少过度拟合。 在训练期间使用了标签平滑正则化(。 在最后几个训练时期,所有 BN 层的参数都被冻结,以确保训练和测试之间的一致性。 使用 8 个服务器(64 个 GPU)并行进行训练,以实现大批量(2048 个)。初始学习率设置为 1.0 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:6:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"拓展阅读 CV27 Momenta研发总监 孙刚 Squeeze and Excitation Networks上 CV27 Momenta研发总监 孙刚 Squeeze and Excitation Networks下 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:7:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"SKNet 文章标题:Selective Kernel Networks 作者:Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang 发表时间:(CVPR 2019) Official Code ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:8:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"Selective Kernel Convolution 用multiple scale feature汇总的information来channel-wise地指导如何分配侧重使用哪个kernel的表征 一种非线性方法来聚合来自多个内核的信息,以实现神经元的自适应感受野大小 Selective Kernel Convolution Split:生成具有不同内核大小的多条路径,这些路径对应于不同感受野(RF,receptive field) 大小的神经元 $X\\in R^{H’\\times W’\\times C’} $ $\\tilde F:X\\to \\tilde U \\in R^{H\\times W\\times C} $ kernel size $3\\times3$ $\\hat F:X\\to \\hat U \\in R^{H\\times W\\times C}$ kernel size $5\\times5$:使用空洞卷积$3\\times3$,空洞系数为2。 Fuse:聚合来自多个路径的信息,以获得选择权重的全局和综合表示。 $$ U=\\tilde U+\\hat U\\\\ s_c=F_{gp}(U_c)=\\frac{1}{H\\times W}\\sum_{i=1}^H\\sum_{j=1}^WU_c(i,j)\\\\ z=F_{fc}(s)=\\delta(B(Ws)) 降维处理\\\\ $$ $s\\in R^c$;$\\delta$:ReLU;$z\\in R^{d\\times1}$;$W\\in R^{d\\times C}$:批量归一化; $d=max(C/r,L)$ L:d的最小值,本文设置32 Select:根据选择权重聚合不同大小内核的特征图 $$ a_c=\\frac{e^{A_cz}}{e^{A_cz}+e^{B_cz}}\\\\ b_c=\\frac{e^{B_cz}}{e^{A_cz}+e^{B_cz}}\\\\ $$$$ V_c=a_c\\cdot\\tilde U_c + b_c\\cdot \\hat U_c\\\\\\ a_c+b_c=1\\\\ V_c\\in R^{H\\times W} $$ Selective Kernel Convolution三分支 $SK[M,G,r]\\to SK[2,32,16]$ M:确定要聚合的不同内核的选择数量 G:控制每条路径的基数的组号 r:reduction ratio import numpy as np import torch from torch import nn from torch.nn import init from collections import OrderedDict class SKAttention(nn.Module): def __init__(self, channel=512,kernels=[1,3,5,7],reduction=16,group=1,L=32): super().__init__() self.d=max(L,channel//reduction) self.convs=nn.ModuleList([]) for k in kernels: self.convs.append( nn.Sequential(OrderedDict([ ('conv',nn.Conv2d(channel,channel,kernel_size=k,padding=k//2,groups=group)), ('bn',nn.BatchNorm2d(channel)), ('relu',nn.ReLU()) ])) ) self.fc=nn.Linear(channel,self.d) self.fcs=nn.ModuleList([]) for i in range(len(kernels)): self.fcs.append(nn.Linear(self.d,channel)) self.softmax=nn.Softmax(dim=0) def forward(self, x): bs, c, _, _ = x.size() conv_outs=[] ### split for conv in self.convs: conv_outs.append(conv(x)) feats=torch.stack(conv_outs,0)#k,bs,channel,h,w ### fuse U=sum(conv_outs) #bs,c,h,w ### reduction channel S=U.mean(-1).mean(-1) #bs,c Z=self.fc(S) #bs,d ### calculate attention weight weights=[] for fc in self.fcs: weight=fc(Z) weights.append(weight.view(bs,c,1,1)) #bs,channel attention_weights=torch.stack(weights,0)#k,bs,channel,1,1 attention_weights=self.softmax(attention_weights)#k,bs,channel,1,1 ### fuse V=(attention_weights*feats).sum(0) return V ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:9:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"Network Architecture SKNet ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:10:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"Ablation Studies The dilation D and group number The dilation D and group number Combination of different kernels The dilation D and group number k3 表示 3x3 conv,k5 表示 3x3 conv with 2 dilated,k7 表示 3x3 conv with 3 dilated。 Dilated 是一种在不改变参数数量的情况下扩大感受区域的方法,主要用于分割。 (1) 当路径 M 的数量增加时,识别误差通常会减小。 (2) 无论 M = 2 还是 3,基于 SK 注意力的多路径聚合总是比简单聚合方法(朴素基线模型)实现更低的 top-1 误差。 (3) 使用 SK attention,模型从 M = 2 到 M = 3 的性能增益是微不足道的(top-1 error 从 20.79% 下降到 20.76%)。为了更好地权衡性能和效率,M = 2 是首选 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:11:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"拓展阅读 SKNet——SENet孪生兄弟篇 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/senet/:12:0","tags":["Deep Learning","图像分类模型"],"title":"SENet","uri":"/posts/deeplearning/image-classification/senet/"},{"categories":["Deep Learning"],"content":"VGGNet","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"VGGNet 文章标题:Very Deep Convolutional Networks for Large-Scale Visual Recognition 作者:Simonyan K, Zisserman A. V 发表时间:(ICLR 2015) 论文主页 VGG是2014年ImageNet图像分类竞赛亚军,定位竞赛冠军,由牛津大学视觉组提出。 VGG16和VGG19经常作为各类计算机视觉任务的迁移学习骨干网络。 VGG将LeNet和AlexNet奠定的经典串行卷积神经网络结构的深度和性能发挥到极致。 将所有卷积核设置为3 x 3,减少参数量和计算量,共设置5个block,每进入新的block,卷积核个数翻倍。 VGG模型结构虽然简单,但臃肿复杂,参数过多(超过一亿个),速度慢,第一个全连接层占据了大量参数。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/:1:0","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"Architecture VGGNet16 VGGNet 2层$3\\times3$卷积的感受野相当于$5\\times5$;3层$3\\times3$卷积的感受野相当于$7\\times7$。 VGGNet16参数计算 前两层卷积占据绝大部分内存;第一层全连接层占据绝大部分参数。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/:2:0","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"Training 更快收敛:小的卷积核和深的网络起到隐式的正则化;对某些层进行权重初始化策略 训练图片尺寸:$S$为经过各向同性缩放的训练图像的短边,作为训练图片尺度。(AlexNet数据增强部分缩放尺度$S$为$256\\times256$) 固定$S$ $S=256$或$S=384$ 多尺度$S$ 随机从[256,512]选取 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/:3:0","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"Tseting 下面设缩放后图片短边为$Q$ 全卷积:先将网络中的FC层全转换成卷积层(第一个FC -\u003e 7 x 7的卷积层,后面两个FC -\u003e 1 x 1的卷积层),因此预测时无需裁剪成224 x 224了(因为现在是全卷积网络),可以将整个图片喂入。Softmax层之前的输出(class score map):feature map个数 = 类别数,为了能经过Softmax层,这里对每个feature map求全局平均池化GPA(global average pooling)。 全卷积 裁剪:还是AlexNet的思路,作者每个尺度裁剪50张图片,三个尺度一共150张图片(注意这里还是需要裁剪出224 x 224的) ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/:4:0","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"Experiments $Q$为固定值 LRN在这里不起作用 训练多尺度很有效。 train多尺度,test单尺度(全卷积) $Q$为范围值 若$S$为固定值:$Q={S-32,S,S+32}$ 若$S$为范围值:$Q={S_{min},\\frac{S_{min}+S_{max}}{2},S_{max}}$ train,test多尺度(全卷积) 全卷积和裁剪 裁剪的效果更好,当然两者集成之后更好 train,test多尺度(全卷积和裁剪) * 模型集成 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/:5:0","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"Localisation Localisation定位问题(可看成目标检测的特例)(模型预测是bbox与Ground Truth的IoU大于0.5即可):VGGNet改成预测bounding box(下面都简称为bbox)的模型,一个bbox用中心坐标、长、宽四个数确定,最后一个FC层换成4维(single-class regression,SCR,对所有类别不区分对待,即训练1个bbox)或4000维的向量(per-class regression,PCR,每个类别区分对待,即训练1000个bbox)。Softmax损失换成L2损失,训练单尺度模型,模型初始化使用之前的分类的模型,最后一层FC层随机初始化。 预测时:第一种方法是仅裁剪出图片中间的一块;第二种方法是用前面的全卷积,这种情况下最后会输出一堆bbox,于是可以对它们进行合并(基于前面分类的结果合并)。这里没有使用可以进一步提高结果的multiple pooling offsets和resolution enhancement technique(有待研究)。 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/:6:0","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"拓展阅读 牛津大学视觉组(VGG)官方网站 VGG原版Slides VGG详解 可视化VGG-16网络结构 经典神经网络结构可视化 Convolutional neural networks on the iPhone with VGGNet 翻译博客 ","date":"2023-06-02","objectID":"/posts/deeplearning/image-classification/vggnet/:7:0","tags":["Deep Learning","图像分类模型"],"title":"VGGNet","uri":"/posts/deeplearning/image-classification/vggnet/"},{"categories":["Deep Learning"],"content":"Deep Learning Paper","date":"2023-06-01","objectID":"/posts/deeplearning/paper/","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"Paper ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:1:0","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"Image Classification ALexNet:ImageNet Classification with Deep Convolutional Neural Networks (NIPS 2012) ZFNet:Visualizing and Understanding Convolutional Networks (ECCV 2014) GoogLeNet:Going Deeper with Convolutions (CVPR 2015) Network In Network $1\\times1$卷积 Provable Bounds for Learning Some Deep Representations 用稀疏、分散的网络取代以前庞大密集臃肿的网络 InceptionV2:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (ICML 2015) InceptionV3:Rethinking the Inception Architecture for Computer Vision (CVPR 2016) InceptionV4:Inception-ResNet and the Impact of Residual Connections on Learning (AAAI 2017) Xception:Xception: Deep Learning with Depthwise Separable Convolutions (CVPR 2017) VGGNet:Very Deep Convolutional Networks for Large-Scale Visual Recognition (ICLR 2015) ResNet:Deep Residual Learning for Image Recognition(CVPR 2016) ResNeXt:ggregated Residual Transformations for Deep Neural Networks-2017 DenseNet:Densely Connected Convolutional Networks ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:2:0","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"Object Detection ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:3:0","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"Dense Prediction (one-stage) anchor based SSD:SSD: Single Shot MultiBox Detector (ECCV 2016) YOLO:You Only Look Once:Unified, Real-Time Object Detection (CVPR 2016) YOLOV2:YOLO9000: Better, Faster, Stronger (CVPR 2017) YOLOV3:YOLOv3: An Incremental Improvement (CVPR 2018) YOLOV4:YOLOv4: Optimal Speed and Accuracy of Object Detection (CVPR 2020) Scaled-YOLOv4:Scaled-YOLOv4: Scaling Cross Stage Partial Network (CVPR 2021) IOU_Loss(2016)-\u003eGIOU_Loss(2019)-\u003eDIOU_Loss(2020)-\u003eCIOU_Loss(2020) YOLOX:YOLOX: Exceeding YOLO Series in 2021 YOLOV5: Alpha-IoU:A Family of Power Intersection over Union Losses for Bounding Box Regression (NIPS 2021) RetinaNet:Focal Loss for Dense Object Detection (ICCV 2017) anchor free CornerNet:CornerNet: Detecting Objects as Paired Keypoints](https://arxiv.org/abs/1808.01244) (ECCV 2018) CornerNet-Lite: Efficient Keypoint Based Object Detection (BMVC 2020) CenterNet:CenterNet: Keypoint Triplets for Object Detection (ICCV 2019) MatrixNet:Matrix Nets: A New Deep Architecture for Object Detection(ICCV 2019) FCOS:FCOS: Fully Convolutional One-Stage Object Detection (ICCV 2019) ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:3:1","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"Sparse Prediction (two-stage) anchor based R-CNN:[Rich feature hierarchies for accurate object detection and semantic segmentation (CVPR 2014) Selective Search for Object Recognition(IJCV 2012) [Path-aggregation blocks-FPN](####Path-aggregation blocks) [Additional blocks-SPP](####Additional blocks) Fast R-CNN:Fast R-CNN (ICCV 2015) Faster R-CNN:Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (NIPS 2015) R-FCN:R-FCN: Object Detection via Region-based Fully Convolutional Networks (NIPS 2016) Mask R-CNN:Mask R-CNN (ICCV 2017) Libra R-CNN: Libra R-CNN: Towards Balanced Learning for Object Detection (CVPR 2019) Sparse R-CNN:Sparse R-CNN: End-to-End Object Detection with Learnable Proposals (CVPR 2021) anchor free RepPoints:RepPoints: Point Set Representation for Object Detection (ICCV 2019) ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:3:2","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"Neck Additional blocks SPP:Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition (TPAMI 2015) ASPP:DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs (TPAMI 2017) RFB:Receptive Field Block Net for Accurate and Fast Object Detection (ECCV 2018) SAM:CBAM: Convolutional Block Attention Module (ECCV 2018) Path-aggregation blocks FPN:Feature Pyramid Networks for Object Detection (CVPR 2017) PAN:Path Aggregation Network for Instance Segmentation (CVPR 2018) NAS-FPN:NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection (CVPR 2019) BiFPN:EfficientDet: Scalable and Efficient Object Detection (CVPR 2020) ASFF:Learning Spatial Fusion for Single-Shot Object Detection (2019) SFAM: M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network (AAAI 2019) ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:3:3","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"Image Segmentation ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:4:0","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"轻量化CNN SqueezeNet:SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \u003c0.5MB model size (2016) SqueezeNext: Hardware-Aware Neural Network Design (2018) MobileNet:MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (2017) MobileNetV2:MobileNetV2: Inverted Residuals and Linear Bottlenecks (2018) MobileNetV3:Searching for MobileNetV3 (2019) MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019) ShuffleNet:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (2017) ShuffleNetV2:ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (2018) PeleeNet:Pelee: A Real-Time Object Detection System on Mobile Devices (2018) Shift-A:Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions (2018) GhostNet: GhostNet: More Features from Cheap Operations (2020) ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:5:0","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"GAN GAN:Generative Adversarial Networks ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:6:0","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"},{"categories":["Deep Learning"],"content":"如何读论文 李沐 第一遍:关注标题和摘要;结论。实验部分和方法的图表;看看适不适合。海选 第二遍:全过一遍,图表、流程图具体到每个部分;相关文献圈出来。精选 第三遍:知道每句话,每段话在说什么,换位思考。脑补过程。重点研读 ","date":"2023-06-01","objectID":"/posts/deeplearning/paper/:7:0","tags":["Deep Learning"],"title":"Deep Learning Paper","uri":"/posts/deeplearning/paper/"}]