2018 03 07

wangkuiyi

abhinavarora

CSP
- Adding more unit tests for ChannelHolder class https://github.com/PaddlePaddle/Paddle/pull/8668
- Redesign channel implementation for Select Op https://github.com/PaddlePaddle/Paddle/pull/8814
- [WIP] Implement non blocking CanSend and CanReceive in Channels for Select OP https://github.com/PaddlePaddle/Paddle/issues/8815
- Issues Opened to Support Select
  - Fluid Channel should match Go Channel in Semantics https://github.com/PaddlePaddle/Paddle/issues/8813
  - Expose methods to create and add QueueMessage to Send or Receive Queue https://github.com/PaddlePaddle/Paddle/issues/8863
  - Add ability to provide callbacks in QueueMEssage Notify https://github.com/PaddlePaddle/Paddle/issues/8864
Update Authors.md https://github.com/PaddlePaddle/Paddle/pull/8692
PR Reviews

cs2be(thuan)

Paddle
- Fix Documentation and API links on paddlepaddle.org (https://github.com/PaddlePaddle/Paddle/pull/8760)
- CSP
  - Select Operator, Select Operator Unit tests (https://github.com/PaddlePaddle/Paddle/pull/8867)

qijun

Memory optimization on Fluid:

Model	no optimize	reuse memory	release memory	forward memory
Resnet	170590208	92995584(reduce 45.5%)	78004224(reduce 54.3%)	77488128

https://github.com/PaddlePaddle/Paddle/pull/8690
Try to combine scheduler and memory optimization together

Fix and Enhance:

Docs:

https://github.com/PaddlePaddle/Paddle/pull/8833

caoying

PR review:
- Inception-V4 for Fluid： https://github.com/PaddlePaddle/models/pull/672#pullrequestreview-100702951
- squeeze/unsqueeze opeartor: https://github.com/PaddlePaddle/Paddle/pull/8541#pullrequestreview-100686151
PR:
- reorganized NMT directory: https://github.com/PaddlePaddle/models/pull/690
- [WIP, not create a PR yet]: add multi-card/multi-thread implementation for Transformer
  - need to enhance reshape operator first: https://github.com/PaddlePaddle/Paddle/issues/8781
  - finish basic implementation, need to test and clean codes.

wanghaoshuang

OCR CTC
- Refine training net, final sequence error is 0.286(fluid) VS 0.278(V1) without 'model average'
  - https://github.com/PaddlePaddle/models/pull/596
- Add model average option for fluid API [WIP]
Add guide for howto documentation:
- https://github.com/PaddlePaddle/Paddle/pull/8849

fengjiayi

Remove Accuracy from Evaluator:
- https://github.com/PaddlePaddle/Paddle/pull/8643
A new design of model save/load:
- save/load layer: https://github.com/PaddlePaddle/Paddle/pull/8711
- related issue: https://github.com/PaddlePaddle/Paddle/issues/8521
[WIP] Double buffer for C++ reader:
- https://github.com/PaddlePaddle/Paddle/pull/8841
Doc update:
- https://github.com/PaddlePaddle/Paddle/pull/8765

luotao

compile:
- fix only shared variables could be declared as static in the device code on cuda 7.5: https://github.com/PaddlePaddle/Paddle/pull/8716
- fix warning: statement is unreachable: https://github.com/PaddlePaddle/Paddle/pull/8740
- refine operators/math/CMakeLists.txt: https://github.com/PaddlePaddle/Paddle/pull/8682
inference:
- compile and install the static library of fluid inference: https://github.com/PaddlePaddle/Paddle/pull/7827
doc:
- remove doc_theme: https://github.com/PaddlePaddle/Paddle/pull/8752
- Create new structure for V2 and fluid doc: https://github.com/PaddlePaddle/Paddle/pull/8761
- fix document deployment: https://github.com/PaddlePaddle/Paddle/pull/8772
code review:
- Fix building error of missing end-group for Android.: https://github.com/PaddlePaddle/Paddle/pull/8680
- Fix fluid distribute build: https://github.com/PaddlePaddle/Paddle/pull/8704
- [Intel] MKLDNN conv2d kernels added: https://github.com/PaddlePaddle/Paddle/pull/8451
- 5 doc guides： https://github.com/PaddlePaddle/Paddle/issues?q=is%3Aissue+label%3Adocumentation+author%3Ashanyi15+is%3Aopen

wuyi

distribute training performance boost:
- https://github.com/PaddlePaddle/Paddle/pull/8839
- https://github.com/typhoonzero/grpc_zerocopy_async_example
EDL reviews:
- https://github.com/PaddlePaddle/cloud/pull/634

Dang Qingqing

SSD on Fluid:
- Verify the backward of SSD loss.
  - https://github.com/PaddlePaddle/Paddle/issues/8626#issuecomment-369518496
- Verify the correctness of detection_output API in SSD.
  - https://github.com/PaddlePaddle/Paddle/issues/8687
- Implement detection mAP evaluator wrapper and unify label format between SSD loss and mAP evaluator.
  - https://github.com/PaddlePaddle/Paddle/pull/8736
- Enable device automatically switching in mine_hard_examples_op.
  - https://github.com/PaddlePaddle/Paddle/pull/8706
- Refine the doc in detection_output API.
  - https://github.com/PaddlePaddle/Paddle/pull/8689
- Fix bug in detection mAP evaluator.
  - https://github.com/PaddlePaddle/Paddle/pull/8778
- Refine MobileNet SSD model and add mAP evaluator.
  - https://github.com/PaddlePaddle/models/pull/694
- Fix the test reader for SSD.
  - https://github.com/PaddlePaddle/models/pull/695
- Converted a MobileNet-SSD model of Tensorflow based on COCO dataset for pre-training and train MobileNet-SSD.

Yan Xu

[WIP] Fix sparse update bug, https://github.com/PaddlePaddle/Paddle/issues/8678
document fix: https://github.com/PaddlePaddle/Paddle/pull/8853
fix nccl version in manylinux Dockerfile, https://github.com/PaddlePaddle/Paddle/pull/8708

qiaolongfei

fluid

Profile
- se_resnet_152 multi-gpu profile https://github.com/PaddlePaddle/Paddle/issues/8661
- [Speed]speed up python executor in fluid https://github.com/PaddlePaddle/Paddle/issues/8729
- add se resnet 152 profile script https://github.com/dzhwinter/benchmark/pull/84
- Add program cache for executor.py https://github.com/PaddlePaddle/Paddle/pull/8744
- add image resnet profile https://github.com/dzhwinter/benchmark/pull/83
- fluid vs pytorch profile https://github.com/PaddlePaddle/Paddle/issues/8677
```
`time_per_batch` | memory
---- | ---
1.95/1.15 = 1.696 | 18341 / 13359.0 = 1.373
```
- DeepASR profile https://github.com/PaddlePaddle/Paddle/issues/8750
- add timeline profile howto https://github.com/PaddlePaddle/Paddle/pull/8844
- DataReader profile https://github.com/PaddlePaddle/Paddle/issues/8857
bugfix and code optimize
- fix snappy build https://github.com/PaddlePaddle/Paddle/pull/8804
- add print_log to memory_optimize https://github.com/PaddlePaddle/Paddle/pull/8831
Review
- Add CPU time and MemCopy to the timeline. https://github.com/PaddlePaddle/Paddle/pull/8679
- Enable device switching automatically for some operators in detection. https://github.com/PaddlePaddle/Paddle/pull/8684
- Enable device automatically switching in mine_hard_examples_op. https://github.com/PaddlePaddle/Paddle/pull/8706
- add inplace to reshape https://github.com/PaddlePaddle/Paddle/pull/8747
- [Speed]Avoid init_nccl for every steps. https://github.com/PaddlePaddle/Paddle/pull/8758
- Improve the timeline profiler https://github.com/PaddlePaddle/Paddle/pull/8775
- [Speed]Refine elementwise_mul_op gradient functor https://github.com/PaddlePaddle/Paddle/pull/8810
- [Speed] Refine elementwise sub,div,min,max gradient functor https://github.com/PaddlePaddle/Paddle/pull/8820
- fix mac build error https://github.com/PaddlePaddle/Paddle/pull/8856

ranqiu

Refine doc

https://github.com/PaddlePaddle/Paddle/pull/8801
Survey and Dissusion about API document standard

https://github.com/PaddlePaddle/Paddle/issues/8834

https://github.com/PaddlePaddle/Paddle/issues/8866

zhangchao

Fix the error in Readme.md of mt_with_external_memory.
- https://github.com/PaddlePaddle/models/pull/685
Replace paddle.v2.fluid by paddle.fluid in benchmark.
- https://github.com/dzhwinter/benchmark/pull/85
[WIP]Operators profiling in Transformer model.
- https://github.com/PaddlePaddle/models/issues/697
[WIP] Implement rnn_search model with multi-gpu.
- https://github.com/peterzhang2029/models/tree/add_rnn_search/fluid/neural_machine_translation/rnn_search

Yang Yang (tonyyang-svail)

ParallelDo Profiling: https://github.com/PaddlePaddle/Paddle/issues/8719
VGG net: https://github.com/PaddlePaddle/Paddle/issues/8718
NCCL: https://github.com/PaddlePaddle/Paddle/pull/8575
Parallel Executor: https://github.com/helinwang/Paddle/commit/07709351157f30260bff2da12c72a565b021bb70

tangwei

Document: https://github.com/PaddlePaddle/Paddle/pull/8656
ISSUE: https://github.com/PaddlePaddle/cloud/issues/638
EDL Deploy EDL on Sys Kubernetes Cluster

guosheng

NMT:
- [WIP] Add inference and external beam search implementation for Transformer.
- Tune Transformer model achieving comparative training cost with Pytorch.

yangyaming

Refine document
https://github.com/PaddlePaddle/Paddle/pull/8737
Refine api design for beam search
https://github.com/PaddlePaddle/models/pull/675
Enhance data reader for DeepASR [WIP]

Liu Yiqun

Inference Framework
- [Merged] Add profiling information for inference example
  - https://github.com/PaddlePaddle/Paddle/pull/8748
- [Reviewing] Add test for nested RecordEvent
  - https://github.com/PaddlePaddle/Paddle/pull/8773
- Review
  - Fix nullptr when doing nested profiling: https://github.com/PaddlePaddle/Paddle/pull/8782
  - refine operators/math/CMakeLists.txt: https://github.com/PaddlePaddle/Paddle/pull/8682
  - Enable is_test attr of batch_norm and drop out op for test program: https://github.com/PaddlePaddle/Paddle/pull/8642
  - compile and install the static library of fluid inference: https://github.com/PaddlePaddle/Paddle/pull/7827
Mobile
- [Merged] Fix building error of missing end-group for Android
  - https://github.com/PaddlePaddle/Paddle/pull/8680
- [Merged] Try to add build_android task back to travis
  - https://github.com/PaddlePaddle/Paddle/pull/8699
- [Doing] Enable the building of Fluid for Android
  - https://github.com/PaddlePaddle/Paddle/pull/8709

Yibing Liu

Add guideline for the doc of cmd parameters
- https://github.com/PaddlePaddle/Paddle/pull/8732
Setup DeepASR training on k8s cluster (thanks to @wuyi @weibao @yanxu), 2x speedup (4 GPUs, P40 vs. K40).
Model training, find the bad convergence of momentum optimizer
- https://github.com/PaddlePaddle/models/issues/696
- https://github.com/PaddlePaddle/models/pull/692
Some small fix
- https://github.com/PaddlePaddle/models/pull/677
Code Review:
- https://github.com/PaddlePaddle/Paddle/pull/8848
- https://github.com/PaddlePaddle/Paddle/pull/8773

Yu Yang

[Reviewing] RecordIO Reader
- https://github.com/PaddlePaddle/Paddle/pull/8830
[Merged] RecordIO File Utilities
- https://github.com/PaddlePaddle/Paddle/pull/8780
Profiling GPU/CPU code
- Several internal web page linkes.
Several enhancements
- https://github.com/PaddlePaddle/Paddle/pull/8657
- https://github.com/PaddlePaddle/Paddle/pull/8791

zhaochengduo

Performance optimization of single card single card
- Refine elementwise_mul_op gradient functor
  - https://github.com/PaddlePaddle/Paddle/pull/8810
- Refine concat_op
  - https://github.com/PaddlePaddle/Paddle/pull/8669
- Refine elementwise sub,div,min,max gradient functor
  - https://github.com/PaddlePaddle/Paddle/pull/8820
Enhancement
- Add log before op Run
  - https://github.com/PaddlePaddle/Paddle/pull/8859
Debug
- For test_machine_translation.py, with regularization, the program will be hung out. https://github.com/PaddlePaddle/Paddle/issues/8697
Review

dongzhihong

Fluid
- add develop standard
  - https://github.com/PaddlePaddle/Paddle/pull/8803
- add feature/recordio io format
  - https://github.com/PaddlePaddle/Paddle/issues/8763
  - https://github.com/PaddlePaddle/Paddle/pull/8759
- refine gpu common
  - https://github.com/PaddlePaddle/Paddle/pull/8742

Xin Pan

paddle performance profiling
paddle performance optimization
- https://github.com/PaddlePaddle/Paddle/pull/8758
paddle multi-threaded scheduling
- https://github.com/PaddlePaddle/Paddle/issues/8818

sidgoyal78

Inference:

TensorRT results: https://github.com/PaddlePaddle/Paddle/issues/8790 (Code has been shared with Kexin and Xreki in another repo)
Discussion with Nvidia folks for TensorRT integration

PR review:

Profiling info for inference example: https://github.com/PaddlePaddle/Paddle/pull/8748
float16 add context wait: https://github.com/PaddlePaddle/Paddle/pull/8850
float16 data type tranform: https://github.com/PaddlePaddle/Paddle/pull/8619

helinwang

Performance tuning:
Move EDL from PaddlePaddle/cloud to PaddlePaddle/edl with Yi: https://github.com/PaddlePaddle/edl/pull/5

jetfuel

Integrate PyBind documentation into Sphinx documentation generation: https://github.com/PaddlePaddle/VisualDL/pull/292
Update Travis-CI setting: https://github.com/PaddlePaddle/VisualDL/pull/295, https://github.com/PaddlePaddle/VisualDL/pull/297
Fix incorrect documentation link: https://github.com/PaddlePaddle/VisualDL/pull/288
Add English examples of how to use VisualDL with PyTorch, MXNet, Keras: https://github.com/PaddlePaddle/VisualDL/pull/299
Publish the new examples to the websites: https://github.com/PaddlePaddle/VisualDL/pull/300

varunarora

WIP of CSP Select op and Channels redesign: https://github.com/PaddlePaddle/Paddle/pull/8867
Collaborate with Jeff on VDL issues

Release Notes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2018 03 07

wangkuiyi

abhinavarora

cs2be(thuan)

nickyfantasy

gongweibao

kexinzhao

qijun

caoying

wanghaoshuang

fengjiayi

luotao

wuyi

Dang Qingqing

Yan Xu

qiaolongfei

fluid

ranqiu

zhangchao

Yang Yang (tonyyang-svail)

tangwei

guosheng

yangyaming

Liu Yiqun

Yibing Liu

Yu Yang

zhaochengduo

dongzhihong

Xin Pan

sidgoyal78

helinwang

jetfuel

varunarora

Clone this wiki locally