Skip to content

Commit

Permalink
Merge pull request #11 from rgrr/feature/pio-games
Browse files Browse the repository at this point in the history
PIO optimizations
  • Loading branch information
rgrr authored Jan 6, 2023
2 parents 75c2f25 + 65a8dc0 commit 8d690a5
Show file tree
Hide file tree
Showing 11 changed files with 352 additions and 313 deletions.
34 changes: 31 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,9 @@ with WAIT and the host needs to retry.
Effects of cabling should be clear: the longer the cables plus some more effects, the worse the signals. Which effectively means
slowing down clock frequency is required to get the data transported.

**Note:** SWCLK speed for MSC and RTT (below) is set according to the latest used tool setup. E.g. `pyocd gdb -f 5000000` sets
SWCLK to 5MHz.


## MSC - Mass Storage Device Class
Via MSC the so called "drag-n-drop" supported is implemented. Actually this also helps in copying a UF2 image directly into the target via command line.
Expand Down Expand Up @@ -119,7 +122,9 @@ ACTION=="remove", SUBSYSTEMS=="usb", SUBSYSTEM=="block", ENV{ID_FS_USAGE}=="file



# Benchmarking
# Optimizations

## Benchmarking
Benchmarking is done with an image with a size around 400KByte. Command lines are as follows:

* **cp**: `time cp firmware.uf2 /media/picoprobe/`
Expand All @@ -139,16 +144,39 @@ DAPv2 is always used, because DAPv1 does not run under Linux(?).
| git-bd8c41f | 5.7s | 28.6s | 7.7s | 19.9s | there was a python update :-/ |
| git-0d6c6a8 | 5.7s | 28.5s | 6.8s | 20.2s | |
| - same but optimized for openocd | 5.7s | 28.5s | 6.1s | - | pyocd crashes |
| git-0eba8bf | 4.9s | 28.6s | 6.5s | 13.8s | cp show sometimes 5.4s |
| git-0eba8bf | 4.9s | 28.6s | 6.5s | 13.8s | cp shows sometimes 5.4s |
| - same but optimized for openocd | 4.9s | 28.6s | 5.8s | - | pyocd crashes |
| git-e38fa52 | 4.8s | 28.6s | 6.6s | 14.0s | cp shows sometimes 5.4s |
| - same but optimized for openocd | 4.8s | 28.6s | 5.9s | - | pyocd crashes |
| git-28fd8db | 4.1s | 28.6s | 6.2s | 13.9s | cp shows sometimes 4.6s, SWCLK tuned to 25MHz |
| - same but optimized for openocd | 4.1s | 28.6s | 5.7s | - | pyocd crashes |


## PIO
Several PIO optimizations has been implemented. Main idea of PIO control has been taken from [pico_debug](https://github.com/essele/pico_debug/blob/main/swd.pio).

To monitor the progress between the several versions, [PulseView](https://sigrok.org/wiki/PulseView) has been used. LA probe was [sigrok-pico](https://github.com/pico-coder/sigrok-pico).

### First Version (03.01.2023 - e2b4a67)
![First Version](doc/png/Screenshot_20230103_074404.png "First Version (03.01.2023)")

### (Currently) Final Version (06.01.2023 - 28fd8db)
![Final Version (06.01.2023)](doc/png/Screenshot_20230106_153629.png "Final Version (06.01.2023)")

### Explanation / Conclusion
The plots above are taken with SWCLK = 15MHz. Absolute time of the four commands shrunk from ~25us to 18us. Not bad.

But there are still gaps which may give more optimization opportunities. Switching times between read / write and the
gap between two commands are candidates. Note that moving code into RAM did not really help (and optimization is still
a non/slow-working mystery).



# TODO / Known Bugs
* Bugs
* check the benchmark "miracle" with the NDEBUG version
* if `configTICK_RATE_HZ` is around 100, the SWD IF no longer works
* TODO
* TX host->target via RTT
* pyocd auto-detect?
* support of Nordic nRF52xxx (my other target)
* DAP_PACKET_SIZE: how to increase?
Expand Down
Binary file added doc/png/Screenshot_20230103_074404.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/png/Screenshot_20230106_153629.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 6 additions & 6 deletions include/DAP_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ This information includes:
/// Processor Clock of the Cortex-M MCU used in the Debug Unit.
/// This value is used to calculate the SWD/JTAG clock speed.
/* Picoprobe actually uses kHz rather than Hz, so just lie about it here */
#define CPU_CLOCK 125000000U ///< Specifies the CPU Clock in Hz.
#define CPU_CLOCK (PROBE_CPU_CLOCK_KHZ * 1000) ///< Specifies the CPU Clock in Hz.

/// Number of processor cycles for I/O Port write operations.
/// This value is used to calculate the SWD/JTAG clock speed that is generated with I/O
Expand Down Expand Up @@ -84,7 +84,7 @@ This information includes:
/// Default communication speed on the Debug Access Port for SWD and JTAG mode.
/// Used to initialize the default SWD/JTAG clock frequency.
/// The command \ref DAP_SWJ_Clock can be used to overwrite this default setting.
#define DAP_DEFAULT_SWJ_CLOCK (PROBE_DEFAULT_KHZ*1000U) ///< Default SWD/JTAG clock frequency in Hz. (10MHz)
#define DAP_DEFAULT_SWJ_CLOCK (probe_freq_khz * 1000U) ///< Default SWD/JTAG clock frequency in Hz. (10MHz)

/// Maximum Package Size for Command and Response data.
/// This configuration settings is used to optimize the communication performance with the
Expand Down Expand Up @@ -403,16 +403,16 @@ __STATIC_FORCEINLINE void PIN_SWDIO_OUT (uint32_t bit) {
Configure the SWDIO DAP hardware I/O pin to output mode. This function is
called prior \ref PIN_SWDIO_OUT function calls.
*/
__STATIC_FORCEINLINE void PIN_SWDIO_OUT_ENABLE (void) {
probe_write_mode();
__STATIC_FORCEINLINE void PIN_SWDIO_OUT_ENABLE (void)
{
}

/** SWDIO I/O pin: Switch to Input mode (used in SWD mode only).
Configure the SWDIO DAP hardware I/O pin to input mode. This function is
called prior \ref PIN_SWDIO_IN function calls.
*/
__STATIC_FORCEINLINE void PIN_SWDIO_OUT_DISABLE (void) {
probe_read_mode();
__STATIC_FORCEINLINE void PIN_SWDIO_OUT_DISABLE (void)
{
}


Expand Down
30 changes: 15 additions & 15 deletions src/daplink/interface/swd_host.c
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ void int2array(uint8_t *res, uint32_t data, uint8_t len)
}
}

uint8_t swd_transfer_retry(uint32_t req, uint32_t *data)
uint8_t __not_in_flash_func(swd_transfer_retry)(uint32_t req, uint32_t *data)
{
uint8_t i, ack;

Expand Down Expand Up @@ -136,7 +136,7 @@ uint8_t swd_clear_errors(void)
}

// Read debug port register.
uint8_t swd_read_dp(uint8_t adr, uint32_t *val)
uint8_t __not_in_flash_func(swd_read_dp)(uint8_t adr, uint32_t *val)
{
uint32_t tmp_in;
uint8_t tmp_out[4];
Expand All @@ -157,7 +157,7 @@ uint8_t swd_read_dp(uint8_t adr, uint32_t *val)
}

// Write debug port register
uint8_t swd_write_dp(uint8_t adr, uint32_t val)
uint8_t __not_in_flash_func(swd_write_dp)(uint8_t adr, uint32_t val)
{
uint32_t req;
uint8_t data[4];
Expand All @@ -178,7 +178,7 @@ uint8_t swd_write_dp(uint8_t adr, uint32_t val)
}

// Read access port register.
uint8_t swd_read_ap(uint32_t adr, uint32_t *val)
uint8_t __not_in_flash_func(swd_read_ap)(uint32_t adr, uint32_t *val)
{
uint8_t tmp_in, ack;
uint8_t tmp_out[4];
Expand Down Expand Up @@ -207,7 +207,7 @@ uint8_t swd_read_ap(uint32_t adr, uint32_t *val)
}

// Write access port register
uint8_t swd_write_ap(uint32_t adr, uint32_t val)
uint8_t __not_in_flash_func(swd_write_ap)(uint32_t adr, uint32_t val)
{
uint8_t data[4];
uint8_t req, ack;
Expand Down Expand Up @@ -246,7 +246,7 @@ uint8_t swd_write_ap(uint32_t adr, uint32_t val)

// Write 32-bit word aligned values to target memory using address auto-increment.
// size is in bytes.
static uint8_t swd_write_block(uint32_t address, uint8_t *data, uint32_t size)
static uint8_t __not_in_flash_func(swd_write_block)(uint32_t address, uint8_t *data, uint32_t size)
{
uint8_t tmp_in[4], req;
uint32_t size_in_words;
Expand Down Expand Up @@ -302,7 +302,7 @@ static uint8_t swd_write_block(uint32_t address, uint8_t *data, uint32_t size)

// Read 32-bit word aligned values from target memory using address auto-increment.
// size is in bytes.
static uint8_t swd_read_block(uint32_t address, uint8_t *data, uint32_t size)
static uint8_t __not_in_flash_func(swd_read_block)(uint32_t address, uint8_t *data, uint32_t size)
{
uint8_t tmp_in[4], req, ack;
uint32_t size_in_words;
Expand Down Expand Up @@ -366,7 +366,7 @@ static uint8_t swd_read_block(uint32_t address, uint8_t *data, uint32_t size)
}

// Read target memory.
static uint8_t swd_read_data(uint32_t addr, uint32_t *val)
static uint8_t __not_in_flash_func(swd_read_data)(uint32_t addr, uint32_t *val)
{
uint8_t tmp_in[4];
uint8_t tmp_out[4];
Expand Down Expand Up @@ -403,7 +403,7 @@ static uint8_t swd_read_data(uint32_t addr, uint32_t *val)
}

// Write target memory.
static uint8_t swd_write_data(uint32_t address, uint32_t data)
static uint8_t __not_in_flash_func(swd_write_data)(uint32_t address, uint32_t data)
{
uint8_t tmp_in[4];
uint8_t req, ack;
Expand All @@ -430,7 +430,7 @@ static uint8_t swd_write_data(uint32_t address, uint32_t data)
}

// Read 32-bit word from target memory.
uint8_t swd_read_word(uint32_t addr, uint32_t *val)
uint8_t __not_in_flash_func(swd_read_word)(uint32_t addr, uint32_t *val)
{
if (!swd_write_ap(AP_CSW, CSW_VALUE | CSW_SIZE32)) {
return 0;
Expand All @@ -444,7 +444,7 @@ uint8_t swd_read_word(uint32_t addr, uint32_t *val)
}

// Write 32-bit word to target memory.
uint8_t swd_write_word(uint32_t addr, uint32_t val)
uint8_t __not_in_flash_func(swd_write_word)(uint32_t addr, uint32_t val)
{
if (!swd_write_ap(AP_CSW, CSW_VALUE | CSW_SIZE32)) {
return 0;
Expand All @@ -458,7 +458,7 @@ uint8_t swd_write_word(uint32_t addr, uint32_t val)
}

// Read 8-bit byte from target memory.
uint8_t swd_read_byte(uint32_t addr, uint8_t *val)
uint8_t __not_in_flash_func(swd_read_byte)(uint32_t addr, uint8_t *val)
{
uint32_t tmp;

Expand All @@ -475,7 +475,7 @@ uint8_t swd_read_byte(uint32_t addr, uint8_t *val)
}

// Write 8-bit byte to target memory.
uint8_t swd_write_byte(uint32_t addr, uint8_t val)
uint8_t __not_in_flash_func(swd_write_byte)(uint32_t addr, uint8_t val)
{
uint32_t tmp;

Expand All @@ -494,7 +494,7 @@ uint8_t swd_write_byte(uint32_t addr, uint8_t val)

// Read unaligned data from target memory.
// size is in bytes.
uint8_t swd_read_memory(uint32_t address, uint8_t *data, uint32_t size)
uint8_t __not_in_flash_func(swd_read_memory)(uint32_t address, uint8_t *data, uint32_t size)
{
uint32_t n;

Expand Down Expand Up @@ -543,7 +543,7 @@ uint8_t swd_read_memory(uint32_t address, uint8_t *data, uint32_t size)

// Write unaligned data to target memory.
// size is in bytes.
uint8_t swd_write_memory(uint32_t address, uint8_t *data, uint32_t size)
uint8_t __not_in_flash_func(swd_write_memory)(uint32_t address, uint8_t *data, uint32_t size)
{
uint32_t n = 0;

Expand Down
2 changes: 1 addition & 1 deletion src/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ void usb_thread(void *ptr)
int main(void)
{
board_init();
set_sys_clock_khz(CPU_CLOCK / 1000, true);
set_sys_clock_khz(PROBE_CPU_CLOCK_KHZ, true);

usb_serial_init();
tusb_init();
Expand Down
18 changes: 11 additions & 7 deletions src/picoprobe_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -74,15 +74,16 @@
#define picoprobe_error(format,...) ((void)0)
#endif

#define PROBE_CPU_CLOCK_KHZ (150*1000) // overclocked to 150MHz, even 200MHz seems to be no problem

// PIO config
#define PROBE_SM 0
#define PROBE_PIN_OFFSET 2
#define PROBE_PIN_SWCLK (PROBE_PIN_OFFSET + 0) // 2
#define PROBE_PIN_SWDIO (PROBE_PIN_OFFSET + 1) // 3
#define PROBE_PIN_RESET 6 // Target reset config
#define PROBE_MAX_KHZ 24000U // according to RP2040 datasheet 24MHz
#define PROBE_DEFAULT_KHZ 12000
#define PROBE_SM 0
#define PROBE_PIN_OFFSET 2
#define PROBE_PIN_SWCLK (PROBE_PIN_OFFSET + 0) // 2
#define PROBE_PIN_SWDIO (PROBE_PIN_OFFSET + 1) // 3
#define PROBE_PIN_RESET 6 // Target reset config
#define PROBE_MAX_KHZ 25000U // overclocked: according to RP2040 datasheet 24MHz
#define PROBE_DEFAULT_KHZ 15000

// UART config (UART target -> probe)
#define PICOPROBE_UART_TX 4
Expand All @@ -101,4 +102,7 @@
#endif
#endif

extern uint32_t probe_freq_khz;


#endif
Loading

0 comments on commit 8d690a5

Please sign in to comment.