Telink Swire on the wire
Telink SoCs like the TLSR8258 are widely used in cheap IoT devices. Wouldn’t it be neat to have a customizable, power-efficient Zigbee device for a few euros? I certainly thought so. To program and debug these chips Telink uses a proprietary single-wire protocol.
The official Telink programmer usable with their tooling is not that expensive and the Telink Iot Studio to create applications for the chips is free to download. So if that is all you want to do, this is likely the most painless way to go.
Because I apparently do not value my time I did not go with the official programmer. Instead I decided on https://github.com/pvvx/TLSRPGM which is an OpenSource implementation based on a cheap TB-03F-Kit development board. The board uses a TLSR8258 which has the necessary hardware to natively talk the Swire protocol. There is also the possibility to just use a serial adapter but that was supposedly slow and unreliable and I did not want to waste time with unreliable interfaces.
Easy peasy then. Set it up and you are off coding? Well. I wasn’t so lucky and trying to debug my failing attempts to simply read the firmware from my cheap Tuya Zigbee ZTH03-Pro Thermometer which uses a branded TLSR8258 variant (Z2) made me dig much deeper into what’s actually happening on the wire. While not earth shattering I found it kind of neat so here we are.
Telink Swire basics#
Pretty much the only Telink Swire documentation available, at least that I could find, is at https://github.com/pvvx/TLSRPGM/tree/main/TelinkSWire . It originally comes from an old Telink Preliminary datasheet release that was later removed and lays out the basics of the protocol:

SWire is a single-wire half-duplex master/slave bus protocol. The bus idles high through the use of pull-up resistors.
Time is split into units in which the bus is either high or low.
Bits are encoded with 5 units per bit allowing self-clocking:
- A zero bit is encoded as 1 unit LOW followed by 4 units HIGH
- A one bit is encoded as 4 units LOW followed by 1 unit HIGH
Bytes are assembled from bits in one of two ways:
- Master bytes:
CMD(1 bit) +DATA(8 bits) +ENDunit - Slave bytes:
DATA(8 bits) +ENDunit (no CMD bit)
Protocol read/write both follow the same basic structure:
START → ADDR (2-3 bytes) → RW_ID → DATA... → END
START:CMD=1,DATA=0x5AEND:CMD=1,DATA=0xFFRW_IDbit 7: Read (1) / Write (0)RW_IDbits 6-0: Slave ID
CMD is always 0 except for START and END bytes. Depending on the chip type you are talking to addresses are two (e.g. TLSR826x) or three bytes (e.g. TLSR825x). For a write operation the master just keeps sending bytes with the END byte indicating the end of the transaction.
For a read operation after starting a transaction with RW_ID read bit set the master sends a 1-unit pulse indicating it is releasing the bus to the addressed slave for one response byte. This way the master stays in control. It decides when the next byte is received and when the transaction ends.
This is all there is to it at the lower protocol level. The protocol is documented as being able to achieve up to a 2Mbit/s bit rate. Apparently the speed can be auto negotiated by the hardware. In my setup the programmer used a 0.96Mbit/s rate resulting in each unit being approximately 208ns.
Based on this description of the protocol I created a protocol decoder for Pulseview. With that and a cheap compatible logic analyzer I bought for 5€ years back I could then actually see this protocol on the wire:

What is depicted here is a write operation. The magic start byte with CMD set is followed by a three byte address 0x602. In the next byte the master specifies it wants this write to be executed on slave id 0. Then it sends one actual data byte 0x05 and ends the transaction with an END byte.
Telink Swire debug protocol#
So how does Telink use this protocol to enable programming and debugging of their chips? Actually you already saw all there is to it. The address sent is a memory address on the slave. As with many controllers all registers and memory are part of one address space. Peripherals are setup and controlled through registers too.
For example take the command from above that wrote 0x05 into 0x602. Looking at the TLSR8258 datasheet we see that 0x602 is part of the registers controlling the MCU itself:

Unfortunately these are undocumented. But looking at the official Telink BDT software used with their programmer kind of gives it away:

The register allows controlling the MCU mode. Writing 0x06 there will stall the CPU execution. The author of TLSRPGM, which was super helpful and responsive, sent me the following values:
| 0x602 value | Effect |
|---|---|
| 0x88 | CPU Reboot |
| 0x08 | CPU Go |
| 0x06 | CPU Stall |
| 0x05 | CPU Stop |
Many other registers are actually documented in the datasheet though and more can be seen from SDK code. All of this is of course highly SoC dependent.
Now let’s get a bit more complex. What I wanted to do in the beginning was dumping the flash. So how does that look like on Swire?
Reading from flash#

The very first action is writing 0x00 to the 0x0D register. Looking at the register_8258.h header we see:
#define reg_mspi_data REG_ADDR8(0x0c)
#define reg_mspi_ctrl REG_ADDR8(0x0d)
enum{
FLD_MSPI_CS = BIT(0),
FLD_MSPI_SDO = BIT(1),
FLD_MSPI_CONT = BIT(2),
FLD_MSPI_RD = BIT(3),
FLD_MSPI_BUSY = BIT(4),
};
So 0x0d is the control register for the memory SPI bus. Turns out our SoC talks to its own flash using SPI. What do we do by setting reg_mspi_ctrl=0? We set all bits to 0 including FLD_MSPI_CS which is the internal chip-select. It is active-low, so writing 0 selects the memory SPI flash chip.
The next four transactions all write to 0xC:

0x0C is reg_mspi_data and all data we write there is clocked out over the SPI bus to our flash. 0x03 is a read data operation which expects a 24-bit address to follow. Since I want to dump all of flash the start address is 0x000000 sent as three 0x00 bytes.
The following write is our first two byte write:

This will write 0x00 to reg_mspi_data (0x0C) and 0x0A to reg_mspi_ctrl (0x0D). The data write drives the SPI clock to initiate the first read cycle. 0x0A is setting FLD_MSPI_RD and FLD_MSPI_SDO to configure the SPI for auto read mode.

Last but not least we write 0x80 to 0xB3. What is at 0xB3?
#define reg_swire_id REG_ADDR8(0xb3)
enum
{
FLD_SWIRE_ID_SLAVE_ID = BIT_RNG(0,6),
FLD_SWIRE_ID_SLAVE_FIFO_EN = BIT(7),
};
It is reg_swire_id which is part of the Swire configuration. 0x80 sets FLD_SWIRE_ID_SLAVE_FIFO_EN which switches swire into fifo mode to enable it to do repeated reads/writes from the same address.
Now we are actually ready to read.

For this the master starts a read transaction from reg_mspi_data (0x0C). For each byte the master wants to receive it sends a trigger unit and with that hands bus control to the slave. The slave reads a byte from reg_mspi_data and sends it back over Swire implicitly returning control back to the master. Since the flash is in read mode it will automatically increment the address so the next read will return the next byte in flash. In this specific case the first two bytes were both 0x00.
Since the master stays in full control, at any time it can simply send the END byte once it has received all the bytes it is interested in:

All that is left now is some cleanup:

This unsets FLD_SWIRE_ID_SLAVE_FIFO_EN from reg_swire_id (0xB3) switching swire back to the normal mode.

The very last action is deselecting our built-in flash chip again by setting FLD_MSPI_CS on reg_mspi_ctrl (0x0D).
Congratulations. You now know how to read the flash of a TLSR8258 over Swire ;) It is not hard to see how other normal debug operations can be executed using the same primitives.
What went wrong?#
So what went wrong with my simple initial goal of just dumping the flash of my device? Well. The problem wasn’t Swire. At least not the protocol. Everything sent on the wire was perfectly valid but the contents made no sense and sometimes the slave just stopped mid byte ending the read.
TLSRPGM will aggressively attempt to halt the MCU after reset to prevent code execution. For this it uses the CPU Stop command mentioned before. Naively copying from TLSRPGM documentation I also asked the tool to issue a CPU Stall command before initiating the flash read.
Turns out that for some unfathomable (or undocumented by Telink, whichever you prefer) reason this makes the MCU partially wake up. When running the MCU will read its program code from the same flash we are trying to dump over Swire at the same time which messes up the addressing and creates other errors down the line.
Simply removing the “-c” parameter that caused the superfluous stall to be sent by TLSRPGM resolved all my problems. No more errors. Repeatable reads.
What did we learn?#
Stick to official tooling so you don’t waste your… nah, who am I kidding… be prepared for some interesting excursions when (mis)using unofficial tooling.
Due to this happy little accident I got to explore Telink Swire which is a somewhat obscure protocol without many features. But even with a minimal feature set it allows complex debug and programming operations on supported chips.
With my problem fixed and curiosity satisfied I can now return to the thing I was actually trying to do: Actually writing software for my soon-hopefully-no-longer thermometer ;)