Close

DAC with DMA and buffer on a Teensy 3.2

A project log for Solid state flow sensing using EIS

Investigations into a novel flow sensing technique to create tiny cheap flow sensors, all using electrical impedance spectroscopy.

arthur-admiraalArthur Admiraal 07/10/2016 at 14:122 Comments

Although I haven't started programming the main program for RISA yet, I have done some tests. For one of them, I tried to get the DAC working. Since the main application will have to both read out two ADCs and write data to the DAC, I want to try to decouple as much as I can from the processor. Hence, I tried to get the DAC working with its internal buffer and Direct Memory Acces (DMA). I couldn't really found a tutorial on how to do this, so it took while. Because of that, I figured I would write up one of my own here.

If you don't know about DMA, it is a pretty cool feature, where microcontroller peripherals can automatically interact with the RAM to continue functioning while the processor can do other stuff. In addition, the DAC has a 16-word (16 times 2 bytes) buffer for storing values. By using this buffer the DMA can be used less frequently, allowing more time for other peripherals to acces it.

I have found really good examples of using the DAC with DMA without its buffer and of using the DAC with it's buffer, but without DMA, but nothing that combined those. After some reading of the datasheet, I now know how to do that. Still most of this tutorial was built on these examples, so a big thanks to pjrc forum users ferdinandkeil and the_pman for putting them up there.

I have made all register names hyperlinks their place in the datasheet, so if you want to see what they do, just click on them to read all about them. All example code is of the Teensyduino flavor, but if you need to adapt it to the Kinetis SDK, most of what you need to do is change minor differences in register names

This tutorial assumes basic familiarity with arduino, C, bitwise logic and pointers

The code

Initialising the DAC

We start by initialising the DAC itself. First, the clock to the DAC module will need to be enabled. This can be done by setting bit 12 of the System Clock Gating Control Register 2:

SIM_SCGC2 |= SIM_SCGC2_DAC0; // enable DAC clock

Then we need to configure the DAC how we want it, that is to say: enabled and using VDDA as the voltage reference. To do this, we need to set the DAC enable and the DAC reference select bit in the DAC Control Register. The datasheet is a bit cryptic about the function of the DAC reference select bit, saying that it select between DACREF_1 and DACREF_2 as the voltage reference. Luckily, section 3.7.3.3 defines what both of those are: DACREF_1 is VREF_OUT and DACREF_2 is VDDA. Anyhow, to flip the bits we need to execute:

DAC0_C0 = DAC_C0_DACEN | DAC_C0_DACRFS; // enable the DAC module, 3.3V reference

The DAC is now ready to use! Any 12-bit value you write to the address of the DAC Data Low Register will appear as an analog voltage between 0V and 3.3V on the DAC pin of the teensy. You may want to slowly ramp up the voltage of the DAC to avoid sharp edges on your signal:

  // slowly ramp up to DC voltage
  for (int16_t i=0; i<2048; i+=1) {
    *(int16_t *)&(DAC0_DAT0L) = i;
    delayMicroseconds(125);
  }

Now we fill the buffer with some constant values, so that there are no jumps in the output when we init the DAC buffer and the DAC first sweeps through it:

// fill up the buffer with 2048
for (int16_t i=0; i<16; i+=1) {
  *(int16_t *)(&DAC0_DAT0L + 2*i) = 2048;
}

That's the setup of the DAC we will do the DMA part of the DAC setup later, but for now, you should have:

void setup() {
  SIM_SCGC2 |= SIM_SCGC2_DAC0; // enable DAC clock
  DAC0_C0 = DAC_C0_DACEN | DAC_C0_DACRFS; // enable the DAC module, 3.3V reference
  // slowly ramp up to DC voltage
  for (int16_t i=0; i<2048; i+=1) {
    *(int16_t *)&(DAC0_DAT0L) = i;
    delayMicroseconds(125); // this function may be broken
  }
  // fill up the buffer with 2048
  for (int16_t i=0; i<16; i+=1) {
    *(int16_t *)(&DAC0_DAT0L + 2*i) = 2048;//256*(16-i) - 1;
  }
}

void loop() {
  // do nothing
}

Initialising the DMA

The DMA has quite an elaborate initialisation. It may be possible to rewrite the code to use the DMAChannel library, which could make it a lot cleaner. For now, I'm quite happy with this implementation. I mostly got the details from this manual of the manufacturer.

We first need to initialise the DMA peripheral and the DMA Multiplexer, which controls which peripheral has access to the RAM. As with the DAC, we first need to enable the clocks to both modules, by setting the appropriate bits in System Clock Gating Control Register 6 and 7:

SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
Now we need to pick a channel for DMA acces. There are a multitude of channels, so that DMA can be configured for multiple peripherals. We're only using the DAC for now, so let's use channel 0. We need to enable it and set to what DMA requests it will listen, which will be those of the DAC. We can set these properties in the Channel Configuration register of this channel:
DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0

Next, we need to make our DMA channel able to listen to requests. This is done by enabling it in the Enable Request Register:

DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
For the next part, we're gonna need a buffer in the RAM which we'll use as the source of the values for the DAC. To initialise one, we have to add the following to the beginning of our code:
#define BUFFER_SIZE 480
static volatile uint16_t sinetable[BUFFER_SIZE];

Please note that BUFFER_SIZE can be any integer value larger than 16, but the higher it is the more samples you will have in your generated waveforms. It also helps to keep it as a proper divisor of your clock frequency, so that the frequency of your waveforms is a nice round number. For this example code to work, the buffer size needs to be an integer multiple of 8, for reasons that will become apparent later. I have named my buffer 'sinetable', but you can name it anything you want, it is just an array.

Let's also put something interesting in our buffer. I like to do this at the beginning of the setup() routine. I'm going to put a sine wave in it:

  // fill up the sine table
  for(int i=0; i<BUFFER_SIZE; i++) {
    sinetable[i] = 2048+(int)(2048*sin(((float)i)*20.0*6.28318530717958647692/((float)BUFFER_SIZE)));
  }

Your code should now look something like this:

#define BUFFER_SIZE 480
static volatile uint16_t sinetable[BUFFER_SIZE];
void setup() {
  // fill up the sine table
  for(int i=0; i<BUFFER_SIZE; i++) {
    sinetable[i] = 2048+(int)(2048*sin(((float)i)*20.0*6.28318530717958647692/((float)BUFFER_SIZE)));
  }
  // initialise the DAC
  SIM_SCGC2 |= SIM_SCGC2_DAC0; // enable DAC clock
  DAC0_C0 = DAC_C0_DACEN | DAC_C0_DACRFS; // enable the DAC module, 3.3V reference
  // slowly ramp up to DC voltage
  for (int16_t i=0; i<2048; i+=1) {
    *(int16_t *)&(DAC0_DAT0L) = i;
    delayMicroseconds(125); // this function may be broken
  }
  // fill up the buffer with 2048
  for (int16_t i=0; i<16; i+=1) {
    *(int16_t *)(&DAC0_DAT0L + 2*i) = 2048;//256*(16-i) - 1;
  }
  // initialise the DMA
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
}
void loop() {
  // do nothing
}

Setting the Transfer Control Descriptor (TCD)

Back to where we were, we are now going to set the details of our DMA transfers. First, we choose where we want our data to come from, and were we want it to go, using the TCD Source and Destination Address register. Obviously, we want our data to come from our buffer, and go to the DAC buffer:

DMA_TCD0_SADDR = sinetable;   // set the address of the first byte in our LUT as the source address
DMA_TCD0_DADDR = &DAC0_DAT0L; // set the first data register as the destination address

You can transfer a maximum of 16 bytes in one cycle with the DMA, according to the datasheet. This would be great, since that is exactly 8 words, of half the buffer. We can only fill half the buffer at a time, since we can't both write to and read from the buffer at the same time. Since the DAC is always reading from one address in the buffer, we can't fill the full buffer in one go, we have to fill it in two writes. If we could fdo a write in one cycle, by having a transfer size of 16 bytes, we would have achieved the smallest tax on the DMA resources possible. For some reason, I haven't been able to use this transfer size unfortunately, so I will configure everything for the next-largest size, which is 32-bits, or 4 bytes.

First, we are going to set the read and write offset. This is the amount of bytes the read and write pointers respectively advance per read. Since we don't want to read or write data doubly, we set this to our transfer size of 4 bytes by configuring the TCD Signed Source and Destination Address Offset:

  DMA_TCD0_SOFF = 4; // advance 32 bits, or 4 bytes per read
  DMA_TCD0_DOFF = 4; // advance 32 bits, or 4 bytes per write
Now we get to set the actual transfer size. We can also set up a modulo, which controls the amount of the least significant bits that may change before resetting to the initial source or destination address. We are going to set this up for the DAC buffer, since it starts at a address that ends in zeroes and its size is a power of two. We do both things by setting up the TCD Transfer Attributes register:
  DMA_TCD0_ATTR  = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT);
  DMA_TCD0_ATTR |= DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DMOD(31 - __builtin_clz(32)); // set the data transfer size to 32 bit for both the source and the destination

The data sheet has some terminology that I found a bit confusing on my first readthrough. Every request is considered a 'minor loop'. It is a loop, since we can set multiple transfers to be executed, which the hardware will presumably loop through. 'Major loops' can also be configured. You can set the number of minor loops that fit in a major loop. Basically, every time this number of minor loops, or requests, has completed, the major loop start over. The DMA can be configured so that at the end of the major loop, the source and destination address are changed. I think it helps to think of the minor loops as sort of a request counter, and the major loop as an oppurtunity to change the source and destination address.

We will use this oppurtunity to reset the source address back to the start of the buffer, since we can't use the modulo function for our buffer, as it is not aligned to an adress ending in zeroes and its size isn't a power of two. This does mean that an integer number of minor loops has to fit in our buffer, which is why its size had to be an integer multiple of 8.

Anyhow, we want to fill half our buffer every minor loop, which is 8 words, or 16 bytes. We can set the number of bytes we want to transfer per minor loop in the TCD Minor Byte Count (Minor Loop Disabled) register[1]:

DMA_TCD0_NBYTES_MLNO = 16; // we want to fill half of the DAC buffer, which is 16 words in total, so we need 8 words - or 16 bytes - per transfer

Now we need to set the number of minor loops in a major loop, or the number of requests before we reset to the original source address. To do this, we must set the current major iteration counter in the TCD Current Minor Loop Link, Major Loop Count (ChannelLinking Disabled) register and the value to which it will be reset when the major loop completes in the TCD Beginning Minor Loop Link, Major Loop Count (ChannelLinking Disabled) register:

// set the number of minor loops (requests) in a major loop
DMA_TCD0_CITER_ELINKNO = DMA_TCD_CITER_ELINKYES_CITER(BUFFER_SIZE*2/16);
DMA_TCD0_BITER_ELINKNO = DMA_TCD_BITER_ELINKYES_BITER(BUFFER_SIZE*2/16);

Then we need to set the amount we want to change the source and destination adress when the major loop completes in bytes. We only need to adjust the source address, since the destination adress is handled by the modulo we set up. We need to set back the source address by double the buffer size, since our buffer is made up of integers, which are 2 bytes. We can do this by setting adjustments in the TCD Last Source Address Adjustment registerTCD Last Source Address Adjustment register and the TCD Last Destination Address Adjustment/Scatter GatherAddress register:

  DMA_TCD0_SLAST    = -BUFFER_SIZE*2;
  DMA_TCD0_DLASTSGA = 0;

The only thing left for us to do is to initialise the TCD Control and Status register:

DMA_TCD0_CSR = 0;

Your code should now look like this:

#define BUFFER_SIZE 480
static volatile uint16_t sinetable[BUFFER_SIZE];
void setup() {
  // fill up the sine table
  for(int i=0; i<BUFFER_SIZE; i++) {
    sinetable[i] = 2048+(int)(2048*sin(((float)i)*20.0*6.28318530717958647692/((float)BUFFER_SIZE)));
  }
  // initialise the DAC
  SIM_SCGC2 |= SIM_SCGC2_DAC0; // enable DAC clock
  DAC0_C0 = DAC_C0_DACEN | DAC_C0_DACRFS; // enable the DAC module, 3.3V reference
  // slowly ramp up to DC voltage
  for (int16_t i=0; i<2048; i+=1) {
    *(int16_t *)&(DAC0_DAT0L) = i;
    delayMicroseconds(125); // this function may be broken
  }
  // fill up the buffer with 2048
  for (int16_t i=0; i<16; i+=1) {
    *(int16_t *)(&DAC0_DAT0L + 2*i) = 2048;//256*(16-i) - 1;
  }
  // initialise the DMA
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // Here we choose where our data is coming from, and where it is going
  DMA_TCD0_SADDR = sinetable;   // set the address of the first byte in our LUT as the source address
  DMA_TCD0_DADDR = &DAC0_DAT0L; // set the first data register as the destination address
  // now we need to set the read and write offsets - kind of boring
  DMA_TCD0_SOFF = 4; // advance 32 bits, or 4 bytes per read
  DMA_TCD0_DOFF = 4; // advance 32 bits, or 4 bytes per write
  // this is the fun part! Now we get to set the data transfer size...
  DMA_TCD0_ATTR  = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT);
  DMA_TCD0_ATTR |= DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DMOD(31 - __builtin_clz(32)); // set the data transfer size to 32 bit for both the source and the destination
  // ...and the number of bytes to be transferred per request (or 'minor loop')...
  DMA_TCD0_NBYTES_MLNO = 16; // we want to fill half of the DAC buffer, which is 16 words in total, so we need 8 words - or 16 bytes - per transfer
  // set the number of minor loops (requests) in a major loop
  // the circularity of the buffer is handled by the modulus functionality in the TCD attributes
  DMA_TCD0_CITER_ELINKNO = DMA_TCD_CITER_ELINKYES_CITER(BUFFER_SIZE*2/16);
  DMA_TCD0_BITER_ELINKNO = DMA_TCD_BITER_ELINKYES_BITER(BUFFER_SIZE*2/16);
  // the address is adjusted by these values when a major loop completes
  // we don't need this for the destination, because the circularity of the buffer is already handled
  DMA_TCD0_SLAST    = -BUFFER_SIZE*2;
  DMA_TCD0_DLASTSGA = 0;
  // do the final init of the channel
  DMA_TCD0_CSR = 0;
}
void loop() {
  // do nothing
}

Setting up the DAC for buffered DMA

Now it's time to set up the DMA of the DAC. There are three interrupts that the DAC can generate, which are used in the DAC buffer example. These are:

Instead of interrupts, the DAC can be set up to generate DMA requests. We will be doing exactly this. First, we must enable two interrupts, the watermark and another one. I found that sometimes, I could fix glitches in the generated signal by choosing another interrupt. For this example, the read pointer bottom flag interrupt works great. We can enable the interupts using the DAC Control Register:

DAC0_C0 |= DAC_C0_DACBBIEN | DAC_C0_DACBWIEN; // enable read pointer bottom and waterwark interrupt

Then, we can generate DMA request instead of interrupts, enable the DAC buffer and set the correct watermark offset by writing to the DAC Control Register 1:

DAC0_C1 |= DAC_C1_DMAEN | DAC_C1_DACBFEN | DAC_C1_DACBFWM(3); // enable dma and buffer

Lastly, we can set the inital position of the read pointer. Again, I found that I could sometimes fix glitches in the signal by setting the read pointer to a position beyond a certain interrupt, which probalby worked because in this alternate interrupt order the DMA had more time to catch up to the DAC requests. We can change the pointer position by changing the DAC Control Register 2:

DAC0_C2 |= DAC_C2_DACBFRP(0);
By this point, you code should look like this:
#define BUFFER_SIZE 480
static volatile uint16_t sinetable[BUFFER_SIZE];
void setup() {
  // fill up the sine table
  for(int i=0; i<BUFFER_SIZE; i++) {
    sinetable[i] = 2048+(int)(2048*sin(((float)i)*20.0*6.28318530717958647692/((float)BUFFER_SIZE)));
  }
  // initialise the DAC
  SIM_SCGC2 |= SIM_SCGC2_DAC0; // enable DAC clock
  DAC0_C0 = DAC_C0_DACEN | DAC_C0_DACRFS; // enable the DAC module, 3.3V reference
  // slowly ramp up to DC voltage
  for (int16_t i=0; i<2048; i+=1) {
    *(int16_t *)&(DAC0_DAT0L) = i;
    delayMicroseconds(125); // this function may be broken
  }
  // fill up the buffer with 2048
  for (int16_t i=0; i<16; i+=1) {
    *(int16_t *)(&DAC0_DAT0L + 2*i) = 2048;//256*(16-i) - 1;
  }
  // initialise the DMA
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // Here we choose where our data is coming from, and where it is going
  DMA_TCD0_SADDR = sinetable;   // set the address of the first byte in our LUT as the source address
  DMA_TCD0_DADDR = &DAC0_DAT0L; // set the first data register as the destination address
  // now we need to set the read and write offsets - kind of boring
  DMA_TCD0_SOFF = 4; // advance 32 bits, or 4 bytes per read
  DMA_TCD0_DOFF = 4; // advance 32 bits, or 4 bytes per write
  // this is the fun part! Now we get to set the data transfer size...
  DMA_TCD0_ATTR  = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT);
  DMA_TCD0_ATTR |= DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DMOD(31 - __builtin_clz(32)); // set the data transfer size to 32 bit for both the source and the destination
  // ...and the number of bytes to be transferred per request (or 'minor loop')...
  DMA_TCD0_NBYTES_MLNO = 16; // we want to fill half of the DAC buffer, which is 16 words in total, so we need 8 words - or 16 bytes - per transfer
  // set the number of minor loops (requests) in a major loop
  // the circularity of the buffer is handled by the modulus functionality in the TCD attributes
  DMA_TCD0_CITER_ELINKNO = DMA_TCD_CITER_ELINKYES_CITER(BUFFER_SIZE*2/16);
  DMA_TCD0_BITER_ELINKNO = DMA_TCD_BITER_ELINKYES_BITER(BUFFER_SIZE*2/16);
  // the address is adjusted by these values when a major loop completes
  // we don't need this for the destination, because the circularity of the buffer is already handled
  DMA_TCD0_SLAST    = -BUFFER_SIZE*2;
  DMA_TCD0_DLASTSGA = 0;
  // do the final init of the channel
  DMA_TCD0_CSR = 0;
  // enable DAC DMA
  DAC0_C0 |= DAC_C0_DACBBIEN | DAC_C0_DACBWIEN; // enable read pointer bottom and waterwark interrupt
  DAC0_C1 |= DAC_C1_DMAEN | DAC_C1_DACBFEN | DAC_C1_DACBFWM(3); // enable dma and buffer
  DAC0_C2 |= DAC_C2_DACBFRP(0);
}
void loop() {
  // do nothing
}

Setting up the DAC interval

While the DAC itself has a clock, that doesn't advance the DAC buffer pointer. To do that, we need to set up a Programmable Delay Block to generate the DAC intervals. As before, we first need to enable the clock to the PDB using the System Clock Gating Control Register 6:

SIM_SCGC6 |= SIM_SCGC6_PDB; // turn on the PDB clock  

Now, we need to enable the PDB, select the software trigger as trigger source and select the continuous run mode. This can all be set up in the PDB Status and Control register:

PDB0_SC |= PDB_SC_PDBEN; // enable the PDB  
PDB0_SC |= PDB_SC_TRGSEL(15); // trigger the PDB on software start (SWTRIG)  
PDB0_SC |= PDB_SC_CONT; // run in continuous mode  
Now we need to set the amount of cycles in a PDB period, the modulus time. This is set in the PDB Modulus register:
PDB0_MOD = 20-1; // modulus time for the PDB

You shouldn't go lower than this value. I have set it at it lowest, to achieve the maximum sample rate possible. If you go any lower, the DMA can't keep up with the DAC, and you get a glitchy signal. A way to attain an even higher sample rate would be to make the transfer take less time, which you can accomplish by having a transfer size of 16 bytes. This way, the sample rate could probably be increased by a factor of 4 If you want lower frequencies, you should change the amount of sines in the buffer, or enlarge the buffer.

Next, we set the amount of cycles in the actual DAC interval. By having this be equal to the modulus time, the period will be the same as the PDB period. It can be set in the DAC Interval Register:

PDB0_DACINT0 = (uint16_t)(20-1); // we won't subdivide the clock
Then, we need to enable the DAC interval trigger, by setting the DAC Interval Trigger Control Register:
PDB0_DACINTC0 |= 0x01; // enable the DAC interval trigger
To update all the PDB registers, the Load OK bit needs to be set to 1 in the PDB Status and Control register:
PDB0_SC |= PDB_SC_LDOK; // update pdb registers
Finally, we can start this complicated beast by one simple software trigger, written to the PDB Status and Control register:
PDB0_SC |= PDB_SC_SWTRIG;
At this point, your completed code should resemble this:
#define BUFFER_SIZE 480
static volatile uint16_t sinetable[BUFFER_SIZE];
void setup() {
  // fill up the sine table
  for(int i=0; i<BUFFER_SIZE; i++) {
    sinetable[i] = 2048+(int)(2048*sin(((float)i)*20.0*6.28318530717958647692/((float)BUFFER_SIZE)));
  }
  // initialise the DAC
  SIM_SCGC2 |= SIM_SCGC2_DAC0; // enable DAC clock
  DAC0_C0 = DAC_C0_DACEN | DAC_C0_DACRFS; // enable the DAC module, 3.3V reference
  // slowly ramp up to DC voltage
  for (int16_t i=0; i<2048; i+=1) {
    *(int16_t *)&(DAC0_DAT0L) = i;
    delayMicroseconds(125); // this function may be broken
  }
  // fill up the buffer with 2048
  for (int16_t i=0; i<16; i+=1) {
    *(int16_t *)(&DAC0_DAT0L + 2*i) = 2048;//256*(16-i) - 1;
  }
  // initialise the DMA
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // first, we need to init the dma and dma mux
  // to do this, we enable the clock to DMA and DMA MUX using the system timing registers
  SIM_SCGC6 |= SIM_SCGC6_DMAMUX; // enable DMA MUX clock
  SIM_SCGC7 |= SIM_SCGC7_DMA;    // enable DMA clock
  // next up, the channel in the DMA MUX needs to be configured
  DMAMUX0_CHCFG0 |= DMAMUX_SOURCE_DAC0; //Select DAC as request source
  DMAMUX0_CHCFG0 |= DMAMUX_ENABLE;      //Enable DMA channel 0
  // then, we enable requests on our channel
  DMA_ERQ = DMA_ERQ_ERQ0; // Enable requests on DMA channel 0
  // Here we choose where our data is coming from, and where it is going
  DMA_TCD0_SADDR = sinetable;   // set the address of the first byte in our LUT as the source address
  DMA_TCD0_DADDR = &DAC0_DAT0L; // set the first data register as the destination address
  // now we need to set the read and write offsets - kind of boring
  DMA_TCD0_SOFF = 4; // advance 32 bits, or 4 bytes per read
  DMA_TCD0_DOFF = 4; // advance 32 bits, or 4 bytes per write
  // this is the fun part! Now we get to set the data transfer size...
  DMA_TCD0_ATTR  = DMA_TCD_ATTR_SSIZE(DMA_TCD_ATTR_SIZE_32BIT);
  DMA_TCD0_ATTR |= DMA_TCD_ATTR_DSIZE(DMA_TCD_ATTR_SIZE_32BIT) | DMA_TCD_ATTR_DMOD(31 - __builtin_clz(32)); // set the data transfer size to 32 bit for both the source and the destination
  // ...and the number of bytes to be transferred per request (or 'minor loop')...
  DMA_TCD0_NBYTES_MLNO = 16; // we want to fill half of the DAC buffer, which is 16 words in total, so we need 8 words - or 16 bytes - per transfer
  // set the number of minor loops (requests) in a major loop
  // the circularity of the buffer is handled by the modulus functionality in the TCD attributes
  DMA_TCD0_CITER_ELINKNO = DMA_TCD_CITER_ELINKYES_CITER(BUFFER_SIZE*2/16);
  DMA_TCD0_BITER_ELINKNO = DMA_TCD_BITER_ELINKYES_BITER(BUFFER_SIZE*2/16);
  // the address is adjusted by these values when a major loop completes
  // we don't need this for the destination, because the circularity of the buffer is already handled
  DMA_TCD0_SLAST    = -BUFFER_SIZE*2;
  DMA_TCD0_DLASTSGA = 0;
  // do the final init of the channel
  DMA_TCD0_CSR = 0;
  // enable DAC DMA
  DAC0_C0 |= DAC_C0_DACBBIEN | DAC_C0_DACBWIEN; // enable read pointer bottom and waterwark interrupt
  DAC0_C1 |= DAC_C1_DMAEN | DAC_C1_DACBFEN | DAC_C1_DACBFWM(3); // enable dma and buffer
  DAC0_C2 |= DAC_C2_DACBFRP(0);
  // init the PDB for DAC interval generation
  SIM_SCGC6 |= SIM_SCGC6_PDB; // turn on the PDB clock  
  PDB0_SC |= PDB_SC_PDBEN; // enable the PDB  
  PDB0_SC |= PDB_SC_TRGSEL(15); // trigger the PDB on software start (SWTRIG)  
  PDB0_SC |= PDB_SC_CONT; // run in continuous mode  
  PDB0_MOD = 20-1; // modulus time for the PDB  
  PDB0_DACINT0 = (uint16_t)(20-1); // we won't subdivide the clock... 
  PDB0_DACINTC0 |= 0x01; // enable the DAC interval trigger  
  PDB0_SC |= PDB_SC_LDOK; // update pdb registers  
  PDB0_SC |= PDB_SC_SWTRIG; // ...and start the PDB
}
void loop() {
  // do nothing
}

Conclusion


So, now you know how to get DMA up and running with both DMA and the DAC buffer. Hopefully you also learnt something about programming or about the architecture of the Kinetis K20 (the family of ICs used in the teensy).

If you have any suggestions for improvements, or errors to point out, I would be happy to hear them.

Endnotes

[1] I guess I am not even using the minor loops. However, there does not seem to be much of a difference in operation, except that you can also use the minor loops to do some offsetting. My problems with getting the 16-byte transfer to work may originate in my misunderstanding of this mechanism. Of course, any suggestions are welcome, as always.

Discussions

Will Robinson wrote 02/27/2018 at 21:20 point

Hi Arthur. I just tried your post on a Teensy 3.6.1 and noticed some output distortion from the DAC. I slowed down the output rate by changing the values of PDB0_MOD and PDB0_DACINT0 to 0xffff, which set the sine wave frequency to about 36 Hz. It looks like the first word out of the DAC is the last word written into the 16-word DAC buffer. That is, the values written into the DAC buffer are (val1, val2, val3, ..., val15, val16). The output of the DAC is (val16, val1, val2, val3, ..., val15).

Has anyone else identified this as a problem? Is there a fix? I would insert a picture from my scope if I could. 

  Are you sure? yes | no

dale.roberts wrote 02/28/2017 at 15:19 point

Hi Arthur, I came across your helpful post as I was trying to figure out DMA on an Arduino Teensy 3.6. I don't know if you've worked through this already, but I found that I was getting a Source Address Error when I tried to use the 16-byte "burst" transfer type. I assume this was also happening when you had tried it.

I finally figured out that the "fix" is to force the alignment of your sine table to a 16-byte boundary using a GCC "attribute", like this:

static volatile uint16_t   __attribute__((aligned(16)))  sinetable[BUFFER_SIZE];

That should make the 16-byte transfers work (assuming your destination is also 16-byte aligned, which it should be if it is the DAC buffer registers).

dale

  Are you sure? yes | no