Close

Copying Rows

A project log for Compute In Memory in Ancient DRAM

Massively parallel operations in a 64kx1 DRAM from the 1980ies

timTim 05/17/2025 at 23:080 Comments

Now that we can set/erase entire rows, lets move on to another exploit: Copying rows. This approach has been outlined in RowClone: Fast and Energy-Efficient In-DRAM Bulk Data Copy and Initialization in 2013.

We are using the access pattern that I called "RAS glitch high" for this, as shown in the scope image below.

Copying rows works as follows:

  1. The source row is opened by pulling RAS low.
  2. The rows content is now loaded onto the bitlines and the read amplifiers will restore the charge from the array capacitors to full logic levels.
  3. We set the address of the target row on the address lines
  4. RAS is toggled to hi level for 40ns. This will activate the target row and connect it to the bitlines. Since the short RAS high time was not long enought to precharge the bitlines, they still contain the information from the source row. The low charge of the target row cells capacitors is not able to override this, so that the information from the source row is retained
  5. The read amplifiers will amplify the voltage levels on the bitline, which are dominated by the source row, and write the information to the target row

The code for this is shown below:

// Copy a row to another row
void dram_copyrow(uint8_t row1, uint8_t row2) {
    // Ensure read mode
    GPIOD->BSHR = DRAM_WR_PIN;  // W/R high (read mode)
    
    // Set row address
    DRAM_ADDR_PORT->OUTDR = row1;
    DELAY_RP_CYCLES();         // RAS precharge time
    GPIOD->BCR = DRAM_RAS_PIN;  // RAS low (active)  
    DELAY_RCD_CYCLES();         // RAS to CAS delay
    DRAM_ADDR_PORT->OUTDR = row2;
    DELAY_2_CYCLES();           // RAS to CAS delay

    // Open row2 while bitlines are still precharged with row1 content
    GPIOD->BSHR = DRAM_RAS_PIN; // RAS high (inactive)
    // violate RAS precharge time
    GPIOD->BCR = DRAM_RAS_PIN;  // RAS low (active)    
     
    DELAY_RAS_CYCLES();         // CAS pulse width
        
    // End cycle
    GPIOD->BSHR = DRAM_RAS_PIN; // RAS high (inactive)
    DELAY_RP_CYCLES();          // RAS precharge time
}

Test results are shown below. It is possible to copy rows between inverting and non-inverting areas of the DRAM array - the read amplifiers will ensure the correct polarity of source and target rows.

We are now able to copy a complete 256 bit or byte (assuming 8 chips in parallel) row in less than 1µs! (~600ns actually). A 6502 at 1 Mhz would take around 15 cycles per byte, ~3800 cycles in total, to copy the same amount of data, around 4000 times slower!

Now, curious, that this mode was first reported in 2013 when it actually worked in 1983 DRAM as well? What would have been possible, if the home computer of that time had an ultra fast block copy ability?

Discussions