This page is an attempt to explain how DRAM memory and 72 pin SIMMs work. This material is intended for students and engineers who may want to use a 72 pin SIMM in their own project designs. It may also provide a bit of insight into how this MP3 player's DRAM controller works. But first, a bit of history...

Sometime in November 2000, Matthew Burnham posted a question to a message board associated with Peter Kovacs's excellent mp3projects.com site. His question, "Has anyone tried using a SIMM as a buffer. I can't find any datasheets on the timing", which provoked two lengthy replies from me, one explaining in general how this MP3 player's DRAM controller works, and another with more general information about using DRAM memory, specifically 72 pin SIMMs. Jesper Hansen offered quite a bit of additional information, based on his work using a DRAM chip (not a SIMM) with his yampp project.

On January 4th, 2001, Peter moved his message board from Amazingforums to Bravenet, which had the unfortunate effect of scrambling the message order and losing much of the html formatting. Those old messages are now burried deep in a long list. On either forum, the messages aren't editable by their authors, which is unfortunate, because my original postings had several minor wording errors. This page is an attempt to clean up this material and present it on a page where it may be useful long term. As time goes on, this page will probably be edited and improved, bearing less and less resemblance to the original message forum posts. If you're trying to use DRAM memory in your own project, I hope you'll find this information useful.

Recent User Comments:

Marcelo deCastro commented:

Hi Paul.
Well, I was reading an old document talking about how to interface an 8051 to SIMM memory from your website and some thing came to mind. A few months ago I sampled a few DRAM interface chips from national semiconductors. They are quite cool. The DP8422AV-25 does all the refresh pulses, ras and cas calls for you. They interface straight to 8 bit micro and can handle up to 64megabytes of dram. Everything in a plcc package. And they have smaller chips too.. this one was the top of the line.
I don't know the price of a fpga but this was a very good alternative for the hobist.

I'm not so sure about the "interface straight to 8 bit micro" part, having only looked at the DP8422A datasheet briefly. It appears to require wait state capability from the microprocessor, and the list of app notes does not include interface example with small 8-bit microcontroller chips. It is also more expensive than a 10k gate FPGA which can implement this sort of circuitry.

Incompatible SIMM Formats (RAS lines, array size)

Here's the text of a message I wrote on April 17, 2002, which explains the SIMM compatibility issues I've run into and what the code in the mp3 player does about them:

Short summary: The hardware is designed for the SIMM configurations described in the Micron datasheets, and SIMMs with other wiring can only be partially accessed. When the new SIMM detection code finds even _less_ memory, it is actually doing exactly what it should, since it is carefully checking the memory instead of making the assumption that your SIMM is one of the four described in the Micron datasheets.

however no matter what SIMM i add it's incorrectly detected as 2mb instead of 32mb. 0.6.8 and below detect 8mb instead of 2mb, It seems to be the new improved SIMM detection that has made it recognise even less.

The board hardware, FPGA and firmware were designed around the Micron datasheets, which are archived on the web site. So far, those are the only datasheets I have regarding 72 pin simms. I have seen datasheets for individual chips, and several people have emailed me datasheets that turned out to be for 168 pin DIMMs, not 72 pin SIMMs.

Saddly, someone changed the format of SIMMs and some SIMMs, particularly very new ones, tend to be different than the original "standard" documented by the Micron datasheet. I have done quite a bit of reverse engineering on several of these problematic SIMMs. It looks like two things tend to differ. First, many of the newest SIMMs only connect to RAS0. RAS1, RAS2, and RAS3 are not connected to any of the chips. Also, some SIMMs use a rectangular matrix instead of a square as defined by the Micron datasheets (and followed by the majority of used simms I have found). Micron described two matrix sizes, 1024x1024 (4 and 8 megs) and 2048x2048 (16 and 32 megs).

First, the RAS lines. 4 and 16 meg simms are supposed to use RAS0 and RAS2 (according to Micron), and 8 and 32 meg simms use all four RAS lines. 4 megs of memory appears as blocks 0 to 1023, and the FPGA asserts RAS0 for 0 to 511, and it asserts RAS2 for 512 to 1024. If you have a 4 meg simm that has only RAS0 connected, only the memory from 0 to 511 will work. Of course, the RAS lines are only half of the story.

The other ugliness is in the size and shape of the matrix inside the memory. Micron describes only square matrix arrangements, but some simms use a different number of bits for the rows and columns. The first major issue is that the FPGA absolutely will not work with any SIMM that uses less than 10 column address bits. I have only 1 SIMM that uses 9 bits for column, and the new simm detection code does a check for this. The other problem arises if the SIMM wants to use 12 or even 13 bits for either row or column address. The circuit board (and FPGA) only has 11 address bits to connect to the SIMM.

So suppose you have a SIMM with only RAS0 connected to a 1024x8192 size matrix. That would be labeled as a 32 meg SIMM. Because only one of the four RAS lines is used, only 8 megs would be viewable by the firmware because they did not implement the other three RAS lines. However, it has 8192 rows, but only 2048 of those rows will be accessible on the board, because the board only has 11 address bits (again, motivated by the Micron datasheet). The SIMM would initially appear as 8 megs of memory, but it would actually be the same 2 megs appearing in 4 places.

The original SIMM detection code assumed there were only 5 possible scenarios... the four described in the Micron docs, or no simm at all. Other SIMM types would typically be mis-detected and the firmware would run for a short time and ultimately crash very badly.

The new simm detection code tests each 1 megabyte section of memory, which makes very few assumptions about the SIMM. There is an assumtion that there are at least 512 rows and at least 1024 columns, so that memory will appear in chunks no smaller than 1 megabyte (the FPGA allocates the address space bits where the lower 20 bits are always mapped into the low 10 bits of rows and columns, and the upper 5 bits map into RAS selection and that 11'th address signal for both row and column, and the 10'th bit of the row). I've considered switching this in the FPGA so the requirement is a minimum of 512 columns and 1024 rows. At the end of the test is a check that a 4k block really is a solid block of memory (write to one part don't make changes anywhere else in the block).

The 1 megabyte tests are done by writing known and distinct data into all of the previously located good sections, and then doing a write and readback of the next untested section, and a readback of all other good sections to see if writing to that new section corrupted any of the other good ones. I hope that complicated sentence made some sense.... you could always look at the code, I suppose.

The whole idea of this test is to determine what portions of the memory are truely usable, making as few assumptions about the SIMM's configuration as possible (since there are so many very different types that are otherwise labeled the same). A 32 meg SIMM that was previous mis-detected as 8 megs of memory and now appears as 2 megs is probably an extreem case of using only one RAS line and a rectangular array where one dimention is only 1024 cells wide. Saddly, there are some SIMMs like this and due to the way the player was designed (for the Micron docs), only 2 megabytes of the SIMM is usable.

If I could go back in time and change the board design, I'd probably connect only two of the RAS lines (tying pairs together), and use the two extra FPGA pins to control two pairs of the CAS lines (for 16 bit access instead of using RAS's) and use the one leftover pin for an a twelfth address bit. Maybe someday I'll make some mods and a special FPGA config for them, but the effort would be much better spent on making the firmware play files larger than the available memory!

Anyway, this message got long, but I hope it clears up a bit what it actually happening with the simm detection code.

Messages Archived

Name: Matthew Burnham
Subject: SIMM access

Has anyone tried using a SIMM as a buffer. I can't find any datasheets on the timing

Name: Richard Houghton
Subject: Re: SIMM access

Hi Matthew, How's it going ?
I've spent some time looking at using SIMM's as buffer memory, and concluded that it's a waste of effort with the 8051 type MCU. It's messy with the timings since it was not designed with dynamic memories in mind. I'm not convinced that such a large buffer is necessary anyway. I did find a circuit/article about interfacing DRAM, but with the 8051 it required the software to control the memory (I think wait until it was ready) I feel that the buffer memory should only require the normal memory access insctructions of the MCU and in the case of the 8051 it's not possible. If you're still considering a FPGA solution then have a look at PJRC again.

Name: Paul
Subject: Re: SIMM access

It's not easy to interface a DRAM SIMM with a 8051, or other simple 8 bit microcontroller, but it can be done. My little MP3 player project does it (well, the soon to be released firmware does).

Here's the link: http://www.pjrc.com/tech/mp3/

The DRAM interface is done in a Xilinx FPGA chip. It's really pretty complex. I actually made two DRAM controllers... My first design didn't really allow 8051 bus cycles to access the DRAM. It used a small shared memory (using the some xilinx CLBs are SRAM). You'd write the 32 bit address and 16 bit data, and then write to a particular location that would cause a state machine to assert a sequence of signals to write that data to the DRAM. There was a similar sequence for reading into the shared memory. I intended to make a DMA capability, but it became obvious that this cumersome access method would make it very difficult to use the DRAM memory for manipulating playlists, long filenames, and other generic program data. So I did a major overhaul... The second design also has the shared memory, but it's used only for DMA transfer parameters. I added a second shared memory that serves as the upper bits of a transfer, and the top four bits of the 8051 address bus drive the address pins of this memory, so that the 8051 "sees" 16 4k blocks. Actually, I reserved the top block to serve a registers to control things, and the two SRAM blocks are mapped into this memory, since there needs to be a way to read and write them. The state machine is a bit more complex... it waits in an idle state, and when it sees a 8051 bus cycles, it traverses through a group of states that sequence a read or write to the sharded memory, to the DRAM, or to the IDE interface. The FPGA clocks at twice the speed of the 8051, so there are plenty of clocks available to finish an access to any of these places while the 8051 waits (the IDE is the slowest to conform with PIO-0). To do DMA, the circuitry looks at the PSEN signal, and when the 8051 asserts it, the chip executes a group of states that does one bus cycle of a DMA transfer (or does a NOP sequence if no DMA transfers are in progress). If this all sounds complicated, well, it is, and it's taken me a few months to get it all working. I'm still working on writing 8051 firmware to make good use of all this. If you come visit my site, the good news is that the firmware is open source (GPL'd). The bad news is that, at least at this time, I'm not distributing the FPGA source. The board is designed with flash rom, so the firmware and the FPGA config can be upgraded via a serial port. I've had quite a bit of interest in the project, and we're selling boards, kits and parts (currently out of kits and bare circuit boards, though we have assembled and tested boards in stock). Even if you're designing your own (and don't need a board), come take a look and maybe you'll find something useful for your own project.

Name: Paul
Subject: Re: Re: SIMM access

I was reflecting back on the message I wrote just a couple hours ago... aside from not knowing how line breaks/wraps would be handled, it occured to me that Matthew was looking for datasheets for SIMMs. Well, as nearly as I can tell, there's only one place left. The nice folks at Micron have kept them on-line, even though they're labeled as obsolete.

Here's some links directly to the PDF files:

4 and 8 Meg SIMM dm53.pdf (archived, no longer available at Micron.com)

16 and 32 Meg dm45.pdf (archived, no longer available at Micron.com)

Now that you've got the datasheet with all the detailed timing info, you'll probably be quite confused about how to actually read/write to the SIMM. I know I was for some time. I'll try to explain briefly (and over-simplified), so that the timing diagrams will make a bit more sense.

The SIMM has 4 RAS lines, four CAS lines, and a WE signal. These 9 lines are the ones that control the SIMM. There are 36 data lines (non-parity SIMMs will only implement 32), and 11 address pins (4 and 8 meg will only connect 10 of these). There's also 4 pins intended to identify what type of SIMM is installed, but they're worthless because most SIMMs didn't connect them properly.

I mentioned that there are 9 lines that control the SIMM, but conceptually, you'll create a RAS signal and use it to drive one or more of the RAS lines, and a CAS signal that you use to drive one or more of the CAS lines. So far now, let's imaging that there's only 1 RAS and 1 CAS, and later (if I don't get tired or Robin doesn't come distract me ... er, later we'll worry about which ones to drive, so for now we only have to think about RAS, CAS, and WE.

The smaller SIMMs have 10 address pins, and the bigger ones have 11 (I suppose a 64 meg SIMM has 12, but I don't have any datasheets). That's not very many. To access the SIMM, you need to give it the address in two parts, so each address pin will actually communicate two bits of the address for each access. You'll also end up using some of your desired address to decide which combination of RASs and CASs to drive, but for now we're pretending like there's only one RAS and one CAS. In any case, the point behind the address pins is that two bits travel on each pin. If you're trying to connect a SIMM to the 8051, there are a few things you can do. You can make a bunch of 2:1 MUXs to feed either half of the whole address to these pins. That's what my DRAM controller does inside the FPGA. If you're using discrete logic chips or low density PLDs, you could take an approach like using the 8051's bus during half the time, and perhaps the output of a couple 74HC374 chips for the other half. If you're really tricky, you might might try to wire a 74HC374 (writable at some special address) and the normal 73HC373 used on the 8051 bus, and use their tri-state enable pins somehow. The point is that to communicate with the SIMM, you need a way to selectively put two different address bits on each address pin.

To read from the SIMM, you do the following:

Keep WE de-asserted for the entire time
Apply the first half of the address
Assert the RAS line
Apply the other half of the address
Assert the CAS line (this causes the SIMM to drive the data pins which correspond to the combination of RAS and CAS lines you chose, so if anything else is driving the data bus when you assert CAS, well, make sure that there isn't anything else driving)
Wait. If it's a 60 ns SIMM, it takes 60 ns from the time you asserted RAS, or about half that from when you asserted CAS, whichever is worse (my controller does these steps 34 ns apart from each other, but waits 102 ns so that even a very slow SIMM would work)
Latch the data or otherwise do whatever you wanted to do with it
De-assert CAS and RAS
Wait! You can not assert RAS again for a while (approx the spec'd access time). This is because the DRAM is doing a thing called "pre-charge". What's really happening here is that the read was destructive... the tiny charge in the memory cells was wiped out to drive the lines in the memory array leading to the sense amplifiers, and during this time the chip is writing the entire row (that's been read to a buffer) back into the memory array. Circuits inside the DRAM chips do all this for you automatically when you de-assert RAS... all you really need to do is make sure you don't assert RAS or CAS again until the precharge time is finished.

The data sheet is a bit confusing because there are lots of other things you can do, but for an MP3 player, there's just no need to go to the trouble... DRAM is just so much faster than a disk drive, cdrom, or even flash memory that it's not worth the effort. Having said that, the basic idea behind all those funny modes is that you can leave RAS asserted, and de-assert and re-assert CAS (each time giving a new second half address), and for each successive CAS, you only need to wait for half the access time. FPM SIMMs will tri-state the data pins as soon as CAS is de-asserted, but EDO will leave the buffers driven until RAS is deasserted. My controller always deasserts RAS and CAS at the same time, and if you design one for a MP3 player, yours probably will too. These fast CAS-only accesses are particularily useful in a PC, where virtually all access to the DRAM is loading or flushing chunks of the L1 or L2 cache, instead of normal reads like you'd expect from a microprocessor without a cache.

Reading from the SIMM is nice, but it'll only have garbage until you write something (hopefully MP3 bitstreams....) Here's how to write:

Drive the address pins with the first half of the address
Assert RAS
Drive the address pins with the other half
Assert WE
Drive the data pins with whatever you want to write. You only need to drive those pins that correspond to the combination of RAS and CAS lines you decided to drive. You can actually do this step and the previous two at the same time if you like
Assert CAS
Wait (not long, check the datasheet). My controller waits 68 ns.
Deassert CAS and RAS
Wait (again, check the data sheet, but I think it's the usual precharge time) My controller always allows at least 136 ns for precharge.

If you like, you can actually mix reads and writes within the same assertion of RAS, if for example you want to read, change some of the bits, and write back. My controller does that to write a byte, because I connected the RAS and CAS lines in a way where my controller can not write a single 8 bits at a time.... I ran out of pins on the FPGA, and I was determined to use the 84 pin chip, so that the project would be buildable by hobbists. The 144 pin TQPF package costs the same, but it's virtually impossible to hand solder.

Now that you've got the steps to write data and read it back, the only other thing you need to do is make sure the data stays around. DRAM needs "refresh".... it won't retain it's memory if you just leave it sitting still. You're supposed to do refresh operations at an average of no more than 15.26 µs between them. You can do them at 15.26 µs intervals (like mine does), you can do, say, 1024 at once and do these groups at 15.6 ms intervals. My controller does the refresh while the 8051 is asserting ALE (update: it now schedules refresh together with everything else and snoops the 8051's opcode fetches from the flash rom to predict when it can hog 100% usage on the DRAM bus vs when it needs to be ready to respond to the 8051's MOVX instructions).

To do a refresh is easy:

Make sure WE isn't asserted
Assert CAS
Assert RAS
Deassert both of them
Wait for the precharge time before using the SIMM.

It's that easy. Well, there needs to be some minimal delay between the steps, but not much.... check the datasheet. I believe my controller asserts CAS, then RAS 68 ns later, keeps then both on for 136 ns, and allows 136 ns for precharge.

Now the last major issue is to figure out which combination of RAS and CAS to assert... I'm getting tired, and the Micron datasheet has some nice diagrams, so I'll finish up quickly.

One thing that they don't really explain in the datasheet is that it's ok to drive the a CAS line without driving a corresponding RAS... whichever chip(s) that is on the SIMM will ignore it. Some of the ways you might want to connect the signals will end up driving the CAS on some of the chips without their RAS being driven, and that's ok. Driving RAS without CAS is probably ok too, but the chip will consume its active current (approx 1000 times the idle). I've avoiding driving RAS without CAS.

The 8 and 32 meg SIMMs use all four RAS lines, and you can think of them as being two 4 or two 16 meg SIMMs, where the pair of RAS lines selects which SIMM you're using. Each single-sided SIMM (4 and 16) is really two 16 bit memories, with each RAS line controlling one of them. Each of these 16 bit memories has two CAS lines, which lets you read or write either half, or both halves at once if you drive both. Update: there are other simm arrangements in common use... see the June 26, 2001 entry on the recent news page (or history page linked at the bottom).

If you want a 32 bit wide memory, you'll probably always drive the RAS lines in pairs, and then you'll drive 1 CAS line if you only need 1 byte, 2 CASs for a 16 bit word, and all four for all 32 bits. For the 4 and 16 meg SIMMs, you'll only have one RAS pair to drive, and for the 8 and 32, you'll have to drive one pair or the other depending on one of your address bits.

It's also possible to connect the SIMM with a 16 bit data bus, which is what my controller does. In this way, you have either 2 or 4 banks of RAM, so you use either one or two address bits, and always drive only a single RAS line at a time. For the 16 bit config, you'll connect the CAS lines in pairs, and for reading/writing a byte you'll drive the pair depending on MSB/LSB, or both if you want all 16 bits. My controller actually ties all 4 together on the board, and always reads all 16 bits, and when it needs to write 8 bits, it does a read-modify-write cycle.

Whew, this has been a long post... Robin went to bed just a while ago, and I should get some sleep too. I hope this has helped a bit. I am speaking from the recent experience of designing a custom DRAM controller for the 8051 (that is now working), and I know it can be difficult to make sense out of the datasheets at first.

Name: Matthew Burnham
Subject: Re: Re: Re: SIMM access

Thanks Paul, I've had quick skim through that while prining it off and that's answered a lot of my questions (particularly how the waveforms work out over multiple transfers, the datasheet I'd got glossed over that a little).

One thing I'm interested about is how you connected all 32-pins to the SIMM as well as the IDE and 8051... I understand from your website you've shared the IDE data ports for 16-bits and now understand how you've made use of all 32-bits. Great. Right, I'm off to do some soldering to see if I can get my FPGA clock wired in.

Sorry for not getting round to replying to your email Richard. Its been sat in my Draft box for a while. It's funny where you meet people!

Name: Jesper
Subject: Re: Re: Re: Re: SIMM access

Let me add a few thoughts to the above discussion....

First of all - buffering is not needed for playing MP3's off a HD. Maybe that wasn't the original intent, but well, the message was posted here, so... I've tried playing 2ith just one sector buffer (and using FAT32), and its working perfectly. But if the player should be battery powered, if may be of benefit to cache ahead several Megabytes. Unfortunately, with this, you'll have to wait for the disk to spin up, if you want to change the preloaded songs. I don't know the dynamic power requirements of a xxMB SIMM, but compare it to a low power 2 1/2" disk which uses about 450 mA.

Secondly, it's no big deal to interface a DRAM to an 8-bitter like the 8051/AVR's, IF you don't need the full speed ! You can use I/O pins to handle the RAS/CAS sequencing and a timer interrupt can handled the refresh. Yes, it will introduce overhead, but that may not be critical for your application.

/Jesper

http://www.myplace.nu/mp3
yampp - Yet Another MP3 Player

Name: Jesper Hansen
Subject: Re: SIMM access

Hi all,
I'd expected someone to make me prove my theories, but nothing happened. But I thought I saw it coming, so I prepared and hooked up a DRAM to my STK200, and here's the results :

1. It does NOT draw much power. You're right here Paul. As the specified current draw of 120 mA is while operating, it's obviously only while OE are low and the chips is cycling. I measured about 2-3 mA for my chip, but it wasn't very active. Just refreshing and reading out a relatively slow stream of data.

2. I connected the chip with a 14 wire interface. The address A0-A7 and data D0-D7 is connected together to PORTA, and A8,A9R,WR,OE,RAS and CAS is connected to PORTC. A burst refresh takes 1.2 mS (1024 cycles) using CAS before RAS mode. So far so good.

The slo-mo came when I had to do the Read and Write. With my current routines (in C-code), a read takes 2.6 uS and a write takes 2.2 uS (on a 11.0592 MHz 8515). These routines can be optimized in assembler to about 20 cycles each.

The interface is pretty nice with just two functions using long pointers so the full 512kB (or more) is accessible without worrying about paging and such nasty stuff.

So ... to conclude this :
It's possible to interface make a glueless interface to DRAM's without ANY extra hardware. The penalty is as always - speed. But if you can live with RD/WR speeds of 300-500 kB/sec, it's doable.

/Jesper

PS. A funny thought just struck me.... With my AVR at 11.059 MHz, it's quicker (or as quick as) a standard 8051 at 12 MHz in reading/writing memory. When was the last time you saw a 8051 with several MB of linearly addressable memory ??? DS.

http://www.myplace.nu/mp3
yampp - Yat another MP3 Player

Name: Paul
Subject: Oops :)

When I wrote that very lengthy message above, I made a number of small mistakes (aside from the places where my html formatting didn't turn out quite right). For example, the last line about the fast page modes ought to read "microprocessor without a cache". There are a few other with vs without screw ups (it was late and it's a bitch to proofread a long message in such a tiny text entry window). I may turn the text into a web page and edit to clean up those little confusing mistakes. Update: they've been fixed now. If you see anything that looks wrong, please email.

(Japser) My experience with playing from a hard drive with no buffering is not quite as good as yours. All the standard 3.5 inch drives (that I've tried) can play without a buffer, but if you shake the drive, it will often take a moment to respond and you'll get a drop-out. That's standard 3.5 inch drives. Laptop drives are not nearly as nice. Most of them do thermal recal and other not-so-nice things and stop responding for a while. You really do need to buffer, but perhaps 64k would be enough for steady playback. For 320 and maybe even 256kbps, 64k is a bit small. My Hitachi drive (DK23AA-12) sometimes spends when seems like 1-2 seconds doing a recal.

The reason I wanted a large buffer is to be able to run from batteries. The DRAM consumes very little power when sitting idle (refresh only). I haven't done any accurate measurements, but I do keep an eye on the needle of the benchtop power supply, and it doesn't make a noticable change when I insert the SIMM and config the FPGA (starts the refresh). Saddly, the FPGA uses more power than I wanted.

Regarding doing all the interface in software and using a timer interrupt for the refresh, consider a 8051 running at 24 MHZ. A "cycle" is actually 12 crystal cycles. Dallas makes some expensive chips where it's only 4, but it's only about 2X the speed because many instructions take longer. Let's assume a standard 8051, since 24 MHz is pretty fast. The timer interrupt takes 4 cycles to interrupt and vector to the code (2 for vector, 2 for the call). Let's assume you don't save any context, and do roughly this:

CLR CAS_PIN
CLR RAS_PIN
SETB CAS_PIN
SETB RAS_PIN
RETI

That's 6 cycles, plus 4 for the vectoring, for a total of 10 cycles, or 5 �s. You need to do this every 15.26 µs, which means you've burned 33% of the CPU, and you've still got to move all those MP3 bits around... and every read/write to the buffer is lots of bit-bashing.

Now, I'll admit that you'd probably choose to do refresh in blocks. The decoder chips have small buffers (the datasheets I have don't say exactly how much). Maybe you'd do 256 refreshes in one timer interrupt, consuming 1034 cycles every 3.9 ms, which only burns 13.2% of the CPU time. An AVR or Dallas fast 8051 could do better, but it's still a considerable burden on the CPU. Also, every time to do an access to the DRAM, the code would have to disable the timer interrupt, and the buffer is worthless unless you can fill it (while playing) faster than you're draining it. My point is that the overhead of a software-only approach is considerable.

Name: Jesper
Subject: Re: Oops :)

Hi Paul,

I havent had any problems with the drive recalibrating or anything. It's a 2 1/2" IBM Travelstar. But obviously, if you shake the drive, there will be dropouts without buffering. But so far, I haven't had any major problems. I usually have a few 4 kB buffers, but was just testing the actual need for buffering, and found that it worked okay with a single sector buffer.

I don't say that there aren't any overhead with DRAMS in my refresh scheme. But depending on your application, it may not be important. In my MP3 player, only 9 % of the CPU capacity is used ( 8 MHz AVR 8515 ), so using DRAM would be no big deal.

And I'm even using a very timeconsuming way of interfacing to the IDE. But without the need for any extra interface hardware.

The Refresh code would look as your example, but note that on a 8 MHz AVR, it will take only about 1 - 1.2 uS to execute a single CAS/RAS refresh. A 128 kbps song will cause the MAS to request data in 1.5 mS bursts every 10 mS. A 1024 row refresh sequence could easily be put in between as it only would take about 1.3 mS.

The refresh normally has to be repeated every 16 mS, making the load only a little more than 8 %. So together with the other code, I still have about 80 % CPU power left :)

No problemo, in other words.

It sounds strange that the SIMM doesn't use more power. According to my datasheet on a single DRAM chip, the MSM514800, 512kB*8, the averace power consumption operating is stated as 650 mW (about the same as a SRAM). The standby consumption is 5 mW.

I can't see how a 8 chip (or more) SIMM could use much less ?
/Jesper

Name: Paul
Subject: Re: Re: SIMM access

Jesper,

Sounds like you've done some really good work with the AVR and the DRAM. With A0-A7 and D0-D7 tied together, do you get a logic contention when you read the chip? When you assert CAS, you need to be driving the address pins with the column address bits, and when CAS goes low, the chip turns on its output drivers. Hmm... just read your message again, looks like you're using the OE pin, which will solve that problem. The OE pin is available on individual chips, but it's not one of the signals available on the 72 pin SIMM, which is strange, since there's a few unused pins, but that's the way they are.

I'm pretty impressed that you got such good speed. 300-500 kB/sec is damn good. My hardware-based DMA gets about 1.2 to 2.5 MB/sec, depending on a few factors. Of course, the CPU can execute other code, as well as do normal DRAM access during DMA transfers... and that's a good thing, since the 8051 is so slow (I'm only clocking at 7.3728 MHz), so doing 4k DMA bursts only gives me 2 ms to get the next one set up: figuring out which sectors to read, where to put them in memory, and updating tables and lists to organize everything. It's turning out to be more work that I had originally thought to manage so much memory. I'm still not sure how to decide when to un-cache FAT sectors. Clusters are easy, free then when the file or directory is closed.

It's true that the AVR kicks the 8051's butt, at least in processing power at the same clock speed. I wanted to use the AVR for my project, but there were two things that held me back. #1, it can't execute from external memory. I had already made a homebrew player for myself, and the major goal of the new design was to make something for others. I really needed to have the firmware be downloadable with only a standard PC port, and this was quite easy with the 8051. The second major reason for not using the AVR in my little project was that I couldn't find vendors who'd sell me even 100 parts. Apparantly Atmel is having a really hard time meeting the demand (partly Atmel's fault, mostly 'cause the AVR is such a kick-ass chip). (The AVR shortage ended somewhere in the winter of 2001 and they became readily available again). The 8051 is cheaper than the AVR, but even in performance/dollar, the AVR is the clear winner compared to the 8051. Where I work, I started a design in the summer of '96, and I really wanted to use the AVR (vapor at the time). Atmel claimed it would be "any week now", but it was about another year until it finally appeared. I ended up using a microchip PIC. At the risk of getting flamed, I'm going to say that the PIC is about the worst instruction set I've ever used (well, besides one I designed myself as a student, OSU8 project). All through that project, I kept looking over at that vaporous AVR databook. Woulda been nice. I'm saying the same thing again while writing code these days for my MP3 project... if only Atmel make one that could run from a large external memory, and that was readily available!

(from an email on Nov 6, 2002)

>I have found your page to be VERY useful. I thank you for putting it up.

I'm glad it has helped.

>I would like to refresh 72 pin DRAM without an FPGA or DRAM controller. I am
>thinking of either software refresh with the 8051 doing CAS before RAS
>refresh which would take, on a 24 MHz 8051, either 2 or 4mSec out of a
>16mSec refresh cycle.

The simplest approach, if your application can tolerate it, is to do a block refresh. Every 8 ms (or 4 ms) you do a a whole bunch of refreshes very rapidly (likely from a timer interrrupt). If you build hardware do do a cas before ras cycle on a MOVX from at a certain address, then you can just put 262 movx's in a row (262 bytes of code space), plus the interrupt routine's overhead. At 24 MHz, a movx takes 1 us, and figure you'll spend 40 us entering and existing the interrupt routine. That's just .3 ms spent every 4 ms, or 7.5% of your cpu time.

>Or a hardware assisted refresh using a GAL 16V8 or
>22V10 to build a state machine to toggle the RAS and CAS lines the needed
>amount of times while the 8051 requests permission to access the DRAM for
>short to medium periods. What do you think of the 2 methods?

There are lots of more efficient ways to get the refresh done if you are willing to throw hardware at it. My FPGA is on the hardware heavy side.

>It seems to me
>I would be able to gain a lot by using a GAL state machine to refresh the
>DRAM.

Well, if your app can't work with the latency of having the CPU consumed for .3 ms every 4 ms (or some other ratio per second, and all the rows every 4 or 8 ms depending on the chip on the simm), then you'll probably need dedicated hardware of some sort.

If your app runs always from external code memory and NOT the simm (you only store MOVX-accessed data in the simm) then maybe you can do the refreshes during the PSEN strobe. It's also possible to try and sneak them in during ALE (which continues pulsing even when running from the internal code memory of a 87c5x chip).

Whatever you do, I hope you'll take a bit of time to write about it if it works out well. If you put it on a website somewhere, I'll make a link to it, or if you don't have a site I can post it to that SIMM page.

MP3 Player, Using The STA013 Chip, Paul Stoffregen.
http://www.pjrc.com/tech/mp3/simm/simm.html
Last updated: February 23, 2005
Questions, Comments?? <paul@pjrc.com>