From: Jaap van Ganswijk To: mc68hc11@bobcat.etsu.edu Subject: Re: Memory testing algorithms At 08:38 96-10-29 -0600, you wrote: >I have a student looking for references on memory testing algorithms, >i.e. for bad cells, stuck on zero, one, bad addressing etc. Can anybody >point out anything published in this area? On the University of Delft they did this kind of thing when I was there. You can find them at: http://www.et.tudelft.nl/ Or more precisely: http://einstein.et.tudelft.nl/~linden/testing/ It was handled by Prof. A.J. van de Goor My conclusions were, that it's best is to use a march test: - fill all bits with zero - walk through the memory and see for each bit if it is still zero, then change it to one - walk torough the memory and see for each bit if it is still one, then change it to zero It can be proved (by writing out all combinations) that all kinds of errors like you mention will be detected, when they involve only two cells. It can also be proved (by the method of full induction) that when they will be found concerning two cells, they will also be found when they concern any number of cells. Date: 19961102 From: Fred E. Stuebner To: Multiple recipients of list Subject: Re: Memory Test When we test memories, the criteria for the test must first be defined. Some of the questions which must be addressed include: What type of memory are we testing? Is it static or dynamic? Is it embedded within a micro? What is the temperature range at which it has to operate? What is the voltage range of the power to the memory? What is the timing margin of both the memory and the using system? What are the noise tolerances and the signal level tolerance of both the memory and the system? Are there any known interference mechanisms within the memory itself? Let me comment on each of the above, a little bit. If we are testing a dynamic memory, it requires a periodic refresh. To prove retention of data, we must refresh at the slowest rate, with temperatures and voltages at the most critical setting. Every time we either read of write a location, we effectively insert an additional refresh cycle to either a portion or the entire array. If you test the memory on the bench, you have no assurance it will operate on either temperature extreme outdoors. You need to check with the Mfr of the chip to determine critical conditions. If your design permits Vcc to vary from 4.5 to 5.5, you better test at both 4.3 and 5.7 v. You need to verify that your memory writes in the allotted time. (fastest CPU and slowest memory) You need to work against worst case numbers, not typical numbers. The same applies to read timings. In each system there is noise (electrical). Each signal line must also be securely up or down. A marginal level either in or out can give failures. Depending on the design of the memory chip itself, there may be particular patterns which aggravate problems. A walking ones and zeros may not be enough to provide a good test. A better test is where each address and each data line is switched each cycle. and this is done while operating the memory at a clock rate 5% higher than normal. For example, read location 0000, then read location FFFF. Next read 0001 followed by read of FFFE, etc The data should also change each cycle. The object here is to generate the most noise within the chip. Remember, this *could* be the pattern your application generates. Now that I have either scared or bored you, let me add: Look at your application. If it is a toy, who cares. If it is a navigable device for an airplane, concern yourself with all of the above. In this case, you need to talk to the application engineers from the manufacture of the device you are testing. You do not need to do "all corners". Further information on this type of testing can be obtained from the manufactures of memory testers such as Teradine, Megatest, and others. From: Jaap van Ganswijk To: Komala Vengadasalam Subject: Re: SRAM tester Hi Komala, At 03:23 19991003 GMT, Komala Vengadasalam wrote: >Dearest Jaap, > >I need to ask you few questions on memory testing. If lets say i want to use >2 types of testing 1. walking bits and 2. checkerboard pattern. A question >might arise why you choosed this 2 types of testing. > >I know that checkerboard testing is a faster one but still in certain memory >testers in the marked today, they use few types of testing ; march >algorithm, moving inversion etc. > >I hope to get some advise from you about this. Thank you. Efficient forms of testing always require knowledge about the product and what possible flaws to test for. For example when you need to test DRAM that consists of 8 separate chips that each hold a bit of a byte then it's very unlikely that two bits in a byte 'will stick together', so using the running bit test is a waste of time. Using a checkerboard pattern is usually a good idea, (unless of course every other bit is somewhere inverted on it's way to the actual RAM, but that's unlikely). From the March Algorithm I have once proved (using full induction) that for two neigbouring bits that are connected in any possible way thay will be found and that it's also the case with 3 or more bits that are connected in any way. Thus I personally think that the March test is a very good (and minimal) way to test most types of memory. Especially one-bit DRAM. When testing multibit memory a march test with a checkboard pattern AA and 55 should do the trick. By the way, when testing DRAM it must also be considered that the testing process itself will cause the needed refresh so testing the refreshing circuits will need another approach, like filling it with a pattern. Wait a minute and checking the contents again. This may be combined with a single run march test I think. From: Jaap van Ganswijk To: Komala Vengadasalam Subject: Re: SRAM tester At 06:08 19991011 GMT, Komala Vengadasalam wrote: >Hi Mr.Jaap, > >I still have some questions on SRAM testing. static Memory testing? Recently >in one of the articles, they said before i start to do memory testing i must >do walking bit test to test the access to the address and data line of the >memory. And continue with the memory cells, using checkerboard. > >I didn't really get the idea about it. please give some of your comments >about it. Is it good to use these algorithms. I have proved that a march test will find all errors of given categories and it will certainly find all addressing errors. The method was aimed at finding all errors in 1-bit wide memory, but when you do the march test with all values 0..255 it can also find all errors in 8-bit wide memory. When you can exclude that non-adjacent bits are ever interconnected than testing with all paterns of 00, 01, 10, 11 for every two bits should also suffice. you can for example use 00000000, 01010101, 10101010 and 11111111. When it's known that bits that are interconnected will always give the same value back, you only need to test with AAh and 55h en not 00h and FFh. By the way, the article that you are refering to, was that in a scientific magazine or in a popular magazine. You can perhaps find more on memory testing when you do a search for 'van der Goor' and/or look at sites like: http://www.et.tudelft.nl/ or http://www.ict.tudelft.nl/ (Or look for Technical University Delft or Technische Universiteit Delft). (Van der Goor was the professor that I did the project for.) Let me know in detail what you don't understand yet and I'll try to explain. Ah, another thing to consider is that most algorithms will not find all errors of a non-permanent nature. So of a connection that is present only part of the time. Often dependent on temperatuur or on a PCB, the tension on the board. I think it's really important to define in advance what kinds of errors you want to find and with what kind of certainty. For example, when the value of the bit on a certain address is always the opposite of that of an earlier bit in the address range than it will not be found by the normal march test and you will need to expand the march test from two marches to four. (In the added two marches the bit is not reversed after checking.) Normally you would do: - for each cell: init to zero - for each cell: check that it's zero, change it to one - for each cell: check that it's one, change it to zero With the added marches it would become: - for each cell: init to zero - for each cell: check that it's zero - for each cell: check that it's zero, change it to one - for each cell: check that it's one - for each cell: check that it's one, change it to zero Greetings, Jaap From: Jaap van Ganswijk To: ws2wong at engmail.uwaterloo.ca Subject: Re: Memory testing At 15:51 20000425 Canada/Eastern, ws2wong at engmail.uwaterloo.ca wrote: >Hi Jaap, > >I am an electrical engineering student at the University of Waterloo in Canada, >and I have an assignment to research RAM testing algorithms. I read a series >of your emails regarding memory testing at http://www.hitex.com/ and I was >wondering if you would be willing to answer a few questions I have about RAM >testing. Unfortunately, I have very little background in this area. I have >been reading many articles that refer to various fault models and algorithms, >but not many of these articles define all the terms that they use. If you have >a bit of time, can you please tell me: > >1) What exactly is a March test? It consists of three passes (going from address 0 to n): - First you fill every (bit) location with 0 (or 1). - Then you check if the 0 (or 1) is still there and you change it to the opposite. - Then you check if the 1 (or 0) is still there and change it to the opposite. This will detect all stuck-at errors and any location who's value changes depending on one of it's direct neighbors (even of any of it's neighbors is I have proven using the method of full induction). >2) I have read about stuck-at faults, transistion faults, coupling faults, and >neighbourhood pattern sensitive faults. Other than these four fault types, are >there other common faults that I should be concerned about? What I did in this field was very theoretical namely assuming a single row of bit locations, where only neighboring bits could influence each other. But even all bits might be stuck to each other of course... >3) What are static neighbourhood pattern sensitive faults (NPSF), passive NPSF, >and active NPSF? What is the difference between these three types of NPSF? Patterns suggest that the bits are located in a matrix and not a single line, which makes it more difficult. When you allow XOR-relations it becomes even more difficult. When testing DRAM, the time element also gets involved and the testing itself will even influence the refreshing of the memory... >4) Do you know of an algorithm that will provide high fault coverage with a >reasonably short testing time? I believe that march testing will get most of the errors when you know what you're doing... But in the case of DRAM, you may have to wait a minute and do another march test.... As usual, you can only test things quickly when you know something about them and then the test becomes non-unpartial... It becomes a question of statistics... From: Jaap van Ganswijk To: Bruce Nepple Subject: Re: SRAM Testing Hi, At 16:27 20000523 -0700, Bruce Nepple wrote: >oops, hit the send key by mistake > >I read the following in an email by you regarding ram testing... > >> Normally you would do: - for each cell: init to zero - for each cell: >> check that it's zero, change it to one - for each cell: check that >> it's one, change it to zero > >I don't understand something. If writing a cell to 1 causes a previously >checked cell to become 1, how does this find it (since it is already set to >one)? It's assumed that the cells are completely linked by mistake so writing a 1 to the cell or cells (one of them may be may be absent) via the first address will change the value of the cell(s) to 1, which will be detected further on in the march. >I also have questions about bist strategy. Perhaps you care to comment? > >I am implementing an "already architected" bist strategy for embedded sram on >.18u CMOS. Rams are small, word oriented (34 deep by 140 wide, for example). >Currently we plan to implement testing with an address generator per ram, 4 >bit lfsr data generators within the ram and a signature generator looking at >data out. We will run repeated passes and then check the signature through a >scan chain. Current plan is, >1. write memory to all zeros after reset. >2. 1st pass "dummy read, then Write" all memory with lfsr pattern >3. "read a location(into signature) then write it with new lfsr data" through >all memory 13 times >4. read and verify signiture > >(as an aside, non existant locations (34 through 63) return address 0's or >address 1's data (depending on lsb) when read, and have no affect when written) > >My question is, how can I maximize the fault finding capability of such an >approach. (best address sequence, best 4 bit lfsr patterns, number of passes). >As always, the tradeoff is maximum return for minimum area. Can you refresh my memory? What is a BIST strategy? and what is a LFSR data? I'm afraid it's hard to help much in detail. I'd try a checkerboard pattern, unless the memory has some inversion mechanism every other bit. Generally I think that one needs to know a lot about the architecture that is to be tested to determine a good (quick but error finding) testing method. It's for example also very important to know what kind of errors can occur. For example can two cells that are one apart be connected without the intermediate cell being connected? Can there be capacitive connections? Is there a time effect (like in DRAM)? Will the errors increase or deminish over time? >This IC is going into a comsumer oriented data communications device. Also consider if it's useful to build a memory test into the product or only factory test the device. It's also dangerous to do the testing in a subroutine (which uses the stack to remember the return value) and what to do when an error is found? You can't use the stack function in that case to give the user any kind of (text) error message. The programmer will have to work in macro's to solve this matter. But why bother. It only helps when there is a SRAM error and not when there is a ROM error or any other wiring error so the device can't give a decent error message. I normally didn't build in a memory test at boot-up time, but offered it to the service technician in the monitor program. >I also plan to go to the http://www.et.tudelft.nl/ site and search for >information there. Large parts of the site may be in Dutch...