diff -u -r -N linux-2.2.16.orig/CREDITS linux/CREDITS --- linux-2.2.16.orig/CREDITS Tue Jan 4 19:12:10 2000 +++ linux/CREDITS Mon Mar 20 23:10:13 2000 @@ -1679,6 +1679,14 @@ S: Malvern, Pennsylvania 19355 S: USA +N: Rick van Rein +E: vanrein@cs.utwente.nl +W: http://www.cs.utwente.nl/~vanrein +D: Memory, the BadRAM subsystem dealing with statically challanged RAM modules. +S: Binnenes 67 +S: 9407 CX Assen +S: The Netherlands + N: Stefan Reinauer E: stepan@linux.de W: http://www.freiburg.linux.de/~stepan/ diff -u -r -N linux-2.2.16.orig/Documentation/Configure.help linux/Documentation/Configure.help --- linux-2.2.16.orig/Documentation/Configure.help Tue Jan 4 19:12:10 2000 +++ linux/Documentation/Configure.help Sat Apr 8 01:19:32 2000 @@ -12287,6 +12287,21 @@ If you do not have any Quicknet telephony cards, you can safely ignore this option. + +Work around bad spots in RAM +CONFIG_BADRAM + This small kernel extension makes it possible to use memory chips + which are not entirely correct. It works by never allocating the + places that are wrong. Those places are specified with the badram + boot option to LILO. Read /usr/src/linux/Documentation/badram.txt + and/or visit http://home.zonnet.nl/vanrein/badram for information. + + This option co-operates well with a second boot option from LILO + that starts memtest86, which is able to automatically produce the + patterns for the commandline in case of memory trouble. + + It is safe to say 'Y' here, and it is advised because there is no + performance impact. CPiA Video For Linux CONFIG_VIDEO_CPIA @@ -12541,5 +12556,5 @@ # LocalWords: BVME BVME WRITETHROUGH copyback writethrough fwmark syncookie tu # LocalWords: alphalinux GOBIOS csn chemnitz nat ACARD AMI MegaRAID megaraid # LocalWords: QNX4FS ISI isicom xterms Apollos VPN RCPCI rcpci sgi visws pcmcia -# LocalWords: IrLPT UIRCC Tecra +# LocalWords: IrLPT UIRCC Tecra BadRAM badram vanrein zonnet diff -u -r -N linux-2.2.16.orig/Documentation/badram.txt linux/Documentation/badram.txt --- linux-2.2.16.orig/Documentation/badram.txt Thu Jan 1 01:00:00 1970 +++ linux/Documentation/badram.txt Sat Apr 8 01:32:14 2000 @@ -0,0 +1,212 @@ +INFORMATION ON USING BAD RAM MODULES +==================================== + +Introduction + RAM is getting smaller and smaller, and as a result, also more and more + vulnerable. This makes the manufacturing of hardware more expensive, + since an excessive amount of RAM chips must be discarded on account of + a single cell that is wrong. Similarly, static discharge may damage a + RAM module forever, which is usually remedied by replacing it + entirely. + + This is not necessary, as the BadRAM code shows: By informing the Linux + kernel which addresses in a RAM are damaged, the kernel simply avoids + ever allocating such addresses but makes all the rest available. + +Reasons for this feature + There are many reasons why this kernel feature is useful: + - Chip manufacture is resource intensive; waste less and sleep better + - It's another chance to promote Linux as "the flexible OS" + - Some laptops have their RAM soldered in... and then it fails! + - It's plain cool ;-) + +Running example + To run this project, I was given two DIMMs, 32 MB each. One, that we + shall use as a running example in this text, contained 512 faulty bits, + spread over 1/4 of the address range in a regular pattern. Some tricks + with a RAM tester and a few binary calculations were sufficient to + write these faults down in 2 longword numbers. + + The kernel recognised the correct number of pages with faults and did + not give them out for allocation. The allocation routines could + therefore progress as normally, without any adaption. + So, I gained 30 MB of DIMM which would otherwise have been thrown + away. After booting the kernel, the kernel behaved exactly as it + always had. + +Initial checks + If you experience RAM trouble, first read /usr/src/linux/memory.txt + and try out the mem=4M trick to see if at least some initial parts + of your RAM work well. The BadRAM routines halt the kernel in panic + if the reserved area of memory (containing kernel stuff) contains + a faulty address. + +Running a RAM checker + The memory checker is not built into the kernel, to avoid delays at + runtime. If you experience problems that may be caused by RAM, run + a good RAM checker, such as + http://reality.sgi.com/cbrady_denver/memtest86 + The output of a RAM checker provides addresses that went wrong. In + the 32 MB chip with 512 faulty bits mentioned above, the errors were + found in the 8MB-16MB range (the DIMM was in slot #0) at addresses + xxx42f4 + xxx62f4 + xxxc2f4 + xxxe2f4 + and the error was a "sticky 1 bit", a memory bit that stayed "1" no + matter what was written to it. The regularity of this pattern + suggests the death of a buffer at the output stages of a row on one of + the chips. I expect such regularity to be commonplace. Finding this + regularity currently is human effort, but it should not be hard to + alter a RAM checker to capture it in some sort of pattern, possibly + the BadRAM patterns described below. + + By the way, if you manage to get hold of memtest86 version 2.3 or + beyond, you can configure the printing mode to produce BadRAM patterns, + which find out exactly what you must enter on the LILO: commandline, + except that you shouldn't mention the added spacing. That means that + you can skip the following step, which saves you a *lot* of work. + +Capturing errors in a pattern + Instead of manually providing all 512 errors to the kernel, it's nicer + to generate a pattern. Since the regularity is based on address decoding + software, which generally takes certain bits into account and ignores + others, we shall provide a faulty address F, together with a bit mask M + that specifies which bits must be equal to F. In C code, an address A + is faulty if and only if + (F & M) == (A & M) + or alternately (closer to a hardware implementation): + ~((F ^ A) & M) + In the example 32 MB chip, we had the faulty addresses in 8MB-16MB: + xxx42f4 ....0100.... + xxx62f4 ....0110.... + xxxc2f4 ....1100.... + xxxe2f4 ....1110.... + The second column represents the alternating hex digit in binary form. + Apperantly, the first and one-but last binary digit can be anything, + so the binary mask for that part is 0101. The mask for the part after + this is 0xfff, and the part before should select anything in the range + 8MB-16MB, or 0x00800000-0x01000000; this is done with a bitmask + 0xff80xxxx. Combining these partial masks, we get: + F=0x008042f4 M=0xff805fff + That covers everything for this DIMM; for more complicated failing + DIMMs, or for a combination of multiple failing DIMMs, it can be + necessary to set up a number of such F/M pairs. + +Rebooting Linux + Now that these patterns are known (and double-checked, the calculations + are highly error-prone... it would be neat to test them in the RAM + checker...) we simply restart Linux with these F/M pairs as a parameter. + If you normally boot as follows: + LILO: linux + you should now boot with + LILO: linux badram=0x008042f4,0xff805fff + or perhaps by mentioning more F/M pairs in an order F0,M0,F1,M1,... + When you provide an odd number of arguments to badram, the default mask + 0xffffffff (only one address matched) is applied to the pattern. + + Beware of the commandline length. At least up to LILO version 0.21, + the commandline is cut off after the 78th character; later versions + may go as far as the kernel goes, namely 255 characters. In no way is + it possible to enter more than 10 numbers to the badram boot option. + + When the kernel now boots, it should not give any trouble with RAM. + Mind you, this is under the assumption that the kernel and its data + storage do not overlap an erroneous part. If this happens, and the + kernel does not choke on it right away, it will stop with a panic. + You will need to provide a RAM where the initial, say 2MB, is faultless. + + Now look up your memory status with + dmesg | grep ^Memory: + which prints a line much like + Memory: 158524k/163840k available + (940k kernel code, + 412k reserved, + 1856k data, + 60k init, + 2048k badram) + The latter entry, the badram, is 2048k to represent the loss of 2MB + of general purpose RAM due to the errors. Or, positively rephrased, + instead of throwing out 32MB as useless, you only throw out 2MB. + + If the system is stable (try compiling a few kernels, and do a few + finds in / or so) you may add the boot parameter to /etc/lilo.conf + as a line to _all_ the kernels that handle this trouble with a line + append="badram=0x008042f4,0xff805fff" + after which you run "lilo". + Warning: Don't experiment with these settings on your only boot image. + If the BadRAM overlays kernel code, data, init, or other reserved + memory, the kernel will halt in panic. Try settings on a test boot + image first, and if you get a panic you should change the order of + your DIMMs [which may involve buying a new one just to be able to + change the order]. + +BadRAM classification + This technique may start a lively market for "dead" RAM. It is important + to realise that some RAMs are more dead than others. So, instead of + just providing a RAM size, it is also important to know the BadRAM + class, which is defined as follows: + + A BadRAM class N means that at most 2^N bytes have a problem, + and that all problems with the RAMs are persistent: They + are predictable and always show up. + + The DIMM that serves as an example here was of class 9, since 512=2^9 + errors were found. Higher classes are worse, "correct" RAM is of class + -1 (or even less, at your choice). + Class N also means that the bitmask for your chip (if there's just one, + that is) counts N bits "0" and it means that (if no faults fall in the + same page) an amount of 2^N*PAGESIZE memory is lost, in the example on + an i386 architecture that would be 2^9*4k=2MB, which accounts for the + initial claim of 30MB RAM gained with this DIMM. + +Known Bugs + LILO is known to cut off commandlines which are too long. For the + lilo-0.21 distribution, a commandline may not exceed 78 characters, + while actually, 255 would be possible [on i386, kernel 2.2.16]. + LILO does _not_ report too-long commandlines, but the error will + show up as either a panic at boot time, stating + panic: BadRAM page in initial area + or the dmesg line starting with Memory: will mention an unpredicted + number of kilobytes. (Note that the latter number only includes + errors in accessed memory.) + +Future Possibilities + It would be possible to use even more of the faulty RAMs by employing + them for slabs. The smaller allocation granularity of slabs makes it + possible to throw out just, say, 32 bytes surrounding an error. This + would mean that the example DIMM only looses 16kB instead of 2MB. + It might even be possible to allocate the slabs in such a way that, + where possible, the remaining bytes in a slab structure are allocated + around the error, reducing the RAM loss to 0 in the optimal situation! + + However, this yield is somewhat faked: It is possible to provide 512 + pages of 32-byte slabs, but it is not certain that anyone would use + that many 32-byte slabs at any time. + + A better solution might be to alter the page allocation for a slab to + have a preference for BadRAM pages, and given those a special treatment. + This way, the BadRAM would be spread over all the slabs, which seems + more likely to be a `true' pay-off. This would yield more overhead at + slab allocation time, but on the other hand, by the nature of slabs, + such allocations are made as rare as possible, so it might not matter + that much. I am uncertain where to go. + +Names and Places + The home page of this project is on + http://home.zonnet.nl/vanrein/badram + In case you have experiences with the BadRAM software which differ from + the test reportings on that site, I hope you will mail me with that + new information. + + The BadRAM project is an idea and implementation by + Rick van Rein + Binnenes 67 + 9407 CX Assen + The Netherlands + vanrein@cs.utwente.nl + If you like it, a postcard would be much appreciated ;-) + + + Enjoy, + -Rick. diff -u -r -N linux-2.2.16.orig/Documentation/kernel-parameters.txt linux/Documentation/kernel-parameters.txt --- linux-2.2.16.orig/Documentation/kernel-parameters.txt Tue Jan 4 19:12:10 2000 +++ linux/Documentation/kernel-parameters.txt Sat Apr 8 00:48:00 2000 @@ -13,6 +13,7 @@ APIC APIC support is enabled. APM Advanced Power Management support is enabled. AX25 Appropriate AX.25 support is enabled. + BADRAM Support for faulty RAM chips is enabled. CD Appropriate CD support is enabled. EIDE EIDE/ATAPI support is enabled. FB The frame buffer device is enabled. @@ -75,6 +76,8 @@ atascsi= [HW,SCSI] Atari SCSI. aztcd= [HW,CD] Aztec CD driver. + + badram= [BADRAM] Avoid allocating faulty RAM addresses. baycom_par= [HW,AX25] BayCom Parallel Port AX.25 Modem. diff -u -r -N linux-2.2.16.orig/Documentation/memory.txt linux/Documentation/memory.txt --- linux-2.2.16.orig/Documentation/memory.txt Wed Oct 27 02:53:39 1999 +++ linux/Documentation/memory.txt Mon Mar 20 23:33:00 2000 @@ -18,6 +18,13 @@ as you add more memory. Consider exchanging your motherboard. + 4) A static discharge or production fault causes a RAM module + to have (predictable) errors, usually meaning that certain + bits cannot be set or reset. Instead of throwing away your + RAM module, you may read /usr/src/linux/Documentation/badram.txt + to learn how to detect, locate and circuimvent such errors + in your RAM module. + All of these problems can be addressed with the "mem=XXXM" boot option (where XXX is the size of RAM to use in megabytes). It can also tell Linux to use less memory than is actually installed. @@ -45,6 +52,8 @@ * Try passing the "mem=4M" option to the kernel to limit Linux to using a very small amount of memory. + If this helps, read /usr/src/linux/Documentation/badram.txt + to learn how to find and circuimvent memory errors. Other tricks: diff -u -r -N linux-2.2.16.orig/arch/i386/config.in linux/arch/i386/config.in --- linux-2.2.16.orig/arch/i386/config.in Tue Jan 4 19:12:11 2000 +++ linux/arch/i386/config.in Sat Apr 8 01:18:05 2000 @@ -115,6 +115,8 @@ bool ' Use real mode APM BIOS call to power off' CONFIG_APM_REAL_MODE_POWER_OFF fi +bool 'Work around bad spots in RAM' CONFIG_BADRAM + endmenu source drivers/pnp/Config.in diff -u -r -N linux-2.2.16.orig/arch/i386/defconfig linux/arch/i386/defconfig --- linux-2.2.16.orig/arch/i386/defconfig Thu Aug 26 02:29:46 1999 +++ linux/arch/i386/defconfig Sat Apr 8 01:04:38 2000 @@ -58,6 +58,7 @@ CONFIG_BINFMT_MISC=y # CONFIG_PARPORT is not set # CONFIG_APM is not set +CONFIG_BADRAM=y # # Plug and Play support diff -u -r -N linux-2.2.16.orig/arch/i386/mm/init.c linux/arch/i386/mm/init.c --- linux-2.2.16.orig/arch/i386/mm/init.c Wed Oct 27 02:53:39 1999 +++ linux/arch/i386/mm/init.c Sat Apr 8 01:25:35 2000 @@ -393,6 +393,7 @@ int reservedpages = 0; int datapages = 0; int initpages = 0; + int badpages = 0; unsigned long tmp; end_mem &= PAGE_MASK; @@ -426,10 +427,20 @@ clear_bit(PG_reserved, &mem_map[MAP_NR(start_mem)].flags); start_mem += PAGE_SIZE; } + +#ifdef CONFIG_BADRAM + badram_markpages (); +#endif + for (tmp = PAGE_OFFSET ; tmp < end_mem ; tmp += PAGE_SIZE) { if (tmp >= MAX_DMA_ADDRESS) clear_bit(PG_DMA, &mem_map[MAP_NR(tmp)].flags); if (PageReserved(mem_map+MAP_NR(tmp))) { +#ifdef CONFIG_BADRAM + if (PageBad(mem_map+MAP_NR(tmp))) { + panic ("BadRAM page in initial area"); + } +#endif if (tmp >= (unsigned long) &_text && tmp < (unsigned long) &_edata) { if (tmp < (unsigned long) &_etext) codepages++; @@ -445,6 +456,12 @@ reservedpages++; continue; } +#ifdef CONFIG_BADRAM + if (PageBad(mem_map+MAP_NR(tmp))) { + badpages++; + continue; + } +#endif atomic_set(&mem_map[MAP_NR(tmp)].count, 1); #ifdef CONFIG_BLK_DEV_INITRD if (!initrd_start || (tmp < initrd_start || tmp >= @@ -452,13 +469,14 @@ #endif free_page(tmp); } - printk("Memory: %luk/%luk available (%dk kernel code, %dk reserved, %dk data, %dk init)\n", + printk("Memory: %luk/%luk available (%dk kernel code, %dk reserved, %dk data, %dk init, %dk badram)\n", (unsigned long) nr_free_pages << (PAGE_SHIFT-10), max_mapnr << (PAGE_SHIFT-10), codepages << (PAGE_SHIFT-10), reservedpages << (PAGE_SHIFT-10), datapages << (PAGE_SHIFT-10), - initpages << (PAGE_SHIFT-10)); + initpages << (PAGE_SHIFT-10), + badpages << (PAGE_SHIFT-10)); if (boot_cpu_data.wp_works_ok < 0) test_wp_bit(); diff -u -r -N linux-2.2.16.orig/include/linux/mm.h linux/include/linux/mm.h --- linux-2.2.16.orig/include/linux/mm.h Tue Jan 4 19:12:25 2000 +++ linux/include/linux/mm.h Fri Apr 7 23:27:21 2000 @@ -13,6 +13,12 @@ extern void * high_memory; extern int page_cluster; +// The maximum number of values (address, mask pairs) covering bad RAM. +// This value is hardcoded (!) in init/main.c, get_options() +#define BADRAM_MAXPARAMS 10 +extern void badram_setup (char *str, int *ints); +void badram_markpages (void); + #include #include @@ -144,6 +150,7 @@ #define PG_Slab 9 #define PG_swap_cache 10 #define PG_skip 11 +#define PG_badram 13 #define PG_reserved 31 /* Make it prettier to test the above... */ @@ -159,6 +166,7 @@ #define PageSlab(page) (test_bit(PG_Slab, &(page)->flags)) #define PageSwapCache(page) (test_bit(PG_swap_cache, &(page)->flags)) #define PageReserved(page) (test_bit(PG_reserved, &(page)->flags)) +#define PageBad(page) (test_bit(PG_badram, &(page)->flags)) #define PageSetSlab(page) (set_bit(PG_Slab, &(page)->flags)) #define PageSetSwapCache(page) (set_bit(PG_swap_cache, &(page)->flags)) diff -u -r -N linux-2.2.16.orig/init/main.c linux/init/main.c --- linux-2.2.16.orig/init/main.c Tue Jan 4 19:12:25 2000 +++ linux/init/main.c Sat Apr 8 01:26:49 2000 @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -1050,6 +1051,9 @@ #endif #ifdef CONFIG_BLK_CPQ_DA { "smart2=", cpqarray_setup }, +#endif +#ifdef CONFIG_BADRAM + { "badram=", badram_setup }, #endif { 0, 0 } }; diff -u -r -N linux-2.2.16.orig/mm/page_alloc.c linux/mm/page_alloc.c --- linux-2.2.16.orig/mm/page_alloc.c Tue Jan 4 19:12:26 2000 +++ linux/mm/page_alloc.c Sat Apr 8 01:27:52 2000 @@ -346,6 +346,72 @@ return start_mem; } + +#ifdef CONFIG_BADRAM + + +/* Given a pointed-at address and a mask, increment the page so that the + * mask hides the increment. Return 0 if no increment is possible. + */ +static int next_masked_address (ulong *addrp, ulong mask) +{ + ulong inc=1; + ulong newval = *addrp; + while (inc & mask) + inc += inc; + while (inc != 0) { + newval += inc; + newval &= ~mask; + newval |= ((*addrp) & mask); + if (newval > *addrp) { + *addrp = newval; + return 1; + } + do { + inc += inc; + } while (inc & ~mask); + while (inc & mask) + inc += inc; + } + return 0; +} + + +static ulong badram_params [BADRAM_MAXPARAMS+1]; + +void badram_setup (char *str, int *argv) +{ + int i; + if (argv[0] > BADRAM_MAXPARAMS) + panic ("Too bad: badram=... length exceeds BADRAM_MAXPARAMS. " + "Try to combine patterns"); + for (i=0; i<=argv[0]; i++) { + badram_params [i]=argv [i]; + } +} + +void badram_markpages (void) { + ulong *argv=badram_params; + int argc=*argv++; + ulong addr, mask; + while (argc-- > 0) { + addr = *argv++; + mask = (argc-- > 0) ? *argv++ : ~0L; + mask |= ~PAGE_MASK; // Optimalisation + addr &= mask; // Normalisation + do { + if ( ( addr >> PAGE_SHIFT ) >= max_mapnr) + break; + set_bit (PG_badram, + &mem_map [ addr >> PAGE_SHIFT ].flags); + } while (next_masked_address (&addr,mask)); + } +} + + +#endif /* CONFIG_BADRAM */ + + /* * Primitive swap readahead code. We simply read an aligned block of * (1 << page_cluster) entries in the swap area. This method is chosen