memdiag(4) — FILE FORMATS
NAME
memdiag − control boot-time memory diagnostics
DESCRIPTION
The file /etc/master.d/memdiag contains information about memory diagnostics that are run when the system is booted. It defines the MemDiagTable table, which controls which diagnostics will be run. This table is defined as follows:
| #Function | Name | Control |
| &f_ram_quik, | "ram_quik", | DM_QUICK, |
| &f_ram_adr, | "ram_adr", | DM_QUICK, |
| &f_ram_alts, | "ram_alts", | DM_COMPR, |
| &f_ram_btog, | "ram_btog", | DM_COMPR, |
| &f_ram_march, | "ram_march", | DM_QUICK, |
| &f_ram_pats, | "ram_pats", | DM_COMPR, |
| &f_ram_perm, | "ram_perm", | DM_QUICK, |
| &f_ram_ref, | "ram_ref", | DM_COMPR, |
| &f_ram_rndm, | "ram_rndm", | DM_QUICK, |
The three columns in this table are as follows:
Function The address of the function that runs this particular diagnostic.
Name A name for the diagnostic, used in logging messages.
Control If this column contains DM_QUICK, then this diagnostic will be run in either “quick” or “comprehensive” memory diagnostic mode. If it contains DM_COMPR, then this diagnostic will be run only in “comprehensive” mode. If it contains DM_NEVER, then the diagnostic will never be run.
The following table describes the tests that can be run, and gives the approximate time, in seconds per megabyte, for each. The times are approximate values for an MVME197 CPU, uncached; other configurations will vary, but will be approximately proportional.
| ram_quik | 0.5 | Quick write/read test. This should always be run, and should always be first. |
| ram_adr | 0.6 | Fill memory with its address. |
| ram_alts | 0.3 | Fill memory with alternating ones and zeroes. |
| ram_btog | 1.0 | Fill memory with random bit patterns and their compliments. |
| ram_march | 0.6 | Fill memory with a pattern and its inverse. |
| ram_pats | 5.2 | Fill memory with selected patterns. |
| ram_perm | 0.4 | Test byte/short/long permutations. |
| ram_ref | 0+ | RAM refresh test. (See RAM_REFRESH below.) |
| ram_rndm | 0.5 | Random data test. |
In addition to the MemDiagTable, the following tunable parameters are available:
MAXBLOCK
The maximum size of a block of memory, expressed in pages, that will be tested in a single pass. Smaller values consume somewhat fewer system resources while diagnostics run, and reduce the time taken before the first memory is freed. Larger values give slightly better diagnostic coverage and reduce the overall overhead for running diagnostics. Larger values also significantly reduce the total time taken by the ram_ref test.
RAM_REFRESH
This is the time, in hundreths of a second, for the ram_ref test to wait for a memory refresh cycle. ram_ref sleeps this long for each block of memory being tested, regardless of the size of the block. Therefore, the MAXBLOCK and RAM_REFRESH parameters together have a large effect on how long ram_ref takes to run.
NICE This is a “nice” value for the diagnostic daemon, equivalent to the value given to the nice command.’ Larger values (up to 20) cause the daemon to run at lower priority.
INITMIN After kernel initial memory configuration, but before driver initialization and before starting any processes, the system must have at least this many pages available. If the BUG memory diagnostics have not verified enough memory to ensure this, a warning will be printed and the memory will be used without passing diagnostics. The default value should be large enough for most systems. On small configurations, it could be reduced, permitting the BUG to be reconfigured to verify less memory. On large memory configurations with many devices (or with custom device drivers that allocate lots of memory), this value may need to be increased to prevent a hang when booting.
MAXREPORT
This limits the number of errors that will be logged in any given memory block. If more than this many errors occur, the additional errors will not be individually logged. The total failures in each block will still be logged, and the memory will still be taken out of use.
TESTCACHED
If set, memory to be tested will be mapped with data cache enabled.
TESTUNCACHED
If set, memory to be tested will be mapped with data cache disabled. If both TESTCACHED and TESTUNCACHED are set, then tests will be run both with and without cache. If neither are set, tests will be run uncached.
MESSAGES
The following is a list of messages that can be printed by the MEMDIAG module while it runs. A brief explanation of each message is included. The description of each message includes an indiciation of the logging level at which the message will print. For example, if [DV_OFF] is included with the message the message will always be printed, and if [DV_LEVEL1] is included the message will be printed only if the logging level is at or above DV_LEVEL1.
WARNING: memdiag: Memory below xxx used w/o diagnostic [DV_OFF]
This is printed if the BUG has not tested the low range of memory. Anything below what was tested by BUG will be used without further diagnostics.
WARNING: memdiag: Memory xxx-xxx used w/o diagnostic [DV_OFF]
BUG did not test enough memory for the kernel to boot successfully without using untested memory. The memory in the indicated range will be used without passing tests.
NOTICE: Diagnostics needed for memory xxx-xxx [DV_LEVEL1]
Memory in the indicated range was not tested by BUG before boot, so the kernel must test it before it can be used. This testing will be done in background while the system runs.
WARNING: memdiag: Can’t test dd pages at once. [DV_OFF]
Increase KVSIZE or decrease MAXBLOCK
The kernel couldn’t allocate virtual space to map in the memory to be tested. Either decrease MAXBLOCK or increase KVSIZE.
memdiag: Starting memory diags, dd pages to check [DV_LEVEL1]
Memory diagnostics have started running.
memdiag: Done memory diags. dd bad pages found [DV_LEVEL1]
Memory diagnostics have been run on all memory. This is reported after the initial scan or after a rescan. This reports how many pages failed diagnostics.
memdiag: Starting memory rescan, dd pages to check [DV_LEVEL1]
This reports that the ramtest daemon has started a rescan of memory. A rescan can happen if some diagnostics in a previous pass were incomplete, or if memory was marked for recheck via the memregion(7) interface.
memdiag: xx-xx: starting memory diags. [DV_LEVEL2]
Diagnostics on the indicated range of physical memory have started.
memdiag: xx-xx: dd bad page(s) found [DV_LEVEL2]
The given number of pages in this range of physical memory failed diagnostics in this pass. That memory will be taken out of circulation, and won’t be used by the system. The remaining memory in this range passed diagnostics, and will be released for use by the system.
memdiag: xx-xx: partial coverage, will retest. [DV_LEVEL2]
Some diagnostic could not run to completion on this range of memory. The memory will remain marked as “untested,” and will be tested later in a subsequent pass.
memdiag: xx-xx: passed memory diags [DV_LEVEL2]
The indicated physical memory has passed diagnostics, and will be released for use by the system.
memdiag: <test> [un]cached completed [DV_LEVEL3]
This announces that the the named diagnostic completed successfully.
memdiag: <test> [un]cached incomplete [DV_LEVEL3]
This announces that the the named diagnostic could not complete. This is probably due to a parity error or other fault that occurred while testing.
memdiag: <test> not runnable [DV_LEVEL3]
The named diagnostic does not run on this processor. The test is not run.
WARNING: memdiag: <test> failure at <addr>: <msg> [DV_LEVEL1]
The named diagnostic detected a memory error at the given physical address. <msg> is specific to the particular diagnostic.