10. Errors


Contents of this section

10.1 When rebooting it says, "WARNING: cannot set battery backed clock...". Should I be concerned?

Not really, but it means that you are using a rather old kernel. Newer kernels give the error message below .

10.2 When rebooting it says, "NetBSD/mac68k does not trust itself to update the RTC on shutdown." Should I be concerned?

No. This is simply an indication that NetBSD/mac68k would rather not set your Real Time Clock since NetBSD loses more time the longer it runs (this would lead to your clock being off in MacOS as well). The code to update the RTC on most machines does exist, though, and you can turn it on in the kernel if you truly wish to do so.

10.3 When I boot, I get the following error message:

init: not found
panic: no init
Pausing forever.
What's wrong?

/sbin/init was not installed. Use the Installer to "ls /sbin/init" and make sure it is there. Make sure that you have installed a base distribution set such as the NetBSD 1.3 release, or a -current snapshot:

10.4 When I boot, I get the following unusual lines:

  audio at mainbus0 not configured
  floppy at mainbus0 not configured
What's wrong?

Audio support has been available since the prior to the 1.2 release. You should now see a line like:

asc0 at obio0: Apple Sound Chip
Upgrade to a newer release or try running -current if you want minimal sound support for ASC-based machines. EASC-based machines currently don't work quite right, but support is in the works.

However, floppy support has not been implemented yet. It is in the works, though.

10.5 My keyboard locks up after I type a few commands. What's wrong?

If you were at the bottom of the screen when this occurred, then this is the infamous "scrolling while ADB input is being received" problem. There are a few ways to avoid it:

  1. always type "clear" before the screen has a chance to scroll
  2. edit /etc/ttys and use an external terminal on one of your serial ports
  3. use "dt", Lawrence Kesteloot's desktop program (this is a good solution)
  4. since this problem has been fixed on all machines that I know of since the 1.1 release, the best solution is to upgrade to a much newer kernel.

10.6 I just switched the SCSI ID's of my drives, and now my machine freezes after:

changing root device to sd1a.
Swapping 409 and 401.
Swapdev = 409, dumpdev=ffffffff.
What's wrong?

You need to edit the /etc/fstab file to mount the new root partition. You can also simply use the Installer utility to delete the current /etc/fstab and then do a "Build Devices" to create a new one which the correct SCSI ID's for your drives.

10.7 The NetBSD 1.1 distribution kernel will not boot on my IIci. What's wrong?

Your symptoms probably look like:

If I boot with extra-debugger info on, using multi-user, I get:
Changing root device to sd1a.
PRAM: 0x30d482dc, macos_boottime: 0x30d482d1.
vm_fault(10c000, 8bb3000, 1, 0) -> 1vm_fault(10c000, 8bb3000, 1, 0) -> 
1vm_fault(10c000, 8bb3000, 1, 0) ->
then it hangs.

Unfortunately, the 1.1 distribution kernel has quite a few bugs in it and does not run very well on many machines, especially the IIci. Try the 1.3 distribution or get a newer version of the kernel off of NetBSD.ORG , puma or ftp.eskimo.com .

10.8 When I try netstat -r I get: netstat: kvm_read kvm_read: Bad address. What's wrong?

This is nothing to worry about. Basically, the file /netbsd is not your current kernel. A number of programs (such as ps, who, systat etc.) and libkvm, access /netbsd to learn what's going on in the kernel. So you can simply rename your current kernel to /netbsd to make this kind of error go away. Make sure that you are not overwriting a working kernel when you do this, though, unless that is exactly what you intend to do.

10.9 When I try netstat -r I get an endless stream of question marks. What's wrong?

That's a mis-match between libkvm/netstat and /netbsd. Or it's that you're running a kernel that's not named /netbsd. Other symptoms of this problem are likely to be that who, ps, ifconfig, and systat will not work either. If you update your binaries and your kernel at the same time, then you should be OK.

10.10 I just upgraded to a new kernel and now w, ps, and netstat, among others, don't work. What happened?

One of two things. Either your currently booted kernel isn't named /netbsd or else you have a mismatch between your kernel and the binaries you are using. In the first case, simply making a link from your currently booted kernel to /netbsd will solve the problem.

In the second case, dynamically linked binaries can usually be fixed by upgrading libkvm to match your new kernel. Statically linked binaries need to be replaced with more recent versions. Since they are statically linked, if you are going to recompile them yourself, you need to rebuild libkvm.a before you rebuild the program in question.

Thanks to John Wittkowski (jpw@netscape.com), here is a list of most of the programs (besides /bin/ps) that depend on libkvm (all of these are in /usr/bin):

10.11 After changing kernels, I get a "proc size mismatch" error when I try to use ps. What's wrong?

Like the previous three questions on this subject, the answer is most likely that your libkvm is out of sync with your kernel or binaries. To solve this problem, you can either get a binary distribution which matches your kernel, or you can build you own by following the instructions below:

If you get the "proc size mismatch" error and you determine
that you need to update your libs, here's what to do:

1. Get all the source code. If you're not willing to do this
   and recompile things than you'll have to find someone who
   will do it for you and you can try installing everything
   by hand.

2. Make sure that your include files are up to date. Do 
   this by:
      cd /usr/src
      make includes
   This will take a while. I had some trouble with this
   because some of the Makefiles didn't have the INSTALL
   variable defined. Whenever the "make includes" failed,
   I went to the last directory listed and added the
   following line to the Makefile:
      INSTALL=/usr/bin/install
   I had to do this several times before the make finished
   without any errors.

   (If you make sure that /usr/bin/make and all the files in
   /usr/share/mk are up to date first, the above difficulties
   can probably be avoided --Colin)

3. Rebuild the libkvm and install it:
      cd /usr/src/lib/libkvm
      make
      make install
   Note that in order to get the libkvm to compile on
   my system I had to add the following link:
      cd /usr/include/machine
      ln -s ../m68k/kcore.h kcore.h
   This may have been a quirk of my system so try 
   compiling without it first.

4. Then rebuild the binaries that are STATICALLY linked to
   libkvm. The only statically linked program that I am 
   aware of is "/bin/ps". To rebuild ps, simply:
      cd /usr/src/bin/ps
      make
      make install

5. You may or may not need to rebuild the binaries that
   are dynamically linked to libkvm. This is because (I
   think) if the major version number of the lib changes
   then the old binaries will expect the old version
   number and not work with the newer version of the lib.
   For example, my old libkvm was libkvm.so.4.0. The new
   one was libkvm.so.5.0. Without recompiling the 
   dynamically linked binaries, it would still complain
   about "proc size mismatch" (if the 4.0 lib was still
   there) or some lib missing error (if the 4.0 lib
   had been removed from /usr/lib). If the version minor
   number changes (4.0 to 4.1, for example), I think it
   will run with a warning and so you may not need to
   recompile all of these things.

   The dynamically linked binaries that I am aware of
   will give the  "proc size mismatch" error (if the 4.0
   lib was still there) or some lib missing error (if the
   4.0 lib had been removed from /usr/lib). If the version
   minor number changes (4.0 to 4.1, for example), I think
   it will run with a warning, and you may not need to
   recompile all of these things.

   The dynamically linked binaries that I am aware of
   are:
      /usr/bin/fstat
      /usr/bin/gdb
      /usr/bin/ipcs
      /usr/bin/netstat
      /usr/bin/nfsstat
      /usr/bin/systat
      /usr/bin/uptime (linked to /usr/bin/w)
      /usr/bin/vmstat
      /usr/bin/w
   Note that /usr/bin/uptime is a link to /usr/bin/w and
   will be set up properly when you do the "make install"
   for w.

   To recompile these, do the following:
      cd /usr/src/usr.bin/<cmd>
      make 
      make install
   For example, to recompile /usr/bin/vmstat:
      cd /usr/src/usr.bin/vmstat
      make
      make install

Much thanks to John Wittkowski (jpw@netscape.com) for providing such a detailed answer for this one.

10.12 Whenever I make extensive use of the serial ports, I get a lot of fifo overruns. Is there a fix?

This error was due to a bug in the serial drivers. A fix has been made, but it did not make it into the 1.1 distribution kernel. Kernels since that time should have the fix incorporated.

(NOTE: If you are currently seeing this error, please read the paragraph at the bottom of this answer)

Here is the latest from Bill Studenmund (wrstuden@loki.stanford.edu):

I'm pleased to announce that we seem to have fixed the serial port
problems. The current source on ftp.netbsd.org should be correct.

The problem turns out that we misunderstood how to set up the interrupt
levels (spltty specifically). We were turning off interrupts to the
chip when we should only have been turning off the passing of data to
the kernel. We were supposed to stop characters coming out of a
temporary buffer (the ring buffer) when in fact we were stopping characters
entering it. Thus it never did much, and we got lots of errors. :-(

I've gotten ppp to work very solidly. I've transfered a couple of meg
over the line w/o a single fifo error. :-)

The changed file is sys/arch/mac68k/include/param.h. Note that changing
it will mean a lot of the kernel needs re-compiling.

Noud de Brouwer, who is running a neighborhood ISP has used it, and gotten
some very impressive numbers. Like over 6k bytes/sec. :-)

WARNING about high speeds: If, while transfering data at over say 28800
(non-scientifically-picked number), you want to do something else, like
compile, log in elsewhere, or use ethernet, you might have problems.
You might not. Every time a byte gets received, the CPU has to drop
what it's doing, save what it's doing, get the byte, store it, and
resume what you're doing. It's life w/o DMA. :-( It works, but takes
a bit of CPU. Just be aware.

Recently, a change in the interrupt handling has caused this error to start cropping up again. Although a fix has been committed, newer serial drivers should permanently fix the problem. When the new serial drivers are committed, a note will be made of it.

10.13 When I boot, I get the following error message:

Bootstrapping the pmap system.
Failure in BSD boot.  nextpa=0x106000, high[0]=0x100000.
panic: You're hosed!
What's wrong?

You forgot to put your Mac into 32-bit mode in the Memory control panel. Do this, and, barring other problems, it should boot fine.

The above is true unless you have a 24-bit video board installed. For some reason, the 1.2 kernels seem to have trouble detecting that the machine is in 32-bit mode when one of these boards (e.g. an Apple 8.24 video card or a Radius Preciesion Color Pro 24XP board) is installed.

This problem was not present in the 1.1 kernels, so hopefully a getting a more recent kernel will fix the problem (assuming the problem has been fixed by now), unless you are willing to simply remove the board for now.

10.14 When going multi-user, I get the following error message:

Mar  2 13:03:04 myname init: kernel security level changed from 0 to 1
mrg: no trace functionality enabled
panic: kernel jump to zero

This is normal if you're using the serial console boot and have not fixed your /etc/ttys to disable the getty on /dev/ttye0 and enable the getty on /dev/tty00 like so:

Change /etc/ttys from this:

# Define console that we actually run getty on
ttye0 "/usr/libexec/getty Pc" vt220 on secure
.
.
.
#Hardwired lines are marked off ...
tty00 "/usr/libexec/getty std.9600" unknown off secure

to this:

# Define console that we actually run getty on
ttye0 "/usr/libexec/getty Pc" vt220 off secure
.
.
.
#Hardwired lines are marked off ...
tty00 "/usr/libexec/getty std.9600" unknown on secure

Thanks to Brian Wimberly (brianw@scripps.edu) for the above fix.

10.15 Now that I've got ethernet working, my machine hangs a lot with the message:

/netbsd: Panic switch: PC is 0x7f734.
/netbsd: ae0: warning - receiver ring buffer overrun
Isn't ethernet working?

From Steve Weiss (srw@izzy.net):

This bug is known and fixed.
A little over a month ago I found a bug in dev/if_ae.c, the apple ethernet card driver, in which it failed to lower the interrupt that the card raises when its tally count registers overflow. This results in a perpetual interrupt hang. The mean time to fail is a function of network traffic.
Allen fixed this and checked the edits back into the source tree on February 2. There are several kernels in his outgoing directory on puma that have been built with this problem corrected. The earliest being netbsd.please.test (4feb), and the latest being netbsd.GENERIC-5 (7mar). Check the README in
ftp://ftp.macbsd.com/pub/outgoing/briggs/
My IIci used to hang every 3-5 minutes before this fix (we have very high traffic) and now it has been up for about 2 weeks running GENERIC-3.

10.16 Now that I've installed an ethernet card, I get the following message:

/netbsd: ae0: warning - receiver ring buffer overrun
/netbsd: ae0: device timeout, recovered
Is this a problem?

If these messages continue after boot, you probably have a problem. However, if you only seem the during boot, they are most likely caused by the card having already been initialized by the MacOS before you booted NetBSD.

Thanks to Henry B. Hotz (h.b.hotz@jpl.nasa.gov) for answering this one.

10.17 Whenever I telnet into my Mac running NetBSD from a machine running MacOS, SunOS, or Solaris, I have severe terminal emulation problems. What's wrong?

From Scott Reynolds (scottr@og.org):

This is a pretty well-known problem, and stems from the fact that Sun's telnet (and HP's, at least up to HP-UX 9.x) is based on very old, and broken, code. There are two workarounds for this: (a) get new telnet source from ftp.cray.com, compile, and install it on the Sun, or (b) add "-k" to the entry in inetd.conf for telnetd. The former solution is preferred; the latter is a kludge that will help the situation, but you will probably find other strangeness coming from "good" telnets.

Presumably, the Mac version of telnet suffers from a similar problem. Using rlogin instead usually works for me, but I'd recommend using ssh if it's available.

10.18 Why does Booter 1.9.2 always crash with an Unimplemented Trap system error?

From Bill Studenmund (wrstuden@loki.stanford.edu):

I bet you have a "minimal" installed MacOS, no? There is a call that was part of an experiment for supporting booting w/ VM on. The project wasn't completed, and this call was left in. It is not implimented on minimal systems, but is on full installs.

Upgrading to Booter 1.9.4 or later should fix the problem.

10.19 Why do I get the message savecore: can't find device 0/0 when I go multi-user?

From Allen Briggs (briggs@puma.macbsd.com):

That's because the file /netbsd and the current kernel are not the same, and savecore needs to get the default dump device from the kernel, but if /netbsd and the current running kernel don't match, it gets confused. So do a few other programs.

10.20 Now that I've added an ethernet card right next to my video card, why does MacBSD refuse to boot?

If it hangs right after recognizing either the video card or the ethernet card, then you have probably been bitten by the interrupt conflict bug. For some time now, it seems that certain combinations of video and ethernet cards seem to have an interrupt conflict, usually causing the machine to hang on boot.

Although removing one of the cards will solve the problem, you might want to try a newer kernel as this kind of bug has been solved in several instances.

Allen Briggs (briggs@puma.macbsd.com) has provided a HOWTO on locating the information necessary to help fix this kind of problem: http://www.macbsd.com/macbsd/howto/video.html

10.21 Why do I get the line: ae0: NIC memory corrupt - invalid packet length 65280 when I boot?

The full text of the error message is probably something more like:

ae0: length does not match next packet pointer
ae0: len 0000 nlen ff00 start 0c first 00 curr 20 next 00 stop 40
ae0: NIC memory corrupt - invalid packet length 65280

This is most likely because you had MacOS networking enabled at the time you booted (most likely EtherTalk or MacTCP). Try booting with extensions off (well, at least those extensions) and it should go away. Occasionally, the boot process will hang at this point if you don't turn them off.

Thanks to Isaac Salpeter (isaac@ticalc.org) for the above answer.

According to Allen Briggs (briggs@puma.macbsd.com), these messages are fairly harmless, and you can eliminate them by compiling a kernel without the DIAGNOSTIC option enabled in its configuration file.

10.22 When attempting to mount my filesystems, why do I get an error like /dev/sd0a on /: specified device does not match mounted device?

This indicates that your /etc/fstab doesn't match the actual layout of the partitions on your disk. A likely cause it that versions 1.1a through 1.1c of the Installer utility create a possibly incorrect version of /etc/fstab putting all of your partitions on sd0. If you're not using sd0, there are a couple of solutions:

  1. Use the Installer to copyout /etc/fstab, edit it to reflect reality in a text editor, and then use the Installer to copy it back in.
  2. Upgrade to version 1.1e of the Installer: ftp://ftp.best.com/pub.s/sbrown/NetBSD/installer/Installer_1.1e.hqx and do another "Build Devices" from the Installer menu.

Another possibility is that NetBSD 1.3 and later ignore Apple Driver type partitions when filling in the disklabel, so some of your partitions might move around due to this. Doing a disklabel sdX where "X" is the device number of the drive in question should give you a listing of the current layout of your partitions so you can fix your /etc/fstab.

10.23 I added a non-standard ADB device and now my machine gives the following error:

panic:  ResHndls table too small!

From Bob Nestor (rnestor@metronet.com):

The error comes from running out of ResHandles while making calls for ROM Resources. This is part of the MRG code solution for the ADB, so it probably won't be a problem at all with the Hardware Direct ADB solution. I'm not sure why the ResHandles Table gets exhausted with certain ADB devices, but it does. I recall working with someone on this some time ago, and we increased the table size to a value larger than the total number of Resources in the ROM and still couldn't eliminate the error. It's probably a case where there's an error in the use of the ResHandles Table *and* a loop of some kind in the ADB ROM routines trying to handle these devices. There are some ROM patches for the ADB routines in MacOS that we're not currently using in the MRG code in NetBSD, and they may be related this this problem.
So, unless someone wants to dig back into the MRG code to solve this, real solutions are to either wait for the Hardware Direct code or run without the offending ADB device.

10.24 Whenever I install something using the Installer, I get an error saying: Warning: Battery clock earlier than filesystem date. What's wrong?

From Nico van Eikema Hommes (hommes@ccc.uni-erlangen.de):

I think I have figured out what happens: it is a fug (that is a bug that would, in a Microsoft program, be called a "feature" :-> ) in the Installer. The timestamp is set to the local MacOS time (a cpin followed by ls shows this), but it should be set to GMT. When the filesystem is mounted during the boot process, the timestamp is interpreted as GMT, what it should be, and the offset, which for MET is positive, is added to it. That of course gives a later time than that of the battery clock. The irregularity of course came from the time elapsed between installing and booting. The negative offset for US-based systems probably caused this to remain unnoticed.

The latest version of the installer (i.e. 1.1e) fixes this problem by using GMT for reads and writes instead of the local timezone (which is what the MacOS uses).

10.25 When I boot, I get the error: "kernal is not in a format the booter can understand". What's wrong?

You are probably trying to boot from a MacOS partition and have forgotten to extract the raw kernel file from archive that it is in (many kernel files are uploaded as tar archives, gzipped archives, or both). You can get a copy of the enhanced Stuffit Expander or the MacGzip and Tar utilities to extract the raw kernel file. Make sure that you do not do any sort of end-of-line (i.e. text) translation in the process, or you will corrupt the kernel binary.

10.26 Why am I getting the message: "sn0: receive descriptors exhausted"?

From Denny Gentry (denny1@home.com):

The sonic DMAs a received packet into a data buffer. It then needs a descriptor to write the address of the buffer and some information about the packet (like its length) to hand it to the driver. If it runs out, it interrupts.
Usually this means the sn driver didn't get to run for a long time to clean up its descriptors. This can happen if there is a lot of SCSI activity, since SCSI interrupts will block network interrupts. Early in the morning the /etc/daily (etc) scripts run, causing lots of disk activity. If someone tried to reach your machine while the disk grinding was going on, it could easily cause it to run out of descriptors.
A few weeks ago I submitted changes to increase the number of receive descriptors by using a seperate 4K page just for RX descs. Scott checked those changes in, so make sure you have a "recent" kernel. I note from another message you sent that you're running 1.2D, a 1.2G kernel will have more descriptors.
Since the Sonic requires its descriptors to be physically contiguous, giving it even more descriptors would require having a way to allocate physically contiguous memory.

10.27 My boot hangs shortly after printing "warning: no /dev/console" What's wrong?

This and the similar message:

warning: lookup /dev/console: error #
both indicate that the kernel can't find the console device. This means that you probably forgot to build devices while in the Installer. Reboot the machine, mount any non-root partitions you have using the Installer's Mini-Shell, and choose the "Build Devices" menu item to set up the proper devices.

Thanks to Bill Studenmund (wrstuden@loki.stanford.edu) for the above information.

10.28 I just upgraded my kernel, and now when I boot, I get the following message:

You booted with booter version 1.8.
Booter version 1.11 is necessary to fully support this kernel.

From Scott Reynolds (scottr@netbsd.org):

Harmless message, unless you are using a miniroot. (Well, if you're using a miniroot, the manner in which you are currently doing it is deprecated, anyway.) You can safely use new kernels with booter versions 1.9.5 or later, regardless of the warning message.
The booter version passed in has been out of date for quite some time. That's one of the reasons the next version of the Booter is 1.11, not 1.10.3; I wanted to circumvent any potential trouble with a Booter that actually had the version number fixed.

10.29 My machine hangs when I try to shut it down or reboot it. How can I get around this?

This is a problem that seems to most often plague IIcx users, but it has been seen in other older II-series Mac's as well. It appears to be a result of some strange low-memory interaction that depends on the size of the kernel and the location of certain data structures within it.

One workaround for this problem is to compile a custom kernel with unnecessary device drivers removed. This will often fix the problem (but not always). Another possible workaround is to try to reboot the machine and shutdown from MacOS instead. If the reboot hangs as well (as is often the case), the best you can do is shutdown to single-user mode, sync the disks, and then manually power off or reboot the machine.


Previous Chapter

Table of contents of this chapter, General table of contents

Top of the document, Beginning of this Chapter