Saturday, 22 December 2007

Removing CHS based access from windows boot loaders

Recently, I had troubles to migrate my Windows installation from VMWare to VirtualBox. When booting the vmware created partition in virtualbox, I got "NTLDR not found". So I sharpened the knives and got down to business with vmware's gdb interface and virtualbox's internal debugger. Tracing the execution showed that the BIOSes of the two products reported different geometries on the INT 13h interface. The generic method contained in the boot loader to read a sector from disk is "clever" as it checks whether the sector is below the maximum sector index that is reachable with the CHS geometry reported by the BIOS. If not, it uses the LBA interface of the BIOS. If yes, the cleverness of the boot loader suddenly vanishes. Instead of using the BIOS reported geometry to break the absolute sector down into its CHS components, the boot loader uses a geometry stored in the so called BIOS parameter block. That's a section of the first sector embedded into the boot loader that hard codes such values as head per cylinder and sectors per heads into the boot loader. If the hard coded values are different from the ones used by the BIOS, the calculation produces wrong values. So, if you move your partition to a BIOS that exposes a different geometry to the boot loader than is hard coded in the boot loader the whole thing blows up. Brilliant Microsoft design, as ever.

My solution is to override the check in the boot loader, so that LBA based access is always used and the CHS code is never touch. This way I'm able to use my partition under vmware (which uses heads=15) and virtualbox (which uses heads=255) simultaneously. Here is my boot loader patcher for FAT32 and NTFS based boot loaders: killchs.c. Use on your own risk. Chances are good that you can restore you boot loader with mbrfix if it breaks your boot loader.

Btw: VirtualBox is available under the GPL, and not only this makes it much more sexy to work with, it is also much faster than VMWare, at least that's my impression. There is also a commercial distribution of VirtualBox.

3 comments:

Vituko said...

Great, man!!

I thought I'd never get my vmdk xp system to boot directly outside Virtualbox-ose. I've spent lots of hours : installing inside, broken outside and vice versa, fixmbr, fixboot, install-mbr, dd,... Nothing worked. I knew this should be a bios question but... what to do? I'd never thought to try hexediting a boot sector (your c code doesnt accept "MSDOS5.0" system name, my boot partition, but is very clear).

* Debian Lenny package : virtualbox-ose 1.6.6-dfsg-2

Thanks

Let-it-be-genocide said...

Unfortunately, this does not work for a slightly different situation I have. I had a Thinkpad R40, which is old now. I bought a new HP Pavilion dv6t. Would love to keep the disk image from the Thinkpad. Cloned the Thinkpad disk to a USB drive (identical size) using EaseUS Disk Copy 2.3 -- disk copy, not partition copy. Then cloned the USB drive to my HP disk. There was a warning that geometries don't match -- 240 head on the USB drive vs 255 head on the HP drive -- but that EaseUS Disk Copy will handle this correctly. Checked various low level details in MBR and boot loader, and all known fixes reported on the web were indeed correctly set in the cloned version on the HP. However, HP machine does not boot -- starts the boot loader and then hangs with a blank screen and blinking cursor at top left corner. Making the changes in the bootloader per killchs.c (manually editing disk content for these 4 bytes) causes boot loader to proceed a shade more, and then it flashes what looks like MS blue screen of death; but instead of dying, the machine goes back into reboot, and I cannot capture the error message.

Any further suggestion? Would really appreciate some help here.

Unknown said...

Same situation here, when cloning a physical ThinkPad R60 installation (240 heads) into VirtualBox. Your patch fixed the booting so that it could get into the boot menu, where I could enable Kernel Debugging and fix the remaining two bluescreens with the help of windbg analyze plugin (first was missing IO APIC, second was a missing DLL for the AHCI implementation of VirtualBox that seemed to be different to the one used in my ThinkPad). So, if you can get into F8 menu, especially on VM where attaching a virtual serial cable is easy, try Kernel debugging :-)