Sunday, June 7, 2009

The g4u

What is it?

    g4u ("ghosting for unix") is a NetBSD-based bootfloppy/CD-ROM that allows easy cloning of PC harddisks to deploy a common setup on a number of PCs using FTP. The floppy/CD offers two functions. The first is to upload the compressed image of a local harddisk to a FTP server, the other is to restore that image via FTP, uncompress it and write it back to disk. Network configuration is fetched via DHCP. As the harddisk is processed as an image, any filesystem and operating system can be deployed using g4u. Easy cloning of local disks as well as partitions is also supported.

    For the curious, I've added a few screenshots:

    1. Booting g4u in bochs
    2. Device detection
    3. Welcome to g4u!
    4. Some random g4u commands
    5. Uploading a disk image with uploaddisk
    6. Restoring with slurpdisk

2. Why not one of the alternatives?

    • Server-part often runs (only) under DOS/Windows. I wanted to use a Unix based server.
    • Supported filesystems include everything from Microsoft, but others are not handled properly (Solaris/x86, NetBSD, ...)
    • I don't want to deal (ever again) with making a DOS-based bootfloppy, that gets its IP-number via DHCP.
    • I've played with doing multicast-based deployment based on imm, but the result was so slow I decided not to pursue it.

3. Requirements & Download

    • Two empty 1.44MB floppy disks, or an empty CD-R/RW or DVD
    • A FTP-server with some GB free space
    • A DHCP-server

    In addition to that, you may want:

    • The g4u 2.3 floppy images (zipped/ uncompressed floppy one and floppy two)
    • The g4u 2.3 ISO CD image (zipped/uncompressed)
    • The g4u 2.3 source
    • Some md5 checksums:
      MD5 (g4u-2.3-1.fs) = 2f430b3cf983d314ee377381feaa678a
      MD5 (g4u-2.3-2.fs) = 2148d7ca70d8c469ead31a41c0b6fe56
      MD5 (g4u-2.3.iso) = 3f50b5b9aebc50acbad8d63fba3853e2
      MD5 (g4u-2.3.tgz) = 2f770037461f79389c7167829fe2c7cb
      MD5 (g4u-2.3.fs.zip) = 4cddc7faebdf383f391a6c159094711a
      MD5 (g4u-2.3.iso.zip) = b64167a06fa19c6f646ff41657d53294

    Older versions of g4u are available as well:

    You can also download from one of these mirrors:

4. Using it

    4.1 Preparations

    • Using the g4u floppy images:
      1. Download the floppy images, g4u-2.3-1.fs g4u-2.3-2.fs and g4u-2.3-3.fs or g4u-2.3.fs.zip, which contains these files.
      2. If you downloaded the g4u-2.3.fs.zip file, unpack it to get g4u-2.3-1.fs, g4u-2.3-2.fs and g4u-2.3-2.fs
      3. Write the two images to two seperate floppy disk. Under Unix, a simple "cat g4u-2.3-1.fs >/dev/diskette" (and same for -2.fs) will do. Make yourself familiar with the name of your floppy device, some common ones are:

        • NetBSD: /dev/fd0a
        • Solaris: /dev/diskette
        • Linux: /dev/fd0

        There are also similar devices for USB sticks, but you need to grab the g4u.fs from the ISO to put there:

        • NetBSD: /dev/sd0d
        • Linux: /dev/sd0

        If you're using Microsoft Windows or DOS, use rawrite.exe. There's also a Windows-based program available called rawr32.zip.

    • Using the g4u CDROM ISO image:
      1. Download the CDROM ISO image, g4u-2.3.iso or g4u-2.3.iso.zip
      2. If you downloaded the g4u-2.3.iso.zip file, unpack it to get g4u-2.3.iso
      3. Please consult your CDROM writing software (Nero, DiskJuggler, WinOnCD, cdrecord, ...) 's manual on how to write the g4u.iso file to a CDROM. Note that the image is bootable.

    • On a FTP server of your choice, create an user-account called "install", and protect it with some password. Make sure the 'install' user can login via ftp (/etc/shells...)

      If you want to use a different account, you can specify "login@server" for slurpdisk, uploaddisk etc..

    • Make sure you have a working DHCP server that hands out IP addresses and other data needed to access the FTP server from your workstation (name server, netmask, default gateway). Else you will have to set the IP-number manually..

    4.2 Image creation

    • Boot the CD or floppy on the machine you want to clone. See it read the kernel from disk, then print out all the devices found in the machine. It will do DHCP next, asking for an IP number - be sure you have DHCP configured properly! At the end you'll get a text description of possible commands, and a shell prompt.

    • Whole harddisk:
      Type "uploaddisk your.ftp.server.com filename.gz" to read out the machine's harddisk (rwd0d), and put it into the "install" account of your FTP server under the given filename. The disk image is compressed (with gzip -9), so maybe use a ".gz" file suffix. You don't have to, though. Before putting the file on the FTP server, the "install" account's password is requested.

      If you want to clone your second IDE disk, add it's name on the uploaddisk command line: "uploaddisk your.ftp.server.com filename.gz wd1". Similarly, if you use SCSI instead of IDE disks, use "uploaddisk your.ftp.server.com filename.gz sd0".

      If you want to use a different account name than "install", use "account@your.ftp.server.com" for both uploaddisk and slurpdisk.

    • One partition only:
      Get an overview of disks recognized by g4u by typing "disks", a list of partitions on a certain disk is available via "parts disk", where disk is one of the disks printed by "parts", e.g. wd0, wd1, sd0, etc. Partitions are numbered with letters starting from 'a', where partitions a-d are usually predefined, with your partitions starting at 'e'. Partitions here are BSD-partitions, which have little in common with DOS MBR partitions. To specify a partition, use something like "wd0e" or "sd0f": "uploadpart your.ftp.server.com filename.gz wd0e". Run "uploadpart" without arguments for more examples.

    • Wait until you're back at the shell prompt (ignore the errors :-). Depending on your network, CPU, harddisk hardware and contents, image creation can take several hours!
    • You can switch off the machine now. Type "halt" or simply press reset/power button - there are no filesystems mounted so no harm will result.
    • Check that your FTP server's "install" account now has the image file.

    4.3 Image deployment

    • Boot the CD or floppy to the shell prompt again, see above.

    • Whole harddisk:
      Type "slurpdisk your.ftp.server.com filename.gz". This will log into the FTP server's "install" account, verify the password, then retrieve the image, uncompress it and write it back to /dev/rwd0d.

      If you want to restore to a SCSI disk, add the disk's name to the slurpdisk command line, e.g. "slurpdisk your.ftp.server.com filename.gz sd0".

      See above if you want to use an account name other than "install".

    • One partition:
      Use "slurppart your.ftp.server.com filename.gz wd0e" or whatever values you passed to uploadpart. Please note that the partition information is taken from your MBR, which is expected to be the same as before image creation - expect surprises if you change something between image creation and deployment. In case of inevitable changes, check the start sector and size values given by "parts". For an image that includes the MBR, do a full backup with "uploaddisk".

    • Reboot the machine (type "reboot" or press reset button), and see if your machine comes up as expected - it should!

    4.4 Copying a disk locally

      If you just want to copy one local disk to another one with no network & server involved, the "copydisk" command is what you want. E.g. to copy the first IDE disk to the second IDE disk, use "copydisk wd0 wd1", to do the same for SCSI disks run "copydisk sd0 sd1". Beware! All data on the target disk will be erased!

      A list of disks as found during system startup can be found using the "disks" command.

    4.5 Copying a partition locally

      If you want to only copy one local partition to another local partition (similar to what 'uploadpart' and 'slurppart' do, just without the network and FTP in between), this can be done with the 'copypart' command. It takes two partition names as arguments, and copies the contents of one partition to the other. As an example if you found you want to copy your first local partition 'wd0e' to the second one 'wd0f', run:
      copypart wd0e wd0f
      A list of disks can be found using the 'disk' command, to list all the partitions on a disk use the 'parts' command. Partitions have the form of "wd0d", "w1e", "sd1f".

      Be aware that the partitions to copy should have identical size (down to the sector), else funny things will happen. When copying a 'big' partition into a 'small' one, g4u won't thrash the data behind the 'small' partition, but of course the copy is not complete either. Take special note that that case could happen when you restore a copy made that way, and which went fine when you first copied your small working partition to your big backup partition!

5. FAQs and hints on disk cloning

    5.1 Supported filesystems

      One of the questions arising a lot is "what filesystems does g4u support". The answer is: "all of them". g4u reads the disk bit by bit, starting from byte #0 to the end. It includes any MBR, boot record, partition table and the partitions themselves without further investigating the structure of the data stored in these partitions.

    5.2 Supported Operating Systems

      The question on operating systems that can be deployed with g4u is the same as for the filesystems: any. Given the image-approach again, g4u is able to handle any operating system. Systems that were cloned successfully include NetBSD, Linux, Novell Netware 4.11 and 5.1, Solaris/x86, Windows NT, 2000 and XP.

      By moving the harddisks to a PC, g4u can even be used to deploy operating systems for non-PC based SCSI machines running HP-UX, Irix, Solaris, AIX etc.

    5.3 Supported Hardware

      The system running g4u itself can have IDE, SATA, SCSI or RAID disks with various controllers (Adaptec, ...) as well as wide range of PCMCIA, Cardbus, ISA and PCI network cards. Please see the g4u kernel config for the full list of supported hardware.

      If you're unsure if your hardware is supported, simply boot g4u and see if your network card gets listed by "ifconfig -a" and if your disks get listed by the "disks" command. If not, adding relevant parts of "dmesg" output (from g4u; press space bar to scroll down) is required for analysis if you ask for help. See "Reporting problems" for more information.

    5.4 A word on disk sizes

      The question how g4u deals with different disk sizes arises a lot too. The general answer is, g4u works best with identical disk sizes & geometry. Putting an image from a small disk on a big disk works, putting an image from a big disk to a small disk is likely to cause problems.

      If you cannot avoid preparing an image on a big disk that'll get deployed to a small disk later, make sure the "extra" space is not occupied by a active partition or filesystem, else data loss is very likely to occur!

      If you intend to deploy a "small" image to a "big" disk, the extra space that's not covered by g4u can be used for creating a partition and a filesystem. You will have to do that on your own, e.g. using your operating systems' post installation steps.

    5.5 Changing compression level

      Per default, images uploaded to the FTP server are compressed with "gzip -9". This saves as much disk space as possible, but also takes a long time - several hours are not uncommon. You can reduce the gzip level for "uploaddisk" by setting the GZIP environment variable:

      # GZIP=1 uploaddisk your.ftp.server.com filename.gz
      You can change compression levels between 1 (fast, little compression) and 9 (slow, maximum compression). Of course you can specify all the usual options to uploaddisk.

    5.6 List of recognized disks

      During startup of g4u, all devices recognized are listed, but very fast. To get a list of recognized disks, use the 'disks' command:

      # disks
      wd0 at pciide0 channel 0 drive 0:
      wd0: drive supports 16-sector pio transfers, lba addressing
      wd0: 6149 MB, 13328 cyl, 15 head, 63 sec, 512 bytes/sect x 12594960 sectors
      wd0: 32-bit data port
      wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2
      wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (using DMA data transfers)
      The above example shows a 6GB IDE harddisk.

    5.7 Problems with images at 2GB

      Do you experience g4u aborting file transfers after the image has grown to 2 GB on the FTP server? The problem here is not g4u, but most likely your FTP server. Some older Linux distributions are known to only allow files of up to 2GB filesize, and even if there is a Linux 2.4 kernel running, that's no guarantee for a properly working server. Make sure that your ftp daemon is upto date, or install a decent operating system.

      So far, whatever FTP server comes with NetBSD, Solaris and Windows 2000 has been used without problems.

    5.8 Can you add feature XXX?

      I got requests for adding many features to g4u:

      • using TFTP
      • using SSH/scp
      • using NFS
      • adding a X or curses based GUI
      • writing images to CDROM / deployment from CDROM
      • bzip2 compression

      After moving to a two-floppy set for g4u, some of these features may be added in the future, while others (X...) are not likely. Stay tuned!

    5.9 Problems with network performance

      If upload performance is weak (less than 5MBytes/sec on a 100BaseT Ethernet switch) even with a small compression level or a fast CPU and the harddisk is idle this means the network sucks. A common problem in switched Ethernet is a duplex mismatch between the NIC and the switch. In NetBSD, the default is to negotiate speed and duplex automatically. Other settings can be set manually.

      Enforcing 100BaseTX/Full-duplex:

      # ifconfig fxp0 media 100BaseTX mediaopt Full-duplex
      # ifconfig -a
      fxp0: flags=[...]
      media: Ethernet 100baseTX full-duplex
      Using autonegotiation (default):

      # ifconfig fxp0 media auto
      # ifconfig -a
      fxp0: flags=[...]
      media: Ethernet autoselect (100baseTX)

      For more information, please see the ifconfig(8) manpage as well as the Auto-Negotiation Valid Configuration Table featuring "Why Can't the Speed and Duplex Be Hardcoded On Only One Link Partner?".

    5.10 Reducing the image size

      People complain that the image resulting from g4u is very big. This is normal as g4u clones the whole disks with all blocks, not attributing if they contain any valid data or if they are empty/unused. To find empty/unused blocks (and not clone them), g4u would need intimate understanding of the contained filesystem, which is different again for each filesystem - Windows FAT, Linux Ext2/3/ReiserFS/..., BSD FFS, Solaris UFS, etc. Given both tight space limitations on the floppy as well as shortage on filesystem documentation and implementations available, teaching g4u to ignore empty blocks is not likely to happen.

      But there is an easy way to circumvent the problem: use the native operating system's understanding (and implementation) of the filesystem, and make sure it prepares empty/unused blocks in a way so they don't contain random garbage data but values which can be compressed easily by g4u, thus resulting in small image sizes.

      Effectively, you just fill up the disk's unused blocks with zero-bytes. Open file for writing, stuff in 0-bytes until the disk is full, then close the file and remove it. The result is that all unused blocks were used by the file, and filled with data that g4u can then compress easily. Usually the operating system will just mark the blocks as unused, without changing the actual data content.

      Using this technique on a 20GB disk that had 6GB Solaris 8/x86 and the rest Windows 2000 Workstation shrunk the image from ~6GB compressed to ~2GB compressed. You can probably imagine the effect of this on deployment time too. :)

      To perform the filling of unused data blocks with zero-bytes, there are several ways, depending on what operating system you use on your computer, and what software you have available:

      Standard Unix:
      This works on any Unix variant - Linux, NetBSD, Solaris, etc.:
      dd if=/dev/zero of=/0bits bs=20971520   # bs=20m
      rm /0bits

      Windows Perl solution:
      This one needs perl installed. In a command shell, type:
      cd /d c:\
      c:\win-preclone.pl c:
      Click here to download the win-preclone.pl perl script.

      Windows Pascal solution:
      This pascal program was contributed by Matthias Jordan [mjordan at code-fu dot de]: The programs are provided here without warranty.

      64bit Windows binary:
      Dominic Leelodharry [dominic at authorsoftware d0t com] also sent me a binary for 64bit Windows: This program is provided here without warranty.

      Windows "Erasor":
      This freeware program can erase your disk in a safe way, but it can also be told to just write a pattern of all-0-bits to the disk. Grab it at www.heidi.ie/eraser. Thanks to Stephen Krans [s040 at krans dot org] for the hint!

      Windows "onboard" solution:
      Aparently Windows XP comes with a tool to do some harddisk encryption that can also be used to write 0-bytes to the disk. To do so, run the following command: cipher /W:C: for drive C:. You will need to abort (Control+C) after the first round, else it will write random data after filling the disk nicely with 0-bytes.

      Windows "sdelete":
      Microsoft provides a tool "sdelete" that offers a switch -c in recent versions (only!) to zero free space on drives.
      Run any of these right before shutting down the operating system to create an image with g4u, and see the size difference.

    5.11 Setting IP-number manually

      Sometimes you may not want or be able to use DHCP. In that case doing the network configuration manually is possible with g4u, too:

      1. Find out if your network device is recognized, and by what name, using the command

             ifconfig -a
        Your network device is something like "ex0", "tlp0", etc. (Note that unlike in Linux, NetBSD doesn't call all ethernet cards "eth0"!)

      2. Next configure the network device's IP number and netmask. It is assumed that your network device is xx0 here, and that the machine should run with IP number 1.2.3.4 and netmask 255.255.255.0:

             ifconfig xx0 1.2.3.4 netmask 255.255.255.0
      3. Last, you may want to make the default router known unless your FTP server is in the same IP subnet as the machine you want to use g4u on. Let's say the default router's IP address is 2.3.4.5, then the command to enter is:

             route add default 2.3.4.5 
      That's all - simple, huh? Just remember that g4u is still Unix! After these steps, you should be able to use g4u just as if it used DHCP.

    5.12 Extracting the g4u kernel

      I've been asked how to boot g4u from harddisk (using e.g. grub). The idea is to extract the kernel from the boot floppy, and hand that to grub (or whatever bootloader you want - maybe use PXE to netboot g4u). Here's how to extract the kernel, named "netbsd":

      % ( cat g4u-2.3-1.fs | dd bs=512 skip=16 ; \
      ? cat g4u-2.3-2.fs | dd bs=512 skip=16 ; \
      ? cat g4u-2.3-3.fs | dd bs=512 skip=16 \
      ? ) | tar vxf -
      -r--r--r-- 1 feyrer netbsd 53948 Nov 3 23:08 boot
      -rw-rw-r-- 1 feyrer netbsd 1479905 Nov 3 23:08 netbsd
      Note that the kernel ("netbsd") is actually still compressed, which is fine for the NetBSD bootloader and probably GRUB, but just in case, you may want to uncompress it:
      % file netbsd
      netbsd: gzip compressed data, was "netbsd-INSTALL_G4U", from Unix
      % mv netbsd netbsd.gz
      % gunzip netbsd.gz
      % ls -la netbsd
      -rw-rw-r-- 1 feyrer wheel 5523084 Dec 7 18:08 netbsd
      % file netbsd
      netbsd: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped

5.13 Netbooting g4u via PXE

5.14 What FTP server software to use?

    When you're uploading or downloading images to or from your FTP server, and you see a line like
     553 Cannot send file larger than 4 gigabytes 
    scroll by, you can assume that that line (and any others with a number at the start) originates from your FTP server, and it is thus not g4u that's buggy but your FTP server software that has a problem.

    Some known working FTP server programs are:

    • NetBSD's ftpd(8) that comes in the default installation
    • GuildFTPd on Windows XP
    • The FTP server that comes with Microsoft Windows 2000 and 2003 Server (there seem to be problems with any FTP servers that come with non-"Server"-Versions of MS Windows...)
    • vsftpd on your favourite Linux distribution (reported working: Fedora 2, Debian Sarge)
    • Cerberus FTP
    • Novell Netware's FTP server (reported working: Netware 6)
    • TYPSoft FTP Server on a USB memory stick with Windows XP
    • Filezilla Server on Windows 98SE and XP
    • Mac OS X and Mac OS X Server's FTP daemon
    • Bullet Proof FTP Server v. 2.3.1 (Build 26)
    • Pure-FTPd
    • MOVEit DMZ
    • ProFTPD
    • SlimFTPd
    • (More? Let me know!)

5.15 Non-standard applications

    g4u was originally made to setup a cluster of PCs. Since then, it has been used for several other types of hardware and application areas. I'd like to collect some of them here:
    • Copied a dual drive Tivo
    • Saved Novell NetWare server disks
    • Copied a Nokia IP330 Checkpoint Firewall 1 boxes
    • Install several clusters of firewalls, compute machines, school workstations, etc.
    • Cloned Symbol WS5000 and WS5100 wireless switches
    • ...
    Please send me mail if you've used g4u to clone something funny, cool, unusal, geeky, etc.!