Archive for the ‘Software’ Category

Reconstructing heavily damaged hard drives - July 3rd, 2008

[EDIT: Hey guys, thanks for the feedback! Someone over at virtuallyhyper.com has an awesome write up that deals with SD cards specifically (but is highly relevant to hard drives too), with a set of much improved and updated scripts. I’d strongly recommend taking a look … Recover files from an SD card using Linux utilities]

Recover data even when an NTFS (or other) won’t mount due to partial hard drive failure.

This was born when someone brought me a near dead hard drive (a serious number of read errors, so bad that nothing could mount or fix the filesystem), asking if I could recover any data.

Now obviously (as almost any geek would know), the answer of course is a very likely yes. There are many ways of recovering data. One such way (which I performed) is using Foremost to find files based on their headers and structures. While this technique works really quite well, it does miss a lot of files, fragment others up, leave bits out and generally not retrieve any metadata (such as filenames etc).

This makes Matt mad. No filenames == days of renaming files.

So I booted up Helix, created a quick image of the drive to a 500GB external drive, and tried running Autopsy (the GUI of Sleuthkit). This is where things got interesting.

I say interesting, because Sleuthkit couldn’t read the filesystem. But it could retrieve the inodes, and the metadata along with them. And it could accordingly retrieve the data content of (some) files.

Observing this, I realized there was a high probability that I could somehow use Sleuthkit’s command line tools to retrieve the files which were not on bad clusters and recover the filenames from the inode. As it turns out, this wasn’t such a bad idea!

There are 3 tools which proved useful:

  • ils
  • ffind
  • icat

ils “lists inode information” from the image, ffind “finds the name of the file or directory using the given inode” and icat “outputs the content of the file based on it’s inode number”. Using these three tools and a bit of bash, we can grab a list of inodes, get the filename from the metadata, create the directory structure beneath it, extract the file content, move on to the next.

So for this task I knocked up the following (really ugly, potentially unsafe) script:

#!/bin/sh
for inode in $(cat /tmp/inodes) ; do
 
/KNOPPIX/usr/local/sleuthkit-2.09/bin/ffind /dev/hda1 $inode
 
if [ $? -eq 0 ]
then
	echo "INODE: $inode"
	INODEDIR=`/KNOPPIX/usr/local/sleuthkit-2.09/bin/ffind /dev/hda1 $inode`
 
	REALDIR=/mnt/out`dirname "$INODEDIR"`
	FILENAME="/mnt/out$INODEDIR"
	mkdir -p "$REALDIR"
 
	echo "FILENAME: $FILENAME"
	/KNOPPIX/usr/local/sleuthkit-2.09/bin/icat /dev/hda1 $inode > "$FILENAME"
 
	if [ `du "$FILENAME" | awk '{print $1}'` == 1 ]
	then
		rm "$FILENAME"
		mkdir -p "$FILENAME"
	fi
	echo ""
fi
done

Really, I do warn you, take serious care running this!

It needs a lot of work, but enough is there for it to function. It reads a file of inode numbers (one per line) and uses ffind to get the filename. We extract the path, attempt to create it, output the file content and (this is important), take a wild guess at if the inode was a directory. Please note this is wildly inaccurate and needs serious rethinking! Currently we look at the file size, and assume directories alone use 1 byte.

We can populate a file with inode numbers like so:

ils -a /dev/hda1 | awk -F '|' '{print $1}' > /tmp/inodes

(Users of Helix will need to use the full pathname to ils as in the above script).

At some point (no garuntees when) I’ll tidy up the script and make it more bullet proof. In the meantime, I hope this saves some data!

Remember: No matter how much data you have, it’s always better to have 2 hard drives of half the size, mirrored than it is to have one large expensive drive. They will die unexpectedly! When you next buy a bigger hard drive, consider this: 1x500GB drive will loose you 500GB of data. 2x250GB will 99.9% probability loose you nothing. So if you’re on a tight budget, buy twice smaller. If you’ve a lot of money, buy twice big.

Oh, and always make regular backups. Cheap USB drives are good for this!

Full, recoverable, DVD server backups - June 20th, 2008

If you’re responsible for one or more Linux servers, (they provide packages for Linux distributions only, but it may be easily ported to other Unixes) you may be interested in the lesser known tool MondoRescue.

It’s essentially nothing more (but it is a lot!) than a front end to several other GPL’ed tools, including BZip, growisofs and busybox.  To cut a long story short, it builds a set of CDs, DVDs, USB sticks with a tiny bootable Linux distribution based on your server’s kernel, which can restore parts or all of your filesystem from tarballs on the media.  It can also backup to tape drives, NFS mounts or a local filesystem.

It has (imo) a nice, clean ncurses interface; and it’s quick to use.  On an AMD K6 clocked at 350MHz, with 60MB free memory, it took just under 6 hours to compress 8.5GB of data down to a single DVD (4.7GB).  Doing the backup with an “average” compression took far less time (around an hour), but would have eaten up several DVDs.

You’re presented with a wide variety of options when you boot up the produced rescue media.  You can “nuke” the system and start from scratch, or just restore the parts you want.  It can handle new disk geometry, and because the entire rescue system is burnt onto the media, (assuming you used CDs, DVDs or USB storage), it’ll work even if the target machine is clean.

Smart mailboxes in Mail - June 14th, 2008

Today I discovered an amazingly useful feature in Mail (for Mac). You can create so called “smart” mailboxes, based on any or all of several criteria including the from address, subject and message body.

The immediate use which occurred to me for this is to keep a handle on how many of those infuriating FaceBook notifications I receive. By creating a smart mailbox for all such messages, I can easily clean out my inbox of all outdated junk.

gpart - February 16th, 2008

So we’ve all done it at some point. Bye bye beautiful partition table. Bye bye 20 user’s worth of mail. Bye bye rest of my weekend doing anything enjoyable.

I just had to blog this, I was that impressed! gpart scans your hard drive for likely partitions, and offers to write the guessed MBR back to the HD. For me it worked first time without any hackery. To say I’m relieved would be a massive understatement!

So if you’ve just lost a partition, give gpart a go. Remember, always make a backup of the HD before doing it (dd if=/dev/xx of=/somewheresafe), and always have a backup system in place for lost data. The most I would have lost would have been emails received since 3.00am this morning.

IE7 glitches galore - August 22nd, 2007

For anyone who claims that IE7 is “the best browser there is” (Davey…), this might just put it in a different light.

Goto http://www.plfc.org.uk/ and look at the first link on the page, “FIEC”. Notice (in IE7) that there’s a little gap after it, like there’s a . Well, look at it in FireFox, Opera, Konqueror or Safari and you’ll see a (correctly rendered) image indicating an external link (as used on Wikipedia). Now look (in IE7) on the Missionary Support page and you’ll see what it should look like on the home page. The CSS is completely valid (I set a padding-right and a background-image, it’s the same CSS behind both pages), yet IE7 insists on not displaying the image on the home page. Why? Because it’s full of bugs. Is FireFox? Yes. But they get fixed quickly and regularly. IE7? Every webmaster has to just hack their way around them.

Go, go now and download FireFox.

Update: I just had a look on IE5, and the links render completely wrong. However the home page one does work. Interesting. I then had a look at wikipedia to see how they get around these problems, and I noticed that in IE5 the links don’t render the external icon at all. A quick baz at the source, and I notice they have a bunch of IF statements in the HTML, each one referring various IE versions to alternative CSS “fixes”. Seems a good idea, so I’ll ditto it.

If you’re too lazy to go find the HTML yourself, wikipedia uses this:

 <!--[if lt IE 5.5000]><style type="text/css">@import "/skins-1.5/monobook/IE50Fixes.css?90";</style><![endif]-->
 <!--[if IE 5.5000]><style type="text/css">@import "/skins-1.5/monobook/IE55Fixes.css?90";</style><![endif]-->
 <!--[if IE 6]><style type="text/css">@import "/skins-1.5/monobook/IE60Fixes.css?90";</style><![endif]-->
 <!--[if IE 7]><style type="text/css">@import "/skins-1.5/monobook/IE70Fixes.css?90";</style><![endif]-->
 <!--[if lt IE 7]><script type="text/javascript" src="/skins-1.5/common/IEFixes.js?90"></script>
 <meta http-equiv="imagetoolbar" content="no" /><![endif]-->

Reverse proxying with FreeBSD - July 10th, 2007

Being the geek I am, I have a nasty habbit of setting something up and then thinking “OK, so it runs, but what would happen if…“, and I then go on to think about the most unlikely, impossible, worse-case senario and try to figure a way around it.

The latest case of this occured 2 days ago, when I sat there looking at hedwig’s memory usage (our web/mail/everything-else-you-can-think-of server) and it occured to me “ouch, she’s pretty maxed out”. More RAM of course would be a solution, but it’d only be a solution upto a given point, beyond which I’d once again be stuffered.

So my trail of thought was “how can I lessen the load on hedwig without performing any hefty modifications to her”. The solution was simpler than I first imagined, why just throw a transparent reverse proxy infront!

So I’ve just finished configuring doriath (LOTR location, home to many Elves?) and I’m just awaiting the completion of squid’s compilation. doriath has no less than two network cards, one to join itself to the gateway (mordor) and one to pass traffic onto hedwig. So it’s effectively a very big, very powerful network switch. Well more precisely, a managed network switch.

My aim is to get squid running on doriath and setup some routing rules to redirect traffic destined for hedwig (which will be coming in on rl0) and rather than (as is now default) passing it straight onto rl1, it’ll instead pump it into lo on port 3128 (or whatever it is squid uses, I forget).

My hope is that doriath will be a drop-in addition to the network, silently proxying all HTTP traffic for hedwig. It’s also running an FTP server to serve up any common static content (prehaps sermons for example) to again lessen hedwig’s load. The solution is in my opinion elegant because I can at any point simply remove doriath and place a cable directly between hedwig and mordor, and everything will continue as per normal, just minus FTP and transparent proxying.

Might even hold up to a small slashdotting 😉