<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>wally's blog &#187; Uncategorized</title>
	<atom:link href="http://matt.matzi.org.uk/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://matt.matzi.org.uk</link>
	<description>Delving deep into the mind of me</description>
	<lastBuildDate>Mon, 30 Jan 2012 21:20:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>Argos</title>
		<link>http://matt.matzi.org.uk/2012/01/30/argos/</link>
		<comments>http://matt.matzi.org.uk/2012/01/30/argos/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 21:20:07 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=210</guid>
		<description><![CDATA[&#8220;It&#8217;s like playing Bingo, only you you win what you&#8217;ve already paid for!&#8221; &#8211; Michael McIntyre]]></description>
			<content:encoded><![CDATA[<p>&#8220;It&#8217;s like playing Bingo, only you you win what you&#8217;ve already paid for!&#8221;</p>
<p>&#8211; Michael McIntyre</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2012/01/30/argos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>MySQL migrations and generating Zabbix Cisco templates</title>
		<link>http://matt.matzi.org.uk/2011/12/07/mysql-migrations-and-zabbix-cisco-templates/</link>
		<comments>http://matt.matzi.org.uk/2011/12/07/mysql-migrations-and-zabbix-cisco-templates/#comments</comments>
		<pubDate>Wed, 07 Dec 2011 10:03:05 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cisco]]></category>
		<category><![CDATA[scripting]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[zabbix]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=207</guid>
		<description><![CDATA[What a week. What a month. What a year. So much has happened, far more than I could or should ever write about. Recently we&#8217;ve been migrating MySQL servers from one hardware to another &#8211; for anyone (me!) that finds this useful in the future, firstly &#8220;Hello, and I hope the future is better than [...]]]></description>
			<content:encoded><![CDATA[<p>What a week.  What a month.  What a year.</p>
<p>So much has happened, far more than I could or should ever write about.</p>
<p>Recently we&#8217;ve been migrating MySQL servers from one hardware to another &#8211; for anyone (me!) that finds this useful in the future, firstly &#8220;Hello, and I hope the future is better than the current &#8216;past&#8217;!&#8221;, and secondly &#8220;I hope this does what you need&#8221;:</p>
<p><code>#!/bin/sh<br />
HOST="$1"<br />
USER="$2"<br />
PASS="$3"<br />
mysql -h "$HOST" -u "$USER" -p"$PASS" -e \<br />
  "SELECT CONCAT(\"SHOW GRANTS FOR '\", \<br />
user,\"'@'\",host,\"'\;\") FROM mysql.user;" -B -N | \<br />
mysql -h "$HOST" -u "$USER" -p"$PASS" | egrep -v "^Grants"</code></p>
<p>That small bash script accepts 3 arguments, HOSTNAME, USERNAME and PASSWORD.  Executing it will output &#8220;CREATE USER&#8221; statements for all users on the MySQL instance.</p>
<p>While migrating the MySQL servers, we took the opportunity to setup a new monitoring solution &#8211; Zabbix.  I&#8217;ve used it a couple of times in the past, and in my most humble opinion it&#8217;s failing is it&#8217;s flexibility.  It&#8217;s just so flexible!</p>
<p>We have a fair number of Cisco switches and routers at our disposal here in the office, and setting up Zabbix templates for each is not high on my bucket list.  Why do something by hand that can be done automagically by a computer?</p>
<p><a href="http://matzi.org.uk/mygiftstotheworld/generate-cisco-template">Thus this (very hacky, PHP) script which automatically generates Cisco (SNMP) templates for Zabbix.</a></p>
<p>Simply run &#8220;show snmp mib ifmib ifindex&#8221; on your Cisco device of choice, and shove the output into the script&#8217;s STDIN.  Hey-presto, out comes a Zabbix template!</p>
<p>Modifying the default options in the template is very easy; just edit what you like after line 34.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/12/07/mysql-migrations-and-zabbix-cisco-templates/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Samba 3, Windows 7 and Domain Admins</title>
		<link>http://matt.matzi.org.uk/2011/07/07/samba-3-windows-7-and-domain-admins/</link>
		<comments>http://matt.matzi.org.uk/2011/07/07/samba-3-windows-7-and-domain-admins/#comments</comments>
		<pubDate>Thu, 07 Jul 2011 14:56:13 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=205</guid>
		<description><![CDATA[*sigh* It&#8217;s been for several weeks that I&#8217;ve been trying (and failing) to aquire Domain Admin status on our Samba domain. With Windows 2000 and XP, administrative access was easy.  A local admin account called &#8220;administrator&#8221;, and a local policy stating who&#8217;s an administrator &#8211; all peachy. You can tell I&#8217;m not a Windows administrator [...]]]></description>
			<content:encoded><![CDATA[<p>*sigh*</p>
<p>It&#8217;s been for several weeks that I&#8217;ve been trying (and failing) to aquire Domain Admin status on our Samba domain.</p>
<p>With Windows 2000 and XP, administrative access was easy.  A local admin account called &#8220;administrator&#8221;, and a local policy stating who&#8217;s an administrator &#8211; all peachy.</p>
<p>You can tell I&#8217;m not a Windows administrator by job!  <em>Obviously</em> domain admins is a far cleaner way to go.</p>
<p>So I dip my toe into Windows 7 on a Samba 3 domain.</p>
<p>It works well.  <em>Too</em> well.  There&#8217;s a small issue with the workstations claiming &#8220;there&#8217;s no logon servers available to handle the request&#8221; for 60+ seconds from a cold-start, but I think that&#8217;s solvable (I suspect WINS, NetBIOS or DNS is failing to warm up in a timely manner).  The domain admins however eluded me.</p>
<p>&#8220;So it&#8217;s OK&#8221; I think to myself.  &#8220;Login as the local administrator &#8211; set some local policies up.  &#8230;  Oh wait, Windows 7 disabled those because I joined the domain and didn&#8217;t first give them passwords.  Sheesh!  Joining the domain admin group it is then.&#8221;</p>
<p>Well I searched the Internet and ripped my hair out.  Week after week the issue prevailed.  Windows 7 wouldn&#8217;t obey the &#8220;Domain Admin&#8221;.  &#8220;Elevated access required.&#8221;</p>
<p>Finally I trip across this: <a href="http://fixunix.com/smb/64004-domain-admin-group-samba-3-a.html">http://fixunix.com/smb/64004-domain-admin-group-samba-3-a.html</a></p>
<p>The second guy motions that &#8220;Domain Admins&#8221; must be group ID 512 (in the Windows mapping) &#8211; and blam.</p>
<p>One Samba restart later &#8230; all works.</p>
<p>Check the documentation out at <a href="http://www.samba.org/samba/docs/man/manpages-3/net.8.html">http://www.samba.org/samba/docs/man/manpages-3/net.8.html</a> &#8230; specifically you&#8217;re looking for something like this:</p>
<p><code>net groupmap add rid=512 unixgroup=MYUNIXGROUPHERE type=domain</code></p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/07/07/samba-3-windows-7-and-domain-admins/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Playstation Network nearly back online &#8211;</title>
		<link>http://matt.matzi.org.uk/2011/05/15/playstation-network-nearly-back-online/</link>
		<comments>http://matt.matzi.org.uk/2011/05/15/playstation-network-nearly-back-online/#comments</comments>
		<pubDate>Sun, 15 May 2011 20:48:46 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=195</guid>
		<description><![CDATA[As I sit here typing, the very Playstation Network (PSN) service which has received so much flak over the last month is finally approaching usability once more. I would like to say a huge thank-you to the guys at PSN, there&#8217;s no question over how hard you&#8217;ve been working to get things this far. Despite [...]]]></description>
			<content:encoded><![CDATA[<p>As I sit here typing, the very Playstation Network (PSN) service which has received so much flak over the last month is finally approaching usability once more.</p>
<p>I would like to say a huge thank-you to the guys at PSN, there&#8217;s no question over how hard you&#8217;ve been working to get things this far.  Despite how many will disagree with me, I&#8217;d like to thank the non-tech guys at Sony too, for your honesty and speed with dealing with things.</p>
<p>I work at a PCI-DSS regulated company myself, as a programmer, and don&#8217;t have 1% of the customers PSN does.  We also don&#8217;t provide a service that millions of people rely on &#8220;real-time&#8221; for entertainment purposes, nor which they expect to find working 24/7.</p>
<p>You guys have done a good job, please keep it up.</p>
<p>This could have happened to anyone, although it&#8217;s always easy to point the finger when you&#8217;re the ignorant customer.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/05/15/playstation-network-nearly-back-online/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Kate and Will &#8211; they made it!</title>
		<link>http://matt.matzi.org.uk/2011/04/29/kate-and-will-they-made-it/</link>
		<comments>http://matt.matzi.org.uk/2011/04/29/kate-and-will-they-made-it/#comments</comments>
		<pubDate>Fri, 29 Apr 2011 10:41:32 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=192</guid>
		<description><![CDATA[Yeah yeah, wasn&#8217;t going to watch it and all that &#8211; but I finally broke. Don&#8217;t they just look great]]></description>
			<content:encoded><![CDATA[<p>Yeah yeah, wasn&#8217;t going to watch it and all that &#8211; but I finally broke.</p>
<p>Don&#8217;t they just look great <img src='http://matt.matzi.org.uk/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><a href="http://matt.matzi.org.uk/wp-content/uploads/2011/04/katewill.jpg"><img class="aligncenter size-full wp-image-193" title="katewill" src="http://matt.matzi.org.uk/wp-content/uploads/2011/04/katewill.jpg" alt="" width="500" height="280" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/04/29/kate-and-will-they-made-it/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Validating email addresses by SMTP in realtime</title>
		<link>http://matt.matzi.org.uk/2011/04/06/validating-email-addresses-by-smtp-in-realtime/</link>
		<comments>http://matt.matzi.org.uk/2011/04/06/validating-email-addresses-by-smtp-in-realtime/#comments</comments>
		<pubDate>Wed, 06 Apr 2011 08:31:33 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=181</guid>
		<description><![CDATA[Validation of email addresses.  What a painful subject. About a year ago I looked into the possibility of using SMTP to validate email addresses.  As it turns out, it&#8217;s not perfect (a lot of false-positives) but it&#8217;s very reliable (I&#8217;ve failed to get any false-negatives). The following code outlines how this works (written in PHP, [...]]]></description>
			<content:encoded><![CDATA[<p>Validation of email addresses.  What a painful subject.</p>
<p>About a year ago I looked into the possibility of using SMTP to validate email addresses.  As it turns out, it&#8217;s not perfect (a lot of false-<em>positives</em>) but it&#8217;s very reliable (I&#8217;ve failed to get any false-<em>negatives</em>).</p>
<p>The following code outlines how this works (written in PHP, because I was thinking about it in the context of web-apps):</p>
<pre lang="PHP" style="font-size: 0.8em;">&lt;?php
function validate_email_regex ($email) {
   return preg_match ('/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i', $email);
}

function validate_email_smtp ($email) {
   if (!validate_email_regex ($email)) return false;
   list ($user, $domain) = split ('@', $email);

   if (!getmxrr ($domain, $mxhosts)) return false;
   $i = 0;
   $fp = false;
   $thishost = isset ($_SERVER['SERVER_NAME']) ? $_SERVER['SERVER_NAME'] : exec ("hostname");
   foreach ($mxhosts as $mxhost) {
      if (++$i &gt; 3) break;    // don't check more than 3 MX records
      if ($fp) fclose ($fp);  // cleanup from last iteration

      $fp = fsockopen ("tcp://$mxhost", 25, $errno, $errstr, 10);
      if (!$fp) continue;
      if (!$str = fgets ($fp)) continue;
      if (substr ($str, 0, 3) != '220') continue;

      fwrite ($fp, "HELO $thishost\r\n");
      $str = fgets ($fp);
      if (substr ($str, 0, 3) != '250') continue;

      fwrite ($fp, "MAIL FROM:&lt;mail@$thishost&gt;\r\n");
      $str = fgets ($fp);
      if (substr ($str, 0, 3) != '250') continue;

      fwrite ($fp, "RCPT TO:&lt;$email&gt;\r\n");
      $str = fgets ($fp);
      if (substr ($str, 0, 3) != '250') {
         fclose ($fp);
         return false;  /* don't proceed with other MX records because of a bad user */
      }

      fclose ($fp);
      break;
   }
   return true;
}
?&gt;</pre>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/04/06/validating-email-addresses-by-smtp-in-realtime/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SVN hooks &#8211; receiving HTML formatted emails when SVN commits are made</title>
		<link>http://matt.matzi.org.uk/2011/04/05/svn-hooks-receiving-html-formatted-emails-when-svn-commits-are-made/</link>
		<comments>http://matt.matzi.org.uk/2011/04/05/svn-hooks-receiving-html-formatted-emails-when-svn-commits-are-made/#comments</comments>
		<pubDate>Tue, 05 Apr 2011 08:06:30 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=175</guid>
		<description><![CDATA[While setting up Trac as our company source-code repository yesterday, the question arose of &#8220;What about cute HTML emails when SVN commits are made?&#8221;. Hmmmm.  Thank goodness for SVN hooks &#8211; a wonderfully understated feature of subversion! Go to your subversion directory (that created by &#8220;svnadmin create&#8221;), and enter the &#8220;hooks&#8221; directory. Create a file, executable [...]]]></description>
			<content:encoded><![CDATA[<p>While setting up Trac as our company source-code repository yesterday, the question arose of &#8220;What about cute HTML emails when SVN commits are made?&#8221;.</p>
<p>Hmmmm.  Thank goodness for SVN hooks &#8211; a wonderfully understated feature of subversion!</p>
<p>Go to your subversion directory (that created by &#8220;svnadmin create&#8221;), and enter the &#8220;hooks&#8221; directory.</p>
<p>Create a file, executable and owned by your webserver, called &#8220;post-commit&#8221;:</p>
<pre>touch post-commit
sudo chown www-data:www-data post_commit     # RH and other distros will want apache:apache here
chmod +x post_commit</pre>
<p>Enter the following as it&#8217;s content (this is a fairly self explanatory shell script):</p>
<pre>
<div id="_mcePaste">#!/bin/sh
set -e
REPOS="$1"
REV="$2"
TO="email@address.com"
FROM="email@address.com"
BY=`svnlook author $REPOS -r$REV`
MSG=`svnlook log $REPOS -r$REV`
CHANGED=`svnlook changed $REPOS -r$REV`
DIFF=`svnlook diff $REPOS -r$REV | pygmentize -l diff -f html`
STYLE="borland"
CSS=`pygmentize -S $STYLE -f html`
echo "To: $TO\nFrom: $FROM\nSubject: SVN commit r$REV - ($BY) \"$MSG\"\nContent-Type: text/html; charset=us-ascii\n\n&lt;html&gt;\n&lt;style type='text/css'&gt;\n$CSS&lt;/style&gt;\n&lt;body&gt;\n&lt;p&gt;&lt;a href='https://your.trac.url.com/trac/changeset/$REV/'&gt;View changeset r$REV on Trac&lt;/a&gt;&lt;/p&gt;\n&lt;h3&gt;Files changed:&lt;/h3&gt;&lt;pre&gt;$CHANGED&lt;/pre&gt;\n&lt;h3&gt;Changes made:&lt;/h3&gt;&lt;p&gt;$DIFF&lt;/p&gt;\n&lt;/body&gt;\n&lt;/html&gt;" | /usr/sbin/sendmail -t</div>
</pre>
<p>You&#8217;ll need to apt-get or yum install pygmentizer (package named python-pygments on Debian/Ubuntu) to get the cool diff colouring.</p>
<p>And that&#8217;s it!  (The &#8220;sendmail -t&#8221; reference works with Exim on Debian Squeeze, your mileage as always, may vary.)</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/04/05/svn-hooks-receiving-html-formatted-emails-when-svn-commits-are-made/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Improving CoovaAP portal pages for iPhone and Android</title>
		<link>http://matt.matzi.org.uk/2011/03/19/improving-coovaap-portal-pages-for-iphone-and-android/</link>
		<comments>http://matt.matzi.org.uk/2011/03/19/improving-coovaap-portal-pages-for-iphone-and-android/#comments</comments>
		<pubDate>Sat, 19 Mar 2011 10:49:07 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=171</guid>
		<description><![CDATA[Over the last couple of weeks, I&#8217;ve been spending my spare time playing with the Coova AP (ChilliSpot) wireless hotspot service on a WRT54GL. The main problem I&#8217;ve had with it is the portal pages not looking so good on mobile devices (particularly iPhone and Androids). A little bit of Googling revealed a solution (at [...]]]></description>
			<content:encoded><![CDATA[<p>Over the last couple of weeks, I&#8217;ve been spending my spare time playing with the Coova AP (ChilliSpot) wireless hotspot service on a WRT54GL.</p>
<p>The main problem I&#8217;ve had with it is the portal pages not looking so good on mobile devices (particularly iPhone and Androids).</p>
<p>A little bit of Googling revealed a solution (at least, a solution for websites in general): insert the following meta tag into the HTML header.</p>
<pre>&lt;meta name="viewport" content="width=320; initial-scale=1.0; maximum-scale=1.0; user-scalable=0;" /&gt;</pre>
<p>In the CoovaAP configuration interface, you can apply this for every portal page by adding it to the &#8220;HotSpot&#8221;/&#8221;Portal&#8221;/&#8221;HTML Title&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/03/19/improving-coovaap-portal-pages-for-iphone-and-android/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ebuyer &#8211; maxing out the result limit</title>
		<link>http://matt.matzi.org.uk/2011/02/22/ebuyer-maxing-out-the-result-limit/</link>
		<comments>http://matt.matzi.org.uk/2011/02/22/ebuyer-maxing-out-the-result-limit/#comments</comments>
		<pubDate>Tue, 22 Feb 2011 12:02:51 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=168</guid>
		<description><![CDATA[This might be useful to someone, somewhere, out there. Those who&#8217;re into tinkering with things they shouldn&#8217;t will notice that Ebuyer&#8217;s search pages include a &#8220;limit&#8221; GET variable in the results page URL. The upper limit appears to be 30, but going negatively causes a different kettle of fish: http://www.ebuyer.com/search?q=a%25&#038;x=0&#038;y=0&#038;limit=-3000 (That URL is not for [...]]]></description>
			<content:encoded><![CDATA[<p>This might be useful to someone, somewhere, out there.</p>
<p>Those who&#8217;re into tinkering with things they shouldn&#8217;t will notice that Ebuyer&#8217;s search pages include a &#8220;limit&#8221; GET variable in the results page URL.  The upper limit appears to be 30, but going negatively causes a different kettle of fish:<br />
   <a href="http://www.ebuyer.com/search?q=a%25&#038;x=0&#038;y=0&#038;limit=-3000">http://www.ebuyer.com/search?q=a%25&#038;x=0&#038;y=0&#038;limit=-3000</a><br />
(That URL is not for the feint hearted!)</p>
<p>It makes one wonder, is there a little bit of sanitization fail?</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/02/22/ebuyer-maxing-out-the-result-limit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cloud-Bible progress</title>
		<link>http://matt.matzi.org.uk/2011/02/06/cloud-bible-progress/</link>
		<comments>http://matt.matzi.org.uk/2011/02/06/cloud-bible-progress/#comments</comments>
		<pubDate>Sun, 06 Feb 2011 16:56:33 +0000</pubDate>
		<dc:creator>wally</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://matt.matzi.org.uk/?p=164</guid>
		<description><![CDATA[Two blogs in one weekend. Impressive for me! I&#8217;ve made a lot of progress with the Cloud-Bible project (the name is growing on me, slowly). Switching from the simpler-to-follow (but very assuming) big &#8220;switch&#8221; clause which was previously handling reverse-engineered tags (oh, I didn&#8217;t mention the XML I have is not precise OSIS format did [...]]]></description>
			<content:encoded><![CDATA[<p>Two blogs in one weekend.  Impressive for me!</p>
<p>I&#8217;ve made a lot of progress with the Cloud-Bible project (the name is growing on me, slowly).</p>
<p>Switching from the simpler-to-follow (but very assuming) big &#8220;switch&#8221; clause which was previously handling reverse-engineered tags (oh, I didn&#8217;t mention the XML I have is not precise OSIS format did I? <img src='http://matt.matzi.org.uk/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' /> ) to the more complex (but elegant?) recursive-function method brought not only more robust operation (a good thing &#8211; variances in the XML layout are handled now) but the performance has been greatly decrease (a bad thing &#8211; down from 0.008s to 0.014s average).</p>
<p>I&#8217;ve tried a lot of ideas to increase the performance again, including passing everything by reference (so the stack isn&#8217;t filled with copies of SimpleXML objects) &#8211; all to no avail.</p>
<p>I did however have (some) victory by reworking this:</p>
<p>This is slower by 0.000006s per iteration than the code in use below.<br />
<code><br />
function process_xml ($x, $children=false) {<br />
      if ($subs) foreach ($z->children () as $x) process_xml ($x);<br />
      else {<br />
          // ... processing here<br />
</code><br />
I wasn&#8217;t sure from the start that the additional recursion (in the foreach()) was a good thing, and my trial-and-error agreed:<br />
<code><br />
function process_xml ($xml, $children=false) {<br />
      $to_process = ($children)?$xml->children ():array ($xml);<br />
      foreach ($to_process as $x) {<br />
          // ... processing here<br />
</code></p>
<p>Surprisingly (to me) the latter performed faster. ~0.000006s per iteration to be precise.  Not much, but it adds up (larger XML files can reach 1000 or more iterations, recovering around half the performance lost earlier!)</p>
<p>The jury (in my mind) is therefore still out on whether to handle the XML each on each request (allowing users to turn on and off features, and include user-specific markup like highlighting etc) or whether to build a cache of HTML files alongside the XML counterparts.  (Certainly, the latter would be faster.  And given the Bible isn&#8217;t suspected to be changing any time soon&#8230;)</p>
<p>Some of the changes introduced include:</p>
<ul>
<li>Red-letter (Jesus&#8217; words in red)</li>
<li>Handling of quotation marks (IE doesn&#8217;t conform to the standards, but beside that the XML I use is UTF-8 encoded and provides sets of quotation marks as needed)</li>
<li>(Corrected) handling of references, variances etc.  Before the recursive code was introduced, I was missing any &#8220;notes&#8221; which bundled more than one reference up in them.</li>
</ul>
<p>So it&#8217;s progress.</p>
<p>While on the subject of performance, I&#8217;ve been trying to work out some targets.  How much memory will I allow this script to consume before it&#8217;s &#8220;too much&#8221;?  How fast must it execute to remain acceptable?</p>
<p>Currently the figures for Mark chapter 1 come in at:</p>
<ul>
<li>0.013s execution time</li>
<li>0.18MB peak memory usage</li>
</ul>
<p>Interestingly, memory usage jumped up hugely after a tiny change in the recursive function (I can&#8217;t even remember what it was now).  I thought about this for a moment at the time, and hypothesised that this really makes a lot of sense.  Each iteration is shoving data onto the stack, eating memory and generally doing a whole lot of &#8220;weight throwing&#8221;.  I went as far as jotting down my expectations of memory usage per iteration.  I expect it to look like a very wide but short kind of bell-curve, initially shooting up as XML depth is reached, then bobbing up and down, finally tailing rapidly off as the end is reached.</p>
<p>Due to my geeky nature, I recorded the memory usage on Mark 1, and went to graph them.  But I never got that far.  Just looking at the results were quite clear: an immediate shoot upward (9KB across the first 9 iterations) then a very slow increase (about 1.5KB every 20 iterations) thereafter.  And it never decreased.</p>
<p>PHP I believe uses a garbage collector, and presumably it considers my usage <em>so low</em> that it would have a negative effect to execute mid-script.  Fair play!</p>
<p>My concern really boils down to this: let&#8217;s assume that this project really takes off.  If we aim to handle 100 requests per second (that&#8217;s around 40 times what Wikipedia&#8217;s English wiki received on average 2003q1) &#8230; I want to service each request within 0.1s each and not exceed 512MB memory on a dual-core Intel (comparable to what I&#8217;m running on right now).</p>
<p>I&#8217;m working on the grossly over simplified (and wildly inaccurate) calculation:</p>
<p>MaxHits = Min((CpuCores/ExecutionTime),(MemoryLimit/PeakMemoryUsage))</p>
<p>Currently that stands at:</p>
<p>MaxHits = Min((2/0.013),(512/0.26)) = 154</p>
<p>So we&#8217;re in.  We&#8217;re limited hugely by the execution time.  I can afford to go up as far as 0.02s per request, but that&#8217;s it.</p>
<p>We&#8217;ll see what happens!</p>
]]></content:encoded>
			<wfw:commentRss>http://matt.matzi.org.uk/2011/02/06/cloud-bible-progress/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

