<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Si Dawson . Com &#187; Code</title>
	<atom:link href="http://sidawson.com/category/code/feed" rel="self" type="application/rss+xml" />
	<link>http://sidawson.com</link>
	<description>Self Improving Software. Evolutionary Algorithms. Weak AI.</description>
	<lastBuildDate>Wed, 25 Jan 2012 05:02:22 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Find non-commented Python lines in Komodo</title>
		<link>http://sidawson.com/2012/01/find-non-commented-python-lines-in-komodo.html</link>
		<comments>http://sidawson.com/2012/01/find-non-commented-python-lines-in-komodo.html#comments</comments>
		<pubDate>Wed, 25 Jan 2012 05:00:45 +0000</pubDate>
		<dc:creator>Si</dc:creator>
				<category><![CDATA[Code]]></category>

		<guid isPermaLink="false">http://sidawson.com/?p=57</guid>
		<description><![CDATA[I&#8217;ve been doing a lot of large scale refactoring recently. This entails a lot of &#8220;find all instances of this and replace it with that&#8221; &#8211; in non-trivial ways (of course &#8211; any monkey can do a search &#38; replace). Obviously I also want to only bother with non-commented lines of code. I use Komodo [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been doing a lot of large scale refactoring recently.</p>
<p>This entails a lot of &#8220;find all instances of this and replace it with that&#8221; &#8211; in non-trivial ways (of course &#8211; any monkey can do a search &amp; replace).</p>
<p>Obviously I also want to only bother with non-commented lines of code.</p>
<p>I use <a href="http://www.activestate.com/komodo-ide" target="_blank">Komodo</a> for my Python coding, and while it&#8217;s a great IDE in a lot of ways, it would appear I&#8217;m the first coder that&#8217;s ever wanted to search only active lines of code (/sarcasm). Komodo does have a great regex search feature though, so I put that to use.</p>
<p>After much head scratching (every regex engine has its own delightful little quirks) I found this incantation:</p>
<blockquote><p>^\s*[^#\n]*find.*$</p></blockquote>
<p>Which will find all single-line non-commented instances of &#8216;find&#8217;.</p>
<p>Now, bugger typing that mess in every time I want to find something, so here&#8217;s a better way.</p>
<p>Go <em>View | Toolbox</em> (so the toolbox appears on the right hand side). Then right-click &amp; <em>&#8220;Add New Macro&#8221;</em>. Give it a sensible name and enter this into the main text area:</p>
<blockquote><p>if (komodo.view) {komodo.view.setFocus();}<br />
var search = ko.interpolate.interpolateStrings(&#8216;%ask:SearchFor:&#8217;);<br />
Find_FindAllInMacro(window, 0, &#8216;^\\s*[^#\\n]*&#8217; + search + &#8216;.*$&#8217;, 2, 0, false, false);</p></blockquote>
<p>It has to be Javascript &#8211; Komodo doesn&#8217;t offer the %ask functionality in Python macro scripting (nice one, guys).</p>
<p>Next give it a decent key-binding on the second tab. Click in the <em>&#8220;New Key Sequence&#8221;</em> box and hit a vulcan key combo that works for you &#8211; I&#8217;ve used Ctrl-Alt-F &#8211; followed by clicking <em>Add</em>.</p>
<p>Hit <em>OK</em> &amp; you&#8217;re ready to roll. Anytime you want to find non-commented lines of code, hit your key combo, type your search string and voila!</p>
]]></content:encoded>
			<wfw:commentRss>http://sidawson.com/2012/01/find-non-commented-python-lines-in-komodo.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Twitter Minimaliser</title>
		<link>http://sidawson.com/2011/09/new-twitter-minimaliser.html</link>
		<comments>http://sidawson.com/2011/09/new-twitter-minimaliser.html#comments</comments>
		<pubDate>Sat, 17 Sep 2011 05:37:31 +0000</pubDate>
		<dc:creator>Si</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://sidawson.com/?p=46</guid>
		<description><![CDATA[Twitter recently forced everybody over onto what they&#8217;ve dubbed &#8220;New Twitter.&#8221; It&#8217;s got more functionality than the old version &#8211; which translates to &#8220;a lot more visual clutter.&#8221; I&#8217;d been avoiding it for the most part, simply because I like clean, simple, straightforward. When I&#8217;m using Twitter on the web, I want to read tweets [...]]]></description>
			<content:encoded><![CDATA[<p>Twitter recently forced everybody over onto what they&#8217;ve dubbed &#8220;New Twitter.&#8221;</p>
<p>It&#8217;s got more functionality than the old version &#8211; which translates to &#8220;a lot more visual clutter.&#8221;</p>
<p>I&#8217;d been avoiding it for the most part, simply because I like clean, simple, straightforward. When I&#8217;m using Twitter on the web, I want to read tweets and send tweets. Nothing else.</p>
<p>Now I have no choice (if I&#8217;m using web-based Twitter), I thought I&#8217;d do something about it.</p>
<p>Thus, I present to you! <a href="http://userscripts.org/scripts/show/109381">The New Twitter Minimaliser</a>.</p>
<p>This is a GreaseMonkey script, which means it works if you have the <a href="https://addons.mozilla.org/en-US/firefox/addon/greasemonkey/">GreaseMonkey Add-on</a> (follow that to get it) for Firefox, or if you run Chrome (where a lot of GreaseMonkey scripts run natively).</p>
<p>The <a href="http://userscripts.org/scripts/show/109381">New Twitter Minimaliser</a> does the following:</p>
<p>Removes:</p>
<ul>
<li>Recommended Users</li>
<li>Trends</li>
<li>User Recommendations</li>
<li>The &#8220;Witty Definition&#8221;</li>
<li>Ability to do new style RTs (one click &amp; all done)</li>
</ul>
<p>&nbsp;</p>
<p>Adds:</p>
<ul>
<li>Old Style RT button (where you quote the user &amp; add your comment)</li>
</ul>
<p>It also shrinks the dashboard on the side, and makes the main text area much larger. Ie, focusing the screen real estate on where it&#8217;s most useful.</p>
<p>It doesn&#8217;t screw with any of the code on the page (just the css) so it can&#8217;t add any new bugs. It&#8217;s also carefully optimised so it works very well on 1024&#215;768 screens.</p>
<p>Oddly, now I&#8217;ve been running this script for a while, I actually prefer New Twitter to the old version. It&#8217;s much cleaner &amp; snappier. Functionality wise it&#8217;s a bit of a wash &#8211; some things are easier, some things are harder.</p>
<p>Now, if I could just figure out how to get New Twitter to show me incoming DMs only (like old Twitter did, rather than one mushed up list), I&#8217;d be a super happy camper.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://sidawson.com/2011/09/new-twitter-minimaliser.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Firefox 3.5.5 screwy characters appearing</title>
		<link>http://sidawson.com/2009/11/firefox-355-screwy-characters-appearing.html</link>
		<comments>http://sidawson.com/2009/11/firefox-355-screwy-characters-appearing.html#comments</comments>
		<pubDate>Sun, 08 Nov 2009 05:42:00 +0000</pubDate>
		<dc:creator>Si</dc:creator>
				<category><![CDATA[bugs]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://sidawson.com/?p=16</guid>
		<description><![CDATA[There&#8217;s something that&#8217;s bugged me ever since I upgraded to Firefox 3. Certain pages that used to work perfectly in Firefox 2 suddenly didn&#8217;t. Instead there would be a mess on the page &#8211; lots of square boxes the size of characters with text inside them. Like this or maybe this Typically this would be [...]]]></description>
			<content:encoded><![CDATA[<p>There&#8217;s something that&#8217;s bugged me ever since I upgraded to Firefox 3. Certain pages that used to work perfectly in Firefox 2 suddenly didn&#8217;t.</p>
<p>Instead there would be a mess on the page &#8211; lots of square boxes the size of characters with text inside them. Like this <img src="http://sidawson.com/images/2009/10/comp_1.jpg" alt="comp_1.jpg" height="33" width="64"/> or maybe this <img src="http://sidawson.com/images/2009/10/comp_2.jpg" alt="comp_2.jpg" height="18" width="62"/></p>
<p>Typically this would be some kind of character encoding issue ( the server/browser specifying/requesting UTF-8 instead of ISO-8859-1 etc), or having Auto-Detect universal set off in Firefox &#8211; and most sites around the net propose this as a solution (oh, &amp; also recommend partial reinstalls of your O/S).</p>
<p>Uhh, no.</p>
<p>It&#8217;s actually a compression issue.</p>
<p>If you&#8217;re having this problem, the resolution is this:</p>
<p>Enter into the address bar</p>
<blockquote style="MARGIN-RIGHT: 0px" dir="ltr"><p>about:config</p>
</blockquote>
<p>in the Filter textbox below, type</p>
<blockquote style="MARGIN-RIGHT: 0px" dir="ltr"><p>network.http.accept-encoding</p>
</blockquote>
<p>You can also just start typing &#8220;accept-encoding&#8221; until it appears on the screen.</p>
<p>Double click the network.http.accept-encoding entry.</p>
<p>Now, on my browser, it was set to</p>
<blockquote><p>gzip,deflate;q=0.9,compress;q=0.7</p>
</blockquote>
<p>but should have been</p>
<blockquote style="MARGIN-RIGHT: 0px" dir="ltr"><p>gzip,deflate</p>
</blockquote>
<p>So, type that into the box &amp; hit OK, then restart your browser (just make sure you close all your windows too)</p>
<p>Voila, you can now surf the web without having to constantly switch back to IE.</p>
]]></content:encoded>
			<wfw:commentRss>http://sidawson.com/2009/11/firefox-355-screwy-characters-appearing.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Twitter OAuth Invalid Signature on friendships/create</title>
		<link>http://sidawson.com/2009/10/twitter-oauth-invalid-signature-on.html</link>
		<comments>http://sidawson.com/2009/10/twitter-oauth-invalid-signature-on.html#comments</comments>
		<pubDate>Fri, 23 Oct 2009 04:42:00 +0000</pubDate>
		<dc:creator>Si</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://sidawson.com/?p=15</guid>
		<description><![CDATA[This is a public service announcement. I&#8217;ve been doing a bunch of work with Twitter recently &#38; came across this problem. When trying to do a friendships/create, I get back &#8220;OAuth Invalid Signature.&#8221; I&#8217;m using Tweetsharp v0.15 preview (an excellent product, btw), but I don&#8217;t think this is a Tweetsharp issue, it&#8217;s a Twitter issue. [...]]]></description>
			<content:encoded><![CDATA[<p>This is a public service announcement.</p>
<p>I&#8217;ve been doing a <a href="http://twitcleaner.com/" title="TwitCleaner!">bunch of work</a> with Twitter recently &amp; came across this problem.</p>
<p>When trying to do a friendships/create, I get back &#8220;OAuth Invalid Signature.&#8221;</p>
<p>I&#8217;m using <a href="http://tweetsharp.com/">Tweetsharp</a> v0.15 preview (an excellent product, btw), but I don&#8217;t think this is a Tweetsharp issue, it&#8217;s a Twitter issue. People are really <a href="http://groups.google.com/group/twitter-development-talk/browse_thread/thread/0f3fdb9127d0df96">scratching their heads</a> about it.</p>
<p>The Tweetsharp guys <a href="http://code.google.com/p/tweetsharp/issues/detail?id=89#c10">proposed a solution here</a>, but that didn&#8217;t help me. In fact, the more I googled, the more erroneous solutions I found.</p>
<p>Here&#8217;s my setup. TwitCleaner (the app) has a consumer keys &amp; secret. It would then get an access token/secret for the user, &amp; use that token/secret to make the user follow <a href="http://twitter.com/TheTwitCleaner">@TheTwitCleaner</a>. This is done so we can DM the user when their report is done. We encourage people to unfollow again (if they want to) once they get their report DM.</p>
<p>Anyway, pretty simple. We have valid OAuth token/secret from the user, so that&#8217;s not a problem.</p>
<p>We&#8217;re just trying to make the user follow <a href="http://twitter.com/TheTwitCleaner">@TheTwitCleaner</a>, should be simple, right? No.</p>
<p>I wasted several hours on this. Among the solutions proposed (&amp; wrong) were:</p>
<ul>
<li>You can&#8217;t use a consumer key/secret to follow the user those keys are associated with (ie, TwitCleaner the app has key/secret, but it&#8217;s associated with @TheTwitCleaner the Twitter account)</li>
<li>The OAuth information is incorrect</li>
<li>The request had to be made over https, not http (not something I have control over with TweetSharp, as far as I can tell)</li>
<li>That because I was passing in Client information when making the request, that was gumming things up.</li>
</ul>
<p>Well guess what? It was none of those.</p>
<p>Know what fixed it?</p>
<p><strong>Passing in the username to follow in lower case.</strong></p>
<p>I kid you not.</p>
<p>Now, <a href="http://twitter.com/TheTwitCleaner">@TheTwitCleaner</a> is in Twitter with that combination of upper/lower case, so I was passing it exactly as stored. But no, apparently befriend (<a href="http://apiwiki.twitter.com/Twitter-REST-API-Method:-friendships create">Twitter API friendships/create</a>) needs lower case in order to work reliably.</p>
<p>So now you know. Hope that saves you some pain.</p>
]]></content:encoded>
			<wfw:commentRss>http://sidawson.com/2009/10/twitter-oauth-invalid-signature-on.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>A Nifty Non-Replacing Selection Algorithm</title>
		<link>http://sidawson.com/2008/12/nifty-non-replacing-selection-algorithm.html</link>
		<comments>http://sidawson.com/2008/12/nifty-non-replacing-selection-algorithm.html#comments</comments>
		<pubDate>Tue, 16 Dec 2008 02:40:00 +0000</pubDate>
		<dc:creator>Si</dc:creator>
				<category><![CDATA[Algorithms]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[Software-Engineering]]></category>

		<guid isPermaLink="false">http://sidawson.com/?p=13</guid>
		<description><![CDATA[Algorithms are awesome fun, so I was super pleased when my little bro asked me to help him with a toy problem he had. The description is this: It&#8217;s a secret santa chooser. A group of people, where each person has to be matched up with one other person, but not themselves. He&#8217;s setup an [...]]]></description>
			<content:encoded><![CDATA[<p>Algorithms are awesome fun, so I was super pleased when my little bro asked me to help him with a toy problem he had.</p>
<p>The description is this: It&#8217;s a secret santa chooser. A group of people, where each person has to be matched up with one other person, but not themselves.</p>
<p>He&#8217;s setup an array that has an id for each person.</p>
<p>His initial shot was something like this (pseudo, obviously):</p>
<blockquote>
<pre style="FONT-SIZE: 12px">
foreach $array as $key =&gt; $subarr {
  do {
      // $count is set to count($array)
      $var = rand(0, $count)
  } while $var != $key and $var isn't already assigned
  $array[$key][$assign] = $var
}
</pre>
</blockquote>
<p>Initially he was mostly concerned that rand would get called a lot of times (it&#8217;s inefficient in the language he&#8217;s using).</p>
<p>However, there&#8217;s a ton of neat (non-obvious) problems with this algorithm:</p>
<ol>
<li>By the time we&#8217;re trying to match the last person, we&#8217;ll be calling rand (on average) N-1 times</li>
<li>As a result, it&#8217;s inefficient as hell ( O(3N+1)/2)? )</li>
<li>There is a small chance that on the last call we&#8217;ll actually lock &#8211; since we won&#8217;t have a non-dupe to match with</li>
<li>Not obvious above, but he also considered recreating the array on every iteration of the loop *wince*</li>
</ol>
<p>Add to this some interesting aspects of the language &#8211; immutable arrays (ie, there&#8217;s no inbuilt linked lists, so you can&#8217;t del from the middle of an array/list) &amp; it becomes an interesting problem.</p>
<p>The key trick was to have two arrays:</p>
<p>One, 2-dimensional array (first dim holding keys, second the matches) <br/>and one 1-dimensional array (which will only hold keys, in order).</p>
<p>Let&#8217;s call the first one &#8220;$list&#8221; and the second &#8220;$valid&#8221;.</p>
<p>The trick is this &#8211; $valid holds a list of all remaining valid keys, in the first N positions of the array, where initially N = $valid length. Both $list &amp; $valid are initially loaded with all keys, in order.</p>
<p>So, to pick a valid key, we just select $valid[rand(N)] and make sure it&#8217;s not equal to the key we&#8217;re assigning to. <br/>Then, we do two things:</p>
<ol>
<li>Swap the item at position rand(N) (which we just selected) with the Nth item in the $valid array, &amp;</li>
<li>Decrement N ($key_to_process).</li>
</ol>
<p>This has the neat effect of ensuring that the item we just selected is always at position N+1. So, next time we rand(N), since N is now one smaller, we can be sure it&#8217;s impossible to re-select the just selected item.</p>
<p>Put another way, by the time we finish, $valid will still hold all the keys, just in reverse order that we selected them.</p>
<p>It also means we don&#8217;t have to do any array creation. There&#8217;s still a 1/N chance that we&#8217;ll self-select of course, but there&#8217;s no simple way of avoiding that.</p>
<p>Note that below we don&#8217;t do the swap (since really, why bother with two extra lines of code?) we simply ensure that position rand(N) (ie, $key_no) now holds the key we <strong>didn&#8217;t</strong> select &#8211; ie, the one that is just off the top of the selectable area.</p>
<p>Oh, and in this rand implementation rand(0, N) includes both 0 AND N (most only go 0-&gt;N-1 inclusive).</p>
<blockquote>
<pre style="FONT-SIZE: 12px">
$valid = array_keys($list);
$key_to_process = count($valid) - 1;
do {
  $key_no = rand(0, $key_to_process);
  if ($key_to_process != $valid[$key_no]) {
    $list[$key_to_process][2] = $valid[$key_no];
    $valid[$key_no] = $valid[$key_to_process];
    $key_to_process--;
  }
  # deal with the horrid edge case where the last
  # $list key is equal to the last available
  # $valid key
  if ($key_to_process == 0 and $valid[0] == 0) {
    $key_no = rand(1, count($list) - 1);
    $list[0][2] = $key_no;
    $list[$key_no][2] = 0;
    $key_to_process--;
  }
} while ($key_to_process &gt;= 0);
</pre>
</blockquote>
<p><br/>Without the edge-case code, this results in a super fast, nice slick little 10 or so line algorithm (depending on how/if you count {}&#8217;s <img src='http://sidawson.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Elegant, I dig it.</p>
]]></content:encoded>
			<wfw:commentRss>http://sidawson.com/2008/12/nifty-non-replacing-selection-algorithm.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Trouble With Ratios</title>
		<link>http://sidawson.com/2008/09/trouble-with-ratios.html</link>
		<comments>http://sidawson.com/2008/09/trouble-with-ratios.html#comments</comments>
		<pubDate>Tue, 16 Sep 2008 14:18:00 +0000</pubDate>
		<dc:creator>Si</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Software-Engineering]]></category>

		<guid isPermaLink="false">http://sidawson.com/?p=9</guid>
		<description><![CDATA[Ratios are used all over the place. No huge surprise there &#8211; they are, after all, just one number divided by another. The well known problem case is when the denominator (the bottom bit) is zero, or very near zero. However, there are other subtler issues to consider. Here&#8217;s a chart that has a ratio [...]]]></description>
			<content:encoded><![CDATA[<p>Ratios are used all over the place. No huge surprise there &#8211; they are, after all, just one number divided by another.</p>
<p>The well known problem case is when the denominator (the bottom bit) is zero, or very near zero. However, there are other subtler issues to consider.</p>
<p>Here&#8217;s a chart that has a ratio as the X axis:</p>
<p><img src="http://sidawson.com/images/2008/09/ratio_pre.gif" alt="ratio_pre.gif" height="394" width="500"/></p>
<p>Don&#8217;t sweat the details, they&#8217;re not terribly important &#8211; just the rough distribution.</p>
<p>The X axis in this case is what&#8217;s called a Calmar &#8211; ie, the total dollar return of a system divided by it&#8217;s maximum drawdown. Or, in English &#8211; how much you make proportional to how big your pockets need to be. This gives a non-dollar based (ie, &#8220;pure&#8221;) number that can then be compared across markets, systems, products, whatever.</p>
<p>This graph is actually a bit trickier than that, since there&#8217;s actually 3 dimensions of data there &#8211; it&#8217;s just the third dimension isn&#8217;t plotted &#8211; but we&#8217;ll get back to that.</p>
<p>Where this gets ugly is when, in the case of the Calmar above, the drawdown drops to, or near to, zero. For example, if you have a system that only trades once &#8211; and it&#8217;s a winning trade &#8211; the calmar will be very, very large. Even if you chuck out systems that are obviously a bit nutty like that, you can still end up with situations where the ratio has just blown out of all proportion.</p>
<p>Which results in this:</p>
<p><img src="http://sidawson.com/images/2008/09/ratio_post.gif" alt="ratio_post.gif" height="400" width="500"/></p>
<p>See how everything is in a vertical line on the left?</p>
<p>Well, it&#8217;s not. Those points are actually quite well spread out &#8211; it&#8217;s just that instead of the X axis going from 0-&gt;50 as in the first case, it now goes from 0-&gt;22 million &#8211; of which only a small number are greater than a hundred (you can see them spread out on the right, very close to the Y axis)</p>
<p>In this example, we can see the problem, so we&#8217;re aware of it. However, what if the ratio had been the unplotted third dimension? We might never have known.</p>
<p>Now, the way that I&#8217;m using these ratios internally, I&#8217;m protected from these sorts of blowouts &#8211; I simply compare sets of ratios. If one is bigger, it doesn&#8217;t matter if it&#8217;s bigger by 2 or by 2 billion.</p>
<p>However, there are many situations where you might want proportional representation. If one value is twice as big, say, it should occur twice as often. In this case, ratios that explode out by orders of magnitudes quickly swamp results, and drive the whole thing into the ground.</p>
<p>You swiftly end up with a monoculture. One result eats all the others, and instead of a room full of happy spiders doing their thing, you end up with one fat angry spider in the middle of the room. Umm, so to speak.</p>
<p>Ratios can be dangerous, kids. Watch out!</p>
]]></content:encoded>
			<wfw:commentRss>http://sidawson.com/2008/09/trouble-with-ratios.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Unit Testing &#8211; Necessary, but Not Enough</title>
		<link>http://sidawson.com/2008/07/unit-testing-necessary-but-not-enough.html</link>
		<comments>http://sidawson.com/2008/07/unit-testing-necessary-but-not-enough.html#comments</comments>
		<pubDate>Wed, 02 Jul 2008 11:59:00 +0000</pubDate>
		<dc:creator>Si</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Software-Engineering]]></category>

		<guid isPermaLink="false">http://sidawson.com/?p=6</guid>
		<description><![CDATA[I realised recently that I&#8217;d hit a point of diminishing returns. My overall code base was now so complex that any change I introduced in certain areas was taking exponentially longer to debug &#38; ensure accuracy. Of course, I had a test rig &#8211; otherwise how would I know what I was doing was correct [...]]]></description>
			<content:encoded><![CDATA[<p>I realised recently that I&#8217;d hit a point of diminishing returns. My overall code base was now so complex that any change I introduced in certain areas was taking exponentially longer to debug &amp; ensure accuracy.</p>
<p>Of course, I had a test rig &#8211; otherwise how would I know what I was doing was correct in the first place?</p>
<p>The central core of all my systems is a rebuild of a now antiquated black box trading platform. I don&#8217;t have the source, but I need to duplicate the behaviour.</p>
<p>The test rig is pretty sophisticated &#8211; it didn&#8217;t start that way, and it shouldn&#8217;t really have needed to be, buuuuut</p>
<p>The old system:</p>
<p><strong>1. Calculates using single precision floating point math. <br/></strong> If I need to explain why this is painful, <a href="http://www.office-excel.com/articles/microsoft-excel-fails-simple-math-multiplication.html">check this out</a> &#8211; if even the guys running Excel get occasionally tripped up by floating point math, what hope is there for the rest of us? Single point means there&#8217;s only half as many bits (32) to do maths in vs the default double (64 bits). Rough shorthand, single precision gives you get 6 decimal places. A number like &#8217;12000.25&#8242;, you&#8217;ll lose the &#8217;5&#8242;. If it&#8217;s negative, you&#8217;ll lose the &#8216;.25&#8242;. This means lots of rounding errors, and the more calculations you do, the more errors. The systems I&#8217;m working with do a LOT of calculations.</p>
<p><strong>2. Rounds incoming numbers non deterministically</strong> <br/>Mostly you can guess correctly what it&#8217;s going to decide a market price becomes, but particularly with markets that move in 1/32&#8242;s or 1/64 (ie, not simple decimals), this rounding becomes arbitrary if not damn ornery (rounded? no. up? no. down? no. truncated? no. based on equivalent string length? maybe)</p>
<p><strong>3. Makes &#8216;interesting&#8217; assumptions</strong> <br/>Things like the order that prices get hit, how numbers are calculated internally (eg X = function(A/B) often returns a different result from Y = A/B; X = function(Y), that slippage only occurs in some situations and not others, and so on. Some make sense, in a way, many we don&#8217;t want. So now we have two modes of operation &#8220;old, broken, compatible, testable&#8221; and &#8220;new, not-broken, different numbers, untestable&#8221;</p>
<p><strong>4. Has &#8216;chains&#8217; of internal dependencies. <br/></strong>So, unsurprisingly, any of the above errors will then cascade through the output, fundamentally changing large chunks of the results.</p>
<p><br/>So, the test rig allows for all this. Understands where common rounding problems occur, and how they cascade. Sorts by seriousness of the discrepencies, and so forth. Oh, and it does this by automatically tracking 60 or 70 internal variables for each calculation set across 7000 days on 60 markets. Ie, filtering &amp; matching its way through 20-30 million data points.</p>
<p>But this still isn&#8217;t enough.</p>
<p>And this is where I see the light, and realise that this unit testing stuff that people have been raving about might actually be useful. So far, it has been. It&#8217;s enabled me to auto-scan a ton of possible problems, keep things in alignment as the system adjusts to changing requirements &#8211; all the palava you&#8217;ve read about.</p>
<p>But I&#8217;ve been thinking. No amount of unit testing would catch the errors my test rig will. Not that the rig is that amazing &#8211; just that they&#8217;re operating at fundamentally different levels. Unit testing won&#8217;t tell me:</p>
<p><strong>a)</strong> If I&#8217;ve made a mistake in my logic <br/><strong>b)</strong> If I understand the problem space correctly <br/><strong>c)</strong> If my implementation is correct (in the &#8220;are these answers right?&#8221; sense) <br/><strong>d)</strong> If I understand the problems space &lt;b&gt;thoroughly&lt;/b&gt; (obscure, hard-to-find &amp; subtle edge cases are very common) <br/><strong>e)</strong> If my unit tests are reliable &amp; complete &#8211; have they caught everything?</p>
<p>Unfortunately, thinking about this more, I&#8217;m not convinced that even unit testing PLUS my test rigs (yes, rigs. I lied before. I actually have two, no three, that grill the system from subtly different angles) are going to catch everything.</p>
<p>Of course, it&#8217;s a game of diminishing returns. How much time do I spend testing vs actually delivering resuilts?</p>
<p>Shifting to a higher level language helps &#8211; fewer lines of code = fewer bugs. It&#8217;s still a stop gap though. Programs are only getting larger &amp; more complex.</p>
<p>Better architecture always helps of course &#8211; lower coupling = fewer cascading problems across sub-domains, but when we&#8217;re juggling tens, hundreds, or thousands of subsystems in a larger overall system?</p>
<p>I&#8217;m not convinced there&#8217;s an easy answer. And as software gets more complex, I only see the overall problem spiralling at some high power of that complexity. No matter how clever our test rigs, how well covered in tests our code is.. How do we move forward efficiently without getting bogged down in &#8220;Can we trust the results?&#8221;?</p>
<p>Right now, I just don&#8217;t know.</p>
]]></content:encoded>
			<wfw:commentRss>http://sidawson.com/2008/07/unit-testing-necessary-but-not-enough.html/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

