<?xml version='1.0' encoding='UTF-8'?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-4122965975758737600</id><updated>2008-06-16T15:00:33.386-07:00</updated><title type='text'>Transcendental Technical Travails</title><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default'/><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>17</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-7430452110505027967</id><published>2008-04-21T17:29:00.000-07:00</published><updated>2008-04-26T00:52:23.021-07:00</updated><title type='text'>Left-leaning red-black trees are hard to implement</title><content type='html'>Back in 2002, I needed balanced trees for a project I was working on, so I used  the description and pseudo-code in &lt;u&gt;&lt;a href="http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&amp;amp;tid=8570"&gt;Introduction to Algorithms&lt;/a&gt;&lt;/u&gt; to implement red-black trees.  I vaguely recall spending perhaps two days on implementation and testing.  That &lt;a href="http://www.canonware.com/%7Ettt/rb_2002.h"&gt;implementation&lt;/a&gt; uses C preprocessor macros in order to make it possible to link data structures into one or more red-black trees without requiring container objects.&lt;br /&gt;&lt;br /&gt;About the same time, Niels Provos added a similar implementation to &lt;a href="http://www.openbsd.org/"&gt;OpenBSD&lt;/a&gt;, which was &lt;a href="http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/sys/tree.h"&gt;imported&lt;/a&gt; into &lt;a href="http://www.freebsd.org/"&gt;FreeBSD&lt;/a&gt;, so when I imported &lt;a href="http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/stdlib/malloc.c"&gt;jemalloc&lt;/a&gt; into FreeBSD, I switched from my own red-black tree implementation to the standard one.  Unfortunately, both implementations use nodes that include four pieces of information: parent, left child, right child, and color (red or black).  That typically adds up to 16 or 32 bytes on 32- and 64-bit systems, respectively.  A few months ago I fixed some scalability issues &lt;a href="http://blog.pavlov.net/"&gt;Stuart Parmenter&lt;/a&gt; found in jemalloc by replacing linear searches with tree searches, but that meant adding more tree links.  These trees now take up ~2% of all mapped memory, so I have been contemplating ways to reduce the overhead.&lt;br /&gt;&lt;br /&gt;A couple of weeks ago, I came across some &lt;a href="http://www.cs.princeton.edu/%7Ers/talks/LLRB/RedBlack.pdf"&gt;slides&lt;/a&gt; for a talk that &lt;a href="http://www.cs.princeton.edu/%7Ers/"&gt;Robert Sedgewick&lt;/a&gt; recently gave on left-leaning red-black trees.  His slides pointedly disparage the use of parent pointers, and they also make left-leaning red-black trees look simple to implement.  Left-leaning red-black trees maintain a logical 1:1 correspondence with 2-3-4 B-trees, which is a huge help in understanding seemingly complex tree transformations.&lt;br /&gt;&lt;br /&gt;Last Monday, I started implementing left-leaning red-black trees, expecting to spend perhaps 15 hours on the project.  I'm here more than 60 hours of work later to tell you that left-leaning red-black trees are &lt;span style="font-style: italic;"&gt;hard&lt;/span&gt; to implement, and contrary to Sedgewick's claims, their implementation appears to require approximately the same amount of code and complexity as standard red-black trees.   Part of the catch is that although standard red-black trees have additional cases to deal with due to 3-nodes that can lean left or right, left-leaning red-black trees have a universal asymmetry between the left and right versions of the algorithms.&lt;br /&gt;&lt;br /&gt;If memory overhead weren't my primary concern for this project, I would have dropped red-black trees in favor of treaps.  Unfortunately, treaps require either recursive implementation or parent pointers, and they also require an extra "priority" field, whereas red-black trees can be implemented without recursion or parent pointers, and it is possible to stuff the red-black bit in the least significant bit of one of the left/right pointers.&lt;br /&gt;&lt;br /&gt;For the curious or those in need of such a beast, here is my &lt;a href="http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/stdlib/rb.h"&gt;left-leaning red-black tree implementation&lt;/a&gt;.  &lt;strike&gt;One point of interest is that my benchmarks show it to be ~25% slower than my standard red-black tree implementation.  The red-black bit twiddling overhead only accounts for about 1/5 of the slowdown.  I attribute the other 4/5 to the overhead of transforming the tree on the down pass, rather than lazily fixing up tree structure violations afterward.&lt;/strike&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;[26 April 2008]&lt;/span&gt; I did some further experimentation to understand the performance disparity between implementations.  The benchmarks mentioned above were flawed, in that they always searched for the most recently inserted item.  Since top-down insertion/deletion is more disruptive than lazy fixup, the searches significantly favored the old implementation.  I fixed the benchmarks to compute the times for random searches, random insertions/deletions, and in-order tree traversal.&lt;br /&gt;&lt;br /&gt;The old rb.h and sys/tree.h perform essentially the same for all operations.  The new rb.h takes almost twice as long for insertion/deletion, is the same speed for searches, and is slightly faster for iteration.  Red/black bit twiddling overhead accounts for ~6% of insertion/deletion time, and &lt;3% of search time.&lt;br /&gt;&lt;br /&gt;I am actually quite pleased with these benchmark results, because they show that for random inputs, left-leaning red-black trees do not noticeably suffer from the fact that tree height is O(3h) rather than O(2h), where h is the height of an equivalent fully balanced tree.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2008/04/left-leaning-red-black-trees-are-hard.html' title='Left-leaning red-black trees are hard to implement'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=7430452110505027967&amp;isPopup=true' title='17 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7430452110505027967'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7430452110505027967'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-1102535079239636104</id><published>2008-04-03T14:55:00.000-07:00</published><updated>2008-04-03T16:30:23.977-07:00</updated><title type='text'>Using Mercurial patch queues for daily development</title><content type='html'>I recently watched a &lt;a href="http://video.google.com/videoplay?docid=-7724296011317502612"&gt;video&lt;/a&gt; (&lt;a href="http://www.selenic.com/mercurial/wiki/index.cgi/Presentations?action=AttachFile&amp;amp;do=get&amp;amp;target=google.pdf"&gt;slides&lt;/a&gt;) of Bryan O'Sullivan speaking about Mercurial.  The presentation was mainly a (great) introduction to Mercurial, but I was surprised to learn that &lt;a href="http://www.selenic.com/mercurial/wiki/index.cgi/MqExtension"&gt;Mercurial patch queues&lt;/a&gt; could be useful even when using a repository that I have full commit access to.  In a nutshell, Bryan described how he uses patch queues to checkpoint his work without cluttering the permanent revision history.  Checkpointing is mainly useful to me when I am about to try a risky programming solution on top of reasonable code that only partially implements a feature.  Historically, I have archived my entire sandbox at such critical points, but patch queues are a much cleaner solution; they make it possible to separate work into distinct patches and checkpoint regularly without performing heavyweight archiving operations.  Note that reverting to an earlier state is much easier with patch queues, which makes failed experiments much less costly.  This all sounds great, but it took me several hours and a lot of mistakes to actually figure out how to use patch queues in this fashion, so I'm recording the solution here with the hope that it will be useful to others.&lt;br /&gt;&lt;br /&gt;The first step is to enable the mq extension (see &lt;a href="http://www.selenic.com/mercurial/wiki/index.cgi/MqExtension"&gt;Configuration directions&lt;/a&gt;), though it is enabled by default on my Ubuntu 7.10 systems, and in fact following the standard configuration directions blindly causes some strange warnings.&lt;br /&gt;&lt;br /&gt;Following is a terse example of how to perform every operation that I find useful when using patch queues for daily development:&lt;pre&gt;&lt;span style="color:gray;"&gt;~&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg init pizza&lt;/span&gt; &lt;span style="color:green;"&gt;# Create repository.&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;cd pizza&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "pepperoni" &gt; ingredients&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "black olives" &gt;&gt; ingredients&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "hand-tossed" &gt; crust&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg add ingredients crust&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg commit -m "Initial pizza recipe."&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qinit&lt;/span&gt; &lt;span style="color:green;"&gt;# Initialize unversioned patch queue repository.&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qnew more-ingredients.patch&lt;/span&gt; &lt;span style="color:green;"&gt;# Create new working patch.&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "mushrooms" &gt;&gt; ingredients&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh&lt;/span&gt; &lt;span style="color:green;"&gt;# Checkpoint before creating a new patch.&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qnew specify-sauce.patch&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qseries&lt;/span&gt; &lt;span style="color:green;"&gt;# Look at the patch queue.&lt;/span&gt;&lt;br /&gt;more-ingredients.patch&lt;br /&gt;specify-sauce.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qapplied&lt;/span&gt; &lt;span style="color:green;"&gt;# See which patches are applied.&lt;/span&gt;&lt;br /&gt;more-ingredients.patch&lt;br /&gt;specify-sauce.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "tomato" &gt; sauce&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg add sauce&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qpop&lt;/span&gt; &lt;span style="color:green;"&gt;# Pop patch, in order to work on more-ingredients.patch again.&lt;/span&gt;&lt;br /&gt;Now at: more-ingredients.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qapplied&lt;/span&gt;&lt;br /&gt;more-ingredients.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red"&gt;hg qunapplied&lt;/span&gt;&lt;br /&gt;specify-sauce.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "green peppers" &gt;&gt; ingredients&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg diff&lt;/span&gt;&lt;br /&gt;diff -r 4f3f2d833e6f ingredients&lt;br /&gt;--- a/ingredients       Thu Apr 03 15:38:01 2008 -0700&lt;br /&gt;+++ b/ingredients       Thu Apr 03 15:40:55 2008 -0700&lt;br /&gt;@@ -1,3 +1,4 @@ pepperoni&lt;br /&gt;pepperoni&lt;br /&gt;black olives&lt;br /&gt;mushrooms&lt;br /&gt;+green peppers&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qpush&lt;/span&gt; &lt;span style="color:green;"&gt;# Go back to specify-sauce.patch.&lt;/span&gt;&lt;br /&gt;applying specify-sauce.patch&lt;br /&gt;Now at: specify-sauce.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "marinara" &gt; sauce&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qdiff&lt;/span&gt;&lt;br /&gt;diff -r aadd3ecd3c8e sauce&lt;br /&gt;--- /dev/null   Thu Jan 01 00:00:00 1970 +0000&lt;br /&gt;+++ b/sauce     Thu Apr 03 15:43:00 2008 -0700&lt;br /&gt;@@ -0,0 +1,1 @@&lt;br /&gt;+marinara&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qpop&lt;/span&gt;&lt;br /&gt;Now at: more-ingredients.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh -e&lt;/span&gt; &lt;span style="color:green;"&gt;# Edit commit message.&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qpush&lt;/span&gt;&lt;br /&gt;applying specify-sauce.patch&lt;br /&gt;Now at: specify-sauce.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh -e&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg log&lt;/span&gt;&lt;br /&gt;changeset:   2:6714598e1ccc&lt;br /&gt;tag:         qtip&lt;br /&gt;tag:         tip&lt;br /&gt;tag:         specify-sauce.patch&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:32 2008 -0700&lt;br /&gt;summary:     Specify which sauce to use.&lt;br /&gt;&lt;br /&gt;changeset:   1:cc9c1fdf1038&lt;br /&gt;tag:         qbase&lt;br /&gt;tag:         more-ingredients.patch&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:01 2008 -0700&lt;br /&gt;summary:     Specify more ingredients.&lt;br /&gt;&lt;br /&gt;changeset:   0:d3ee82132d36&lt;br /&gt;tag:         qparent&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:34:29 2008 -0700&lt;br /&gt;summary:     Initial pizza recipe.&lt;br /&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qseries&lt;/span&gt;&lt;br /&gt;more-ingredients.patch&lt;br /&gt;specify-sauce.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qdelete -r more-ingredients.patch&lt;/span&gt; &lt;span style="color:green;"&gt;# Commit patch.&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qdelete -r specify-sauce.patch&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg log&lt;/span&gt;&lt;br /&gt;changeset:   2:6714598e1ccc&lt;br /&gt;tag:         tip&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:32 2008 -0700&lt;br /&gt;summary:     Specify which sauce to use.&lt;br /&gt;&lt;br /&gt;changeset:   1:cc9c1fdf1038&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:01 2008 -0700&lt;br /&gt;summary:     Specify more ingredients.&lt;br /&gt;&lt;br /&gt;changeset:   0:d3ee82132d36&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:34:29 2008 -0700&lt;br /&gt;summary:     Initial pizza recipe.&lt;br /&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qnew modify-crust.patch&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "deep dish" &gt; crust&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qnew experiment.patch&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;echo "pesto" &gt; sauce&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qrefresh&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg log&lt;/span&gt;&lt;br /&gt;changeset:   4:173178d3d17d&lt;br /&gt;tag:         qtip&lt;br /&gt;tag:         tip&lt;br /&gt;tag:         experiment.patch&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 16:16:10 2008 -0700&lt;br /&gt;summary:     [mq]: experiment.patch&lt;br /&gt;&lt;br /&gt;changeset:   3:4961f95336c5&lt;br /&gt;tag:         modify-crust.patch&lt;br /&gt;tag:         qbase&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 16:14:55 2008 -0700&lt;br /&gt;summary:     [mq]: modify-crust.patch&lt;br /&gt;&lt;br /&gt;changeset:   2:6714598e1ccc&lt;br /&gt;tag:         qparent&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:32 2008 -0700&lt;br /&gt;summary:     Specify which sauce to use.&lt;br /&gt;&lt;br /&gt;changeset:   1:cc9c1fdf1038&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:01 2008 -0700&lt;br /&gt;summary:     Specify more ingredients.&lt;br /&gt;&lt;br /&gt;changeset:   0:d3ee82132d36&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:34:29 2008 -0700&lt;br /&gt;summary:     Initial pizza recipe.&lt;br /&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qpop&lt;/span&gt;&lt;br /&gt;Now at: modify-crust.patch&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg qdelete experiment.patch&lt;/span&gt; &lt;span style="color:green;"&gt;# Discard horrible experiment.&lt;/span&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;hg log&lt;/span&gt;&lt;br /&gt;changeset:   3:4961f95336c5&lt;br /&gt;tag:         qtip&lt;br /&gt;tag:         tip&lt;br /&gt;tag:         modify-crust.patch&lt;br /&gt;tag:         qbase&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 16:14:55 2008 -0700&lt;br /&gt;summary:     [mq]: modify-crust.patch&lt;br /&gt;&lt;br /&gt;changeset:   2:6714598e1ccc&lt;br /&gt;tag:         qparent&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:32 2008 -0700&lt;br /&gt;summary:     Specify which sauce to use.&lt;br /&gt;&lt;br /&gt;changeset:   1:cc9c1fdf1038&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:44:01 2008 -0700&lt;br /&gt;summary:     Specify more ingredients.&lt;br /&gt;&lt;br /&gt;changeset:   0:d3ee82132d36&lt;br /&gt;user:        Jason Evans &lt;jasone@canonware.com&gt;&lt;br /&gt;date:        Thu Apr 03 15:34:29 2008 -0700&lt;br /&gt;summary:     Initial pizza recipe.&lt;br /&gt;&lt;br /&gt;&lt;span style="color:gray;"&gt;~/pizza&amp;gt;&lt;/span&gt; &lt;span style="color:red;"&gt;cat sauce&lt;/span&gt;&lt;br /&gt;marinara&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/jasone@canonware.com&gt;&lt;/pre&gt;The trickiest parts of the above are committing/deleting with the qdelete command, and editing the commit message with qrefresh.  I omitted the many ways of messing up the order of operations, so tread lightly and experiment with a toy repository before you use this mode of operation for real.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2008/04/using-mercurial-patch-queues-for-daily.html' title='Using Mercurial patch queues for daily development'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=1102535079239636104&amp;isPopup=true' title='3 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/1102535079239636104'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/1102535079239636104'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-7871702711924122947</id><published>2008-03-11T21:01:00.000-07:00</published><updated>2008-03-12T00:39:17.761-07:00</updated><title type='text'>Migrating from Subversion to Mercurial</title><content type='html'>jemalloc has settled into &lt;a href="http://developer.mozilla.org/"&gt;Firefox&lt;/a&gt; pretty nicely at this point, so after having mostly worked on &lt;a href="http://lyken.net/"&gt;Lyken&lt;/a&gt; for a few weeks while waiting for the dust to settle, I'm planning to start working on adding the necessary functionality to allow the &lt;a href="http://www.mozilla.org/projects/tamarin/"&gt;Tamarin JavaScript engine&lt;/a&gt; to integrate without requiring a separately managed heap for garbage collection.  One of the first things I ran into was that the Tamarin source code is available as a &lt;a href="http://www.selenic.com/mercurial/"&gt;Mercurial&lt;/a&gt; repository, so it seemed like a good time to become familiar with yet another version control system (VCS).&lt;br /&gt;&lt;br /&gt;Over the past ten years, there has been a proliferation of VCS's, especially those supporting distributed development models (&lt;a href="http://www.gnuarch.org/gnuarchwiki/"&gt;Arch&lt;/a&gt;, &lt;a href="http://darcs.net/"&gt;darcs&lt;/a&gt;, &lt;a href="http://www.bitkeeper.com/"&gt;BitKeeper&lt;/a&gt;, &lt;a href="http://git.or.cz/"&gt;git&lt;/a&gt;, &lt;a href="http://svk.bestpractical.com/"&gt;svk&lt;/a&gt;, Mercurial, &lt;a href="http://bazaar-vcs.org/"&gt;Bazaar&lt;/a&gt;, etc.), but for some reason I've found it difficult to get excited about them.  The biggest barrier for me has been perceived complexity, but that is perhaps attributable in part to lack of exposure.  Well, I've been exposed to Mercurial now, and I &lt;span style="font-style: italic;"&gt;really&lt;/span&gt; like it so far.&lt;br /&gt;&lt;br /&gt;I've primarily been using &lt;a href="http://subversion.tigris.org/"&gt;Subversion&lt;/a&gt; for the past several years, and much to my surprise, Mercurial felt completely natural almost right away.  In fact, it was immediately easier to deal with branching and merging than it has ever been  for me with any other VCS.  I have historically avoided branched development when at all possible, because it has been hard to make sure that the VCS was doing what I intended.&lt;br /&gt;&lt;br /&gt;While &lt;a href="http://blog.pavlov.net/"&gt;Stuart&lt;/a&gt; and I were getting jemalloc working in Firefox, we were tossing patches back and forth constantly.  I spent a total of ~2 days just dealing with patch merges, and changes were dropped on the floor on multiple occasions.  It occurs to me now that I could have avoided the majority of this work if we had been using something like Mercurial.  We wouldn't have lost changes, we wouldn't have had mystery failures due to subtle patch conflicts, and so on.&lt;br /&gt;&lt;br /&gt;Mercurial is so cool that I spent almost two full days trying to migrate my Subversion repositories.  In particular, I was initially trying to convert the Lyken repository, which consisted of 1023 revisions and perhaps 1000 files, with a couple of vendor code imports and one temporary branch (all pretty straightforward as repositories go).  I tried all of the following:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://cheeseshop.python.org/pypi/hgsvn"&gt;hgsvn&lt;/a&gt; silently failed to commit 233 files, which made the resulting repository almost completely useless.  I poked around in the code a bit and determined that fixing the problem myself would be a major undertaking.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://hg.rosdahl.net/yasvn2hg"&gt;yahg2svn&lt;/a&gt; could only handle 'trunk', 'branches', and 'tags' at the top level, and I had 'vendor' as well.  I hacked on the code a bit and probably could have gotten it to work eventually, but I moved on in pursuit of easier solutions.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.selenic.com/mercurial/wiki/index.cgi/ConvertExtension"&gt;hg convert&lt;/a&gt;, which is an extension that comes with Mercurial, failed to do more than throw exceptions due to pickling failures.&lt;/li&gt;&lt;li&gt;&lt;a href="http://progetti.arstecnica.it/tailor"&gt;Tailor&lt;/a&gt; mostly worked, though it was completely broken as installed on my Ubuntu/amd64 7.10 system, so I had to install it manually.  It got confused by a handful of revisions, but it merely left them as unmerged branches, and the fallout was minimal.&lt;/li&gt;&lt;/ul&gt;I never did find a complete example of how to use Tailor to convert a Subversion repository to Mercurial format, so here's a bit more detail, in the hope that it will be of use to someone.&lt;br /&gt;&lt;br /&gt;The command line I used was:&lt;pre&gt;tailor -D -v -F "" --configfile lyken.tailor&lt;/pre&gt;The hard part though was coming up with the configuration file.  Of course, the &lt;a href="http://progetti.arstecnica.it/tailor/browser/README.rst"&gt;manual&lt;/a&gt; might have helped, had I found it before writing this blog post.&lt;pre&gt;[DEFAULT]&lt;br /&gt;verbose = True&lt;br /&gt;&lt;br /&gt;[lyken]&lt;br /&gt;target = hg:target&lt;br /&gt;start-revision = INITIAL&lt;br /&gt;root-directory = /home/jasone/tmp&lt;br /&gt;state-file = tailor.state&lt;br /&gt;source = svn:source&lt;br /&gt;subdir = hg_lyken&lt;br /&gt;&lt;br /&gt;[hg:target]&lt;br /&gt;&lt;br /&gt;[svn:source]&lt;br /&gt;module = /&lt;br /&gt;repository = file:///home/jasone/tmp/svn_lyken&lt;/pre&gt;You can peruse the &lt;a href="http://lyken.net/cgi-bin/hg_lyken"&gt;resulting repository&lt;/a&gt; to see what sorts of warts I had to clean up after the conversion.  I have successfully converted several other repositories using the same method.  The &lt;a href="http://www.canonware.com/onyx/"&gt;Onyx&lt;/a&gt; repository is giving Tailor a real workout though, since it consists of 3475 revisions and (this is the killer) due to how &lt;a href="http://cvs2svn.tigris.org/"&gt;cvs2svn&lt;/a&gt; did things back when I switched to Subversion from &lt;a href="http://www.nongnu.org/cvs/"&gt;CVS&lt;/a&gt;, there are 180 extant branches, 47 extant tags, and [&lt;span style="font-style: italic;"&gt;gasp&lt;/span&gt;] 89087 extant files in the latest revision.  It will probably take most of a day for Tailor to complete the conversion, and I can see in the log output that there are going to be a lot of problems in all the spontaneous branches cvs2svn generated.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2008/03/migrating-from-subversion-to-mercurial.html' title='Migrating from Subversion to Mercurial'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=7871702711924122947&amp;isPopup=true' title='2 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7871702711924122947'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7871702711924122947'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-1428591766235810804</id><published>2008-01-26T20:53:00.000-08:00</published><updated>2008-01-26T21:26:12.937-08:00</updated><title type='text'>Perceived jemalloc memory footprint</title><content type='html'>For the past couple of months I have been working with the &lt;a href="http://www.mozilla.com/en-US/about/"&gt;Mozilla&lt;/a&gt; folks to integrate jemalloc into &lt;a href="http://www.mozilla.org/projects/firefox/"&gt;Firefox&lt;/a&gt;.  This past week, &lt;a href="http://blog.pavlov.net/"&gt;Stuart&lt;/a&gt; has been doing lots of performance testing to make sure jemalloc is actually an improvement, and he ran into an interesting problem on Windows: jemalloc appears to use more memory than the default allocator, because Windows' task manager reports mapped memory rather than actual working set.  As near as we could tell, jemalloc was actually reducing the working set a bit, but the perception from looking at the task manager statistics was that jemalloc was a huge pessimization.  This is because jemalloc manages memory in chunks, and leaves each chunk mapped until it is completely empty.  Unfortunately, even though there is a way to tell Windows that unused pages can be discarded if memory becomes tight, appearances make it seem as if jemalloc is hogging memory.  Well, appearances do matter, so I have been working frantically the past few days to come up with a solution.  The upshot is that I may have ended up with a solution to related problems for jemalloc in &lt;a href="http://www.freebsd.org/"&gt;FreeBSD&lt;/a&gt;, its native setting.&lt;br /&gt;&lt;br /&gt;In FreeBSD, there is an optional runtime flag that tells malloc to call &lt;a href="http://www.freebsd.org/cgi/man.cgi?query=madvise&amp;amp;apropos=0&amp;amp;sektion=0&amp;amp;manpath=FreeBSD+7.0-RELEASE&amp;amp;format=html"&gt;madvise(3)&lt;/a&gt; for pages that are still mapped, but for which the data are no longer needed.  This would be great, but madvise() is quite expensive to call, which leaves us with little choice but to disable those calls by default.  What that means is that when memory becomes tight and the kernel needs to free up some RAM, it has to swap out the junk in those pages, just as if the junk were critical data.  The repercussions are system-wide, since pretty much every application has those madvise() calls disabled.&lt;br /&gt;&lt;br /&gt;The solution is pretty straightforward: rather than calling madvise() as soon as pages of memory can be discarded, simply make a note that those pages are dirty.  Then, if the amount of dirty discardable memory exceeds some threshold, march down through memory and call madvise() until the amount of dirty memory has been brought under control.  This tends to vastly reduce the number of madvise() calls, but without ever leaving very much dirty memory laying around.&lt;br /&gt;&lt;br /&gt;I still need to do a bunch of performance analysis before integrating this change into FreeBSD, but my expectation is that as an indirect result of trying to make jemalloc look good on Windows, FreeBSD is going to benefit.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2008/01/perceived-jemalloc-memory-footprint.html' title='Perceived jemalloc memory footprint'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=1428591766235810804&amp;isPopup=true' title='3 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/1428591766235810804'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/1428591766235810804'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-7506238022158910259</id><published>2007-11-28T01:02:00.000-08:00</published><updated>2007-11-28T01:47:19.839-08:00</updated><title type='text'>Firefox fragmentation?</title><content type='html'>As &lt;a href="http://www.mozilla.org/projects/firefox/"&gt;Firefox&lt;/a&gt; 3 nears release, some of its developers are taking a close look at memory fragmentation issues. There is good information over at &lt;a href="http://blog.pavlov.net/2007/11/10/memory-fragmentation/"&gt;pavlov.net&lt;/a&gt; that I won't repeat here. One recurring theme though is that memory usage in version 2 was reported by some users to be problematic, and fragmentation is a suspected culprit.  This has motivated an investigation of memory fragmentation before version 3 is released.&lt;br /&gt;&lt;br /&gt;As the author of &lt;a href="http://people.freebsd.org/%7Ejasone/jemalloc/bsdcan2006/jemalloc.pdf"&gt;jemalloc&lt;/a&gt;, I have a deep (read: obsessive) interest in memory fragmentation issues, so I spent some time brushing off my &lt;a href="http://people.freebsd.org/%7Ejasone/jemalloc/progs/"&gt;malloc plotting tools&lt;/a&gt; today. Here is a plot from a run of firefox 2.0.0.9 running on FreeBSD-current. In order to generate the allocation trace, I launched firefox, then went through several cycles of opening lots of tabs/windows and then closing most of them.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.canonware.com/%7Ettt/firefox_20071128b.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://www.canonware.com/%7Ettt/uploaded_images/firefox2-725374.png" alt="" border="0" /&gt;&lt;/a&gt;Time starts at the left, and execution ends at the right.  Each vertical column of pixels represents a snapshot of memory usage at a particular moment during program execution (time is measured by allocation events).  Since there are millions of allocation events, most snapshots are left out to make the plot size manageable.  Similarly, there are many bytes of memory that must be represented by each vertical column of pixels, so each pixel represents a bucket of 256kB.  Low addresses are at the bottom of the plot.&lt;br /&gt;&lt;br /&gt;Note the peaks that are mostly green.  Those occur during peak memory usage periods, and overall, the plot shows that fragmentation isn't bad.  Take this with a grain of salt though, since the plot only represents perhaps 15 minutes of heavy web browsing.&lt;br /&gt;&lt;br /&gt;If you want to see much more detail (each bucket is 4kB -- one page), take a look at this &lt;a href="http://www.canonware.com/%7Ettt/firefox_20071128a.png"&gt;image&lt;/a&gt;.  It is big enough to cause most image viewers to choke, so beware.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/11/firefox-fragmentation.html' title='Firefox fragmentation?'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=7506238022158910259&amp;isPopup=true' title='1 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7506238022158910259'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7506238022158910259'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-5013906933242226132</id><published>2007-11-10T08:40:00.000-08:00</published><updated>2007-11-24T10:41:56.577-08:00</updated><title type='text'>Fixed-precision (n choose k) and overflow</title><content type='html'>I recently found myself needing to compute (n choose k) with 64-bit integers.  Recall that (n choose k) is equal to n!/[k!(n-k)!].  Mathematically, this is not a difficult computation, but when considered in the context of integer overflow, the problem becomes much harder.&lt;br /&gt;&lt;br /&gt;To illustrate the problem, consider the computation of (9 choose 4) using 8-bit signed integers.  We can start off by doing some straightforward cancellation, which leaves us with [9*8*7*6]/[4*3*2].  Where do we go from here though?  If we multiply all of the terms in the numerator first, we get an intermediate result of [3024]/[4*3*2], which clearly does not fit in the [-128..127] range.  The method that we are taught on paper is to cancel factors until none are left in the denominator, then multiply the remaining factors in the numerator to get the answer.  We can write a program that effectively does the same thing, but do we really have to create vectors of terms and duplicate the hand method?&lt;br /&gt;&lt;br /&gt;I searched high and low for information about how best to implement (n choose k) with fixed precision integers, without success.  While considering the mechanics of coding the hand method, I realized that computing greatest common divisors (GCDs) would be a critical component.  I then began to wonder if there might be an iterative algorithm that does not require manipulating vectors of integers.  Here is what I came up with.  (n choose k) is [n*(n-1)*...*(n-k+1)] / [k*(k-1)*...*1].  Let us call the vectors of terms in the numerator and denominator [C] and [D], respectively, so (n choose k) is [C]/[D].&lt;br /&gt;&lt;ol&gt;&lt;li&gt;If (k &gt; n/2), set k &lt;-- n-k.  This does not change the result, but it reduces the computational overhead for later steps.&lt;/li&gt;&lt;li&gt;Initialize accumulators A and B for the numerator and denominator of the result to 1, so that A/B is 1/1.  Note that upon completion, B will always be 1, thus leaving the result in A, but during computation, B may be greater than 1.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;While possible without overflowing A (and while [C] is non-empty), repeatedly merge the first term of [C] (call it c) into A and remove the term from [C].  This is achieved via the following steps:&lt;/li&gt;&lt;ol&gt;&lt;li&gt;Divide g &lt;-- GCD(c, B).&lt;/li&gt;&lt;li&gt;B &lt;-- B/g and c &lt;-- c/g.  This removes common factors.&lt;/li&gt;&lt;li&gt;A &lt;-- A*c.&lt;/li&gt;&lt;/ol&gt;&lt;li&gt;While possible without overflowing B (and while [D] is non-empty), repeatedly merge the first term of [D] into B and remove the term from [D].  This is achieved using the same algorithm as for step 3.&lt;/li&gt;&lt;li&gt;If no progress was made in steps 3 or 4, fail due to unavoidable overflow.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;If [C] or [D] is non-empty, go back to step 3.&lt;/li&gt;&lt;/ol&gt;Here is a &lt;a href="http://www.canonware.com/%7Ettt/nk.c"&gt;reference implementation in C&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Since implementing the algorithm, I have been troubled by a seemingly simple question: does this algorithm ever fail even though the final result can be expressed without overflow?  My intuition is that the algorithm always succeeds, but a proof has thus far eluded me.  I have exhaustively tested the algorithm for 32-bit integers, and the algorithm never fails.  Unfortunately, I really need to move on to other work, since the algorithm certainly works well enough for my needs.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/11/fixed-precision-n-choose-k-and-overflow.html' title='Fixed-precision (n choose k) and overflow'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=5013906933242226132&amp;isPopup=true' title='12 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/5013906933242226132'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/5013906933242226132'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-4277080200976186681</id><published>2007-07-09T12:37:00.000-07:00</published><updated>2007-07-09T14:35:56.556-07:00</updated><title type='text'>Tagged unboxed floating point numbers</title><content type='html'>Several modern programming language implementations employ a representation of object reference slots that is self-describing, in order to facilitate run-time type checks and automatic garbage collection.  By reserving one or more bits to indicate the type of data stored within the slot, it is possible to differentiate a pointer (also known as a reference to a "boxed" object) from, say, an integer (also known as an "unboxed" integer).&lt;br /&gt;&lt;br /&gt;Suppose that reference slots are 64-bits wide, and that 61 bits can be used to store an unboxed integer.  Assuming signed integers, we can store an integer in [-2^60..2^60), but outside that range, we are forced to create a boxed integer object and store a reference to that object.  This is in fact how &lt;a href="http://lyken.net/"&gt;Lyken&lt;/a&gt; implements integers (though it preserves 62 bits of accuracy for integers).&lt;br /&gt;&lt;br /&gt;Now, suppose that we want to support double-precision floating point numbers.  The fundamental approaches taken by every implementation I have found in the literature are to either 1) box &lt;span style="font-style: italic;"&gt;all&lt;/span&gt; floating point numbers, or 2) to use a combination of boxed floating point numbers and untagged floating point numbers.  As one might imagine, (1) can cause serious performance degradation for numerically intensive programs, due to the need to create new boxed objects to store the result of each floating point computation.  As for (2), there are numerous papers that discuss various compilation strategies for finding opportunities to use &lt;span style="font-style: italic;"&gt;untagged&lt;/span&gt; unboxed floating point numbers, but these techniques appear to to be limited to particular problem domains, since they mainly try to convert vectors of floating point numbers to be untagged and unboxed.  Nowhere have I found any mention whatsoever of using &lt;span style="font-style: italic;"&gt;tagged&lt;/span&gt; unboxed floating point numbers.&lt;br /&gt;&lt;br /&gt;Let us consider the IEEE 754 floating point number format to see what challenges there are to tagged unboxed double-precision floating point numbers.   (If you are unfamiliar with the format, I suggest taking a look at the &lt;a href="http://en.wikipedia.org/wiki/IEEE_754"&gt;Wikipedia&lt;/a&gt; page for an overview.)  There are three fields: 1) sign, 2) exponent, and 3) fraction.&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;s&lt;/span&gt;&lt;span style="color: rgb(255, 0, 0);"&gt;eeeeeee eeee&lt;/span&gt;&lt;span style="color: rgb(51, 204, 0);"&gt;ffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Suppose we were to steal 3 bits from the fraction field.  In order to avoid losing precision, we would have to box all numbers that did not have a particular bit pattern for those 3 stolen bits, thus allowing us to unbox perhaps 12.5% of the time.  This is not compelling.&lt;br /&gt;&lt;br /&gt;What if we were to instead steal bits from the exponent?  This is much more useful, because it allows us to accurately store all values except those with the most extreme exponent values.  Of course, there are programs that actually need the full range of exponent values, but they are by no means the common case.&lt;br /&gt;&lt;br /&gt;There are some details that make such unboxing more work than for integers:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The exponent must be re-biased.&lt;/li&gt;&lt;li&gt;It is harder to remove the exponent bits, since they are internal.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;There are special values that require special handling (+-0.0, +-Inf, NaN).&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Explaining the nuances in words is rather tedious, so actual code follows instead.&lt;pre&gt;typedef union {&lt;br /&gt;  uint64_t u;&lt;br /&gt;  int64_t i;&lt;br /&gt;  double r;&lt;br /&gt;} LktRealUnion;&lt;br /&gt;&lt;br /&gt;void&lt;br /&gt;LkRealNew(LktSlot *aReal, double aVal) {&lt;br /&gt;    LktRealUnion val;&lt;br /&gt;    val.r = aVal;&lt;br /&gt;&lt;br /&gt;    // Check whether +-0.0.&lt;br /&gt;    if (val.u &amp; 0x7fffffffffffffffLLU) {&lt;br /&gt;        LktRealUnion unboxed;&lt;br /&gt;&lt;br /&gt;        // Re-bias the exponent by subtracting 896.  This makes the useful&lt;br /&gt;        // exponent range for unboxed reals [-127..128].&lt;br /&gt;        unboxed.u = val.u - 0x3800000000000000LLU;&lt;br /&gt;        // Check that the most significant 3 exponent bits are 0.&lt;br /&gt;        if (unboxed.u &amp; 0x7000000000000000LLU) {&lt;br /&gt;            if ((val.u &amp;amp; 0x7ff0000000000000LLU) == 0x7ff0000000000000LLU) {&lt;br /&gt;                // Special value (Inf or NaN).&lt;br /&gt;                uint64_t sign = (val.u &amp; 0x8000000000000000LLU);&lt;br /&gt;                unboxed.u &lt;&lt;= 3;&lt;br /&gt;                unboxed.u &amp;= 0x7fffffffffffffffLLU; // Clear sign bit.&lt;br /&gt;                unboxed.u |= sign;&lt;br /&gt;                unboxed.u |= 0x3; // Tag.&lt;br /&gt;                aReal-&gt;u.b = unboxed.u;&lt;br /&gt;            } else {&lt;br /&gt;                // Overflow; box.&lt;br /&gt;&lt;br /&gt;                // [...]&lt;br /&gt;            }&lt;br /&gt;        } else {&lt;br /&gt;            uint64_t sign = (val.u &amp; 0x8000000000000000LLU);&lt;br /&gt;            unboxed.u &lt;&lt;= 3;&lt;br /&gt;            // Sign bit is already cleared as a result of exponent re-biasing.&lt;br /&gt;            unboxed.u |= sign;&lt;br /&gt;            unboxed.u |= 0x3; // Tag.&lt;br /&gt;            aReal-&gt;u.b = unboxed.u;&lt;br /&gt;        }&lt;br /&gt;    } else {&lt;br /&gt;        // +-0.0.&lt;br /&gt;        aReal-&gt;u.b = val.u | 0x3; // Tag.&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;double&lt;br /&gt;LkRealGet(LktSlot *aReal) {&lt;br /&gt;    LktRealUnion val;&lt;br /&gt;&lt;br /&gt;    LkmAssert(LkSlotTypeGet(aReal) == LkRealType());&lt;br /&gt;&lt;br /&gt;    // Checked whether boxed.&lt;br /&gt;    val.u = aReal-&gt;u.b;&lt;br /&gt;    if ((val.u &amp; 0x7) == 0x3) {&lt;br /&gt;        // Check whether +-0.0.&lt;br /&gt;        if (val.u &amp;amp; 0x7ffffffffffffff8LLU) {&lt;br /&gt;            val.i &gt;&gt;= 3; // Sign-extended shift preserves the sign bit.&lt;br /&gt;            val.u &amp;= 0x8fffffffffffffffLLU; // Clear upper exponent bits.&lt;br /&gt;            // Check whether a special value (Inf or NaN).&lt;br /&gt;            if ((val.u &amp; 0x0ff0000000000000LLU) != 0x0ff0000000000000LLU) {&lt;br /&gt;                // Re-bias the exponent by adding 896.&lt;br /&gt;                val.u += 0x3800000000000000LLU;&lt;br /&gt;            } else {&lt;br /&gt;                // Special value.  Set all exponent bits.&lt;br /&gt;                val.u |= 0x7ff0000000000000LLU;&lt;br /&gt;            }&lt;br /&gt;        } else {&lt;br /&gt;            // +-0.0.&lt;br /&gt;            val.u &amp;= 0x8000000000000000LLU;&lt;br /&gt;        }&lt;br /&gt;        return val.r;&lt;br /&gt;    } else {&lt;br /&gt;        // Boxed.&lt;br /&gt;        LktReal *r = (LktReal *) aReal-&gt;u.p;&lt;br /&gt;        return r-&gt;val;&lt;br /&gt;    }&lt;br /&gt;}&lt;br /&gt;&lt;/pre&gt;As you can see, unboxed floating point numbers do incur some overhead, but for typical applications, they appear to me to be a big improvement over uniformly boxed floating point numbers.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/07/tagged-unboxed-floating-point-numbers.html' title='Tagged unboxed floating point numbers'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=4277080200976186681&amp;isPopup=true' title='5 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4277080200976186681'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4277080200976186681'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-8669289762429936226</id><published>2007-06-19T13:51:00.000-07:00</published><updated>2007-06-19T14:37:24.893-07:00</updated><title type='text'>Unicode strings for Lyken</title><content type='html'>&lt;a href="http://lyken.net/"&gt;Lyken&lt;/a&gt;, a programming language I am currently developing, uses &lt;a href="http://unicode.org/"&gt;Unicode&lt;/a&gt; for all strings.  Lyken is just one of many languages that has to overcome a set of design challenges associated with Unicode, though at least Lyken itself has no legacy support requirements.  However, since Lyken's runtime library is written in C, I still have to devise a way to provide pure Unicode string support in Lyken, without making runtime library development overly cumbersome.&lt;br /&gt;&lt;br /&gt;A couple of years ago I decided to use a simplistic internal representation for strings in Lyken.  The idea was to maintain an ASCII representation of each string that was purely ASCII, but to also maintain a UCS-4 representation of every string (lazily created for pure ASCII strings).  This had a critical problem though:  C library interfaces use (char *) strings, thus making it impossible to use non-ASCII strings for many purposes.  This problem made it clear that I needed to somehow support UTF-8 in Lyken's runtime library.&lt;br /&gt;&lt;br /&gt;One possible approach would be to internally store each string both as UTF-8 and UCS-4, but that is a tremendous waste of memory both for ASCII and non-ASCII strings.  Instead, I have decided to just store strings as UTF-8, but that has performance issues for indexed access.&lt;br /&gt;&lt;br /&gt;In order to mitigate the indexed access performance issue for UTF-8, I store a lazily initialized table that records the location of every n&lt;sup&gt;th&lt;/sup&gt; character (n=32 for now).  Immutable strings make lazy table initialization safe for multi-threaded programs, with no need for synchronization.  The table is only needed for non-ASCII strings and is known to be present just past the end of the string itself iff the string's byte/character lengths differ.&lt;br /&gt;&lt;br /&gt;I have searched for information on better approaches to solving the indexed access problem for UTF-8 strings, but have found nothing.  If you know of anything better, please let me know.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/06/unicode-strings-for-lyken.html' title='Unicode strings for Lyken'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=8669289762429936226&amp;isPopup=true' title='2 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/8669289762429936226'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/8669289762429936226'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-7386317068589373107</id><published>2007-05-06T19:53:00.000-07:00</published><updated>2007-05-06T20:08:46.239-07:00</updated><title type='text'>Why the overwhelming silence?</title><content type='html'>I uploaded an updated version of the &lt;a href="http://www.canonware.com/download/Parsing/Parsing.py"&gt;Parsing module&lt;/a&gt;.  The changes are minor, which is a good indicator of the code's maturity when you consider that I continue to use it heavily to create new parsers.  This is a really solid parser generator, yet the public reception has been overwhelming silence.&lt;br /&gt;&lt;br /&gt;I can guess why this might be, but given the generally sad state of Python-based parser generator software, I expected that at least &lt;i&gt;someone&lt;/i&gt; would find Parsing useful.  I would really appreciate hearing from people who evaluate the software, why you decide not to use it, so that I can potentially do a better job of meeting the programming community's needs.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/05/why-overwhelming-silence.html' title='Why the overwhelming silence?'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=7386317068589373107&amp;isPopup=true' title='6 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7386317068589373107'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7386317068589373107'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-4409980734471037674</id><published>2007-03-22T22:16:00.000-07:00</published><updated>2007-03-22T22:52:45.458-07:00</updated><title type='text'>A simple Parsing-based parser example</title><content type='html'>As requested by several people, I have uploaded a simple &lt;a href="http://www.canonware.com/download/Parsing/examples/example1.py"&gt;example parser&lt;/a&gt; that uses the &lt;a href="http://www.canonware.com/download/Parsing/Parsing.py"&gt;Parsing&lt;/a&gt; module.  It is pretty self explanatory, so I encourage you to take a look at it, run it, and experiment with changes.&lt;br /&gt;&lt;br /&gt;That done, I should say a bit more about how I actually use the Parsing module.  Well, the first thing I did with it was to write a parser for a parser generator input language similar to what &lt;a href="http://www.cs.berkeley.edu/~smcpeak/elkhound/sources/elkhound/index.html"&gt;Elkhound&lt;/a&gt; supports.  The parser translates the input to two output files, Token.py and Ast.py, which contain code that the Parsing module can generate a parser from.  Here are a few example productions from Lyken's grammar specification:&lt;br /&gt;&lt;pre&gt;token comment=xclass:"comment" {&lt;br /&gt;    inline Init ${&lt;br /&gt;        self.val = raw&lt;br /&gt;    }$&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;fail pStmtList &lt;{pExpr};&lt;br /&gt;nonterm StmtList=[e] {&lt;br /&gt;    -&gt; Stmt(Stmt, semicolon);&lt;br /&gt;    -&gt; DelimitedExpr(DelimitedExpr) [pStmtList];&lt;br /&gt;&lt;br /&gt;    -&gt; ExtendStmt(StmtList, Stmt, semicolon);&lt;br /&gt;    -&gt; ExtendDelimitedExpr(StmtList, DelimitedExpr) [pStmtList];&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;nonterm Stmts {&lt;br /&gt;    -&gt; Empty;&lt;br /&gt;    -&gt; Stmt(Stmt);&lt;br /&gt;    -&gt; StmtList(StmtList);&lt;br /&gt;}&lt;br /&gt;&lt;br /&gt;start Module=[S] {&lt;br /&gt;    -&gt; Module(boi, DocStr, ModuleDecl, Version, InitialCBlock, Stmts);&lt;br /&gt;}&lt;/pre&gt; There are numerous features not represented above, but the general idea should be apparent.  Note that embedded code can be associated with productions.  This allows me to do some pretty highly stylized code generation, yet still embed custom code where necessary.&lt;br /&gt;&lt;br /&gt;One of the non-obvious clauses above is the "=[S]" that follows "start Module".  This is extra annotation that says the Module production provides an outer lexical scope.  By supporting such custom annotations, I am able to automatically generate code that deals with many aspects of semantic analysis.  This is also one of the main reasons I haven't seen fit to release the grammar specification parser -- it is not obvious to me how to generalize such features in a way that everyone can benefit.  At the moment I am of the opinion that the low level docstring-based interface to the Parsing module is good for small- to medium-size parsers, and that for large parsers, you need to write a custom translator that I can't hope to guess the needs of.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/03/simple-parsing-based-parser-example.html' title='A simple Parsing-based parser example'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=4409980734471037674&amp;isPopup=true' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4409980734471037674'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4409980734471037674'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-8217957316746017705</id><published>2007-03-19T21:08:00.000-07:00</published><updated>2007-03-19T21:44:42.575-07:00</updated><title type='text'>Parsing.py parser generator is now available</title><content type='html'>The parser generator I implemented has been quite stable for over a month now.  It has the potential to be of use to others, so I am making it publicly available.  Parsing.py is a stand-alone pure Python module.  This makes it easy to maintain and use, but as a result it is substantially slower than C-based parser generators and parsers.  That is the only negative thing I can think of to say though.  In my obviously biased opinion, the Parsing module is extremely cool.  If you need to implement a parser in Python, you should give it a serious look.&lt;br /&gt;&lt;br /&gt;Here is a quick summary of what the Parsing module is:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;True LR(1) parser generator.  Python slowness aside, the algorithms used are extremely scalable; I am currently using it for a grammar with well over 500 productions.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Both standard LR (aka CFSM) and GLR parser drivers.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Tight Python integration.  Parser generator directives are specified via docstrings.  Rather than running a parser generator as a separate step, it is done on the fly, and the results are cached in a pickle.  For subsequent runs, as long as the pickle is still compatible with the parser specification, the pickle is used directly.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Extensive error checking and logging.  You can get a very clear idea of what is going wrong, during both parser generation and parsing, by enabling logfile output.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;The module is heavily documented via docstrings.  The easiest way to view the documentation in a reasonable format is via the interactive python command line (import Parsing; help(Parsing)).  It is worth mentioning here that you need Python 2.5.  As far as I know, the (... if ... else ...) expression syntax is the only reason for this dependency, so if you want to use Parsing.py with an older Python interpreter, porting it should not cause you much trouble.&lt;br /&gt;&lt;br /&gt;Okay, without further delay, here it is: &lt;a href="http://www.canonware.com/download/Parsing/Parsing.py"&gt;http://www.canonware.com/download/Parsing/Parsing.py&lt;/a&gt;</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/03/parsingpy-parser-generator-is-now.html' title='Parsing.py parser generator is now available'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=8217957316746017705&amp;isPopup=true' title='3 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/8217957316746017705'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/8217957316746017705'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-7519567402426707415</id><published>2007-02-13T10:48:00.000-08:00</published><updated>2007-02-13T11:43:09.215-08:00</updated><title type='text'>More forgotten parser generation algorithms</title><content type='html'>When writing a grammar specification that is input to a parser generator, the most natural way of describing the grammar is often ambiguous.  There are two solutions: 1) rewrite the grammar to be less obvious, or 2) use precedence rules to disambiguate conflicting actions.  In practice, I find myself using both approaches, according to which is least distasteful for the ambiguity at hand.&lt;br /&gt;&lt;br /&gt;However, I have lately run into a series of bugs in the &lt;a href="http://www.lyken.net/"&gt;Lyken&lt;/a&gt; parser that are a result of the following steps: 1) disambiguate the grammar using precedences, 2) continue grammar development.  What happens is that the precedence specifications added in step (1) end up being inadvertently employed to disambiguate additions made in step (2), but in many cases not as I would have chosen, were I presented with an ambiguity to resolve.  The result is obscured bugs that only show up when parsing code that exercises the appropriate broken portions of the parser.&lt;br /&gt;&lt;br /&gt;I started thinking about how to avoid these masked ambiguities, and realized that in many cases it is impossible, due to the precedence machinery provided by virtually every parser generator in existence (if there is any precedence support at all).  Here is a typical set of precedence specifications as supported by YACC.&lt;br /&gt;&lt;blockquote&gt;%left '+' '-'&lt;br /&gt;%left '*' '/'&lt;/blockquote&gt;The operator sets are listed from lowest to highest precedence so that multiplication/division has higher precedence than addition/subtraction.&lt;br /&gt;&lt;br /&gt;For simple examples, it is hard to see what is wrong with this scheme, but for more complex grammars, there is a problem:  It is impossible to declare a precedence relationship between a production and, say, addition/subtraction, without incidentally declaring a precedence relationship with &lt;span style="font-style:italic;"&gt;every other precedence&lt;/span&gt;.  In essence, we are stuck with a linearization of what should really be a directed acyclic graph (DAG) of precedence relationships.&lt;br /&gt;&lt;br /&gt;Apparently, before LR parsing became the norm, there was a more limited method called &amp;ldquo;precedence parsing&amp;rdquo;.  The precedence support we have in LR-family parser generators apparently was added to subsume precedence parsing, making pure precedence parsing completely obsolete.  The problem is, we are stuck with a special case of precedence parsing, even though the general case was worked out and published (&lt;a href="#gray_1973"&gt;Gray 1973&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;I have implemented DAG-based precedence specification support in my parser generator, and indeed it solves the problem described earlier.  Since the DAG can be disjoint, it is possible to, for example, disambiguate a reduce/reduce conflict without any possibility of masking conflicts due to later grammar additions.  Naturally, much care is still required when using precedence specifications for disambiguation, but with the DAG-based approach, at least I am no longer hobbled by an unnecessary limitation.&lt;br /&gt;&lt;hr /&gt;&lt;span style="font-weight:bold;"&gt;References&lt;/span&gt;&lt;br /&gt;&lt;a name="gray_1973"&gt;Gray, James N. and Michael A. Harrison (1973) Canonical Precedence Schemes.  JACM 20(2), 214-234.&lt;/a&gt;</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/02/more-forgotten-parser-generation.html' title='More forgotten parser generation algorithms'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=7519567402426707415&amp;isPopup=true' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7519567402426707415'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7519567402426707415'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-745378922682407320</id><published>2007-01-11T16:34:00.000-08:00</published><updated>2007-02-13T11:47:35.984-08:00</updated><title type='text'>Parser generation algorithms of a bygone era</title><content type='html'>In my &lt;a href="http://www.canonware.com/%7Ettt/2006/12/deprogramming-lalr1-bias.html"&gt;most recent post&lt;/a&gt;, I talked about the disadvantages of LALR parser generation as compared to the more general LR method.  Although I now believe even more fervently that LR parser generation is the clear winner over LALR, I need to retract part of what I said, and expound on what really works.  Indeed, I have since then transcended a technical travail, thanks in part to some most helpful comments from readers.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;History&lt;/span&gt;&lt;br /&gt;First, let us informally go over a bit of history.  The earliest high level languages were a real challenge to create, partly because the designers did not at that time have adequate language parsing theory at their disposal.  This gave rise to a flurry of research, and in 1965, Donald Knuth published the seminal LR paper &lt;a href="#knuth_1965"&gt;(Knuth, 1965)&lt;/a&gt;.  This provided a foundation upon which to base general understanding of parsing, but the method for generating LR parsers was extremely computationally intensive, both in terms of space and time.  (This method can be found in the &lt;a href="#aho_1986"&gt;dragon book&lt;/a&gt;.)  Eventually, researchers began to publish derivative methods that were more practical.  Of particular note are the SLR (&lt;a href="#deremer_1971"&gt;DeRemer, 1971&lt;/a&gt;) and LALR (&lt;a href="#lalonde_1971"&gt;LaLonde, 1971&lt;/a&gt;) methods.  The original &lt;a href="http://en.wikipedia.org/wiki/Yacc"&gt;yacc&lt;/a&gt; LALR parser generator came into use in about 1973.  In ten years, we went from having little understanding of parsing theory, to having powerful tools.  Perhaps this is why later work apparently went unnoticed.  In particular, Pager (&lt;a href="#pager_1977"&gt;1977&lt;/a&gt;) seems to have been practically forgotten.&lt;br /&gt;&lt;br /&gt;LALR was devised as a compromise method to get around the extreme cost of the standard LR method.  However, Pager's method delivers full LR power, without any downside.  Pager's method merges states during construction wherever possible, and the generated parser is approximately the same size as that generated by LALR.  Computationally, the algorithm is on par with LALR.  Finally, the algorithm is no more challenging to implement than LALR [&lt;a href="#footnote_1"&gt;1&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Travails&lt;/span&gt;&lt;br /&gt;Now, back to my travails.  I have been developing a parser generator as supporting software for &lt;a href="http://www.lyken.net/"&gt;Lyken&lt;/a&gt;, which is a language I am currently developing.  Parser generation is a means to an end for the purposes of Lyken, so I was hoping to spend as little time as possible on this part of the project.  In fact, I started out using &lt;a href="http://www.hwaci.com/sw/lemon/"&gt;lemon&lt;/a&gt;, which is an excellent LALR parser generator.  However, as the Lyken grammar neared its final form, I found that the convolutions necessary to wedge it into a form lemon could handle had left me with an unmaintainable mess.  Even small modifications were often requiring far-reaching changes, which was going to force me to do some substantial work to normalize the AST (abstract syntax tree) in order to make the guts of the compiler immune to superfluous parser changes [&lt;a href="#footnote_2"&gt;2&lt;/a&gt;].&lt;br /&gt;&lt;br /&gt;Eventually, I decided to go back and look more carefully at what GLR parsing could do to alleviate the problems I was having.  There is an excellent paper by Scott McPeak that describes in detail how to implement a practical GLR parser (&lt;a href="#mcpeak_2002"&gt;McPeak 2002&lt;/a&gt;), and I was sufficiently convinced of GLR's ability to help me clean up the Lyken parser mess that I took a step back and implemented a parser generator from scratch [&lt;a href="#footnote_3"&gt;3&lt;/a&gt;].  The parser generator ended up taking about 200 hours to implement, not because it requires lots of coding, but because I had to learn a lot along the way.&lt;br /&gt;&lt;br /&gt;I ran into a myriad of stumbling blocks, among which the most serious are listed below:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The only reference I initially had for SLR/LALR/LR parser generation was the dragon book.  Opinions surely differ, but I found it to be rather inscrutable; full comprehension took many, many readings.  I was not confident that I understood what I was reading, so I had to implement the algorithms as I read.  Since the book builds up, from LR(0) to SLR(1) to LR(1) to LALR(1), I ended up implementing the first three variants (before realizing that LALR was not what I wanted).&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The naive LR(1) algorithm requires exponential time and space.  This was an issue 40 years ago, and to my great surprise (and mild shame, truth be told), it is still an issue now.  Of course, during testing, I was using relatively small grammars, so I did not recognize the critical nature of this issue until just after I thought I was completely done.  Since this was all Python code, there was some question as to whether C code would be fast enough.  25 hours of effort demonstrated that Python could only be blamed for 2-3 orders of magnitude slowdown, and I was looking at a 5+ order of magnitude problem.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;The literature on LR parser generation is quite old at this point, so when I did start trying to find information on better approaches, the web was of surprisingly little help.  I finally went to the trouble of re-joining the &lt;a href="http://acm.org/"&gt;ACM&lt;/a&gt; so that I could access its &lt;a href="http://portal.acm.org/dl.cfm"&gt;digital library&lt;/a&gt;, where I found several papers of interest.  At about the same time, &lt;a href="http://www.blogger.com/profile/18278436935541738068"&gt;Chris K&lt;/a&gt; pointed me at Pager's paper via &lt;a href="http://cristal.inria.fr/~fpottier/menhir/"&gt;Menhir&lt;/a&gt;, and from there the solution to LR parser generation unfolded rapidly.&lt;/li&gt;&lt;li&gt;There was an &lt;a href="http://www.cs.berkeley.edu/~smcpeak/elkhound/reduceViaPath_bug.html"&gt;erratum&lt;/a&gt; for the algorithms presented in the Elkhound technical report, and I did not find the notice online until after I had discovered and fixed the problem myself.  This may not sound like a big deal, but my first attempt at solving the problem was very wrong, and it masked other problems with my implementation that did not surface until I gained a sufficient understanding of the erratum to fix the problem correctly.&lt;/li&gt;&lt;li&gt;The Elkhound technical report does not discuss how to deal with &amp;epsilon;-grammars, and I was unable to obtain any references on the subject.  Thus, I had to reinvent that wheel.  This, in combination with the erratum just mentioned, provided the setting for some difficult debugging.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Overall, the time requirements for this project were about what I estimated beforehand, but what actually took up that time surprised me in the extreme.&lt;br /&gt;&lt;hr /&gt;&lt;span style="font-weight: bold;"&gt;Footnotes&lt;br /&gt;&lt;/span&gt;&lt;a name="footnote_1"&gt;&lt;span style="font-weight: bold;"&gt;[1]&lt;/span&gt;&lt;/a&gt;  In the interest of full disclosure, I mention here that Pager presents two methods for determining state compatibility during state merging.  "Weak compatibility" is in practice perfectly adequate, and simple to implement, though there are some edge cases, rarely encountered in practice, in which a limited number of states are duplicated.  "Strong compatibility" guarantees that the state machine is minimal, though it requires a bit more sophisticated programming, and appears to be included in the paper mainly for completeness, rather than due to practical need.&lt;br /&gt;&lt;br /&gt;&lt;a name="footnote_2"&gt;&lt;span style="font-weight: bold;"&gt;[2]&lt;/span&gt;&lt;/a&gt;  Also of note, Lyken requires two tokens of lookahead in one case, but this was not a critical parser issue, since I was able to convert the lookahead within the scanner.&lt;br /&gt;&lt;br /&gt;&lt;a name="footnote_3"&gt;&lt;span style="font-weight: bold;"&gt;[3]&lt;/span&gt;&lt;/a&gt;  &lt;a href="http://www.cs.berkeley.edu/~smcpeak/elkhound/"&gt;Elkhound&lt;/a&gt; appears to be an excellent parser generator, but I am bootstrapping Lyken with Python.  I had already gone through enough trouble gluing Lyken's C-based scanner/parser and the Python portions of the compiler that I felt it worthwhile to move to writing the bootstrap compiler purely in Python if I was going to be rewriting the parser anyway.  Unfortunately, there were no parser generation tools for Python that met my needs, which is why I wrote one from scratch.&lt;br /&gt;&lt;hr /&gt;&lt;span style="font-weight: bold;"&gt;References&lt;/span&gt;&lt;br /&gt;&lt;a name="aho_1986"&gt;Aho, Alfred V., Ravi Sethi, and Jeffrey D. Ullman (1986) Compilers: Principles, Techniques, and Tools.  Addison-Wesley Publishing Company, Reading, MA.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name="deremer_1971"&gt;DeRemer, F.L. (1971) Simple LR(k) grammars.  Comm. ACM 14, 453-460.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name="knuth_1965"&gt;Knuth, Donald E. (1965) On the translation of languages from left to right.  Information and Control 8, 607-639.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name="lalonde_1971"&gt;LaLonde, W.R. (1971) An efficient LALR parser generator.  Computer Systems Research Group, University of Toronto, Tech. Report CSRG-2.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name="mcpeak_2002"&gt;McPeak, Scott (2002) Elkhound: A Fast, Practical GLR Parser Generator.  University of California, Berkeley, Tech. Report UCB/CSD-2-1214.&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name="pager_1977"&gt;Pager, David (1977) A Practical General Method for Constructing LR(k) Parsers.  Acta Informatica 7, 249-268.&lt;/a&gt;</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2007/01/parser-generation-algorithms-of-bygone.html' title='Parser generation algorithms of a bygone era'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=745378922682407320&amp;isPopup=true' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/745378922682407320'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/745378922682407320'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-4338303907846884102</id><published>2006-12-29T10:18:00.000-08:00</published><updated>2007-02-13T11:46:28.297-08:00</updated><title type='text'>Deprogramming the LALR(1) Bias</title><content type='html'>Look around for LR(1) parser generators, and you will primarily find LALR(1) parser generators.  There seems to be an unspoken assumption that LALR(1) is somehow better than LR(1), but look at the following pertinent facts:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;In terms of grammars that can be handled by various parser generation techniques, SLR(1) &amp;sub; LALR(1) &amp;sub; LR(1). &lt;/li&gt;&lt;li&gt;SLR(1) and LALR(1) tables are always the same size, but LR(1) tables are potentially larger.&lt;/li&gt;&lt;/ul&gt;The important thing to notice here is that LALR(1) is &lt;span style="font-style: italic;"&gt;not&lt;/span&gt; the most powerful of the three parser generation techniques listed above.  LALR(1) may introduce reduce/reduce conflicts that do not exist when using LR(1).&lt;br /&gt;&lt;br /&gt;So then, why is there a bias toward LALR(1)?  I suspect that it has to do with the well known and widely cited &lt;a href="http://en.wikipedia.org/wiki/Compilers:_Principles%2C_Techniques%2C_and_Tools"&gt;dragon book&lt;/a&gt;, which treats LALR(1) as the culmination of the parser generation algorithms it presents.  There is one little detail that is not mentioned at all though: it is possible to compress LR(1) tables to be the same as LALR(1) tables, with the exception of avoiding table compression that would introduce LALR(1)'s reduce/reduce conflicts.  Granted, the book is over 20 years old now, but I would not be surprised if the &lt;a href="http://en.wikipedia.org/wiki/Compilers:_Principles%2C_Techniques%2C_and_Tools_%282nd_Edition%29"&gt;new edition&lt;/a&gt; preserves this omission.&lt;br /&gt;&lt;br /&gt;I recently finished implementing a parser generator (as the basis for a GLR parser, as described by the &lt;a href="http://www.cs.berkeley.edu/%7Esmcpeak/elkhound/"&gt;Elkhound&lt;/a&gt; &lt;a href="http://www.cs.berkeley.edu/%7Esmcpeak/elkhound/elkhound.ps"&gt;technical report&lt;/a&gt;).  I was initially unable to wrap my head around the "efficient" method that the dragon book provides for LALR(1) parser generation, so I took the incremental approach of implementing SLR(1), converting that to LR(1), then finally converting that to LALR(1) -- the "easy, inefficient" method according to the dragon book.  Well, when I got to the point of compressing LR(1) tables to LALR(1) tables, I questioned the necessity of compressing states when reduce/reduce conflicts would result.  As near as I could tell, there is no fundamental requirement to do so, which means that compressed LR(1) tables that avoid such conflicts should be the gold standard, rather than LALR(1).&lt;br /&gt;&lt;br /&gt;This seemed too obvious to have been overlooked by the compiler community, so I looked high and low for prior art.  The best I found was a &lt;a href="http://groups-beta.google.com/group/comp.compilers/browse_thread/thread/c02b5fff15f1a2ca/e19dadb035501123?&amp;amp;hl=en"&gt;comp.compilers post&lt;/a&gt; by Chris F. Clark on 1 July 2003, which says, in part:&lt;blockquote&gt;  That brings us back to "full" or canonical LR(k), where k is &gt;= 1.&lt;br /&gt;LR(k) parsing preserves the needed left context to disambiguate&lt;br /&gt;certain situations where LALR(k) finds the rules to conflict.  The&lt;br /&gt;canonical way of doing this is to not merge states with different&lt;br /&gt;lookaheads in the first place. That causes an explosion in the&lt;br /&gt;table size as many states are kept distinct simply because they&lt;br /&gt;have different lookaheads, when in reality the lookahead for those&lt;br /&gt;states will never be consulted.  A more modern method for solving&lt;br /&gt;the problem involves splitting the states only when a conflict is&lt;br /&gt;detected.  Then, if the grammar is LALR, no splitting will occur&lt;br /&gt;and one has the same size machine as the LR(0), but as conflicts&lt;br /&gt;arise the require left context to resolve, the tables slowly grow.&lt;br /&gt;&lt;/blockquote&gt;There is likely some discussion of this optimization somewhere in the primary literature, but several hours of searching failed to turn it up.  The closest I found is &lt;a href="http://edocs.tu-berlin.de/diss/2001/kannapinn_soenke.pdf"&gt;S&amp;ouml;nke Kannapinn's PhD thesis&lt;/a&gt; (in German, but the abstract is also in English), which comes to the same conclusion via a very different route.&lt;br /&gt;&lt;br /&gt;In summary, if you need to write a parser generator, make it LR(1) and use careful table compression, rather than using the less general LALR(1) approach.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2006/12/deprogramming-lalr1-bias.html' title='Deprogramming the LALR(1) Bias'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=4338303907846884102&amp;isPopup=true' title='12 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4338303907846884102'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4338303907846884102'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-2684554762264857707</id><published>2006-12-29T09:48:00.000-08:00</published><updated>2006-12-29T10:15:46.558-08:00</updated><title type='text'>Transactional Memory: Panacea or Confounder?</title><content type='html'>There is a very nice review article on the subject of  transactional memory (TM) as applied to programming languages in &lt;a href="http://www.acmqueue.org/modules.php?name=Content&amp;pa=showpage&amp;amp;pid=444"&gt;ACM Queue, Vol. 4 No. 10&lt;/a&gt;.  The article does a reasonable job of describing the basics of transaction processing, and describes how these techniques could be used to directly expose transaction semantics as a built-in programming language feature.&lt;br /&gt;&lt;br /&gt;Transaction processing has been actively researched by the databases community since well before my time, probably for 30+ years.  This is important, because it means that the really new ideas embodied by TM are limited to approximately the following:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Transactions can be exposed as a general purpose language construct (not just in database languages).&lt;/li&gt;&lt;li&gt;Hardware can be modified to reduce TM overhead.&lt;/li&gt;&lt;/ul&gt;These ideas are interesting enough, and certainly warrant research.  However, there are some basic compromises that transactions require, and nothing about TM changes that.&lt;br /&gt;&lt;br /&gt;There is a fundamental difference of approach when writing code that uses synchronization primitives (mutexes, condition variables, etc.) for deadlock/livelock-free algorithms, versus transaction processing.  For the former, we specifically write code that will always make forward progress, while trying to limit the synchronization overhead.  With transactions, we are working at a much higher level, and leave it to the transaction manager to detect when conflicts arise, then (hopefully) make progress even when transactions conflict.&lt;br /&gt;&lt;br /&gt;So, I think TM is both panacea and confounder:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Panacea:&lt;/span&gt; Programmers can work at a higher level that doesn't require thinking as hard about parallel execution.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Confounder:&lt;/span&gt; When working with transactions instead of low level synchronization, there is a lack of detailed information that prevents achieving maximum performance for many applications.&lt;/li&gt;&lt;/ul&gt;TM may be the right tool for some programming jobs, but it is almost certainly not the right tool for all programming jobs.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2006/12/transactional-memory-panacea-or.html' title='Transactional Memory: Panacea or Confounder?'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=2684554762264857707&amp;isPopup=true' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/2684554762264857707'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/2684554762264857707'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-7762599443283022681</id><published>2006-12-03T11:50:00.000-08:00</published><updated>2006-12-03T13:01:18.366-08:00</updated><title type='text'>Impacts of multi-core processing on programming language design</title><content type='html'>Within the next couple of years, all modern desktop computers will have multiple CPUs, thanks to multi-core packaging.  This doesn't matter very much in the context of programming languages though until the number of CPUs crosses a certain threshold, say, 4-8 CPUs.  There are a few reasons for this, including:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;2 CPUs aren't a very compelling motivation for the significant extra development effort required for multi-threaded programming.&lt;/li&gt;&lt;li&gt;There are numerous hackish programming solutions that work reasonably  well when targeting sub-4-CPU systems, which reduces the pressure to develop general solutions.&lt;/li&gt;&lt;li&gt;General solutions have some inherent overhead that substantially eats into the performance gains.  Depending on the programming model, general solutions need on the order of 4-8 CPUs before the overhead is an acceptable tradeoff for the improved scalability.&lt;/li&gt;&lt;/ul&gt;I have been waiting for multi-threaded programming to provide truly scalable performance gains for general computing ever since I started out developing software on IBM's OS/2 operating system in the early 1990s.  The crossover point has been "a few years" away for 15 years now, but now it's so close I can smell it.&lt;br /&gt;&lt;br /&gt;Researchers have long since developed the raw technology necessary to see us through this transition, but current tools are woefully lacking.  Let me briefly describe the shortcomings of some of the tools we currently have available.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;C/C++ pthreads:&lt;/span&gt; C/C++ are dangerous, difficult languages when it comes to writing reliable software, leaving alone multi-threading.  The single-image programming model that pthreads provides adds insult to injury.  Even experts have a very difficult time writing reliable multi-threaded C/C++ programs.  On top of this, development aids such as debuggers are of limited use here, because of the &lt;a href="http://www.catb.org/%7Eesr/jargon/html/H/heisenbug.html"&gt;Heisenbug Principle&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Java, C#:&lt;/span&gt; These languages improve on the C/C++ situation by providing language-level threading support, and the runtime environments improve analysis and debugging prospects.  For the next few years, this is probably the best we're going to do, but the single-image programming model is rather difficult to deal with, even under the best of circumstances.  I think there will always be a place for languages like these, but that as the number of CPUs in computers increases, these will be increasingly seen as low-level languages.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Erlang:&lt;/span&gt; Erlang relies entirely on message passing for communication among threads (let's ignore for now that Erlang's terminology differs).  Threads do not explicitly share memory.  There are two problems with this:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Erlang actually runs all threads inside a single process that uses only one CPU.  This is an implementation detail, but in practice it limits flexibility, and the workarounds are less than ideal.&lt;/li&gt;&lt;li&gt;The overhead of passing all data as messages between threads is very expensive, depending on the application.  This is a general concern, but at some threshold number of CPUs, I expect it to become an acceptable cost of developing highly scalable multi-threaded software.&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Perl, Python, Ruby, etc.:&lt;/span&gt; These languages vary in their approaches to multi-threading, but I think it fair to say that none of them provide scalable, useful multi-threaded development support.  I find this noteworthy because this class of languages is of increasing importance both for scripting and for larger-scale systems programming.  These languages will have to adapt if they are to maintain their value as systems programming languages.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;I alluded above to a division between two approaches to multi-threaded programming: 1) single-image and 2) message-passing.  Right now, (1) is of primary importance (and the time to worry about it really is &lt;span style="font-style: italic;"&gt;now&lt;/span&gt;).  (2) is creeping up on us quickly though; I predict that it will really start mattering when we start dealing with anything more than about 8 CPUs.  Why?  Because in my experience, the only reliable approach to writing software that scales beyond 8 CPUs is to rely mostly on message passing.&lt;br /&gt;&lt;br /&gt;Okay, here's where I pull out my crystal ball.  I predict that five years from now, the languages that provide the highest productivity with regard to multi-threaded programming will make message-passing easy (i.e. it will be the primary mode of multi-threaded development), and shared-image-based threading possible.  Right now, none of the available languages I'm aware of provide this focus.  What really concerns me though is that of the primary "scripting" languages, none are even &lt;span style="font-style: italic;"&gt;close&lt;/span&gt; to providing the necessary programming infrastructure, let alone the appropriate focus on methodology.  This is where my attentions are currently focused, and I expect many of my future ramblings will relate to the topic.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2006/12/impacts-of-multi-core-processing-on.html' title='Impacts of multi-core processing on programming language design'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=7762599443283022681&amp;isPopup=true' title='4 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7762599443283022681'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/7762599443283022681'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry><entry><id>tag:blogger.com,1999:blog-4122965975758737600.post-4521638830851834071</id><published>2006-12-02T12:03:00.000-08:00</published><updated>2007-02-13T11:44:45.572-08:00</updated><title type='text'>Things to come</title><content type='html'>For some time, I've been writing notes on what I would blog about, were I to have a blog.  Well, I have a blog now.  Given the blog name, it should come as no surprise that forthcoming posts will bore family members within an inch of death.  Maybe others will derive some value though.&lt;br /&gt;&lt;br /&gt;Without further delay, here's an incomplete list of topics to come:&lt;ul&gt;&lt;li&gt;Garbage collection: Fast suspend/resume with critical section support&lt;/li&gt;  &lt;li&gt;Effects of multi-core computing on programming language design&lt;/li&gt;  &lt;li&gt;Why virtual machines matter&lt;/li&gt;  &lt;li&gt;Unicode's impact on programming language design&lt;/li&gt;  &lt;li&gt;Phylogenetic inference: Star decomposition is biased!&lt;/li&gt;&lt;/ul&gt;This is the sort of stuff that keeps me up at night.  If nothing else, it can put others to sleep at night.</content><link rel='alternate' type='text/html' href='http://www.canonware.com/~ttt/2006/12/things-to-come.html' title='Things to come'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=4122965975758737600&amp;postID=4521638830851834071&amp;isPopup=true' title='0 Comments'/><link rel='replies' type='application/atom+xml' href='http://www.canonware.com/~ttt/atom.xml' title='Post Comments'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4521638830851834071'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4122965975758737600/posts/default/4521638830851834071'/><author><name>Jason</name><uri>http://www.blogger.com/profile/02753358958139234827</uri><email>noreply@blogger.com</email></author></entry></feed>