12-22-12 - Data Considered Harmful

I believe that the modern trend of doing some very superficial data analysis to prove a point, or support your argument is extremely harmful. It leads to a false impression of a scientific basis to arguments that is in fact spurious.

I've been thinking about this for a while, but this washingtonpost blog about the correlation of video games and gun violence recently popped into my blog feed, so I'll use it as an example.

The Washington Post blog leads you to believe that the data shows an unequivocal lack of correlation between videogames and gun violence. That's retarded. It only takes one glance at the chart to see that the data is completely dominated by other factors, like probably most strongly the gun ownership rate. You can't possibly try to find the effect of a minor contributing factor without normalizing for other factors, which most of these "analyses" fail to do, which makes them totally bogus. Furthermore, as usual, you would need a much larger sample size to have any confidence in the data, and you'd have to question the selection of data that was done. Also the entire thing being charted is wrong; it shouldn't be video game spending per capita, it should be video games played per capita (especially with China on there), and it shouldn't be gun-related murders, it should be all murders (because the fraction of murders that is gun related varies strongly by gun control laws, while the all murders rate varies more directly with the level of economic and social development in a country).

(Using data and charts and graphs has been a very popular way to respond to the recent shootings. Every single one that I've seen is complete nonsense. People just want to make a point that they've previously decided, so they trot out some data to "prove it" or make it "non-partisan" as if their bogus charts somehow make it "factual". It's pathetic. Here's a good example of using tons of data to show absolutely nothing . If you want to make an editorial point, just write your opinion, don't trot out bogus charts to "back it up". )

It's extremely popular these days to "prove" that some intuition is wrong by finding some data that shows a reverse correlation. (blame Freakonomics, among other things). You get lots of this in the smarmy TED talks - "you may expect that stabbing yourself in the eye with a pencil is harmful, but in fact these studies show that stabbing yourself in the eye is correlated to longer life expectancy!" (and then everyone claps). The problem with all this cute semi-intellectualism is that it's very often just wrong.

Aside from just poor data analysis, one of the major flaws with this kind of reasoning is the assumption that you are measuring all the inputs and all the outputs.

An obvious case is education, where you get all kinds of bogus studies that show such-and-such program "improves learning". Well, how did you actually measure learning? Obviously something like cutting music programs out of schools "improves learning" if you measure "learning" in a myopic way that doesn't include the benefits of music. And of course you must also ask what else was changed between the measured kids and the control (selection bias, novelty effect, etc; essentially all the studies on charter schools are total nonsense since any selection of students and new environment will produce a short term improvement).

I believe that choosing the wrong inputs and outputs is even worse than the poor data analysis, because it can be so hidden. Quite often there are some huge (bogus) logical leaps where the article will measure some narrow output and then proceed to talk about it as if it was just "better". Even when your data analysis was correct, you did not show it was better, you showed that one specific narrow output that you chose to measure improved, and you have to be very careful to not start using more general words.

(one of the great classic "wrong output" mistakes is measuring GDP to decide if a government financial policy was successful; this is one of those cases where economists have in fact done very sophisticated data analysis, but with a misleadingly narrow output)

Being repetitive : it's okay if you are actually very specific and careful not to extrapolate. eg. if you say "lowering interest rates increased GDP" and you are careful not to ever imply that "increased GDP" necessarily means "was good for the economy" (or that "was good for the economy" meant "was good for the population"); the problem is that people are sloppy, in their analysis and their implications and their reading, so it becomes "lowering interest rates improved the well-being of the population" and that becomes accepted wisdom.

Of course you can transparently see the vapidity of most of these analyses because they don't propagate error bars. If they actually took the errors of the measurement, corrected for the error of the sample size, propagated it through the correlation calculation and gave a confidence at the end, you would see things like "we measured a 5% improvement (+- 50%)" , which is no data at all.

I saw Bryan Cox on QI recently, and there was some point about the US government testing whether heavy doses of LSD helped schizophrenics or not. Everyone was aghast but Bryan popped up with "actually I support data-based medicine; if it had been shown to help then I would be for that therapy". Now obviously this was a jokey context so I'll cut Cox some slack, but it does in fact reflect a very commonly held belief these days (that we should trust the data more than our common sense that it's a terrible idea). And it's just obviously retarded on the face of it. If the study had shown it to help, then obviously something was wrong with the study. Medical studies are almost always so flawed that it's hard to believe any of them. Selection bias is huge, novelty and placebo effect are huge; but even if you really have controlled for all that, the other big failure is that they are too short term, and the "output" is much too narrow. You may have improved the thing you were measuring for, but done lots of other harm that you didn't see. Perhaps they did measure a decrease in certain schizophrenia symptoms (but psychotic lapses and suicides were way up; oops that wasn't part of the output we measured).

Exercise/dieting and child-rearing are two major topics where you are just bombarded with nonsense pseudo-science "correlations" all the time.

Of course political/economic charts are useless and misleading. A classic falsehood that gets trotted out regularly is the charts showing "the economy does better under democrats" ; for one thing the sample size is just so small that it could be totally random ; for another the economy is more effected by the previous president than the current ; and in almost every case huge external factors are massively more important (what's the Fed rate, did Al Gore recently invent the internet, are we in a war or an oil crisis, etc.). People love to show that chart but it is *pure garbage* , it contains zero information. Similarly the charts about how the economy does right after a tax raise or decrease; again there are so many confounding factors and the sample size is so tiny, but more importantly tax raises tend to happen when government receipts are low (eg. economic growth is already slowing), while tax cuts tend to happen in flush times, so saying "tax cuts lead to growth" is really saying "growth leads to growth".

What I'm trying to get at in this post is not the ridiculous lack of science in all these studies and "facts", but the way that the popular press (and the semi-intellectual world of blogs and talks and magazines) use charts and graphs to present "data" to legitimize the bogus point.

I believe that any time you see a chart or graph in the popular press you should look away.

I know they are seductive and fun, and they give you a vapid conversation piece ("did you know that christmas lights are correlated with impotence?") but they in fact poison the brain with falsehoods.

Finally, back to the issue of video games and violence. I believe it is obvious on the face of it that video games contribute to violence. Of course they do. Especially at a young age, if a kid grows up shooting virtual men in the face it has to have some effect (especially on people who are already mentally unstable). Is it a big factor? Probably not; by far the biggest factor in violence is poverty, then government instability and human rights, then the gun ownership rate, the ease of gun purchasing, etc. I suspect that the general gun glorification in America is a much bigger effect, as is growing up in a home where your parents had guns, going to the shooting range as a child, rappers glorifying violence, movies and TV. Somewhere after all that, I'm sure video games contribute. The only thing we can actually say scientifically is that the effect is very small and almost impossible to measure due to the presence of much larger and highly uncertain factors.

(of course we should also recognize that these kind of crazy school shooting events are completely different than ordinary violence, and statistically are a drop in the bucket. I suspect the rare mass-murder psycho killer things are more related to a country's mental health system than anything else. Pulling out the total murder numbers as a response to these rare psychotic events is another example of using the wrong data and then glossing over the illogical jump.)

I think in almost all cases if you don't play pretend with data and just go and sit quietly and think about the problem and tap into your own brain, you will come to better conclusions.


12-21-12 - File Name Namespaces on Windows

A little bit fast and loose but trying to summarize some insanity from a practical point of view.

Windows has various "namespaces" or classes of file names :

1. DOS Names :

"c:\blah" and such.

Max path of 260 including drive and trailing null. Different cases refer to the same file, *however* different unicode encodings of the same character do *NOT* refer to the same file (eg. things like "accented e" and "e + accent previous char" are different files). See previous posts about code pages and general unicode disaster on Windows.

I'm going to ignore the 8.3 legacy junk, though it still has some funny lingering effects on even "long" DOS names. (for example, the longest path name length allowed is 244 characters, because they require room for an 8.3 name after the longest path).

2. Win32 Names :

This includes all DOS names plus all network paths like "\\server\blah".

The Win32 APIs can also take the "\\?\" names, which are sort of a way of peeking into the lower-level NT names.

Many people incorrectly think the big difference with the "\\?\" names is that the length can be much longer (32768 instead of 260), but IMO the bigger difference is that the name that follows is treated as raw characters. That is, you can have "/" or "." or ".." or whatever in the name - they do not get any processing. Very scary. I've seen lots of code that blindly assumes it can add or remove "\\?\" with impunity - that is not true!

"\\?\c:\" is a local path

"\\?\UNC\server\blah" is a network name like "\\server\blah"

Assuming you have your drives shared, you can get to yourself as "\\localhost\c$\"

I think the "\\?\" namespace is totally insane and using it is a Very Bad Idea. The vast majority of apps will do the wrong thing when given it, and many will crash.

3. NT names :

Win32 is built on "ntdll" which internally uses another style of name. They start with "\" and then refer to the drivers used to access them, like :


In the NT namespace network shares are named :

Pre-Vista :

\Device\LanmanRedirector\<some per-user stuff>\server\share

Vista+ : Lanman way and also :


And the NT namespace has a symbolic link to the entire Win32 namespace under "\Global??\" , so


is also a valid NT name, (and "\??\" is sometimes valid as a short version of "\Global??\").

What fun.

12-21-12 - File Handle to File Name on Windows

There are a lot of posts about this floating around, most not quite right. Trying to sort it out once and for all. Note that in all cases I want to resolve back to a "final" name (that is, remove symlinks, substs, net uses, etc.) I do not believe that the methods I present here guarantee a "canonical" name, eg. a name that's always the same if it refers to the same file - that would be a nice extra step to have.

This post will be code-heavy and the code will be ugly. This code is all sloppy about buffer sizes and string over-runs and such, so DO NOT copy-paste it into production unless you want security holes. (a particular nasty point to be wary of is that many of the APIs differ in whether they take a buffer size in bytes or chars, which with unicode is different)

We're gonna use these helpers to call into windows dlls :

template <typename t_func_type>
t_func_type GetWindowsImport( t_func_type * pFunc , const char * funcName, const char * libName , bool dothrow)
    if ( *pFunc == 0 )
        HMODULE m = GetModuleHandle(libName);
        if ( m == 0 ) m = LoadLibrary(libName); // adds extension for you
        ASSERT_RELEASE( m != 0 );
        t_func_type f = (t_func_type) GetProcAddress( m, funcName );
        if ( f == 0 && dothrow )
            throw funcName;
        *pFunc = f;
    return (*pFunc); 

// GET_IMPORT can return NULL
#define GET_IMPORT(lib,name) (GetWindowsImport(&STRING_JOIN(fp_,name),STRINGIZE(name),lib,false))

// CALL_IMPORT throws if not found
#define CALL_IMPORT(lib,name) (*GetWindowsImport(&STRING_JOIN(fp_,name),STRINGIZE(name),lib,true))
#define CALL_KERNEL32(name) CALL_IMPORT("kernel32",name)
#define CALL_NT(name) CALL_IMPORT("ntdll",name)

I also make use of the cblib strlen, strcpy, etc. on wchars. Their implementation is obvious.

Also, for reference, to open a file handle just to read its attributes (to map its name) you use :

    HANDLE f = CreateFile(from,

(also works on directories).

Okay now : How to get a final path name from a file handle :

1. On Vista+ , just use GetFinalPathNameByHandle.

GetFinalPathNameByHandle gives you back a "\\?\" prefixed path, or "\\?\UNC\" for network shares.

2. Pre-Vista, lots of people recommend mem-mapping the file and then using GetMappedFileName.

This is a bad suggestion. It doesn't work on directories. It requires that you actually have the file open for read, which is of course impossible in some scenarios. It's just generally a non-robust way to get a file name from a handle.

For the record, here is the code from MSDN to get a file name from handle using GetMappedFileName. Note that GetMappedFileName gives you back an NT-namespace name, and I have factored out the bit to convert that to Win32 into MapNtDriveName, which we'll come back to later.

BOOL GetFileNameFromHandleW_Map(HANDLE hFile,wchar_t * pszFilename,int pszFilenameSize)
    BOOL bSuccess = FALSE;
    HANDLE hFileMap;

    pszFilename[0] = 0;

    // Get the file size.
    DWORD dwFileSizeHi = 0;
    DWORD dwFileSizeLo = GetFileSize(hFile, &dwFileSizeHi); 

    if( dwFileSizeLo == 0 && dwFileSizeHi == 0 )
        lprintf(("Cannot map a file with a length of zero.\n"));
        return FALSE;

    // Create a file mapping object.
    hFileMap = CreateFileMapping(hFile, 

    if (hFileMap) 
        // Create a file mapping to get the file name.
        void* pMem = MapViewOfFile(hFileMap, FILE_MAP_READ, 0, 0, 1);

        if (pMem) 
            if (GetMappedFileNameW(GetCurrentProcess(), 
                //pszFilename is an NT-space name :
                //pszFilename = "\Device\HarddiskVolume4\devel\projects\oodle\z.bat"

                wchar_t temp[2048];

            bSuccess = TRUE;

        return FALSE;


3. There's a more direct way to get the name from file handle : NtQueryObject.

NtQueryObject gives you the name of any handle. If it's a file handle, you get the file name. This name is an NT namespace name, so you have to map it down of course.

The core code is :


ObjectBasicInformation, ObjectNameInformation, ObjectTypeInformation, ObjectAllInformation, ObjectDataInformation


typedef struct _UNICODE_STRING {
  USHORT Length;
  USHORT MaximumLength;
  PWSTR  Buffer;


    WCHAR NameBuffer[1];


IN HANDLE ObjectHandle, IN OBJECT_INFORMATION_CLASS ObjectInformationClass, OUT PVOID ObjectInformation, IN ULONG Length, OUT PULONG ResultLength )
= 0;

    char infobuf[4096];
    ULONG ResultLength = 0;



    wchar_t * ps = pinfo->NameBuffer;
    // info->Name.Length is in BYTES , not wchars
    ps[ pinfo->Name.Length / 2 ] = 0;

    lprintf("OBJECT_NAME_INFORMATION: (%S)\n",ps);

which will give you a name like :

    OBJECT_NAME_INFORMATION: (\Device\HarddiskVolume1\devel\projects\oodle\examples\oodle_future.h)

and then you just have to pull off the drive part and call MapNtDriveName (mentioned previously but not yet detailed).

Note that there's another call that looks appealing :


but NtQueryInformationFile seems to always give you just the file name without the drive. In fact it seems possible to use NtQueryInformationFile and NtQueryObject to separate the drive part and path part.

That is, you get something like :

t: is substed to c:\trans

LogDosDrives prints :

T: : \??\C:\trans

we ask about :

fmName : t:\prefs.js

we get :

NtQueryInformationFile: "\trans\prefs.js"
NtQueryObject: "\Device\HarddiskVolume4\trans\prefs.js"

If there was a way to get the drive letter, then you could just use NtQueryInformationFile , but so far as I know there is no simple way, so we have to go through all this mess.

On network shares, it's similar but a little different :

y: is net used to \\charlesbpc\C$

LogDosDrives prints :

Y: : \Device\LanmanRedirector\;Y:0000000000034569\charlesbpc\C$

we ask about :

fmName : y:\xfer\path.txt

we get :

NtQueryInformationFile: "\charlesbpc\C$\xfer\path.txt"
NtQueryObject: "\Device\Mup\charlesbpc\C$\xfer\path.txt"

so in that case you could just prepend a "\" to NtQueryInformationFile , but again I'm not sure how to know that what you got was a network share and not just a directory, so we'll go through all the mess here to figure it out.

4. MapNtDriveName is needed to map an NT-namespace drive name to a Win32/DOS-namespace name.

I've found two different ways of doing this, and they seem to produce the same results in all the tests I've run, so it's unclear if one is better than the other.

4.A. MapNtDriveName by QueryDosDevice

QueryDosDevice gives you the NT name of a dos drive. This is the opposite of what we want, so we have to reverse the mapping. The way is to use GetLogicalDriveStrings which gives you all the dos drive letters, then you can look them up to get all the NT names, and thus create the reverse mapping.

Here's LogDosDrives :

void LogDosDrives()
    #define BUFSIZE 2048
    // Translate path with device name to drive letters.
    wchar_t szTemp[BUFSIZE];
    szTemp[0] = '\0';

    // GetLogicalDriveStrings
    //  gives you the DOS drives on the system
    //  including substs and network drives
    if (GetLogicalDriveStringsW(BUFSIZE-1, szTemp)) 
      wchar_t szName[MAX_PATH];
      wchar_t szDrive[3] = (L" :");

      wchar_t * p = szTemp;

        // Copy the drive letter to the template string
        *szDrive = *p;

        // Look up each device name
        if (QueryDosDeviceW(szDrive, szName, MAX_PATH))
            lprintf("%S : %S\n",szDrive,szName);

        // Go to the next NULL character.
        while (*p++);
      } while ( *p); // double-null is end of drives list



LogDosDrives prints stuff like :

A: : \Device\Floppy0
C: : \Device\HarddiskVolume1
D: : \Device\HarddiskVolume2
E: : \Device\CdRom0
H: : \Device\CdRom1
I: : \Device\CdRom2
M: : \??\D:\misc
R: : \??\D:\ramdisk
S: : \??\D:\ramdisk
T: : \??\D:\trans
V: : \??\C:
W: : \Device\LanmanRedirector\;W:0000000000024326\radnet\raddevel
Y: : \Device\LanmanRedirector\;Y:0000000000024326\radnet\radmedia
Z: : \Device\LanmanRedirector\;Z:0000000000024326\charlesb-pc\c


Recall from the last post that "\??\" is the NT-namespace way of mapping back to the win32 namespace. Those are substed drives. The "net use" drives get the "Lanman" prefix.

MapNtDriveName using QueryDosDevice is :

bool MapNtDriveName_QueryDosDevice(const wchar_t * from,wchar_t * to)
    #define BUFSIZE 2048
    // Translate path with device name to drive letters.
    wchar_t allDosDrives[BUFSIZE];
    allDosDrives[0] = '\0';

    // GetLogicalDriveStrings
    //  gives you the DOS drives on the system
    //  including substs and network drives
    if (GetLogicalDriveStringsW(BUFSIZE-1, allDosDrives)) 
        wchar_t * pDosDrives = allDosDrives;

            // Copy the drive letter to the template string
            wchar_t dosDrive[3] = (L" :");
            *dosDrive = *pDosDrives;

            // Look up each device name
            wchar_t ntDriveName[BUFSIZE];
            if ( QueryDosDeviceW(dosDrive, ntDriveName, ARRAY_SIZE(ntDriveName)) )
                size_t ntDriveNameLen = strlen(ntDriveName);

                if ( strnicmp(from, ntDriveName, ntDriveNameLen) == 0
                         && ( from[ntDriveNameLen] == '\\' || from[ntDriveNameLen] == 0 ) )
                    return true;

            // Go to the next NULL character.
            while (*pDosDrives++);

        } while ( *pDosDrives); // double-null is end of drives list

    return false;

4.B. MapNtDriveName by IOControl :

There's a more direct way using DeviceIoControl. You just send a message to the "MountPointManager" which is the guy who controls these mappings. (this is from "Mehrdad" on Stackoverflow) :

struct MOUNTMGR_TARGET_NAME { USHORT DeviceNameLength; WCHAR DeviceName[1]; };
struct MOUNTMGR_VOLUME_PATHS { ULONG MultiSzLength; WCHAR MultiSz[1]; };


union ANY_BUFFER {
    char Buffer[4096];

bool MapNtDriveName_IoControl(const wchar_t * from,wchar_t * to)
    ANY_BUFFER nameMnt;
    int fromLen = strlen(from);
    // DeviceNameLength is in *bytes*
    nameMnt.TargetName.DeviceNameLength = (USHORT) ( 2 * fromLen );
    strcpy(nameMnt.TargetName.DeviceName, from );
    HANDLE hMountPointMgr = CreateFile( ("\\\\.\\MountPointManager"),
    ASSERT_RELEASE( hMountPointMgr != 0 );
    DWORD bytesReturned;
    BOOL success = DeviceIoControl(hMountPointMgr,
        sizeof(nameMnt), &nameMnt, sizeof(nameMnt),
        &bytesReturned, NULL);

    if ( success && nameMnt.TargetPaths.MultiSzLength > 0 )

        return true;    
        return false;

5. Fix MapNtDriveName for network names.

I said that MapNtDriveName_IoControl and MapNtDriveName_QueryDosDevice produced the same results and both worked. Well, that's only true for local drives. For network drives they both fail, but in different ways. MapNtDriveName_QueryDosDevice just won't find network drives, while MapNtDriveName_IoControl will hang for a long time and eventually time out with a failure.

We can fix it easily though because the NT path for a network share contains the valid win32 path as a suffix, so all we have to do is grab that suffix.

bool MapNtDriveName(const wchar_t * from,wchar_t * to)
    // hard-code network drives :
    if ( strisame(from,L"\\Device\\Mup") || strisame(from,L"\\Device\\LanmanRedirector") )
        return true;

    // either one :
    //return MapNtDriveName_IOControl(from,to);
    return MapNtDriveName_QueryDosDevice(from,to);

This just takes the NT-namespace network paths, like :




And we're done.

12-21-12 - Coroutine-centric Architecture

I've been talking about this for a while but maybe haven't written it all clearly in one place. So here goes. My proposal for a coroutine-centric architecture (for games).

1. Run one thread locked to each core.

(NOTE : this is only appropriate on something like a game console where you are in control of all the threads! Do not do this on an OS like Windows where other apps may also be locking to cores, and you have the thread affinity scheduler problems, and so on).

The one-thread-per-core set of threads is your thread pool. All code runs as "tasks" (or jobs or whatever) on the thread pool.

The threads never actually do ANY OS Waits. They never switch. They're not really threads, you're not using any of the OS threading any more. (I suppose you still are using the OS to handle signals and such, and there are probably some OS threads that are running which will grab some of your time, and you want that; but you are not using the OS threading in your code).

2. All functions are coroutines. A function with no yields in it is just a very simple coroutine. There's no special syntax to be a coroutine or call a coroutine.

All functions can take futures or return futures. (a future is just a value that's not yet ready). Whether you want this to be totally implicit or not is up to your taste about how much of the operations behind the scenes are visible in the code.

For example if you have a function like :

int func(int x);

and you call it with a future<int> :

future<int> y;

it is promoted automatically to :

future<int> func( future<int> x )
    yield x;
    return func( x.value );

When you call a function, it is not a "branch", it's just a normal function call. If that function yields, it yields the whole current coroutine. That is, it's just like threading and waits, but rather with coroutines and yields.

To branch I would use a new keyword, like "start" :

future<int> some_async_func(int x);

int current_func(int y)

    // execution will step directly into this function;
    // when it yields, current_func will yield

    future<int> f1 = some_async_func(y);

    // with "start" a new coroutine is made and enqueued to the thread pool
    // my coroutine immediately continues to the f1.wait
    future<int> f2 = start some_async_func(y);
    return f1.wait();

"start" should really be an abbreviation for a two-phase launch, which allows a lot more flexibility. That is, "start" should be a shorthand for something like :

start some_async_func(y);


coro * c = new coro( some_async_func(y); );

because that allows batch-starting, and things like setting dependencies after creating the coro, which I have found to be very useful in practice. eg :

coro * c[32];

for(i in 32)
    c[i] = new coro( );
    if ( i > 0 )
        c[i-1]->depends( c[i] );

start_all( c, 32 );

Batch starting is one of those things that people often leave out. Starting tasks one by one is just like waiting for them one by one (instead of using a wait_all), it causes bad thread-thrashing (waking up and going back to sleep over and over, or switching back and forth).

3. Full stack-saving is crucial.

For this to be efficient you need a very small minimum stack size (4k is probably good) and you need stack-extension on demand.

You may have lots of pending coroutines sitting around and you don't want them gobbling all your memory with 64k stacks.

Full stack saving means you can do full variable capture for free, even in a language like C where tracking references is hard.

4. You stop using the OS mutex, semaphore, event, etc. and instead use coroutine variants.

Instead of a thread owning a lock, a coroutine owns a lock. When you block on a lock it's a yield of the coroutine instead a full OS wait.

Getting access to a mutex or semaphore is an event that can trigger coroutines being run or resumed. eg. it's a future just like the return from an async procedural call. So you can do things like :

future<int> y = some_async_func();

yield( y , my_mutex.when_lock() );

which yields your coroutine until the joint condition is met that the async func is done AND you can get the lock on "my_mutex".

Joint yields are very important because they prevent unnecessary coroutine wakeup. While coroutine thrashing is not nearly as bad as thread thrashing (and is one of the big advantages of coroutine-centric architecture (in fact perhaps the biggest)).

You must have coroutine versions of all the ops that have delays (file IO, networking, GPU, etc) so that you can yield on them instead of doing thread-waits.

5. You must have some kind of GC.

Because coroutines will constantly be capturing values, you must ensure their lifetime is >= the life of the coroutine. GC is the only reasonable way to do this.

I would also go ahead and put an RW-lock in every object as well since that will be necessary.

6. Dependencies and side effects should be expressed through args and return values.

You really need to get away from funcs like

void DoSomeStuff(void);

that have various un-knowable inputs and outputs. All inputs & outputs need to be values so that they can be used to create dependency chains.

When that's not directly possible, you must use a convention to express it. eg. for file manipulation I recommend using a string containing the file name to express the side effects that go through the file system (eg. for Rename, Delete, Copy, etc.).

7. Note that coroutines do not fundamentally alter the difficulties of threading.

You still have races, deadlocks, etc. Basic async ops are much easier to write with coroutines, but they are no panacea and do not try to be anything other than a nicer way of writing threading. (eg. they are not transactional memory or any other auto-magic).

to be continued (perhaps) ....

Add 3/15/13 : 8. No static size anything. No resources you can run out of. This is another "best practice" that goes with modern thread design that I forgot to list.

Don't use fixed-size queues for thread communication; they seem like an optimization or simplification at first, but if you can ever hit the limit (and you will) they cause big problems. Don't assume a fixed number of workers or a maximum number of async ops in flight, this can cause deadlocks and be a big problem.

The thing is that a "coroutine centric" program is no longer so much like a normal imperative C program. It's moving towards a functional program where the path of execution is all nonlinear. You're setting a big graph to evaluate, and then you just need to be able to hit "go" and wait for the graph to close. If you run into some limit at some point during the graph evaluation, it's a big mess figuring out how to deal with that.

Of course the OS can impose limits on you (eg. running out of memory) and that is a hassle you have to deal with.

12-21-12 - Coroutines From Lambdas

Being pedantic while I'm on the topic. We've covered this before.

Any language with lambdas (that can be fired when an async completes) can simulate coroutines.

Assume we have some async function call :

future<int> AsyncFunc( int x );

which send the integer off over the net (or whatever) and eventually gets a result back. Assume that future<> has a "AndThen" which schedules a function to run when it's done.

Then you can write a sequence of operations like :

future<int> MySequenceOfOps( int x1 )

    future<int> f1 = AsyncFunc(x1);

    return f1.AndThen( [](int x2){

    x2 *= 2;

    future<int> f2 = AsyncFunc(x2);

    return f2.AndThen( [](int x3){

    x3 --;

    return x3;

    } );
    } );


with a little munging we can make it look more like a standard coroutine :

#define YIELD(future,args)  return future.AndThen( [](args){

future<int> MySequenceOfOps( int x1 )

    future<int> f1 = AsyncFunc(x1);

    YIELD(f1,int x2)

    x2 *= 2;

    future<int> f2 = AsyncFunc(x2);

    YIELD(f2,int x3)

    x3 --;

    return x3;

    } );
    } );


the only really ugly bit is that you have to put a bunch of scope-closers at the end to match the number of yields.

This is really what any coroutine is doing under the hood. When you hit a "yield", what it does is take the remainder of the function and package that up as a functor to get called after the async op that you're yielding on is done.

Coroutines from lambdas have a few disadvantages, aside from the scope-closers annoyance. It's ugly to do anything but simple linear control flow. The above example is the very simple case of "imperative, yield, imperative, yield" , but in real code you want to have things like :

if ( bool )


while ( some condition )


which while probably possible with lambda-coroutines, gets ugly.

An advantage of lambda-coroutines is if you're in a language where you have lambdas with variable-capture, then you get that in your coroutines.


12-18-12 - Async-Await ; Microsoft's Coroutines

As usual I'm way behind in knowing what's going on in the world. Lo and behold, MS have done a coroutine system very similar to me, which they are now pushing as a fundamental technology of WinRT. Dear lord, help us all. (I guess this stuff has been in .NET since 2008 or so, but with WinRT it's now being pushed on C++/CX as well)

I'm just catching up on this, so I'm going to make some notes about things that took a minute to figure out. Correct me where I'm wrong.

For the most part I'll be talking in C# lingo, because this stuff comes from C# and is much more natural there. There are C++/CX versions of all this, but they're rather more ugly. Occasionally I'll dip into what it looks like in CX, which is where we start :

1. "hat" (eg. String^)

Hat is a pointer to a ref-counted object. The ^ means inc and dec ref in scope. In cbloom code String^ is StringPtr.

The main note : "hat" is a thread-safe ref count, *however* it implies no other thread safety. That is, the ref-counting and object destruction is thread safe / atomic , but derefs are not :

Thingy^ t = Get(); // thread safe ref increment here
t->var1 = t->var2; // non-thread safe var accesses!

There is no built-in mutex or anything like that for hat-objects.

2. "async" func keyword

Async is a new keyword that indicates a function might be a coroutine. It does not make the function into an asynchronous call. What it really is is a "structify" or "functor" keyword (plus a "switch") . Like a C++ lambda, the main thing the language does for you is package up all the local variables and function arguments and put them all in a struct. That is (playingly rather loosely with the translation for brevity) :

async void MyFunc( int x )
    string y;


[ is transformed to : ]

struct MyFunc_functor
    int x;
    string y;

    void Do() { stuff(); }

void MyFunc( int x )
    // allocator functor object :
    MyFunc_functor * f = new MyFunc_functor();
    // copy in args :
    f->x = x;
    // run it :

So obviously this functor that captures the function's state is the key to making this into an async coroutine.

It is *not* stack saving. However for simple usages it is the same. Obviously crucial to this is using a language like C# which has GC so all the references can be traced, and everything is on the heap (perhaps lazily). That is, in C++ you could have pointers and references that refer to things on the stack, so just packaging up the args like this doesn't work.

Note that in the above you didn't see any task creation or asynchronous func launching, because it's not. The "async" keyword does not make a function async, all it does is "functorify" it so that it *could* become async. (this is in contrast to C++11 where "async" is an imperative to "run this asynchronously").

3. No more threads.

WinRT is pushing very hard to remove manual control of threads from the developer. Instead you have an OS thread pool that can run your tasks.

Now, I actually am a fan of this model in a limitted way. It's the model I've been advocating for games for a while. To be clear, what I think is good for games is : run 1 thread per core. All game code consists of tasks for the thread pool. There are no special purpose threads, any thread can run any type of task. All the threads are equal priority (there's only 1 per core so this is irrelevant as long as you don't add extra threads).

So, when a coroutine becomes async, it just enqueues to a thread pool.

There is this funny stuff about execution "context", because they couldn't make it actually clean (so that any task can run any thread in the pool); a "context" is a set of one or more threads with certain properties; the main one is the special UI context, which only gets one thread, which therefore can deadlock. This looks like a big mess to me, but as long as you aren't actually doing C# UI stuff you can ignore it.

See ConfigureAwait etc. There seems to be lots of control you might want that's intentionally missing. Things like how many real threads are in your thread pool; also things like "run this task on this particular thread" is forbidden (or even just "stay on the same thread"; you can only stay on the same context, which may be several threads).

4. "await" is a coroutine yield.

You can only use "await" inside an "async" func because it relies on the structification.

It's very much like the old C-coroutines using switch trick. await is given an Awaitable (an interface to an async op). At that point your struct is enqueued on the thread pool to run again when the Awaitable is ready.

"await" is a yield, so you may return to your caller immediately at the point that you await.

Note that because of this, "async/await" functions cannot have return values (* except for Task which we'll see next).

Note that "await" is the point at which an "async" function actually becomes async. That is, when you call an async function, it is *not* initially launched to the thread pool, instead it initially runs synchronously on the calling thread. (this is part of a general effort in the WinRT design to make the async functions not actually async whenever possible, minimizing thread switches and heap allocations). It only actually becomes an APC when you await something.

(aside : there is a hacky "await Task.Yield()" mechanism which kicks off your synchronous invocation of a coroutine to the thread pool without anything explicit to await)

I really don't like the name "await" because it's not a "wait" , it's a "yield". The current thread does not stop running, but the current function might be enqueued to continue later. If it is enqueued, then the current thread returns out of the function and continues in the calling context.

One major flaw I see is that you can only await one async; there's no yield_all or yield_any. Because of this you see people writing atrocious code like :

await x;
await y;
await z;
Now they do provide a Task.WhenAll and Task.WhenAny , which create proxy tasks that complete when the desired condition is met, so it is possible to do it right (but much easier not to).

Of course "await" might not actually yield the coroutine; if the thing you are awaiting is already done, your coroutine may continue immediately. If you await a task that's not done (and also not already running), it might be run immediately on your thread. They intentionally don't want you to rely on any certain flow control, they leave it up to the "scheduler".

5. "Task" is a future.

The Task< > template is a future (or "promise" if you like) that provides a handle to get the result of a coroutine when it eventually completes. Because of the previously noted problem that "await" returns to the caller immediately, before your final return, you need a way to give the caller a handle to that result.

IAsyncOperation< > is the lower level C++/COM version of Task< > ; it's the same thing without the helper methods of Task.

IAsyncOperation.Status can be polled for completion. IAsyncOperation.GetResults can only be called after completed. IAsyncOperation.Completed is a callback function you can set to be run on completion. (*)

So far as I can tell there is no simple way to just Wait on an IAsyncOperation. (you can "await"). Obviously they are trying hard to prevent you from blocking threads in the pool. The method I've seen is to wrap it in a Task and then use Task.Wait()

(* = the .Completed member is a good example of a big annoyance : they play very fast-and-loose with documenting the thread safety semantics of the whole API. Now, I presume that for .Completed to make any sense it must be a thread-safe accessor, and it must be atomic with Status. Otherwise there would be a race where my completion handler would not get called. Presumably your completion handler is called once and only once. None of this is documented, and the same goes across the whole API. They just expect it all to magically work without you knowing how or why.)

(it seems that .NET used to have a Future< > as well, but that's gone since Task< > is just a future and having both is pointless (?))

So, in general if I read it as :

"async" = "coroutine"  (hacky C switch + functor encapsulation)

"await" = yield

"Task" = future

then it's pretty intuitive.

What's missing?

Well there are some places that are syntactically very ugly, but possible. (eg. working with IAsyncOperation/IAsyncInfo in general is super ugly; also the lack of simple "await x,y,z" is a mistake IMO).

There seems to be no way to easily automatically promote a synchronous function to async. That is, if you have something like :

int func1(int x) { return x+1; }

and you want to run it on a future of an int (Task< int >) , what you really want is just a simple syntax like :

future<int> x = some async func that returns an int

future<int> y = start func1( x );

which makes a coroutine that waits for its args to be ready and then runs the synchronous function. (maybe it's possible to write a helper template that does this?)

Now it's tempting to do something like :

future<int> x = some async func that returns an int

int y = func1( await x );

and you see that all the time in example code, but of course that is not the same thing at all and has many drawbacks (it waits immediately even though "y" might not be needed for a while, it doesn't allow you to create async dependency chains, it requires you are already running as a coroutine, etc).

The bigger issue is that it's not a real stackful coroutine system, which means it's not "composable", something I've written about before :
cbloom rants 06-21-12 - Two Alternative Oodles
cbloom rants 10-26-12 - Oodle Rewrite Thoughts

Specifically, a coroutine cannot call another function that does the await. This makes sense if you think of the "await" as being the hacky C-switch-#define thing, not a real language construct. The "async" on the func is the "switch {" and the "await" is a "case ". You cannot write utility functions that are usable in coroutines and may await.

To call functions that might await, they must be run as their own separate coroutine. When they await, they block their own coroutine, not your calling function. That is :

int helper( bool b , AsyncStream s )
    if ( b )
        return 0;
        int x = await s.Get<int>();
        return x + 10;

async Task<int> myfunc1()
    AsyncStream s = open it;
    int x = helper( true, s );
    return x;

The idea here is that "myfunc1" is a coroutine, it calls a function ("helper") which does a yield; that yields out of the parent coroutine (myfunc1). That does not work and is not allowed. It is what I would like to see in a good coroutine-centric language. Instead you have to do something like :

async Task<int> helper( bool b , AsyncStream s )
    if ( b )
        return 0;
        int x = await s.Get<int>();
        return x + 10;

async Task<int> myfunc1()
    AsyncStream s = open it;
    int x = await helper( true, s );
    return x;

Here "helper" is its own coroutine, and we have to block on it. Now it is worth noting that because WinRT is aggressive about delaying heap-allocation of coroutines and is aggresive about running coroutines immediately, the actual flow of the two cases is not very different.

To be extra clear : lack of composability means you can't just have something like "cofread" which acts like synchronous fread , but instead of blocking the thread when it doesn't have enough data, it yields the coroutine.

You also can't write your own "cosemaphore" or "comutex" that yield instead of waiting the thread. (does WinRT provide cosemaphore and comutex? To have a fully functional coroutine-centric language you need all that kind of stuff. What does the normal C# Mutex do when used in a coroutine? Block the whole thread?)

There are a few places in the syntax that I find very dangerous due to their (false) apparent simplicity.

1. Args to coroutines are often references. When the coroutine is packaged into a struct and delayed execution, what you get is a non-thread-safe pointer to some shared object. It's incredibly easy to write code like :

async void func1( SomeStruct^ s )

    MoreStuff( s );


where in fact every touch of 's' is potentially a race and bug.

2. There is no syntax required to start a coroutine. This means you have no idea if functions are async or not at the call site!

void func2()



Does this code work? No idea! They might be coroutines, in which case DeleteFile might return before it's done, and then I would be calling CopyFile before the delete. (if it is a coroutine, the fix is to call "await", assuming it returned a Task).

Obviously the problem arises from side effects. In this case the file system is the medium for communicating side effects. To use coroutine/future code cleanly, you need to try to make all functions take all their inputs as arguments, and to return all their effects are return values. Even if the return is not necessary, you must return some kind of proxy to the change as a way of expressing the dependency.

"async void" functions are probably bad practice in general; you should at least return a Task with no data (future< void >) so that the caller has something to wait on if they want to. async functions with side effects are very dangerous but also very common. The fantasy that we'll all write pure functions that only read their args (by value) and put all output in their return values is absurd.

It's pretty bold of them to make this the official way to write new code for Windows. As an experimental C# language feature, I think it's pretty decent. But good lord man. Race city, here we come. The days of software having repeatable outcomes are over!

As a software design point, the whole idea that "async improves responsiveness" is rather disturbing. We're gonna get a lot more trickle-in GUIs, which is fucking awful. Yes, async is great for making tasks that the user *expects* to be slow to run in the background. What it should not be used for is hiding the slowness of tasks that should in fact be instant. Like when you open a new window, it should immediately appear fully populated with all its buttons and graphics - if there are widgets in the window that take a long time to appear, they should be fixed or deleted, not made to appear asynchronously.

The way web pages give you an initial view and then gradually trickle in updates? That is fucking awful and should be used as a last resort. It does not belong in applications where you have control over your content. But that is exactly what is being heavily pushed by MS for all WinRT apps.

Having buttons move around after they first appeared, or having buttons appear after the window first opened - that is *terrible* software.

(Windows 8 is of course itself an example; part of their trick for speeding up startup is to put more things delayed until after startup. You now have to boot up, and then sit there and twiddle your thumbs for a few minutes while it actually finishes starting up. (there are some tricks to reduce this, such as using Task Scheduler to force things to run immediately at the user login event))

Some links :

Jerry Nixon @work Windows 8 The right way to Read & Write Files in WinRT
Task.Wait and �Inlining� - .NET Parallel Programming - Site Home - MSDN Blogs
CreateThread for Windows 8 Metro - Shawn Hargreaves Blog - Site Home - MSDN Blogs
Diving deep with WinRT and await - Windows 8 app developer blog - Site Home - MSDN Blogs
Exposing .NET tasks as WinRT asynchronous operations - Windows 8 app developer blog - Site Home - MSDN Blogs
Windows 8 File access sample in C#, VB.NET, C++, JavaScript for Visual Studio 2012
Futures and promises - Wikipedia, the free encyclopedia
Effective Go - The Go Programming Language
Deceptive simplicity of async and await
async (C# Reference)
Asynchronous Programming with Async and Await (C# and Visual Basic)
Creating Asynchronous Operations in C++ for Windows Store Apps
Asynchronous Programming - Easier Asynchronous Programming with the New Visual Studio Async CTP
Asynchronous Programming - Async Performance Understanding the Costs of Async and Await
Asynchronous Programming - Pause and Play with Await
Asynchronous programming in C++ (Windows Store apps) (Windows)
AsyncAwait Could Be Better - CodeProject
File Manipulation in Windows 8 Store Apps
SharpGIS Reading and Writing text files in Windows 8 Metro


12-15-12 - How to Lose Game Developer's Love

How to Lose Game Developer's Love ... using only Hello World.

MS has gone from by far the most beloved sweet simple console API to develop for to this :

main( some complicated args that don't matter because they don't work ^ hat )

    IPrintf^ p = System::GoodLord::Deprecated::stdio::COM::AreYouKiddingMe( IPrintfToken );
    p->OnReady( [this]{ return CharStreamer( StreamBufferBuilder( StringStreamer( StringBuffer( CharConcatenator('h') +
        CharConcatenator('e') + IQuitSoftware("llo world\n") )))) } ); 


(this example fails because it didn't request privilege elevation with the security token to access the console)

(and then still fails because it didn't list its imports correctly in its manifest xml)


12-13-12 - Windows 8

With each version of Windows it takes progressively longer to install and set up into a cbloom-usable state. Windows 8 now takes 3-4 days to find all the crud they're trying to shove down my throat and disable it. I've gotten it mostly sorted out but there are a few little things I haven't figured out how to disable :

1. The Win-X key. Win-X is mine; I've been using it to mean "maximize window" for the last 10 years. You can't have it. I've figured out how to disable their Win-X menu, but they still seem to be eating that key press somewhere very early, before I can see it. (they also stole my Win-1 and various other keys, but those went away with the NoWinKeys registry setting; Win-X seems unaffected by that setting).

2. Win 8 seems to have even more UAC than previous verions. As usual you can kill most of it by turning UAC down to min, setting Everyone to Full Control, and Taking Ownership of c:\ recursively. But apparently in Win 8 when you turn the UAC slider down to min, it no longer actually goes to off. Before Win 8, with UAC set to min all processes were "high integrity", now processes have to request elevation from code. One annoyance I haven't figured out how to fix is that net-used and subst'ed drives are now per-user. eg. if you open in admin cmd and do a subst, the drive doesn't show up in your normal explorer (and vice-versa).

3. There seems to be no way to tweak the colors, and the default colors are really annoying. Somebody thought it was a good idea to make every color white or light gray so all the windows and frames just run together and you can't easily spot the edges. You *can* tweak individual colors if you choose a "high contrast" theme (it's pretty standard on modern Windows that you only get the options you deserve by pretending to be disabled (reasonable things like "no animations" are all hidden in "accessibility")) - however, the "high contrast" theme seems to confuse every app (devenv, firefox) such that they use white text on white backgrounds. Doh.

Once you get Win 8 set up, it's basically Win 7. I don't understand what they were thinking putting the tablet UI as the default on the desktop. Mouse/keyboard user interface is so completely different from jamming your big fat clumsy fingers into a screen that it makes no sense to try to use one on the other. You wouldn't put tiny little buttons on a tablet, so why are you putting giant ham-finger tablet buttons on my desktop? Oh well, easy to disable.

So far the only improvement I've noticed (over Win 7) is that Windows Networking seems massively improved (finally, thank god). It might actually be faster to copy files across a local network than to re-download them from the internet now.

Some general OS ranting :

An OS should be a blank piece of paper. It is a tool for *me* to create what *I* want. It should not have a visual style. It should not be "designed" any more than a good quality blank piece of paper is designed.

(that said I prefer the Win 8 look to anything MS has done since Win 2k (which was the pinnacle of Windows, good lord how sweet it would be if I could still use Win 2k); Aero was an abortion, you don't base your OS GUI design on WinAmp for fuck's sake, though at least with the Aero-OS'es you could easily get a "classic" look, which is not so easy any more)

It's almost impossible to find an OS that actually respects its users any more. I want control of *everything*. If you add some new feature, fine, let me turn it off. If you change my key mappings, fine, let me put them back the way I'm used to.

I despise multi-user OS'es. In my experience they never actually work for security, and they are a constant huge pain in the ass. If you all want to make multi-user OS'es, please just give me a way to get a no-users install with a flat file system and just one set of configs. Nobody but me will ever touch my computer, I don't need this extra layer of shit that adds friction every single day that I use a computer (is that config under "cbloom" or is it under "all users"? Fuck, why do I have to think about this, give me a god damn flat OS. Oh wait the config was under "administrators" or "local machine". ARG). I know this is not gonna happen. Urg.

While we're at it can we talk about how ridiculously broken Windows is in general now?

One of the most basic thing you might want to do with an OS is to take your apps and data and config from one machine and put it on another. LOL, good luck. I know, let's just take all the system hardware config and the user settings and the app installs and let's just shuffle them all up in a big haystack.

Any serious developer needs to be able to clone their dev software from one machine to another, or completely back up their code + dev tools (but without backing up the whole machine, and be able to restore to a different machine).

Obviously the whole DLL mess is a disaster (and now manifests, packages, SxS assemblies, .net frameworks, WTF WTF). It's particularly insane to me that MS can't even get it right with their own damn runtimes. How in hell is it that I can run an exe on Windows and get an "msvcrxx not found" error? WTF, just ship all your damn runtimes with Windows, it can't be that big. And even if you don't want to ship them all, how can you not just have a server that gives me the missing runtimes? It's so insane.

God help you if you are trying to write software that can build on various installs of windows. Oh you have Win Vista SP1 with Platform SDK N, that's a totally different header which is missing some functions and causes some other weird warning, and you need .net framework X on this machine and blah blah it's such a total mess.

12-13-12 - vcproj nightmare

Ridiculous. WTF were they thinking.

Ok, so XML suxors and all, but if you're going to use XML then *use XML*. When you rev the god damn devstudio you don't break the old file format, you just add new tags for whatever new crap you feel you need to add. You don't put the devstudio version in the header of the file, you put on the individual tags that are specific to that version.

If you need to do per-version settings files, put them in a different file than my basic list of what my source code is and how to build it. And of course don't mix up your GUI cache with my project data.

The thing that really boggles my mind is how they can make such a huge mistake, and then stick with it year after year. It's sort of understandable to make a mistake once (though I think this one was entirely avoidable), but then you go "whoah what a fuckup, let's change that". Nope.

(of course they've done the same thing with their flagship (Office). It's crazy broken that I can't at least load the text and basic formatting from any type of document into any version)


12-8-12 - Sandy Cars

I'm still keeping half an eye open for an E46 M3.

Something I've noticed is that in the last month or so a lot of cars with histories like this are popping up :

09/06/2012      70,668      Inspection Co.  New Jersey      Inspection performed
11/20/2012      74,471      Covert Ford Austin, TX          Car offered for sale

You're not fooling me, bub. I know what happened in NJ between those two dates! Beware!


12-6-12 - Theoretical Oodle Rewrite Continued

So, continuing on the theme of a very C++-ish API with "futures" , ref-counted buffers, etc. :

cbloom rants 07-19-12 - Experimental Futures in Oodle
cbloom rants 10-26-12 - Oodle Rewrite Thoughts

It occurs to me that this could massively simplify the giant API.

What you do is treat "array data" as a special type of object that can be linearly broken up. (I noted previously about having RW locks in every object and special-casing arrays by letting them be RW-locked in portions instead of always locking the whole buffer).

Then arrays could have two special ways of running async :

1. Stream. A straightforward futures sequence to do something like read-compress-write would wait for the whole file read to be done before starting the compress. What you could do instead is have the read op immediately return a "stream future" which would be able to dole out portions of the read as it completed. Any call that processes data linearly can be a streamer, so "compress" could also return a stream future, and "write" would then be able to write out compressed bits as they are made, rather than waiting on the whole op.

2. Branch-merge. This is less of an architectural thing than just a helper (you can easily write it client-side with normal futures); it takes an array and runs the future on portions of it, rather than running on the whole thing. But having this helper in from the beginning means you don't have to write lots of special case branch-merges to do things like compress large files in several pieces.

So you basically just have a bunch of simple APIs that don't look particularly Async. Read just returns a buffer (future). ReadStream returns a buffer stream future. They look like simple buffer->buffer APIs and you don't have to write special cases for all the various async chains, because it's easy for the client to chain things together as they please.

To be redundant, the win is that you can write a function like Compress() and you write it just like a synchronous buffer-to-buffer function, but it's arguments can be futures and its return value can be a future.

Compress() should actually be a stackful coroutine, so that if the input buffer is a Stream buffer, then when you try to access bytes that aren't yet available in that buffer, you Yield the coroutine (pending on the stream filling).

Functions take futures as arguments and return futures.

Every function is actually run as a stackful coroutine on the worker threadpool.

Functions just look like synchronous code, but things like file IO cause a coroutine Yield rather than a thread Wait.

All objects are ref-counted and create automatic dependency chains.

All objects have built-in RW locks, arrays have RW locks on regions.

Parallelism is achieved through generic Stream and Branch/Merge facilities.

While this all sounds very nice in theory, I'm sure in practice it wouldn't work. What I've found is that every parallel routine I write requires new hacky special-casing to make it really run at full efficiency.

12-6-12 - The Oodle Dilemma

Top two Oodle (potential) customer complaints :

1. The API is too big, it's too complicated. There are too many ways of doing basically the same thing, and too many arguments and options on the functions.

2. I really want to be able to do X but I can't do it exactly the way I want with the API, can you add another interface?

Neither one is wrong.


12-5-12 - The Fiscal Cliff

First of all, let's talk about the general structure of the laws that are causing this problem. Our federal government, like a large number of states, now has a sort of one-way ratchet towards smaller government. These laws have been slipped in by Republicans with little attention paid, but they are very powerful and will drastically change American government in the future. Basically, the Republicans have already won and we can't stop them.

The structure of all these laws is basically the same : 1. allow tax cuts to pass by simple majority. 2. make tax raises difficult to pass (many states now require 2/3 super-majority for taxes increases (and they were already politically nearly impossible)). 3. set a debt limit and force mandatory cuts to balance the budget (many states actually have a debt limit of 0, every year's budget must be balanced, which is manifestly absurd given the variance of economies and the resulting receipts and costs).

They claim that this produces "fiscal responsibility" but of course they know that's a lie; the goal is small government and that's all it produces. If you wanted actual fiscal responsibility, you wouldn't cut taxes in flush times, instead you would require that governments save during surplus times to provide a cushion for recessionary times; you would also require that any tax *cut* is matched by spending cuts in order to pass. If you were actually fiscally responsible, you would allow defecits during recessions, but require them to be matched by tax raises in boom periods.

The result of these laws is obvious : a simple Republican majority can pass tax cuts when they are in power (and the dumb voters will love it), then it's almost impossible to put the taxes back where they were, and then you inevitably run out of money and have to cut spending. Particularly if you hit a recession and have to keep the budget balanced, you will have to slash government drastically.

(whether or not government should be minimal is open for debate, but the duplicitous method of acheiving it is incontrovertibly scummy)

So first of all, let's recall the source of the fiscal cliff. It is not the growth of entitlements, which is sort of an unrelated long term issue that people love to mix in to any financial discussion. The primary causes of the short term deficit are the Bush tax cuts and the recession. (the other major factors are the war spending and TARP spending (etc)). This is not a fundamental problem of the way the US government is run, it's the combination of cutting taxes and increasing spending that happened under GWB.

The other major issue that we must keep in mind is that we are still currently deep in a recession. Tiny amounts of GDP growth may hide this, and the unemployment numbers look better, but I believe the reality is that the American economy is still deeply sick, with no real growth of industry and no prospects. Essentially we are propping it up with the free money from the Fed and the super-low taxes. Any attempt to return to a sustainable Fed interest rate and tax rate would show the economy for what it really is. Trying to tighten the belt now would certainly look bad; I don't say that it would "lead to a recession", I believe we are in a recession and are just hiding it with a candy coating.

Now, briefly about entitlements. The Republicans love to make entitlement growth seem really scary, but it's not true. Social Security can be made solvent very easily : simply make the SS tax non-regressive. The current SS tax is regressive because it's a flat percentage but has a maximum. If you simply remove the maximum, Social Security is solvent for the next 100 years (CBO numbers).

Medicare is a bigger problem, but not because of the increase of the number of elderly - rather due to the corruption of doctors and the medical establishment. With increased productivity and technology, the cost of health care should be going *down*, instead it rises at an obscene rate, because the insurance complex has cooked up a system where we have no control over the cost of our care. Unfortunately Obamacare has perhaps made this worse than ever, locking the corrupt health insurance system into law without taking any steps to limit private profits.

How do you actually fix the American economy and get some real growth that's not just an illusion propped up by free Fed money?

1. Legally require open systems. Make net neutrality law. Open up the cable-TV lines. Perhaps the best option is a national open broadband system on some new super-fast fiber (unrealistic). Make the Apple Store type of computer lockdown illegal. Openness and free competition for small business is what will really save this country.

2. Make it easier to start small businesses. Remove the favoritism for big business. Tax loopholes and breaks massively favor big business - eliminate them all. Eliminate all development and "green" subsidies, which again massively favor big business. Simplify the tax code (see below) and then perhaps even simplify it more for small businesses, like provide a super-basic flat tax option for businesses that make less than $1M a year.

3. Make it cheaper to hire Americans in America. Eliminate payroll taxes. Eliminate employer-run health care (or provide a national group option for small businesses). Increase taxes on corproate profits and aggressively go after offshoring of money.

4. Long term we're fucked no matter what. What would you say are the prospects for a country where the education system sucks (the cost of education continues to rise way faster than inflation, and most of the "educated" can't actually do anything useful), the IT infrastructure sucks, and the cost of labor is sky-high? You would say that country is doomed to poverty, and that it is.

A few proposals for real government taxing & spending reform :

O. Get corporate money out of politics or everything else is hopeless.

O. Stop the revolving door between government and private industry. eg. if you work on the Texas Railroad Commission, then you are not allowed to go work for the oil/gas industry (and vice-versa). Treasury secretaries shouldn't be allowed to rotate in and out of wall street. It's totally absurd and like corporate speech, everything else is hopeless until it's stopped.

O. Return defense spending to 1999 levels ($300 billion from the current $700+ billion). And then cut it even further. Never going to happen since defense is the biggest pork item in government (by far).

O. Stop all farm subsidies and tax breaks. They're a sick farce. The small family farmers that are trotted out for political purposes don't exist in the real world; farm subsidies go to large agribusiness and to rich people with vineyards. They're actually very bad for small farmers that are trying to legitimately compete because they massively favor big business. Not only does it make the American farm economy sick, we're destroying the entire world food economy with our export subsidies.

O. Stop all direct aid for ethanol, electric cars, etc. They're a sick distortion of the market that isn't helping anything except corrupt profit. Let the market find solutions to problems.

O. Stop sending federal money to leech states. (need to get rid of the ridiculous over-powering of small states caused by the Senate)

O. Eliminate all payroll taxes. Fund medicare, SS, unemployment, etc. from the general tax revenue. This massively simplifies the tax code, removes the regressive SS tax, and reduces the cost of employment.

O. Don't cut Medicare spending or fake out the inflation rate for COLA. Instead go after the reason why medical costs are rising out of control. Don't reimburse doctors for unnecessary procedures or scans. Don't reimburse for unnecessary MRI's. Don't allow any medical practitioner to pass on the fee in excess of the negotiated rates to the client. Require up-front pricing for all medical treatment. Force the AMA to stop its corrupt limiting of the number of doctors. etc. etc.

O. Eliminate capital gains tax. I don't mean reduce it, I mean treat all profit as profit - tax it as normal income. Eliminate the dividend loophole. Stop letting the super rich pay 10-15% tax rates.

O. Eliminate all tax deductions. Nothing is deductable (but raise the standard deduction so that a majority of Americans actually get to deduct more). Alternatively : raise the AMT and remove exceptions from the AMT.

O. Remove foreign residency as a way to avoid US taxes. If you do business in America, you pay US taxes. Same for corporate taxes. Remove non-income benefits as a way to avoid taxes; eg. company cars, apartments, dinners, etc. all count as income. Pass new laws so we can be more aggressive about going after holding companies or "consulting firms" as ways to hide personal income.

O. Make US companies pay for our foreign spending on their behalf. eg. if you're Chevron and want to run a pipeline through Afghanistan, fine go ahead, but then you pay for the Afghanistan war. etc. Almost all of our defense and foreign aid spending should be paid by the companies that do business in unstable countries.


12-1-12 - Hawk's return

The hawk returned (perhaps a different one). He missed his kill this time and I couldn't get a shot of him before he fled, but it did give me a chance to snap a photo of what the chickens do in response :

(I'm lifting the roof on their house) (there are five there, they're all sitting on top of the one broody hen that never leaves that box)

We've got hawthorn trees at the house which are unusual for Seattle (they don't belong here). In the fall/winter the leaves drop and they are covered in berries which are inedible to humans but are apparently like ambrosia to birds and squirrels. We get incursions from neighboring squirrels that the resident ones have to fend off with much shrieking, and of course lots of little birds come through in packs, which seems to be attracting the predator.

I was a bit worried that our cats would take advantage of this bountiful hunting ground (it's really perverse when people with cats set up bird feeders, and having super-delicious trees is not much removed from that), but so far that hasn't really happened.


11-29-12 - Unnatural

I hate having neighbors so much. It's just not a natural way to live, this modern human way, where we're all crammed together with people who are not our tribe.

I believe that human beings are only comfortable living with people they are intimate with. In ancient days this was your whole tribe, now it's usually just family. You essentially have no privacy from these people, and not even separate property. Trying to keep your own stuff is an exercise in frustration. You must trust these people and work together and open up to them to be happy. Certainly there is always friction in this, but it's a natural human existence, and even though it may give you lots to complain about, there will also be joy. (foolishly moving way from this way of life is the root of much unhappiness)

Everyone else is an enemy. If you aren't in my intimate tribal group, WTF are you doing near my home? This is my land where I have my family, I will fucking jab you in the eye with a pointy stick.

I'm not really comfortable with "friends". So-called "friends" are not your friends; they will make fun of you behind your back, they will let you down when you need help. You can't ever really open up and admit things to them, you can't show your weaknesses, they will mock you for it or use your weaknesses against you. It's so awful to me the way normal people talk to each other; everyone is pretending to be strong and happy all the time, nobody ever talks about anything serious, some people put on a big show to be entertaining, it's all just so tense and shallow and unpleasant. The reason is that these people are not in my tribe, hence they are my enemies, and this whole "friends" thing is a very modern societal invention that doesn't really work.

I realized a while ago that this is one of the things I hate about going into the office. The best office experiences I've had have been the ones where it was a small team, we were all young and spent lots of time at work, and we actually bonded and had fun together and were almost like a family after several years of crunching (going through tough shit together is a classic way to brainwash people into acting like a tribe); at that point, it feels comfortable to go in to work, you can rip a fart and people laugh instead of tensely pretending that nothing happened. But most of the time an office never reaches that level of intimacy, the people in the office are just acquaintances, so you're in this space where you're supposed to be relaxed, and there are people walking around all the time looking over your shoulder, but they are enemies! Of course I can't relax, they're not my tribe, why are they in my space? It's terrible.

Going away from home to work is really unnatural.

At first when people start working from home it feels weird because they're so used to leaving, but really this whole going to a factory/office thing is a super-modern invention (last few hundred years).

Of course you should stay home and work your farm. You should build your house, till your field, and be there to protect your women and children (eg. in the modern world : open jars for them). Of course you should have your children beside you so that you can talk to them and teach them your trade as you work.

Of course when you're hungry you should go in to your own kitchen and eat some braised pork shoulder that's real simple hearty food cooked by your own hands, not the poisonous filth that restaurants purvey.

You shouldn't leave your family for 8 hours every day, that's bizarre and horrible. You should see your livestock running around, be there to shoo away the neighbors cats, see the trees changing color, and put your time and your love into what is really yours.


11-28-12 - SSD Buying Guide

Have done a bunch of reading over the past 24 hours and updating my old posts on SSD's :

cbloom's definitive SSD buying guide :

recommended :

Intel's whatever (currently the 520, but actually the old X25-M is still just fine; the S3700 stuff looks promising for the future)

not recommended :

Everything else.

The whole issue of flash degradation and moving blocks and such is a total red herring. SSD's are not failing because of the theoretical lifetime of flash memory, they are failing because the non-Intel drives are just broken. It's pretty simple, don't buy them.

The other issue I really don't care about is speed. They're all super fast. If they all actually worked then maybe I would care which was fastest, but since the non-Intel ones are just broken, the question of speed is irrelevant. The hardware review sites are all pretty awful with their endless benchmarking and complete missing of the actual issues. And even my ancient X25-M is plenty fast enough.

I think it's tempting to just go for the enterprise-grade stuff (Intel 710 at the moment). Saving money on storage doesn't make any sense to me, and all the speed measurement stuff just makes me yawn. (Intel 720 looks promising for the future). It's not quite as clear cut as ECC RAM (which is obviously worth it), but I suspect that spending another few $hundred to not worry about drive failure is worth it.

Oh, also brief googling indicates various versions of Mac OS don't support various SSD's correctly. I would just avoid SSD's on Mac unless you are very confident about getting this right. (best practice is probably just avoiding Mac altogether, but YMMV and various other acronyms)


11-26-12 - VAG sucks

(VAG = Volkswagen Auto Group)

Going through old notes I found this (originally from Road and Track) :

"For instance, just about every Audi, Porsche and Volkswagen model that I've driven in the U.S. doesn't allow throttle/brake overlap. Our long-term Nissan 370Z doesn't, either, which is a big reason why I'm not particularly keen on taking it out for a good flog; overlap its throttle and brake just a little bit and the Z cuts power for what seems an eternity (probably about two seconds)."

VAG makes fucked up cars. I certainly won't ever buy a modern one again. They have extremely intrusive computers that take the power for LOLs out of the driver's hands. (apparently the 370Z also has some stupidity as well; this shit does not belong in cars that are sold as "driver's cars").

(in case it wasn't clear from the above : you cannot left-foot-brake a modern Porsche with throttle overlap. Furthermore, you also can't trail-brake oversteer a modern Porsche because ESC is always on under braking. You have to be careful going fully off throttle and then back on due to the off-throttle-timing-advance. etc. etc. probably more stupid shit I'm not aware of. This stuff may be okay for most drivers in their comfort saloons, but is inexcusable in a sports car)

Anyway, I'm posting cuz this reminded me that I found another good little mod for the 997 :

Stupid VAG computer has clutch-release-assist. What this does is change the engine map in the first few seconds after you let the clutch out. The reason they do this is so that incompetent old fart owners don't stall the car when pulling away from a light, and also to help you not burn the clutch so much. (the change to the engine map increases the minimum throttle and also reduces the max).

If you actually want to drive your car and do hard launches and clutch-kicks and generally have fun, it sucks. (the worst part is when you do a hard launch and turn, like when you're trying to join fast traffic, and you get into a slight slide, which is fine and fun, but then in the middle of your maneuver the throttle map suddenly changes back as the clutch-assist phase ends, and the car sort of lurches and surges weirdly, it's horrible). Fortunately disabling it is very easy :

There's a sensor that detects clutch depression. It's directly above the clutch in the underside of the dash. You should be able to see the plastic piston for the sensor near the hinge of the clutch pedal. All you have to do is unplug the sensor (it's a plastic clip fitting)

With the sensor unplugged you get no more clutch-release-assist and the car feels much better. You will probably stall it a few times as you get used to the different throttle map, but once you're used to it smooth fast starts are actually easier. (oh, and pressing the clutch will no longer disable cruise control, so be aware of that). I like it.

(aside : it's a shame that all the car magazines are such total garbage. If they weren't, I would be able to find out if any modern cars are not so fucked. And you also want to know if they're easy to fix; problems that are easy to fix are not problems)

(other aside : the new 991-gen Cayman looks really sweet to me, but there are some problems. I was hoping they would use the longer wheelbase to make the cabin a bit bigger, which apparently they didn't really do. They also lowered the seat and raised the door sills which ruin one of the great advantages of the 997-gen Porsches (that they had not adopted that horrible trend of excessively high doors and poor visibility). But the really big drawback is that I'm sure it's all VAG-ed up in stupid ways that make it annoying for a driver. And of course all the standard Cayman problems remain, like the fact that they down-grade all the parts from the 911 in shitty ways (put the damn multi-link rear suspension on the Cayman you assholes, put an adjustable sway on it and adjustable front control arms))

(final aside : car user interface design is generally getting worse in the last 10-20 years. Something that user interface designers used to understand but seem to have forgotten is that the most potent man-machine bond develops when you can build muscle memory for the device, so that you can use it effectively with your twitch reflexes that don't involve rational thought. In order for that to work, the device must behave exactly the same way at all times. You can't have context-sensitive knobs. You can't have the map of the throttle or brake pedal changing based on something the car computer detected. You must have the same outcome from the same body motion every time. This must be an involiable principle of good user interfaces.)

11-26-12 - Chickens and Hawks

Hawk with kill in our yard :

And in context :

The chickens were out free ranging at the time; they all ran inside the coop and climbed back into the farthest corner nesting box and sat on top each other in a writhing pile of terrified chickens.

Watching animals is pretty entertaining. I remember when I was younger, I used to think it was a pathetic waste of time. Old people would sit around and watch the cats play, or get all excited about going on safari or whatever, and I would think "ppfft, boring, whatever, I've seen it on TV, what a sad vapid way to get entertainment, you oldsters are all so brain-dead, doing nothing with your time, you could be learning quantum field theory, but you've all just given up on life and want to sit around smiling at animals". Well, that's me now.


11-24-12 - The Adapted Eye

Buying a house for a view is a big mistake. (*)

Seattle is a somewhat beautiful place (I'm not more enthusiastic because it is depressing to me how easily it could have been much better (and it continues to get worse as the modern development of Cap Hill and South Lake Union turn the city into a generic condo/mall dystopia)) but I just don't see it any more. When we got back from CA I realized that I just don't see the lake and the trees anymore, all I see is "home".

There are some aspects that still move me, like clear views of the Olympics, because they are a rare treat. But after 4 years, the beauty all around is just background.

We have pretty great views from our house, and I sort of notice them, but really the effect on happiness of the view is minimal.

(* = there are benefits to houses with a view other than the beauty of the view. Usually a good view is associated with being on a hill top, or above other people, or up high in a condo tower, and those have the advantages of being quieter, better air, more privacy, etc. Also having a view of nature is an advantage just in the fact that it is *not* a view of other people, which is generally stressful to look at because they are doing fucked up things that you can't control. I certainly appreciate the fact that our house is above everyone else; it's nice to look down on the world and be separate from it).

I was driving along Lake Wash with my brother this summer and he made some comment about how beautiful it was, and for a second there I just couldn't figure out what he was talking about. I was looking around to see if there was some new art installation, or if Mount Rainier was showing itself that day, and then I realized that he just meant the tree lined avenue on the lake and the islands and all that which I just didn't see at all any more.

Of course marrying for beauty is a similar mistake. Even ignoring the fact that beauty fades, if we imagine that it lasted forever it would still be a mistake because you would stop seeing it.

I've always thought that couples could keep the aesthetic interest in each other alive by completely changing their style every few years. Like, dress as a hipster for a while, dress as a punk rocker or a goth, dress as a preppy business person. Or get drastically different hair cuts, like for men grow out your hair like an 80's rocker, or get a big Morrisey pompadour, something different. Most people over 30 tend to settle into one boring low-maintenance style for the rest of their lives, and it becomes invisible to the adapted eyes in their lives.

I suppose there are various tricks you can use; like rather than have your favorite paintings on the wall all the time, rotate them like a museum, put some in storage for a while and hang up some others. It might even help to roll some dice to forcibly randomize your selection.

I guess the standard married custom of wearing sweats around the house and generally looking like hell is actually a smart way of providing intermittent reward. It's the standard sitcom-man refrain to complain that your wife doesn't fancy herself up any more, but that's dumb; if she did dress up every day, then that would just become the norm and you would stop seeing it. Better to set the baseline low so that you can occasionally have something exceptional.

(add : hmm the generalized point that you should save your best for just a few moments and be shitty other times is questionable. Think about behavior. Should you intentionally be kind of dicky most of the time and occasionally really nice? If you're just nice all the time, that becomes the baseline and people take it for granted. I'm not sure about that. But certainly morons do love the "dicky dad" character in TV's and movies; your typical fictional football coach is a great example; dicky dad is stern and tough, scowly and hard on you, but then takes you aside and is somewhat kind and generous, and all the morons in the audience melt and just eat that shit up.)

One of the traps of life is optimizing things. You paint your walls your favorite color for walls, you think you're making things better, but that gets you stuck in a local maximum, which you then stop seeing, and you don't feel motivated to change it because any change is "worse".

I realized the other day that quite a few ancient societies actually have pretty clever customs to provide randomized rewards. For example lots of societies have something like "numbers" , which ignoring the vig, is just a way of taking a steady small income and turning it into randomized big rewards.

Say you got a raise and make $1 more a day. At first you're happy because your life got better, but soon that happiness is gone because you just get used to the new slightly better life and don't perceive it any more. If instead of getting that $1 a day, you instead get $365 randomly on average once a year, your happiness baseline is the same, but once in a while you get a really happy day. This is probably actually better for happiness.

I think the big expensive parties that lots of ancient societies throw for special events might be a similar thing. Growing up in LA we would see our poor latino neighbors spend ridiculous amounts on a quincenera or a wedding and think how foolish it was, surely it's more rational to save that money and use it for health care or education or a nicer house. But maybe they had it right? Human happiness is highly resistant to rational optimization.


11-23-12 - Global State Considered Harmful

In code design, a frequent pattern is that of singleton state machines. eg. a module like "the log" or "memory allocation" which has various attributes you set up that affect its operation, and then subsequent calls are affected by those attributes. eg. things like :

Log_SetOutputFile( FILE * f );


Log_Printf( const char * fmt .... );

or :

malloc_setminimumalignment( 16 );


malloc( size_t size );

The goal of this kind of design is to make the common use API minimal, and have a place to store the settings (in the singleton) so they don't have to be passed in all the time. So, eg. Log_Printf() doesn't have to pass in all the options associated with logging, they are stored in global state.

I propose that global state like this is the classic mistake of improving the easy case. For small code bases with only one programmer, they are mostly okay. But in large code bases, with multi-threading, with chunks of code written independently and then combined, they are a disaster.

Let's look at the problems :

1. Multi-threading.

This is an obvious disaster and pretty much a nail in the coffin for global state. Say you have some code like :

pcb * previous_callback = malloc_setfailcallback( my_malloc_fail_callback );

void * ptr = malloc( big_size ); 

malloc_setfailcallback( previous_callback );

this is okay single threaded, but if other threads are using malloc, you just set the "failcallback" for them as well during that span. You've created a nasty race. And of course you have no idea whether the failcallback that you wanted is actually set when you call malloc because someone else might change it on another thread.

Now, an obvious solution is to make the state thread-local. That fixed the above snippet, but some times you want to change the state so that other threads are affected. So now you have to have thread-local versions and global versions of everything. This is a viable, but messy, solution. The full solution is :

There's a global version of all state variables. There are also thread-local copies of all the global state. The thread-local copies have a special value that means "inherit from global state". The initial value of all the thread-local state should be "inherit". All state-setting APIs must have a flag for whether they should set the global state or the thread-local state. Scoped thread-local state changes (such as the above example) need to restore the thread-local state to "inherit".

This can be made to work (I'm using for the Log system in Oodle at the moment) but it really is a very large conceptual burden on the client code and I don't recommend it.

There's another way that these global-state singletons are horrible for multi-threading, and that's that they create dependencies between threads that are not obvious or intentional. A little utility function that just calls some simple functions picks up these ties to shared variables and needs synchronization protection with the global state. This is related to :

2. Non-local effects.

The global state makes the functions that use it non-"pure" in a very hidden way. It means that innocuous functions can break code that's very far away from it in hidden ways.

One of the classic disasters of global state is the x87 (FPU) control word. Say you have a function like :

void func1()

    set x87 CW

    do a bunch of math that relies on that CW


    do more math that relies on CW

    restore CW

Even without threading problems (the x87 CW is thread-local under any normal OS), this code has nasty non-local effects.

Some branch of code way out in func2() might rely on the CW being in a certain state, or it might change the CW and that breaks func1().

You don't want to be able to break code very far away from you in a hidden way, which is what all global state does. Particularly in the multi-threaded world, you want to be able to detect pure functions at a glance, or if a function is not pure, you need to be able to see what it depends on.

3. Undocumented and un-asserted requirements.

Any code base with global state is just full of bugs waiting to happen.

Any 3d graphics programmer knows about the nightmare of the GPU state machine. To actually write robust GPU code, you have to check every single render state at the start of the function to ensure that it is set up the way you expect. Good code always expresses (and checks) its requirements, and global state makes that very hard.

This is a big problem even in a single-source code base, but even worse with multiple programmers, and a total disaster when trying to copy-paste code between different products.

Even something like taking a function that's called in one spot in the code and calling it in another spot can be a hidden bug if it relied on some global state that was set up in just the right way in that original spot. That's terrible, as much as possible functions should be self-contained and work the same no matter where they are called. It's sort of like "movement of call site invariance symmetry" ; the action of a function should be determined only by its arguments (as much as possible) and any memory locations that it reads should be as clearly documented as possible.

4. Code sharing.

I believe that global state is part of what makes C code so hard to share.

If you take a code snippet that relies on some specific global state out of its content and paste it somewhere else, it no longer works. Part of the problem is that nobody documents or checks that the global state they need is set. But a bigger issue is :

If you take two chunks of code that work independently and just link them together, they might no longer work. If they share some global state, either intentionally or accidentally, and set it up differently, suddenly they are stomping on each other and breaking each other.

Obviously this occurs with anything in stdlib, or on the processor, or in the OS (for example there are lots of per-Process settings in Windows; eg. if you take some libraries that want a different time period, or process priority class, or priviledge level, etc. etc. you can break them just by putting them together).

Ideally this really should not be so. You should be able to link together separate libs and they should not break each other. Global state is very bad.

Okay, so we hate global state and want to avoid it. What can we do? I don't really have the answer to this because I've only recently come to this conclusion and don't have years of experience, which is what it takes to really make a good decision.

One option is the thread-local global state with inheritance and overrides as sketched above. There are some nice things about the thread-local-inherits-global method. One is that you do still have global state, so you can change the options somewhere and it affects all users. (eg. if you hit 'L' to toggle logging that can change the global state, and any thread or scope that hasn't explicitly sets it picks up the global option immediately).

Other solutions :

1. Pass in everything :

When it's reasonable to do so, try to pass in the options rather than setting them on a singleton. This may make the client code uglier and longer to type at first, but is better down the road.

eg. rather than

malloc_set_alignment( 16 );

malloc( size );

you would do :

malloc_aligned( size , 16 );

One change I've made to Oodle is taking state out of the async systems and putting in the args for each launch. It used to be like :

OodleWork_SetKickImmediate( OodleKickImmediate_No );
OodleWork_SetPriority( OodlePriority_High );
OodleWork_Run( job );

and now it's :

OodleWork_Run( job , OodleKickImmediate_No, OodlePriority_High );

2. An options struct rather than lots of args.

I distinguish this from #3 because it's sort of a bridge between the two. In particular I think of an "options struct" as just plain values - it doesn't have to be cleaned up, it could be const or made with an initializer list. You just use this when the number of options is too large and if you frequently set up the options once and then use it many times.

So eg. the above would be :

OodleWorkOptions wopts = { OodleKickImmediate_No, OodlePriority_High  };
OodleWork_Run( job , &wopts );

Now I should emphasize that we already have given ourselves great power and clarity. The options struct could just be global, and then you have the standard mess with that. You could have it in the TLS so you have per-thread options. And then you could locally override even the thread-local options in some scope. Subroutines should take OodleWorkOptions as a parameter so the caller can control how things inside are run, otherwise you lose the ability to affect child code which a global state system has.

Note also that options structs are dangerous for maintenance because of the C default initializer value of 0 and the fact that there's no warning for partially assigned structs. You can fix this by either making 0 mean "default" for every value, or making 0 mean "invalid" (and assert) - do not have 0 be a valid value which is anything but default. Another option is to require a magic number in the last value of the struct; unfortunately this is only caught at runtime, not compile time, which makes it ugly for a library. Because of that it may be best to only expose Set() functions for the struct and make the initializer list inaccessible.

The options struct can inherit values when its created; eg. it might fill any non-explicitly given values (eg. the 0 default) by inheriting from global options. As long as you never store options (you just make them on the stack), and each frame tick you get back to a root for all threads that has no options on the stack, then global options percolate out at least once a frame. (so for example the 'L' key to toggle logging will affect all threads on the next frame).

3. An initialized state object that you pass around.

Rather than a global singleton for things like The Log or The Allocator, this idea is to completely remove the concept that there is only one of those.

Instead, Log or Allocator is a struct that is passed in, and must be used to do those options. eg. like :

void FunctionThatMightLogOrAllocate( Log * l, Allocator * a , int x , int y )
    if ( x )
        Log_Printf( l , "some stuff" );

    if ( y )
        void * p = malloc( a , 32 );

        free( a , p );

now you can set options on your object, which may be a per-thread object or it might be global, or it might even be unique to the scope.

This is very powerful, it lets you do things like make an "arena" allocator in a scope ; the arena is allocated from the parent allocator and passed to the child functions. eg :

void MakeSuffixTrie( Allocator * a , U8 * buf, int bufSize )

    Allocator_Arena arena( a , bufSize * 4 );

    MakeSuffixTrie_Sub( &arena, buf, bufSize );

The idea is there's no global state, everything is passed down.

At first the fact that you have to pass down a state pointer to use malloc seems like an excessive pain in the ass, but it has advantages. It makes it super clear in the signature of a function which subsystems it might use. You get no more surprises because you forgot that your Mat3::Invert function logs about degeneracy.

It's unclear to me whether this would be too much of a burden in real world large code bases like games.

old rants