7/28/2012

07-28-12 - DeUnicode 1.1

Made a few revisions to DeUnicode. Mainly because I got sick of continuing to run into stupid file name problems, you now have the option to even further reduce the set of allowable characters.

To make file names that are (almost) guaranteed to not cause any problems, you can use

DeUnicode -a -c -s *
where -a makes the output 7-bit ascii (instead of 8-bit windows "A" code page (not ANSI), which is the default), -c removes all special characters that are used in my console ("*^;[]()" etc., as well as anything under 32), and -s removes spaces.

Once you do all that, hey you actually get a fucking file name that you can use in the real world. While *some* apps work with some of those characters and *some* command shells work with some of those characters, if you have any of them, it's quite possible you will run somebody else's wild card traverse and it will do unexpected things, which is very very very (very very very) bad. Particularly when deleting or renaming.

usage :

c:\src\deunicode>deunicode -?
DeUnicode v1.1 by cbloom
usage:
 DeUnicode [-opts] <dir>
-r : recurse
-t : special the ("the blah" -> "blah, the")
-c : remove chsh special characters
-s : remove spaces
-a : ascii output [else 'A' code page]
-d : display only (don't do)
-q : quiet
-v : verbose
DeUnicode.zip at cbloom.com

7/23/2012

07-23-12 - Structs are not what you want

I was looking at some stupid struct in my code which was something like
struct Stuff
{
    U16     m_val1;
    U32     m_val2;
    U16     m_val3;
};
because of the order of variables and the standard packing, it wound up taking 12 bytes instead of 8. In this particular case, the types were actually templates or #defined types so I don't know statically that the ordering is fucked up and causing that problem.

It occured to me that the whole C "struct" thing is really not what we want most of the time. 99% of the time I want to say "put all these values in this thing, and I don't care what order". In particular, Mr. Compiler, you can optimize the order to minimize size, or for speed, or whatever.

Now that's not a big deal with just a simple struct, but more generally it's massive.

What if I could say "bag" is some values with certain types and names. Then when I combine bags, you don't duplicate things that are in both. And you can reinterpret bags to other bags as long as they have the needed values.


bag Particle
{
    Vec3    m_pos;
    ColorDW m_color;
};

bag GameObject
{
    Vec3    m_pos;
    String  m_name;
};

bag GameObject_AndColor : GameObject
{
    ColorDW m_color;
};

now a GameObject_AndColor can be passed to any function that wants a Particle because it provides all the needed fields.
alternatively :

bag GameObjectParticle : GameObject, Particle
{
};

only has one m_pos.

Obviously this doesn't work with C functions which are compiled once and get to know the offset to the data in that one compile. You want a compiler that can compile all the logic but leave the variable references unspecified until it is bound to a concrete type.

Now obviously templates sort of address this, but in a much uglier way.

Templates are worse in 2 ways. 1. they don't do any shared compilation, they leave all compilation to the last minute when they know the concrete type they get to work on; conversely bags can do 99% of the compilation up front and just need to specialize the variable addressing per usage. 2. they don't do any static type checking; that is, when you write the template you can't specify in any kind of clean way "you can use this template on anything that provides m_pos and m_color" (C++ concepts were supposed to sort of address this) ; bags provide this very nicely and let you write a template and error-check it before you even apply it to anything.

Obviously templates are more powerful, but not useable in reality because of these problems; they delay compilation too much for widespread use, it makes compilation too slow, and puts the errors in the wrong place (at the use site, not the template itself). Compilation up front is great (I think weakly typed languages are ridiculous disasters).

But in theory bags can do even more. One annoying problem we have at RAD is that factored out functions can be massively slower than macros. There are various reasons for this, but one is that if you have some variables in registers and want to call a function on them, the compiler will almost always do a ton of work to move those variables around that it doesn't need to do (either writing them back to memory or pushing them on the stack, or just moving them to other registers).

With bags in theory you could do something like :


Vec3 pos1;
int i;
ColorDW c;

... some code that sets up pos1 and c ...

bag Particle p = { pos &= pos1, color &= c } ;  

// this is not a copy
// it says that the variables I already have can be treated as a Particle bag

Particle_Move(p);

// Particle_Move generates code that acts directly on my local variables "pos1" and "c" , no copying

The win is that Particle_Move basically acts like a macro, but I get the type-safety and separate compilation error check benefits of an inline function.

Similarly, I shouldn't have to write new code every time I have a bunch of data that I want to use as SOA (structure of arrays) instead of AOS (array of structures). eg. if I have


Vec3 positions[100];
ColorDW colors[100];

I should be able to use Particle_ functions on those, because they provide the values needed to make a valid bag.

Being a bit redundant : the classic way that simple C inheritance fails happens a lot with vertex types in graphics. It goes something like this :


struct Vertex_Pos { Vec3 pos; }

// functs that only need a "pos" member act on a Vertex_Pos

struct Vertex_PosNormal : Vertex_Pos { Vec3 normal; }

// functs that need a pos and normal act on Vertex_PosNormal

struct Vertex_PosColor : Vertex_Pos { Color color; }

// functs that need a pos and color act on Vertex_PosColor

struct Vertex_PosNormalColor : ??

// oh crap, which do I do ?

struct Vertex_PosNormalColor : Vertex_PosNormal { Color color; }
struct Vertex_PosNormalColor : Vertex_PosColor { Vec3 normal; }

// either way is busted
// with bags I should be able to just do :

bag Vertex_PosNormalColor { Vec3 pos; Vec3 normal; Color color; }

// (or either of the above inheritance ways)

and then the bag Vertex_PosNormalColor can be used as a Vertex_PosNormal or Vertex_PosColor without even explicitly specifying any inheritance. (an ideal compiler would also let you specify that as a compile-time constraint on the type).

Now obviously inheritance and virtual functions solve a slightly different problem than bags. They let you act on part of a type without knowing the concrete type. Bags have to be used on the concrete type, just like templates. I guess the fundamental problem with structs (and why they're wrong for some uses) is that they are actually solving this slightly different problem.

Anyhoo.

7/19/2012

07-19-12 - Experimental Futures in Oodle

I don't know if this will ever see the light of day, but it's fucking sexy as hell so here's a sneak peek.

"futures" implemented in C++98 with Oodle :


void example_future_comp_decomp()
{
    future<OodleBufferRC> rawBuf = oodle_readfile("r:\\oodle_example_future_input");
    
    // call :
    // oodle_compress_sync( rawBuf, OodleLZ_Compressor_LZH, OodleLZ_CompressSelect_Fast );
    // but not until rawBuf is done :
    future<OodleBufferRC> compBuf = start_future<OodleBufferRC>( oodle_compress_sync, rawBuf, OodleLZ_Compressor_LZH, OodleLZ_CompressSelect_Fast );
    
    future<const char *> write = start_future<const char*>(oodle_writefile,"r:\\oodle_example_future_comp",compBuf);
        
    future<OodleBufferRC> read_compBuf = start_future<OodleBufferRC>( oodle_readfile, write ); 
    
    future<OodleBufferRC> read_decompBuf = start_future<OodleBufferRC>( oodle_decompress_sync, read_compBuf );
    
    start_future<const char *>( oodle_writefile, "r:\\oodle_example_future_decomp",read_decompBuf);
}

This creates an async chain to read a file, compress it, write it, then read it back in, decompress it, and write out the decompressed bits.

Futures can take either immediates as arguments or other futures. If they take futures as arguments, they enqueue themself to run when their argument is ready (using the forward dependency system). Dependencies are all automatic based on function arguments; it occurs to me that this rather like the way CPU's do scheduling for out-of-order-processing.

(in contrast to idea #1 in Two Alternative Oodles , here we do not get the full async graph in advance, it's given to us as we get commands, that is, we're expected to start running things immediately when we get the command, and we don't get to know what comes next; but, just like in CPU's, the command submission normally runs slightly ahead of execution (unless our pipeline is totally empty), in which case we have a little bit of time gap)

Functions called by the future can either return values or return futures. (eg, in the code above, "oodle_readfile" could just return an OodleBufferRC directly, or it could return a future to one). If a function in a future returns a future, then the returned future replaces the original, and doesn't return to the outer scope until the chain of futures returns a non-future value. That is, this is a way of doing coroutine yields basically; when you want to yield, you instead return a future to the remaining work. (this is like the lambda-style coroutine yield that we touched on earlier). (* - see example at end)

future of course has a wait() method that blocks and returns its value. As long as you are passing futures to other futures, you never have to wait.

You can implement your own wait_all thusly :


int nop(...)
{
    return 0;
}

template <typename t_arg1,typename t_arg2,typename t_arg3,typename t_arg4>
future<int> done_when_all( t_arg1 a, t_arg2 b, t_arg3 c, t_arg4 d )
{
    return start_future<int>( nop, a,b,c,d );
}

then call

done_when_all( various futures )->wait();

A few annoying niggles due to use of old C++ :

1. I don't have lambdas so you actually have to define a function body every time you want to run something as a future.

2. I can't induce the return type of a function, so you have to explicitly specify it when you call start_future.

3. I don't have variadic templates so I have to specifically make versions of start_future<> for 0 args, 1 arg, etc. bleh. (though variadic templates are so ugly that I might choose to do it this way anyway).

Otherwise not bad. (well, that is, the client usage is not bad; like most C++ the implementation is scary as shit; also doing this type of stuff in C++ is very heavy on the mallocs (because you have to convert things into different types, and the way you do that is by new'ing something of the desired type), if you are a sane and reasonable person that should not bother you, but I know a lot of people are in fact still bothered by mallocs).

In order to automatically depend on a previous future, you need to take its return value as one of your input arguments. There's also a method to manually add dependencies on things that aren't input args. Another option is to carry over a dependency through a binding function which depends on one type and returns another, but that kind of C++ is not to my liking. (**)

To really use this kind of system nicely, you should make functions whose return value is a compound type (eg. a struct) that contains all its effects. So, for example oodle_writefile returns the name of the file written, because it modifies that object; if you had a function that modified a game object, like say an Actor *, then its return value should include that Actor *, so that you can use that to set up dependency chains. (in real code, oodle_writefile should really return a struct containing the file name and also an error code).

* : example of returning futures to continue the async job :


float test_func_5_1(int x)
{
    Sleep(1);
    return x * (2.0/3.0);
}

future<float> test_func_5_2(int x)
{
    Sleep(1);

    if ( x == 1 )
    {
        // hey in this case I can return my value immediately
        return make_future(0.6f);
    }
    else
    {
        // I need to run another async job to compute my value
        x *= 3;
        return start_future<float>(test_func_5_1,x);
    }
}


then use as :


future<float> f = start_future<float>(test_func_5_2,7);

... do other work ...

float x = f.wait();

does what it does.

This is a necessary building block, it lets you compose operations, but it's an ugly way to write coroutine-style code.

What it is good for is creating more complex functions from simpler functions, like :


future<const char *> oodle_compress_then_writefile(const char *filename, OodleBufferRC rawBuf, OodleLZ_Compressor compressor, OodleLZ_CompressSelect select )
{
    OodleBufferRC compBuf = oodle_compress_sync( rawBuf, compressor, select );

    return start_future<const char*>( oodle_writefile, filename, compBuf );
}

I believe this "future" is much better than the real C++0x std::future, which seems to be missing a lot of features.

** : example of using a binding function to carry over dependencies :


// say I want to run two asyncs :

future<int> f1 = start_future<int>( func1 );

future<float> f2 = start_future<float>( func2 , 7.5f );

// but I want to run func2 after func1
//  due to some dependency that isn't through the return value

// what I can use is a return-to-arg adapter like :

template<typename t1,typename t2>
t1 return1(t1 a,t2 b)
{
    b;
    return a;
}

template<typename t1,typename t2>
future<t1> return1_after2(t1 a,future<t2> b)
{
    return start_future<t1>( return1<t1,t2>, a, b );
}


// then run :


future<int> f1 = start_future<int>( func1 );

future<float> f2 = start_future<float>( func2 , return1_after2(7.5f,f1) );

but like I said previously I hate that kind of crap for the most part. Much better is to use the explicit dependency mechanism, like :

future<int> f1 = start_future<int>( func1 );

future<float> f2 = make_future<float>( func2 , 7.5f );
f2->add_dep(f1);
f2->start();

There is one case where the funny binding mechanism can be used elegantly; that's when you can associate the binding with the actual reason for the dependency. That is, if we require func2 to run after func1, there must be some shared variable that is causing that ordering requirement. Using a binder to associate func1 with that shared variable is a clean way of saying "you can read this var after func1 is done".

7/15/2012

07-15-12 - VideoPlayer and Commonopoly

I wrote this little videoplayer for RAD a while ago. The main purpose of it is not for consumer-style use but for developer diagnosing of video compressors.

In particular, you can give it a list of several videos on the command line, and it plays all of them. It plays them frame-exact by stepping through their frames one by one. Then you can pause and single step forward and back, and you can do split-screen or frame-toggles.

(I've always found that comparing image or video compressors is very misleading if you just look at the output; you have to look at them side-by-side with the originals, or even better do full-frame ping-ponging. For example, x264 generally looks great, but when you do full-frame ping-pongs you can see that they really fuck up the overall luma level and completely change the colors quite a bit (this is largely due to the very shitty YUV space that is standard for video, not anything in x264)).

Anyhoo, videoplayer does some nice things like prefetch and cache lots of frames (it's retarded that the mainstream video players don't do this; I have gigs of ram you fuckers, prefetch some god damn video so that I can play over a network without hitching; also keep some previous frames around so I can pause and step backwards a little without seeking!).

download : videoplayer.zip (463k at cbloom.com)

(videoplayer needs radutil.dll which is in RAD Video Tools ; I put it on delay-load so you only need if you want to load a real video (not just images))

Anyhoo, I haven't touched it in a while cuz I'm off video for now. But "videoplayer" can also be pointed at a dir and it treats the images in the dir as frames of a video. I did this originally to load frame-dumps of videos that I couldn't play. For example I would use MPlayer to spit out frames with :

mplayer -benchmark -nosound -vo png:z=6 %1
md %1_frames
call mov *.png %1_frames\
and then point my videoplayer at the dir, and since it can prefetch and all that it can play a video of pngs (which load too slow to play without preloading a bunch).

But I realized the other day that I could use dir-loading to just show image slideshows too, if I use the option to manually set the frame rate to something really low like 0.1 fps. Which brings us around to the title of this post :

I discovered COMMONOPOLY a while ago; I think it's super beautiful ; the guy chooses images well and there's something about that horizontal reflection that really does some magic on the brain. If you full-screen it and stare at the middle and let your eyes defocus a bit, you can get the same image going in each eye and it's like mmm yeah good.

But obviously viewing it on the web sucks balls. So what you do is download all the images with DownloadThemAll, put them in a dir and then point videoplayer at the dir thusly :

videoplayerR.exe -f0.125 -w1 -q -0 -2 -4 -s1 -r t:\commonopoly
and then enjoy.

07-15-12 - libpng DLL Delay Load

cblib for a long time has used libpng , which creates a damn DLL dependency. Annoyingly, that happens at app startup, which means even if you are trying to load a .BMP it will fail to start the app if it can't find the PNG DLL (or, you know, using cblib on totally non-image related stuff). What I've wanted for a long time is to put the PNG DLL load into the PNG code so that it's only done if you actually try to load a PNG.

MS could have very easily put DLL load-on-first use into the thunk libs they make. That would have been nice. (ADDENDUM : I guess they did!)

Anyhoo, you can do it yourself, and it looks like this :

We need CALL_IMPORT from before :


template <typename t_func_type>
t_func_type GetImport( t_func_type * pFunc , const char * funcName, HMODULE hm )
{
    if ( *pFunc == 0 )
    {
        ASSERT( hm != 0 );
        t_func_type fp = (t_func_type) GetProcAddress( hm, funcName );
        // not optional :
        ASSERT_RELEASE_THROW( fp != 0 );
        *pFunc = fp;
    }
    return (*pFunc); 
}

#define CALL_IMPORT(name,hm) (*GetImport(&STRING_JOIN(fp_,name),STRINGIZE(name),hm))

and I'm going to just explicitly load the PNG DLL when I get to LoadPNG :

static HMODULE hm_png = 0;
#define HMODULE_FAILED  (HMODULE) 1

static bool my_png_init()
{
    if ( hm_png ) 
    {
        return ( hm_png == HMODULE_FAILED ) ? false : true;
    }
    
    HMODULE hm_zl = LoadLibrary(ZLIB_NAME);
    if ( hm_zl == 0 )
    {
        lprintf("Couldn't load Zlib (%s)\n",ZLIB_NAME);
        hm_png = HMODULE_FAILED;
        return false;
    }
    
    hm_png = LoadLibrary(PNG_NAME);
    if ( hm_png == 0 )
    {
        lprintf("Couldn't load PNG lib (%s)\n",PNG_NAME);
        hm_png = HMODULE_FAILED;
        return false;
    }
    
    lprintf_v2("Using libpng.\n");
                
    return true;
}

#define CALL_PNG(name)      CALL_IMPORT(name,hm_png)

so now we just have to replace all our png calls with CALL_PNG() calls, like :

    png_ptr = png_create_read_struct(PNG_LIBPNG_VER_STRING, NULL,NULL,NULL);

    ->

    png_ptr = CALL_PNG(png_create_read_struct)(PNG_LIBPNG_VER_STRING, NULL,NULL,NULL);

(obviously you could #define png_create_read_struct CALL_PNG(png_create_read_struct) to make it totally transparent)

Lastly we need to define all the fp_ func pointers with the right signature. This is particularly easy because png.h wraps its function externs with macros (nice, thanks guys), so we can just abuse those macros. png.h contains things like :


PNG_EXPORT(png_voidp,png_get_io_ptr) PNGARG((png_structp png_ptr));

So we can just do :

#define PNG_EXPORT(ret,func)    static ret (PNGAPI * STRING_JOIN(fp_,func))
#define PNGARG(args)            args = 0

and then the png proto is defining our fp for us. (unfortunately png.h sticks "extern" in front of all the protos, so you have to copy out the protos and take off the extern).

So anyhoo there you go, libpng use with delay-load.

BTW this also suggests a way that you can make your DLL very easy to use with delay-load and manual imports. What you should do is provide a header with only your function protos in it, and wrap them with a macro like :


DECLFUNC( ret, name , args );

eg.

DECLFUNC( png_voidp, png_get_io_ptr, (png_structp png_ptr) );

Then a client could just include that header multiple times and change DECLFUNC to various things. For example if you had a header like that, you can look up all the func names at LoadLibrary time, instead of doing each one on first use (this removes a branch from each function call site). eg :

#define DECLFUNC( ret, name , args )    ret (CALLBACK * STRING_JOIN(fp_,name)) args = 0
#include "allfuncs.inc"
#undef DECLFUNC

void ImportAllFuncs()
{
HMODULE hm = LoadLibrary;
#define DECLFUNC( ret, name , args )    STRING_JOIN(fp_,name) = ImportFunc(name,hm)
#include "allfuncs.inc"
#undef DECLFUNC
}

Unfortunately you can't do a #define inside a #define or this could be used to alias the names transparently with something like


#define DECLFUNC(ret,name,args) #define name (* STRING_JOIN(fp_,name) )
#include "allfuncs.inc"
#undef DECLFUNC

(ADDENDUM : or, you know, just use the /DELAYLOAD option)

07-15-12 - Internet Toggle

For the most part I'm not that prone to the time-wasting allure of the internet when I'm supposed to be working. But recently I've been writing the docs for Oodle and it's just excruciating boring work. Once in a while I have something important to say in the docs, but the vast majority is stuff like :

DOCGEN int Oodle_ComputeAPlusB(int A, int B);
/* Computes A + B

    $:A   the value of A in A+B
    $:B   the value of B in A+B
    $:return    the sum of and A and B

  Oh god please kill me now.

*/

Every time I wrote one of those docs I would involuntarily pop up my web browser and see if Kids on Crack had any updates. So I started turning off my internet access to break that habit.

I was using ipconfig /release and /renew to do it, but that has a few problems : 1. it's very slow (not a big deal since the toggle is rare), and 2. it also kills my VPN to work.

Jeff suggested a better way is to turn DNS off and on. I found netsh can do that pretty easily. I'd never used netsh before, it's pretty nice. The commands are :


dns_on.bat :

netsh interface ip set dns name="Wireless Network Connection" dhcp

dns_off.bat :

netsh interface ip set dns name="Wireless Network Connection" static 0.0.0.0
ipconfig /flushdns

(or, you know, slightly different as is appropriate for your net config).
(note the flushdns also, otherwise you'll have things like Google in cache).

For machines you want to still access in "internet off" mode, you can of course just use their explicit IP, or slightly nicer you can add them to your "hosts" file.

(aside : another cool feature of netsh is that you can save & restore your entire network config if you want to temporarily mess things up; just use "netsh dump" to save it and "netsh exec" to restore it).

07-15-12 - cbvector

I tried to use std::vector in one of the Oodle examples, and the freaking MSVC STL generates link dependencies by default. WTF. So I got annoyed and wrote my own single file standalone vector.

Caveats : this is ripped out of cblib (with dependencies removed), so it's rather ugly. This is not totally STL compatible. It works well enough for my purposes. Feel free to improve it.

Download : cbvector.h at cbloom.com

cbvector.h intentionally doesn't include anything. You may need to include stddef.h and/or new.h before cbvector.h.

WARNING : cbvector.h uses #defines to configure various aspects of the implementation. If you set those #defines to different things in different code files, then you must ensure that one of the following is true : 1. the CB_VECTOR class name is different in each case, or 2. the linkage of all funcs affected is "static", or 3. the entire cbvector is in an anonymous namespace (which is the default). If you're not careful about this you can get super bizarro bugs when the linker merges functions that aren't actually the same.

old rants