Ziglang - First thoughts!

Anders Lindqvist (breakin)

Contents

1 Introduction
2 Documentation & Package Management
3 WebAssembly
4 Compile Time (Reflection)
4.1 Example: Compile Time Reflection
4.2 Example: Detect hash collisions
5 Debugging
6 Closing words
7 Some References
8 Source code

Introduction

I've been searching for a new (or old!) language to use instead of C++. There has been many languages that has gotten a lot of hype and I've done a hello world program in a few of them. I then usually get stuck on building dependencies and understanding build systems instead of actually writing any code. Being on Windows never help when trying to get dependencies to work! That being said I do get a glimpse of the languages before I sigh and go back to C++.

Todays language is ziglang. Where Rust tries to be a better C++, ziglang tries to be a better C. I feel that it has a lot of overlap with dlang but the language is smaller, both in terms of language constructs and standard library. I have a feeling that it has a lot of overlap with the Jai programming language but I haven't followed it that closely so not sure if that feeling is right or not!

Either way it has been really easy getting into it language wise and it has been very fun so far!

Documentation & Package Management

The hardest part has been documentation. The intro documentation is good but there is not enough. I've found really well-written blog posts and github repros exist but they are often outdated since the language is currently moving quite fast. Without proper documentation it is hard to know how to manually migrate them.

Syntax and standard library has changed a lot and it does not seem to be slowing down, although changes might affect details more.

I reported one bug for the Murmur hash in the standard library. It was fixed but It was suggested to me to try another hash (probably a good advice) and that maybe the murmur hash should not be part of the standard library. It seems like the standard library will shrink a lot before 1.0 and code that is exvluded will probably end up in community-managed packages instead.

As far as I understand it there is no official package manager yet so I am not sure what that will look like. There is, however, a good build system once the packages are out so that is very promising.

The ziglang community has been really helpful and I recommend trying something out if you get stuck. I had good luck with the discord channel!

WebAssembly

I think one big reason that I'm having a lot of fun this time is that I am targeting WebAssembly instead of native. That means that I don't want all the C++ dependencies. I want a few C++ dependencies but most of them can probably be compiled to WebAssembly using emscripten.

ziglang is good at interacting with c-dependencies (they can be included, sometimes source and all) and compiled to WebAssembly. At least that is the pitch, I haven't tried that many yet and there are caveats. Another perk of sitting in the browser is that if there something I need support for I can probably lean on the browser to decode say an image format I don't have a ziglang decoder for.

ziglang can target WebAssembly without requiring the emscripten toolchain. I have nothing bad to say about emscripten but it sure feels nice to have a smallish well-behaved binary (and standard library) that can compiler directly into WebAssembly. No environment variables required! The term smallish is probablu due to the LLVM-library that is part of the compiler; it adds quite some space to binaries! The ziglang binary can cross-compile to everything it supports on all operating systems so that probably increase the size quite a bit too (having to include all the LLVM targets).

Since I am working as a webdev I though it would be neat if I could get the zig compiler via NPM (to get compiler and standard library versioned with my repro) but that was not possible right now.

Compile Time (Reflection)

One important feature I've been after in moving from C/C++ is proper compile time reflection. And better compile time interaction overall. This has always been something dlang has been good at. Ziglang delivers so far!

What I'm after is a good replacement of the C-preprocessor. I want something that is aware of types but that is not as complex as C++ templates in terms of readability and compile time.

In ziglang I've found that a lot of code seems be fully usable at compile time and the compiler is helpful in telling you when you violate things. You can also check such that you are running your function at compile time.

A better introduction to ziglang compile time can be found here!

I have not found a way to inspect the code generated at compile time, but it is possible to log during compile time to a special log such that you could verify that good things happens that way.

Example: Compile Time Reflection

I'm looking to build a material system for a WebGL2 renderer.

First I have a struct

pub const MaterialStateTexture = struct {
    texture_id: c_uint = 0
};

Here we can see that types (such as structs) are first class in ziglang. The type is stored in the variable MaterialStateTexture.

Now let us define a struct holding some properties

pub const PBRMaterialState = struct {
    albedo_texture: MaterialStateTexture = MaterialStateTexture {},
    emissive_texture: MaterialStateTexture = MaterialStateTexture {},
    roughness: f32 = 0.0,

    ...

Now let us say that we want to create a function that calculate a hash key based on the content of the struct. Note that while we use the Murmur hash we don't recommend it. It is only to show what is possible, do reasearch on better hash functions if you need them!

We add a member function to our struct

    pub fn calculate_hash(self: PBRMaterialState) u32 {
        const hasher = hash.Murmur2_32;
        var hv : u32 = 0;
        const t = @typeInfo(PBRMaterialState);
        inline for (t.Struct.fields) |value| {
            if (value.field_type == MaterialStateTexture) {
                hv = hasher.hashUint32WithSeed(
                    @field(self, value.name).texture_id,
                    hv
                );
                }
        }
        return hv;
    }

There is a lot to unpack here. The function is executed at runtime, but the inline for is evaluated at compile time such that at runtime it is no longer a loop. The function @typeInfo can only work at compile time when we have type information. The same goes for the expression inline for. Since value is compile-time known the if-statement will also be evaluated at compile time.

I imagine that what is left looks like this when preprocessed

    pub fn calculate_hash_preprocessed(self: PBRMaterialState) u32 {
        const hasher = hash.Murmur2_32;
        var hv : u32 = 0;
        hv = hasher.hashUint32WithSeed(self.albedo_texture.texture_id, hv);
        hv = hasher.hashUint32WithSeed(self.emissive_texture.texture_id, hv);
        return hv;
    }

Now all we have to do is to finish the struct

};

Now in our program we can do

const ms = PBRMaterialState {};
const hash_value = ms.calculate_hash();

If we want to we could have made the calculate_hash function a generic function, taking any type. The signature would then be

pub fn calculate_hash(our_struct: anytype) u64 {

This allow us to do

const ms = PBRMaterialState {};
const hash_value = calculate_hash(ms);

If we need the type of our_struct inside calculate_hash we can get it using @typeOf(our_struct).

Example: Detect hash collisions

This one kinda failed but it is still interesting to think about. A common case to store string is to use hashing (and hope that there is no collision) or perfect hashing (when all strings are known in advance). ziglang can easily make a hash-function for a string that gives out a number, and if it is called many times it will even remember the returned value and only run it once (see Memoization).

First. Writing a hasher is trivial. Here is one for numbers:

fn fn_hash(comptime T : u32) u32 {
    comptime {
        const hasher = hash.Murmur2_32;
        var a : u32 = 0;
        a = hasher.hashUint32WithSeed(T, a);
        return a;
    }
}

The same approach works for strings too since they are known at compile time but you have to change some words.

But what about collisions? I have not been able to do something that checks if there was a collision. It is currently possible to create a closure inside the compile-time function that remember things between invocations (with unspecified order). Using that an array of hashes and the first string to produce it can be created. Ever time a new string is hashes it can be controlled such that we are using the same string. But rumor on the discord channel was that these closures would probably would go away soon so without I don't know if it is solvable.

It is also not possible to do lazy compile time values, which makes sense since they can influence compilation. I am still discovering what makes sense or not and I don't know if this is a real problem or not.

I think my approach now will be to specify all string up-front, but let each file contribute a list of strings such that it can be somewhat modular. It would be nice to be able to detect unused strings though somehow. Lets see what I come up with.

Debugging

I was able to load my binary in Visual Studio Community and place breakpoints. I could step around in my code and I could inspect numbers and structs on the stack.

Debugging of numbers and structs

It is possible to step into runtime functions that have comptime elements in them but it feels a bit weird. I can't inspect compile time information and I also can't see the generated code. If I enable ASM I do see that it is working like I want it to.

Debugging of comptime

There is a compile time log that you can use while designing compile time algorithm where the code can let you know what it is up to. It is also sometimes possible to make sure that things are happening at compile using by storing results in constant variables.

Closing words

I have no idea if I will actually use ziglang or not but I will at least keep an eye on it. It feels like it could be a better C and might actually be exactly what I want for a lot of cases. It makes sense for anything embedded as well as WebAssembly.

Things I am excited about:

Compile time reflection
Wasm native!
Much talk about doing super fast incremental builds. I love the focus on build time/performance.
Looks like a good contender for the better-C that I've been looking for.

Some things I want to learn more about:

No inheritance and no virtual functions, how to structure programs? Can I at least inherit structs?
Will I miss not being able to see compile time generated code? Is it possible?
How does it compare to emscripten when targeting wasm? Both performance wise and size wise.
Is it mature enough now? If not, when will it be?
What is the story on concurrency and simd? It seems to talk about and partially support it but I haven't looked that close at it yes since my main focus has been single-threaded WebAssembly for now.

I could see myself writing a lot of different types of programs in it. In the end I might end up in Rust instead but it just doesn't feel fun and fun is a big factor for smaller projects, especially hobby projects. But I have a feeling that Rust maybe becomes fun once you've given up to it :) Either way I recommend testing zig out!

Some References

Here are some more references I found while playing around:

Small ziglang+wasm+Webgl+node tests I did - zig-webgl-node-test
- Very similar to zig-wasm-webgl but with node
Ziglang Documentation
Ziglang Standard libary (caveat)
String Matching based on Compile Time Perfect Hashing in Zig

Source code

Source code for my comptime example can be found here. Build using zig build-exe ziglang-1.zig.