April 2019 progress report

This has been a productive and eventful month for Aki development. Read on:

Added

  • Pointer types. These are not useful for anything yet, but the language does have them implemented in some form. Ideally, you should never have to touch them in the course of most daily work.
  • Function pointers. Each function pointer also has a discrete named type associated with it. Again, this is a low-level implementation detail. Most of us should never have to use it.
  • ref/deref. These new commands create pointer references from existing variables, and dereference them. I decided to use this instead of symbols like * or &, which I've always found to be noisy and ugly. There's nothing that says this can't change, though, it's just a preference I find attractive right now.
  • Rudimentary string type. We can't do much of anything with them yet, but they're there.
  • Rudimentary caching of compiled code. From what I can tell, this speeds up the loading and parsing of previously compiled code by about a factor of two. Also, any cached code generated with an earlier build of the compiler will be discarded and recompiled automatically.
  • A basic object format. Objects are things like strings or arrays -- non-scalars. They each have a header with some descriptive information (such as the length of the object), and a pointer to the object itself.
  • Unsafe mode. Some functions, like casts, are considered unsafe and will only be allowed inside an unsafe block. I'm considering some further refinements of the way unsafe behaves -- for instance, the compiler will complain (as in throw a warning, not a full-blown error) if you try to pack too much into an unsafe block -- but I'm not committing to anything there yet.
  • Inline type definitions for constants. If you want to specify that a given constant is a u64, you can just type 256:u64. It's consistent with the other ways types are defined, and it's easy to read.
  • Control characters and Unicode escape sequences in strings. The syntax is essentially stolen as-is from Python: \n for newline, \xhh for ASCII characters, and \u/\U for 16- and 32-bit Unicode, respectively.
  • Array types. You can only make arrays out of scalars for now -- no pointers or other objects -- but in time this ought to change.
  • The Conway's Game Of Life demo is now live. The sample programs bundled with the Aki compiler now include a simple console-based version of Conway's Game Of Life. This is a major milestone, actually, since it means we have enough of the language together to create programs that are at least trivially useful.

Changed

  • Lexer and parser logic was reworked. I unsnarled a number of subtle bugs this way.
  • Type descriptors now use an AST. They used to be stored as strings, which was fine when we only had scalar types and nothing else. Now, with pointers, this approach no longer works. Variable types are essentially little grammars of their own, so we might as well make them into full-blown ASTs. How better else to translate something like :ptr func(:ptr i32, :i32):u64?
  • The entire way types are instantiated for a module has been reworked. This allows us to have variable types that are directly linked to the LLVM target, so there's no guesswork about pointer or byte sizes. (It's assumed bytes are 8 bit on all platforms, but pointer sizes are always derived from the target machine.)
  • A major overhaul of how Aki-specific type information is propagated with LLVM types. Every Aki type is backed by an LLVM type, but the two aren't necessarily the same thing. (An i1 could be a one-bit integer, or it could be a bool value. No way to tell in LLVM alone.) My original way of associating Aki-specific information with LLVM types was incredibly clumsy. I tore it out and completely reworked it.
  • The way object types are extracted for the REPL has been revamped. Originally, we had a custom function for each Aki type that extracted the data for the sake of returning a C-friendly value in the REPL. This proved cumbersome, so I reworked it to simply use the existing c_data function by way of a custom code generation routine.

Needs fixing

  • We still don't have a way to create compile-time constants. That's coming shortly.

Still missing

A lot.

  • Constants of any kind.
  • Any documentation.
  • Pointer operations (we can't do pointer math yet, for instance, or even ref/deref pointers).
  • String operations.
  • Heap allocations.
  • Error trapping of any kind, such as integer rollovers (the best place to start with that, really).
  • Any form of memory management.