skip to content
ShellMonk

Monk Coding Challenge 2: JSON parser in Raku

Writting a JSON parser with Perl's younger brother - Raku

Intro

Hello again, dear reader. Welcome to the second post of the Monk Coding Challenges series, where we write a JSON parser using one of the most fascinating languages I have ever seen - Raku. If you want to skip text and go straight to the conclusion, jump to final verdict

Raku and me

Oh boy. Playing with Raku is a hell of fun. This weird language is so densely packed with features that even writing a 5,000-word text would not scratch the surface. And I’m not kidding!

I first stumbled upon Raku while watching ThePrimeagen’s video, where he went through the ”Raku: A Language for Gremlins” text by Hillel Wayne, the author of ”Practical TLA+: Planning Driven Development” book (great text and an excellent book, btw). Raku is rebranded Perl 6, so if you have Perl experience, Raku’s syntax will look familiar.

There are many cute things about Raku. For instance, Raku’s logo, Camelia, was designed by Larry Wall himself. If you look at official websites, docs, and so on, you won’t find anything fancy. They all look like early ’00s web pages designed in Dreamweaver by computer scientists. And this is how the language feels - a brainchild of outstanding engineers who are in love with their creation.

Fun bits

Rational numbers

One of the unexpected features of Raku is the handling of rational numbers. If you define a floating point number, it will be internally handled as a Rational number. How cool is that!

$ raku -e 'say 0.1 + 0.2 - 0.3'
0
$ raku -e 'say 1/10 + 2/10 - 3/10'
0

Operators

Holy mother of God. I feel like there’s more operators in Raku alone than all other languages combined. Don’t believe me? Look at this:

$ raku -e 'say 2 (elem) (1, 2, 3)' # Wanna check if element is part of the set
True 
$ raku -e 'say 4 ∈ (1, 2, 3)' # How about using some Unicode?
False
$ raku -e 'say [+] <1 2 3 4 5>' # Wanna infix reducer?
15
$ raku -e 'say [\+] <1 2 3 4 5>' # How about some accumulators?
(1 3 6 10 15)

You can even define your own! Prefix, infix, postfix, circumfix, or postcircumfix, it’s your choice:

$ raku -e 'sub postfix:<♥>( $a ) { say „I love $a!“ }; 42♥;'
I love 42

List goes on and on, and I want to keep this short, so look for yourself, it’s fun!

Solving the challenge

The reason why I choose Raku for this particular challenge is one very peculiar language construct offered by it: ”grammars.” It’s an extremely useful feature you won’t find anywhere else. You can build custom grammars as part of the language, which makes writing all kinds of parsers a breeze. (You can even extend Raku itself, but this is too advanced for this challenge, and I’m not smart enough to grasp it quickly)

The usual way you write a parser is to start with a tokenizer or a lexer, generate a stream of tokens, and plug that into the some kind of interpreter. We’re not doing this today. I chose Raku for its “grammars” and I’m dying to try them.

Now, doing this challenge using grammars isn’t doing the language any favors, as it explores only one exotic feature. Hopefully, this will be enough for enthusiastic readers to pursue further learning.

Basic program shell

One of the excellent Raku features is how it handles command line input. If you define a “MAIN” function, any parameters you give will be automatically turned into CLI flags.

# read file name from command line if present
# if not, default to tests\example.json
sub MAIN(Str $file where *.IO.f = 'tests\example.json') {
    my $contents = $file.IO.slurp;
  
    say $content;

    # ...
}

Defining a grammar

Now, for the juicy part, the parser itself. It’s funny how easy it is to write parsers with grammars once you understand them (and regex, of course).

(Note: this is not fully JSON compliant grammar definition, but the proper one isn’t much more complicated, look at: JSON::Tiny)

grammar MCCJSON
{
    # first we match start and end
    # and ignore leading and trailing whitespaces
    token TOP { ^ \s* <object> \s* $ }
    
    # next we define an object as optional pair list
    # between curly brackets 
    token object { '{' \s* <pairlist>? \s* '}' }

    # pair list is one or more pairs, separated by comma
    # and mandatory whitespace cleanup
    token pairlist { <pair> \s* [',' \s* <pair> \s*]* }

    # pair is, unsurprisingly, "key : value"
    token pair { <string> \s* ':' \s* <value> }

    # next, we need an array, which is one or more list of values
    # between square brackets
    token array { '[' \s* <valuelist>? \s* ']' }

    # value is one or more values, separated by comma
    token valuelist { <value> \s* [',' \s* <value> \s*]* }

    # value can be string, number, object, array
    # or special cases: true, false or null
    token value {
        | <string>
        | <number>
        | <object>
        | <array>
        | 'true' | 'false' | 'null'
    }

    # string regex is a bit more complicated to describe here
    # but https://docs.raku.org/language/regexes is our friend
    token string { '"' ( <-[\\"]>+  | '\\' . )* '"' }

    # matching numbers, including negative and ones with floating point
    token number { '-'? \d+ [ \. \d+ ]? }
}

Using the grammar

Grammars are internally represented as classes, so using them is pretty simple:

my $parsed = MCCJSON.parse($contents);

if $parsed {
    say "JSON is valid!";
    # This can be done nicely with grammar actions
    # but we'll leave that for another version
    say $parsed.raku;
} else {
    say "Invalid JSON!";
}

Note: Grammar action objects in Raku are very nice tool for modeling API for accessing parsed text

Final verdict

Perl hasn’t gotten much love in recent years, although old engineers refer to it as “duct tape that holds the Internet together,” and rebranding as Raku feels like an honest attempt to revamp Perl’s old glory. If this attempt is successful, time will tell.

Pros

Raku is one language that makes you feel like you’re not smart enough but in a good way. There are not many programming language constructs you can find elsewhere that are not present in some way or form in Raku. I’d say that developers who have mastered Raku can pretty much write any piece of software they can imagine. It’s really that powerful.

Cons

However, if you’re not familiar with Perl, Raku will feel like an alien language, making the learning curve damn steep. There are so many constructs not present in popular languages today that learning Raku feels like a lobotomy. Documentation is poor right now, the interpreter is slow, and you can make a case that rebranding from Perl is halfway done.

When and where to use it

  • You want a swiss army knife of a language, and you have time and brain power to invest in learning it
  • You are in an environment where you need to do a lot of duck-taping, writing scripts, parsers, or any text-processing programs
  • You love exploring exotic concepts in computing and need a language that will make your brain melt by opening new horizons
  • You want to improve your Perl codebase
  • You have a long white beard and love regex

Raku in 2024 and beyond

When it comes to Raku’s future, to be perfectly honest, I have no predictions. There’s a pretty big and enthusiastic community behind Raku, and people who use it really love it. And I can see why. Although Perl keeps dropping in popularity, I will keep an eye on Raku, if nothing else, to see what intelligent alien engineers behind it will come up with next. However, the world, and especially the software development world, is moving fast, and competition between languages is getting hotter. I truly hope Raku finds its niche and survives, but I would not bet my future on it.

Outro

This was a hell of a ride, and as usual, if you came this far, thank you for reading. See you in the next challenge.