Arnaud/KDL: a cuddly Document Language

Created Wed, 20 Oct 2021 22:08:28 +0200
1348 Words

“A cuddly Document Language” - aka KDL

Few months ago, mid September, I came to know about a new document language. I guess it was related to this tweet, retweeted by one of the account I do follow:

As I’m kinda curious, I read quickly a little bit more information about it.

On the website, and Github repo, it is defined like that:

KDL is a document language with xml-like semantics that looks like you’re invoking a bunch of CLI commands! It’s meant to be used both as a serialization format and a configuration language, much like JSON, YAML, or XML.

and

KDL is a node-oriented document language. Its niche and purpose overlaps with XML, and as do many of its semantics. You can use KDL both as a configuration language, and a data exchange or storage format, if you so choose.

Then, I wondered what was this language, its purposes and differences will all the existing one’s.

The purpose of this post is to share some thoughts and opinions about it.

Why a new document language?

image

Well, it seems we are not the first one to ask this, as you can read in the FAQ:

  • Why yet another document language?
  • Ok, then, why not SDLang?
  • What about YAML?
  • What about JSON?
  • What about TOML?
  • What about XML?

I particularly appreciated the “Have you seen that one XKCD comic about standards?” 😆

Spoiler: scroll down to see the mentioned comic strip 😄

I’ll keep my opinion on this until the conclusion of this post.

In the meantime, I encourage you to read the FAQ, where the author explains quite clearly her thoughts about the other languages.

A language means specifications

The KDL specification can be found on the project Github site.

I read them… but I have to admit, I was not really comfortable when I reached the grammar description:

nodes := linespace* (node nodes?)? linespace*

node := ('/-' node-space*)? type? identifier (node-space+ node-prop-or-arg)* (node-space* node-children ws*)? node-space* node-terminator
node-prop-or-arg := ('/-' node-space*)? (prop | value)
node-children := ('/-' node-space*)? '{' nodes '}'
node-space := ws* escline ws* | ws+
node-terminator := single-line-comment | newline | ';' | eof

identifier := string | bare-identifier
bare-identifier := ((identifier-char - digit - sign) identifier-char* | sign ((identifier-char - digit) identifier-char*)?) - keyword
identifier-char := unicode - linespace - [\/(){}<>;[]=,"]
keyword := boolean | 'null'
prop := identifier '=' value
value := type? (string | number | keyword)
type := '(' identifier ')'
...
...

A little bit too detailed for me. As I did want to develop my own KDL parser/interpreter, I skipped this part 😄!

However the description of the language is quite clear and well describe, ready for implementations.

I discovered therefor a huge number of possible whitespace characters 😨:

Name Code Pt
Character Tabulation U+0009
Space U+0020
No-Break Space U+00A0
Ogham Space Mark U+1680
En Quad U+2000
Em Quad U+2001
En Space U+2002
Em Space U+2003
Three-Per-Em Space U+2004
Four-Per-Em Space U+2005
Six-Per-Em Space U+2006
Figure Space U+2007
Punctuation Space U+2008
Thin Space U+2009
Hair Space U+200A
Narrow No-Break Space U+202F
Medium Mathematical Space U+205F
Ideographic Space U+3000

Source

What a KDL file looks like?

To try this, I’ll take an example of mine (not one of the Github repo … otherwise, you don’t need me 😉).

So, I considered at first a yaml file.

Fun story

The first yaml implementation specifications is 20 years old (released in may 2001). At this time, yaml stand for Yet Another Markup Language.

In it version 1.1, in january 2005, yamlnow stands for YAML Ain’t Markup Language … wind changed!

Let’s take a classical yaml file use to declare a deployment for in Kubernetes world:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

Note: this is the Kubernetes doc example you can find here

Now, I will try to convert it is json. I think it could be something like this:

{
    "apiVersion": "apps/v1",
    "kind": "Deployment",
    "metadata": {
        "name": "nginx-deployment",
        "labels": {
            "apps":"nginx"
        }
    },
    "specs": {
        "replicas" : 3,
        "selector": {
            "matchLabels": {
                "app": "nginx"
            }
        },
        "template": {
            "metadata": {
                "label": {
                    "app": "nginx"
                }
            },
            "spec": {
                "containers": [
                    {
                        "name": "nginx",
                        "image": "nginx:1.14.2";
                        "ports": [
                            {
                                "containerPort": 80
                            }
                        ]
                    }
                ]
            }
        }
    }
}

And finally, let me try to write the same kind of thing in KDL:

image

📝 NOTE

I placed an image here instead of classical text because my web site engine does not know how to highlight KDL language… and VSCode do, thanks to the appropriate extension

May you need the text itself (I’m an adept of copy/paste too, don’t worry), you can find it here

Weight comparison

Format Size in bytes Ratio vs smallest
yaml 340
kdl 389 + 14%
json 874 +157%

From the weight perspective, ie the size of the message between format for the same meaning, json is clearly the loser here.

yamland kdl are approximatively equivalent, no significative difference

Readability

This is a tricky evaluation topic as it is really subjective!

I used to work with json since long time now, so, I feel familiar with it. However, the example above is quite impressive, especially the final 8 lines, dedicated to closing brackets (curly and square)!

yaml is a little bit more recent to me. I used it a lot to describe my API specifications (using the famous Open API Specifications), however, I still feel uncomfortable sometimes when comes the time to write some table: I’m a little bit lost in indentation 😉.

In addition to that, when the document is very long, find the good level of indentation is not an easy task. Hopefully, editors like VSCode or others offer a lot of useful plugins, linter, etc to help! 😄

Regarding kdl, I do not feel it improve readability vs a yaml one.

I think this comes from the fact multiple properties can be defined on the same line, like containers name="nginx" image="nginx:1.14.2"… the yaml equivalent looks better to me.

Implementations

Of course, languages like json and yaml are now very famous and spread over the world and have therefor a lot of implementations, event more than one.

Even if kdl is pretty new, it already has implementations in many languages: Rust, Javascript, Ruby, Dart, Java, Php, Python, Elixir and even an xslt (to represent your xml int kdl)

TL;DR

I have to admit I’m not fan of this language.

The complaints about the existing one, like yaml for instance, does look fair to me, or at least do not justified the creation of something new… and it leads me to have the same feeling as the one mocked in XKCD:

xkcd

I’m pretty sure it should have been an exciting adventure to define this language. I would have loved to do it, to define a complete grammar and rules, etc.

But, to be honnest, if I have to develop a tool today that require a config/input file, I will not use kdl.

In my day to day job, xml is king, json and yaml start to find their place but there is also odata, edoc … and other formats invented for our own business. I’m not sure kdl will find a place here (except if pushed by a major software vendor).

Again, this is just a personal opinion. Maybe the future will prove me I was wrong 😄

Links