Usability in programming language concept implementations

Why does it feel easier and more joyful to write in one programming language than another? Most of the work is coming up with the algorithm, without considering language. Then translating that to a programming language is not usually the biggest problem (unless doing it in J 😉). Puzzling out how to make it work in PostScript (or J) can even be a reward in itself. But then there are those times when you just want to get an answer.

Prequel

As usual I have been doing adventofcode both because it is fun and it is an opportunity to use my Tailspin programming language. I also take the opportunity to try out a new language (or several) and as the mood takes me I might use an old favourite.

In 2017 I wanted to improve my javascript and also ended up learning SequenceL, in 2018 I wanted to see how SQL could work for programming and also started designing Tailspin as "how would I want to write this code", in 2019 I had just developed Tailspin so that was it, in 2020 I wanted to learn Julia, in 2021 F#, in 2022 Smalltalk (although admittedly I should probably do more with that. I really want to give Newspeak a try). Now and then I give Dart or Java a whirl. I've tried more languages, but you really need to do more than a couple of programs to get a good feel for it.

This year I wanted to see how Pyret worked, as it is designed to be a good beginner language. Some things I particularly liked was the way currying is done and the example tests for functions. Also the table data structure was very nice, even if I only used it once. Pyret is quite pleasant to work with, so this is no criticism of Pyret, but as I went along it just felt a little easier and more pleasant to reach for Tailspin than Pyret, particularly on the days I felt I was a little stressed or short of time.

I realized that the same thing had happened previous years. One thing that struck me is that being able to emit an arbitrary amount of result values from a transform/function is really a superpower. I have previously written about the power of emitting nothing at all, but emitting many values is just as liberating. While that is a nice insight in itself, is there a more general way to analyze this?

What is "better"?

Obviously what is "better" can depend on your purpose, so I should start by defining what I think is better for this analysis:

  1. Clearly (and, secondarily, concisely) showing what is being done (rather than how)
  2. Allowing easy restructuring and recombination to fit a slightly altered purpose
  3. Helping to avoid errors (or at least making sources of errors easy to find)
  4. Making it easier to follow best practice and harder to be "clever" 
These are the principles I applied when designing Tailspin, so it should come out looking good in this analysis.
 

Looking at underlying concepts

Last year, I evaluated Tailspin on the Cognitive dimensions of notation and one way of trying to find usability differences would be to evaluate some other languages the same way. I think that would be useful, but I don't really want to go into so much depth on each language. So I had the idea of looking at concepts, Daniel Jackson style.

I think all programming languages mostly have to implement the same concepts, but the way they are implemented and exposed to the programmer may result in big differences in usability (and certainly I could keep the cognitive dimensions in mind here). Also, muddling and overloading of concepts, forced synchronizations between concepts, or a proneness to erroneous expression, could explain arising resistances.

I propose the following concepts:

  • Repetition - Purpose: Allows repeating similar operations with slight variation. Operational Principle: If you provide the parts that vary, the algorithm is repeated with those variations.
  • Aggregation - Purpose: Collects separate related units into one unit. Operational Principle: If you provide the parts and specify how they relate to each other, an aggregate unit is created.
  • Projection - Purpose: Creates a view of parts of an aggregate. Operational Principle: If you specify the location(s) within the aggregate you want to access, those parts are extracted into a smaller aggregate (or single value)
  • Selection - Purpose: Allows a choice of which units to process and which to ignore. Operational Principle: If you specify the selection criteria, the matching units will be processed.
  • Verification - Purpose: Allows checking correctness criteria of the program. Operational Principle: If you specify a criterion and run the verification procedure (which may be built in), a warning/failure is issued if the program does not fulfil the criterion.
  • Documentation - Purpose: Allows communicating aspects of the design and purpose to a future reader of the program. Operational Principle: If you specify the relevant information, a future reader will receive it when needed.

Repetition

I suppose the most primitive expression of this concept would be the GOTO. The biggest problem with the GOTO is that it becomes very easy to create complicated execution trees, as we know Dijkstra pointed out. Enforcing and clearly showing more structured repetition like loops and procedure calls is advantageous. It is possible that there exists some algorithm that is most clearly represented by GOTO, but I don't remember any convincing example.

Loops

The classic here is the integer count from a start to an end FOR i=1 TO 10. This is often used to do something a specific number of times, but despite the beautiful clarity of something like 10 times: [...], I think that it perhaps doesn't improve enough over the FOR to be worth it and there is still a need to start at an arbitrary point and perhaps use a different increment. On the other hand, the C-style loop with arbitrary initialization, test and "increment" clauses is far too flexible to be easily understood at a glance, at least in the more exotic cases.

Mostly the for loop is used to get an index into an array to process each element. The for each construct, e.g. for value in values, is a clear improvement both readability-wise and for avoiding indexing errors. It also allows processing each element of an unordered collection. When we combine the for each with a way to construct a collection of index values, e.g. for i in range(1,10), we can do away with the old counting loop. With good constructors of index iterators we can even get back some of the flexibility of the C version in a more readable way, e.g. for i in powersOfTwo(). A possible downside of the for each is that it often relies on mutation (see below) to construct the result.

Functional languages popularized the higher-order map, filter and flatMap functions to abstract away the iteration itself, applying the repetition concept again to re-use the iteration with a modified operation to be done at each step. Unfortunately this means that the simple iteration code is redefined with innumerable variations, each with a different name and parameter list in order to handle different types of operations. Consider the following tortuous code to map to an optional value: values |> map maybeInt |> filter somes |> map unwrapSome. Since it is fairly common, F# defines yet another variation to handle it, values |> choose maybeInt. The small differences in exactly which type of processing is being done hardly seems relevant for the cognitive load investment needed to keep track of all the variations and the for each covers all of them much more clearly in a single construct.

Tailspin simplifies the for each further by simply streaming values into a pipeline, values..., or a counted range 1..10. This isn't really recognizable as a loop any longer even though it works exactly the same. The operations or transforms to be performed are placed along the pipeline without higher-order function wrappers. To enable filtering, transforms can withhold emitting a value and to enable flatMap, transforms can emit more than one value. This ends up being extremely composable and refactorable.

It might be worth considering a special construct for getting a value and its index, like Go's for index, element := range someSlice, even though the "range" keyword there doesn't sound quite right to me. Tailspin has such a construct as well, adding an "array deconstructor" to an inline transform, someSlice -> \[index](...\). This is effective but a bit of a syntax oddity. Perhaps a better approach would be to just have an iterator constructor that combines an index with the value and use the standard "for each"

The while loop is a really wild beast and I'll get back to that later.

Recursion

Any function or procedure call is also an application of the repetition concept. Recursion calls the same function repeatedly with slight modification, which ends up being equivalent to an iterative loop. The advantage of recursion is that it forces the mutation of the iteration index, the current value and the mutation of the result to happen at the same logical point in the code, which simplies the logical structure.

A typical recursion implementation will have an entry function with a recursive helper function that must contain a test for whether to recurse more or return a result. The helper function is most clearly implemented as a match (or switch) statement listing the different possible conditions and what should be done for each. It may be worth noting that in the Guarded Command Language (GCL), which is intended to be easier to prove correctness for, a loop contains exactly a switch/match which is essentially randomly ordered.

In Tailspin, each templates (function) is by default expected to be an array of matchers (guards), no match keyword, it just is. A value can be sent back to be matched by a #. An optional prelude can set things up before recursing internally on the matchers. Example of how to flatten a list. While this doesn't enforce good practice, it guides it and makes it easier to follow.

Mutation

Mutating a variable is equivalent to calling a function at that point with a parameter holding the new value. Erlang does mutation of state that way. So mutation is also an implementation of repetition. Like the GOTO, rampant mutation can make it very difficult to unravel the execution tree.

It is in the light of mutation that we can examine the while loop because it is completely dependent on mutation to change the condition needed to break the loop. With that in mind, it should be clear that recursion is much preferable to while loops.

Despite the possibility of complications, mutation makes the expression of some algorithms, like folds or aggregations, much simpler. Mutation is also quite easy to understand in a local context. Even mutation affecting code in another part of the program can be extremely useful to relay information, but needs to be packaged and shared in a way to make the intent clear, channel in Go being a good example. But that starts digressing into other concepts that I haven't analyzed.

Tailspin allows one mutable variable inside a function, but it can be an aggregate (see the Aggregation concept below). Since functions can be defined inside functions, this can get complex but it can't escape its confines, unless you work hard at it by storing a closure in an object.

Tailspin also provides object constructs (called processors just to try and free the mind). These allow holding of mutable state between accesses.

Vector operations and other repetitions

Fortran90 introduced vector operations, where whole arrays could be used in arithmetic for elementwise evaluation. Hugely convenient, but also performant if you had a machine with pipelined circuits.

In Julia, a . can be used to turn any function into a vector function.

The Normalize-transpose mechanism of SequenceL automatically vectorizes any function call.

I haven't felt comfortable adding something like this to Tailspin yet, but there is one Projection that has a somewhat magical repetition (perhaps too magical). A call like $({z: §.x + §.y}) will repeat through any level of array nesting and transform each record at the bottom as specified.

Tailspin also has a built-in construct for producing cartesian products: [by 1..3, by 1..3] will produce all 9 pairs in a stream.

Inline-defined functions (lambdas)

Theoretically there is no difference between a function defined inline or a function defined separately, but there is a curious effect I have observed. This is not really about the repetition concept but rather asynchronous programming, which I am not currently analyzing.

Using small inline-defined functions to modify behaviour in immediately-executing code like map and filter is handy and doesn't cause any problems. But when an inline-defined function is passed as a callback, it can cause temporal confusion.

The observation came from writing Java code with callbacks defined as separate task objects, which was a great way to write asynchronous code. The object provided a clear bound and signalled that the execution context was different. When Java got lambdas, some callbacks got defined as lambdas and a few curious and difficult-to-analyze bugs started appearing. Those bugs all were because the brain too easily assumed that the code it was reading inline also got executed in the same time and context.

I'm a little on the fence here, but Tailspin currently does not allow inline-defined functions in a parameter, only in pipeline stages. There is less need because Tailspin doesn't need map, filter and friends. A function being passed as a parameter is obviously a crucial component of your algorithm, maybe it is worth giving it a name? In Tailspin, parameters are also named, so this could result in a function call like sort&{by: nameAscending}

Aggregation

The simplest form of aggregation is probably addition. If I were splitting hairs, I would then also argue that subtraction is a form of projection, which is the inverse of aggregation. Anyway, I think we can agree that arithmetic is most clearly represented in the normal infix way as 5 * (3 + 1) rather than something like 5 3 1 add mul. This, I think, sets the tone for this section.

Lists/arrays

The most fundamental and flexible way of providing aggregation is undoubtedly cons. This might be a little too flexible, though, and definitely a bit onerous when constructing longer list values. On the other side of the room, old languages closer to the metal require that you allocate arrays to a fixed length before you start filling them with data. If you ever worked in a language where you had to allocate space for strings you know the pain.

On a point of order, I suppose there is a distinction between lists, being ordered, iterable and variable length, and arrays, being indexable and usually fixed in size. I want all those properties at once and tend to be a little loose in which term I use. In Java, I don't know if there exist any cases where you would use a LinkedList instead of an ArrayList (although I could find uses for a Node class with pointers to next).

Modern languages provide a more direct and declarative array literal syntax, like ['apple', 'pear', 'orange']. For calculated values, list comprehensions are the way to go, e.g. [[i + j for j in 1:3] for i in 1:3], re-using the for each construct. The only thing to object to is the reversal of the repeated clause and the for.

Tailspin provides the literal list constructor, but since the square brackets just capture a stream of values, the list comprehensions read just like normal code, [1..3 -> \(def i: $; [1..3 -> $i + $]! \)], where an inline-defined function is used to capture the i for use in the nested pipeline. All types of expressions can just be combined inside the list literal, e.g. [5, 9, 1..3, 17, 4..6 -> 3 * $, 79]. This composability means there is no need for a concatenation operator, just do [a1..., a2...]

Text (strings)

In this multi-cultural world which is also full of emojis, text can no longer be considered to be an array of anything, except perhaps glyphs. Since glyphs can be made up of several concatenated unicode codepoints, a glyph array is probably not suitable as a basic representation.

Computer programs don't need to be interpreted by lines like some remnant of punch cards, so text  should be multiline.

String templates are a good idea, but positional parameters to them are not, because word order often needs to change when translated to another language. String interpolation is a good idea, and perhaps an interpolated template should be easy to capture as a function. With string interpolation, no concatenation operator is needed. It should probably be possible to enter unicode codepoints numerically and be able to transform to and from byte or integer encodings.

Tailspin treats text strings as immutable entities produced by string literals with interpolated expressions. As a result of this ananlysis, Tailspin will be getting first-class string templates, perhaps :'Hello §.name;' would fit syntax-wise.

Structures/records (and function parameters)

While lists can be used generally to associate different pieces of data, it is often not a great idea, even if you call it a tuple. Having a type name constructor and typed values is a little better, but not much. What is a Rectangle(1, 3, 5, 9)? Is it a Rectangle(Point(1, 3), Point(5, 9)) or a Rectangle(Point(1, 3), Dimension(5, 9))? Even then, the structure of Point isn't entirely clear, nor how the Point relates to the Rectangle.

Again, a more literal declarative approach is superior. A literal {left: 1, top: 3, width:5, height:9} or named function parameters, Rectangle(left=1, top=3, width=5, height=9) may feel like too much to type sometimes, but it wins in the long run. This gets even better when considering the Communication concept below, but even taking the cognitive load off constructing the aggregation should be beneficial. Anyone who has switched between Java and C# has run into the following: What does "abcdefg".[sS]ubstring(2, 4) give? In Java, it is "cd", in C# it is "cdef".

Needless to say, Tailspin supports the literal structure syntax and requires named parameters (in fact, a set of parameters is represented as a literal structure).

Other datastructures

Whether as built-ins or libraries, it is useful to have Maps (Dictionaries) and Sets available. Some languages (I know of Pyret and Ballerina) also have a Table construct that is a bag or set of records, with record fields treated as columns. Tables allow for interesting projections and relational operators like join.

Tailspin does not have Maps or Sets, but it has Relations, which are sets of records with relational algebra operators, like Tables. They can be used as Maps or Sets, but are more flexible because every column could be considered a key (although Tailspin's relations currently aren't very performantly implemented and replacing a value entails removing and adding)

Projection

Most fundamentally, a Projection is the inverse of Aggregation, so for cons that means car and cdr (or head and tail), but projections can be so much more. Array slices were introduced already in Fortran90 and are now happily present in more and more languages. One might argue that slices are an Aggregation of a Repetition of a simple Projection, but I feel that a Projection could be anything showing a smaller or transformed view of the aggregate object. Keeping the substructure in the projection rather than just streaming out the values is probably a good idea because it can be difficult to reconstruct, while it is still easy to add a separate streaming step when needed.

Fields of a record are usually accessed by name, often with the dot-syntax record.field. Record deconstruction by patterns whereby record fields can be projected onto local variables is very handy and now even available in Java.

LINQ is a great way to create projections, by selecting subsets of records, filtering, aggregating, grouping and ordering. Languages with Tables provide similar operations on those.

On a side-note, projection specifications can be first-class values: Haskell has lenses, which essentially are values that specify which part of an aggregation to return (i.e. which projection to apply).

Tailspin has many ways to project data:

  • arrays can be sliced by ranges but also be selected by index arrays, so ['e', 'h', 'l', 'o'] -> $([2,1,3,3,4]) gives ['h','e','l','l','o'].
  • Individual fields can be selected by dot-syntax, $.field, or by key-projection, $(field:).
  • Records can be transformed, e.g. $.from({x:,y:,z: §.z - $drop}), which copies the x and y fields of the from record and subtracts drop from the z field.
  • Array slicing and key-projection can be performed in several "dimensions", digging ever deeper, e.g. $@.space($.from.z; $.from.y..$.to.y; $.from.x..$.to.x) (dimensions separated by semi-colon).
  • Tailspin projections can be captured as lenses, e.g. :(to:;x:), which selects the x field of the record in the to field when applied.

Many of these things, as well as use of Relations and the matching (semi-join) operator occur in my solution to day 22 adventofcode 2023.

Tailspin also has a group-aggregation projection, e.g. $(collect {total: Sum&{of: :(score:)}} by {player:})

Text

When the idea of text being an array of characters is discarded, the question arises of how to project parts of text strings. Transforming to UTF-8 bytes or an integer codepoint array should be supported, but it doesn't help in proper unicode text manipulation, as seen in the rosettacode exercises for reversing a string or redacting words. Splitting into an array of glyphs that are themselves text objects works for these, but glyphs vary in size so there may be issues with that approach as a general solution.

Regular expressions are pretty standard and ubiquitous now, so I think we should just embrace them for extracting and replacing parts of text. It might be beneficial to have some form of parsing syntax to project text data into structured data, with bonus points if they can work reversibly as output formatters.

Tailspin does allow splitting into glyphs by streaming a string, e.g. ['abcde'...] gives ['a','b','c','d','e']. It also allows getting the text as UTF-8 bytes or a codepoint array. Most useful is the parser syntax that allows declarative specification of a data structure to parse the string into, e.g. { hand: <'\w+'>, (<WS>) bid: <INT"d"> } to parse '32T3K 765' into { hand: '32T3K', bid: 765"d" } ("d" is the unit). I haven't tried to make them reversible yet, but suspect that will only be possible with some restrictions to the current functionality.

Selection

One of the first selection mechanisms in higher-level languages was the arithmetic IF which redirected execution to one of three different locations depending on whether the argument was negative, zero or positive. Nowadays, the boolean IF has taken over.

Most of the time, the real state space contains more than just two cases. While it makes perfect logical sense to create a decision tree, and sets of boolean flags are easily represented in binary computers, nested if statements/expressions are not so easy for the human brain to grapple with, because each level down means more to keep in mind.

Flattening the cases to if...else if...else if...else..., more nicely represented as match (or switch or CASE), helps. Even then, since evaluation of cases is top to bottom, a human reader must keep in mind the implicit "and not any of the above". Perhaps the GCL is correct in (essentially) evaluating the cases in random order, enabling a human to consider each case on its own, and being more robust to code reordering.

On some level there really is no distinction between a Selection and a Projection, both being choices made according to specifications. A map operation is a projection of each value of a list, while a filter is a selection of values in a list. But filter could equally be considered a projection of the list itself.

Perceptually, perhaps a Projection selects data while a Selection selects code paths, but this gets muddied because a code path can simply result in data. Are null-conditional access operators (?. and ?[]) or the elvis operator (?:) Projections or Selections? What about if-expressions?

In Tailspin, templates (functions) by default just consist of a list of matchers (guard conditions) with a corresponding block to execute for each. Matchers are evaluated top to bottom and the block of the first true condition is executed. Matchers do not nest, although inline-defined templates can contain their own matchers.

Subtype polymorphism and typestates

In OO programming, code can become much simpler by utilizing the dynamic dispatch of subtype polymorphism instead of explicitly performing a Selection based on the type or state of the object. An object itself can be organized according to the State pattern to delegate to an internal state object, with different state subtypes, rather than checking state variables in each method. This also organizes the different behaviours of a particular state together, which is how humans tend to think about the world.

Typestates is a similar idea, but the interface is allowed to change with the state so as to only expose methods valid in that state.

Function overloading similarly uses the dispatch mechanism to choose the implementation based on the argument type(s).  The difference lies in where the functionality is defined. Overloaded functions are usually all defined together for all different types, which mostly doesn't make sense because the implementations are unrelated except for having the same interface description (relation between input and output).

Some languages allow overloads to be defined anywhere, presumably where it makes most sense to the programmer. This ends up being even worse because it becomes almost impossible to find the implementations or even to know which of them have actually been defined and included in the program.

Tailspin allows to organize objects (called processors) into internal states. There is currently no way to find out which state is active, unless explicit query methods are added by the programmer.

Verification

How can you feel confident that the program works as expected (at least most of the time)?

Tests

The simplest way to try to gain some confidence in the program's correctness is to run it on a known case, or several. Unfortunately, tests only guarantee that a correct program will pass, but many incorrect ones may pass as well. More tests only help if they eliminate a hypothesis.

In days gone by, testing was only done manually, even if test-first methodologies were already being applied here and there. Then testing became more automated by special test runner programs and tests are defined using a test harness library.

More modern languages realize that tests are important enough to be built-in to the language and that they should be written next to the code they test. I really appreciated Pyret's where clause at the end of a function to specify examples (tests, really). Pyret has additional ability to specify tests not directly related to a specific function.

In Pyret, the tests are run every time the file is interpreted. I think this is probably the right thing to do, tests are useless if they aren't run.

In Tailspin, tests are defined with the code, and it is easy to replace (mock, stub, fake) resource-demanding parts of the program. No automatic running, but probably something to consider.

Contracts

A contract means that if you do your part, I will do mine. In programming contracts, the preconditions required by a function or method can be specified and they are checked on every call. Preconditions check that functions are called correctly. On the other side of the bargain, the postconditions that are guaranteed by the function can be specified and are also checked on every call. Postcondition checks are like tests that are run on the actual input used, to verify that the function is implemented correctly.

Related to contracts is also the ability to specify class invariants and loop variants and invariants. It seems a little too much work, but maybe those are things you have to think about anyway, in which case having to think clearly enough to specify them and having them checked at runtime could be beneficial.

Contracts are central to the Eiffel programming language and rumour has it that it produces stable and reliable software, but evidently this has mostly not resulted in purchases of Eiffel developer seats. I haven't found any hard research proof, but contracts may be the sweet spot between types and tests. The tl;dr; is that preconditions allow more precise definitions than types do and postconditions are fantastic as the built-in asserts for property-based testing.

Kotlin contracts seem to be something entirely different, not contributing to verification, but are rather ways of working around the compiler and type-checker.

Tailspin does not have contracts yet, but it is on the roadmap.

Types

There is actually no consensus on what types are or what they are for. On a practical level, it is useful to keep track of fixed-point and floating-point numbers so the correct machine operation is applied. On a theoretical level there is a similarly-named but different thing in logic and mathematics, and then there is the Curry-Howard correspondence whereby programs are proofs of their type signatures.

One purpose of types is to serve as really anemic contracts to verify that your program was put together sensibly. By limiting what can be expressed, it becomes easier to check the types than to run the program, so it can run as a sanity check during development. Also, type inference can be used to make this mostly automatic, although explicitly writing the types serves as an even better sanity check.

A proven benefit of types is to serve as lightweight documentation, but clearly contracts could do that better. Only explicit typing will do for this, type inference doesn't help at all. As discussed under Aggregation, names probably beat types here, but a combination might be even better. This was a digression into the Communication concept, though.

Types can be given interesting properties to help verify and enforce correctness, like Rust's ownership rules. Linear types that must be used exactly once are another example, as are modes in OCaml. Instead of putting all these things in the same type system, they could be separate orthogonal mini type systems.

In Tailspin, types are currently checked at runtime, and they can be defined to have any property expressible in the language. Tailspin enforces that record fields with the same name have the same type, as per domain-driven design principles. Tailspin will also automatically create tagged types from strings or numbers, so that country´'Georgia is different from first_name´'Georgia' or state´'Georgia', and part_id´9 is different from shoe_size´9.

Proofs

Formal proofs of correctness are a bit of a Holy Grail in programming. From my limited understanding, it is possible to some extent but rapidly becomes harder as the program increases in complexity. Anecdotally I have heard that working with a prover can suck up aeons of time.

Having integrated specifications in the language, like Dafny, is probably helpful. The question is if it feels worth the effort.

Communication

Since communication was also my main "goodness" criterium, I have already touched on it under the other sections, so I won't repeat those details here. I have also previously written some thoughts about how programming structures can be used in communication.

It may seem plausible that having many different ways of expressing code would be helpful to communicate nuances, but more ways of writing means that readers need more knowledge. Scala is infamous for teams not being able to understand each other's preferred subsets of the language. In contrast, Go is designed with the goal of having only one way of expressing a particular algorithm.

In the Repetition section, I questioned whether the map, filter et al. functions conveyed any meaningful difference or were simply "mechanics". An example of this is perhaps the day 9 adventofcode solutions I wrote in Pyret and Tailspin. Despite being the identical algorithm, the Pyret solution just seems to be more obscured by "mechanics".

On the other hand, being able to choose whether to express the algorithm in terms of the input (e.g. by matchers in Tailspin) or the output (by literal constructor) can make a huge difference in understandability. See the roman numerals example in section 8 of Jeremy Gibbon's "How to design co-programs". In Tailspin, encoding roman numerals is naturally expressed in terms of the output, while decoding them naturally becomes a repeated projection of the input.

Final thoughts

It turned out to be easier than I first thought to come up with concepts to analyze. I think it was definitely worth doing and I'll have to try do this for error handling and concurrency as well. For my choice of "good", I think my design choices for Tailspin are supported by this concept analysis.

 

Comments

Popular Posts