I've been noodling for the umpteenth time on a representation for programs that reduces the need to "play computer". My post last night on #two-minute-week (https://futureofcoding.slack.com/archives/C0120A3L30R/p1600587602007800) triggered enough unexpected thinking together to get me to write up my recent attempts and try to trigger more.
We all simulate programs in our heads. The activity seems to break down into two major use cases:
Forward path: Extracting functions out of arbitrary computations.
Backward path: Imagining the execution of arbitrary computations containing function calls.
The forward path fits very well with ideas like starting with concrete examples and emphasizing data at all times. Nobody should ever have to start with a function definition. Instead, start with an example computation like: 18 * 9/5 + 32
, and incrementally end up at a function like celsius-to-fahrenheit
. The backward path fits with various metaphors for debugging programs. Debug by print, debug by step, time-travel debugging. A key concern is how to uncoil static computations (loops, recursion) into dynamic metaphors (traces, stack frames, interactive movements).
Postfix expressions fit beautifully with the backward path. As the demo of Brief (https://www.youtube.com/watch?v=R3MNcA2dpts) showed, execution is already quite uncoiled, with no backward jumps. While the Brief demo didn't show it (it's easy to spot where the presenter plays computer in their heads), it's reasonable to imagine a way to drill down into function calls, replacing words with their definitions. By contrast, conventional expressions -- tree-shaped and using names -- immediately throw up impediments in understanding what happens first.
However, the forward path is thornier:
It's common to claim that point-free programs make it easy to factor out new definitions, but that's only true when the definition consists of consecutive words. Consider how you would go from
* 3 3
to a definition ofsquare
, or from3 4 + 5 *
to a definition of(a+b)*c
.After they're extracted, point-free functions are harder to understand. What does the stack need to look like at the start? How many words, what types, how many words consumed, all these questions require simulating programs in your head. Or a manual comment.
This was the idea maze in my head until I saw LoGlo (https://loglo.app/2020-06-16). The cool idea here has taken me weeks to articulate: lines have names and get separate stacks. Forth typically writes to names within lines with words like !
. Limiting definitions to one per line strikes me as an advance. And having names gives us a way to make Forth words less point-free. I start imagining flows like turning * 3 3
into * x x
using two 'rename' operations, and then turning the entire line into a new function. Like, imagine a UI with a code side on the left, and a scratch computation on the right:
│
│ x: 3
│ * x x
│
After defining a function it might look like this:
│
: sq x │ sq 3
* x x │
│
Notice how the definition of x:
above gets replaced by the call to sq
below. That's kinda pleasing.
But there's issues. This manipulation requires modifying definitions of free variables. Worse, I ended up with the function call in prefix order. Trying to make things consistent got me stuck up on a tree with a 2D layout until I noticed I'd lost the benefits of postfix that got me on this road in the first place. I'll include it here just in case it sparks ideas for others, but I'm starting to think it's a dead end.
Anyways, that's where I am, still looking for a representation that's easy to uncoil and where inlining function calls is a 'smooth' visualization.
📷 tree.png
Wait, do I just need to switch how I define names?
│ 3 :x
│ * x x
│
=>
: sq x │ 3 sq
* x x │
│
i have to read this in more detail later. you may be interested in http://www.nsl.com/k/xy/xy.htm, and https://hypercubed.github.io/joy/html/jp-flatjoy.html, two concatenative languages that eschew nested quotations (what I think you’re dealing with that got you to the prefix stuff).
xy lets every program have both a stack (representing… the stack) and a queue (representing the stream of tokens that are the rest of the program). It’s an interesting route around the problem you’re talking about, and I feel like it’s either exactly right or maybe the inverse of right
also: this got posted here a while ago and seems like an interesting route around the named variable problem: https://suhr.github.io/papers/calg.html
if im understanding the way you’ve written the example code-with-scratchpad above, the forthy way to write assignment would be : DEFINE X 3; which could be sugared down to :x 3 and also maybe punned to read as “put the symbol :x on the stack” ? i dont think that would work actually
Yeah, the idea is that naming is special syntax that doesn't sugar down to stack ops, and that happens outside the scope of any stack.
^^ that’s the same as forth. : enters immediate mode. dont really understand what that means but you stop interacting with the stack
I was referring to Forth's two kinds of names: definitions and variables (memory locations). x: 3
is intended to replace 3 x !
not : define x 3 ;
.
oops i must have skimmed past that part in the stuff i’ve attempted to read on forth
i always spend time thinking about the define semantics because 1. the scoping is really weird and interesting and seems to “just work” even though they have a funky way of handling shadowing and 2. i think you can use the same thing for data if you just use quotations a la joy
Here's a diagram I scrawled of a more fully worked example (sum of squares). It shows 2 ways to imagine inlining. In option A we have a standard Forth, and inlining replaces words with their definitions. The stack stays the same, it just gets more intermediate steps.
In option B we still have a stack, but execution state also includes a namespace of values for (immutable?) variables. For example, adding :x
to a line saves the top of the stack as the value of x
. Every line starts with an empty stack, but can share data with previous lines via variables.
Now inlining shows a second stack in its own row. We might even want to expand the stack of the caller to fit the callee in, just one 'row' down to show that it's an independent, isolated stack. (The x
in sq
is unrelated to the x
in the caller since each function has its own namespace.)
I think both are equally uncoiled. The big benefit of option B to me is that the accidental complexity of stack manipulation (swap
and dup
) has been eliminated.
we’re both in agreement that stack manipulation is bad and makes everyone feel bad for sure.
i’m not opposed to variables but i do think that B is more coiled—there’s shadowing in the example, which is hard for beginners (and sometimes annoying for experts), and you did draw a coil with your arrows to the inline function 😉.
id argue that B is more readable/comprehensible but also more coiled. at the very least, it would be harder to communicate visually where exactly x and y are coming from with the same clean and regular table view as you were using before. and the definition of anything using x or y now depends on your environment which means it can’t be factored seamlessly anymore
factor, joy, and other modern concatenatives use quotations and combinators to get around a lot of the shuffling, but combinators IMO are just as bad (the canonical example is called “bi”, and i can never remember whether it expects two quotes on top of the stack and applies each to the next two items respectively, returning them in order, or some other permutation. even trying to describe my misunderstanding in words is hard).
i definitely don’t have any immediate solutions yet but the comma product article i posted is one attempt, i think pattern matching on the stack itself could be another, and (weirdly) thinking of variables as like almost vim macros that replace with values in-place could be a third direction (like some shells let you expand, eg, how rm *.txt
<tab>
can expand the text at your prompt rm hello.txt readme.txt getting_started.txt
in a directory containing all those files) i think i have about 8 total options for this but my notecards aren’t with me and those are the ones i both know of and can remember off the top of my head
another option might be to have every stack also hold an environment, which is almost like an interface in that it just “expects” an x or y to be defined. so like your variable dictionary gets carried around with your stacks. that’s appealing to me, since i have the conspiratorial belief that variable assignment is just nested record modification that’s been hidden behind some weird syntax
Yeah, you're mirroring many of my own concerns.
One thing I want to point out: the shadowing you mentioned is essential to this idea. Basically I rely on shadowing to avoid making decisions about variable names. However, once a function is defined it can now be invoked with entirely different names.
Kartik Agaram Could you say a few more words about what you dislike in defining square
as dup *
? I always liked that sort of thing in FORTH, which is clouding my understanding.
Me too! I always enjoy stack manipulation puzzles. This isn't about fixing something broken in Forth, but about seeing what part of Forth I can take with me to spreadsheets and example-oriented programming.
I'm probably still failing to internalize the previous discussion at https://futureofcoding.slack.com/archives/C5U3SEW6A/p1599496562080600, so this would be a great place for a rebuttal-by-screencast. Assume you've already run (* 13 13)
(or its Forth equivalent) at your favorite REPL. How do you get from that expression to a definition of the function square
in persistent form?
The assumption I'm making is that nobody calculates the square of 13 at the REPL by typing 13 dup *
. So it seems to me that we need some way to nudge people to massage 13 13 *
into a form that needs a single copy of the input(s).
It's just a thought experiment in the end. It's a frame of reference I'm taking on for the duration, and in the process I'm working against some of my own discomfort with spreadsheets and other 2D representations.
[September 7th, 2020 9:36 AM] garth: https://news.ycombinator.com/item?id=23811382|https://news.ycombinator.com/item?id=23811382 great post on the lisp REPL’s difference from other REPLs. sidenote: can someone explain to me how lisp programmers go from talking to the REPL to updating the code in their source files? in talks and stuff it always seems to be with copy and paste. is that accurate?
This is an interesting question. When I look at (* 13 13)
I see 169
. Or, to be more clear, when I see a form that operates only on constants, I think of it as a constant itself. So I would only execute a form like that one in a Lisp REPL using eval-and-replace
semantics to calculate some constant I want to embed in the code. If, on the other had, I were planning to parameterize such a calculation I'd use a lambda (say, #(* % %)
in Clojure notation), which lifts quite naturally to a def
or defn
if needed.
BTW, although I understand that you aren't trying to "fix FORTH" in any sense, some of what came up in these last few threads reminded me of this paper that talks about adding various functional programming constructs to FORTH starting from the fewest possible cominators:
http://soton.mpeforth.com/flag/jfar/vol4/no4/article6.pdf
... and this classic about implementing a linear logic Lisp that compiles to FORTH:
https://hashingit.com/elements/research-resources/1994-03-ForthStack.pdf
this is vague and maybe irresponsible spitballing from a staying-on-topic perspective, but what if there was a kind of a stack shuffling operator, that split the stack n items down, and then pattern matched them back on top. so maybe you’d pass a series of symbols into it and transform the stack. eg ( n1 n2 — n1 n1 n2 )
would that break the ability to break apart programs anywhere? i’m not sure but i suspect not
Garth Goldwater you can do that in gforth with { n1 n2 } n1 n1 n2
syntax. ({ ... }
introduces local bindings which you can reference at any point later.) I find it quite a useful feature when the word definition grows and juggles 3+ pieces of data
One advantage of dup
and shuffling is that they force you to pay the cost of non-linearity. Since named variables give you non-linearity for free, reference hydras sprout up easily.
I guess my bigger qualm is that catenative programs tend to feel less catenative and more conventually referential as they get larger. How can one keep the catenative spirit in bigger programs?
Maybe some kind of dynamic scope? Instead of one stack, each dynamic variable names the top of a stack.
With queues instead of stacks and you get to Lucid. Concatenative reactive programming here we come!
lucid is really interesting, thanks for the reference! wish there was more than a short wikipedia article and a ~270 page book
that also lead me to lustre, which has a feature i didn’t know i wanted: you name the name of the variable you return in the function signature
seems obvious in retrospect since we basically do that with modules when we mark things as public
or export
i just like it as like an outline—“heads up, this is the variable we’re going to finish with”. i guess it’s almost like a mental hyperlink to the answer to the question “ok... what are we doing this all for?”