Future of Coding History

Jarno Montonen 2022-12-20 07:36:37

Any opinions on should the editors in which you edit a structure (AST) rather than text be called Structure Editors, Structured Editors, Structural Editors (just saw Peter Saxton use this term in his post), or Projectional Editors? Are all of these synonyms or are they just related in specific ways?

Personally I like Structure Editor the best, as opposed to Text Editor. Although Structured and Structural sound correct, I feel like they refer to the editor UI. Similar to how the terms graphical and visual are used. But a Structure editor could be either visual or textual, so having Structure Editor as the 'base term' would allow being more specific by saying Graphical Structure Editor or Textual Structure Editor. Also, even if technically a structure editor would always use a projection of the structure, I feel like it would be best to reserve the term Projectional Editor to editors that explicitly support multiple projections of the said structure. However, Projection(al) might be a bit foreign term to people not familiar with the topic, so I would rather just use Structure Editor 99% of the time. I feel like one of the barriers for more widespread usage of structure editors is that you have to explain people what they even are and it would certainly be easier to change this if the developer field could agree on the terminology 🙂.

Any thoughts?

Lu Wilson 2022-12-20 07:45:40

My vote is for Structure Editor ✋

"With a text editor, you edit some text that represents your code. With a structure editor, you edit a structure that represents your code." ?

Lu Wilson 2022-12-20 07:49:02

on the other hand, it doesn't roll off the tongue so well, so scrap that

Jarno Montonen 2022-12-20 07:49:21

Good choice! 🙂 Although I don't think the thing 'represented' needs to necessary be code.

Jarno Montonen 2022-12-20 07:49:59

roll off as opposed to Structured or Structural?

Lu Wilson 2022-12-20 07:53:07

I guess swap out 'code' for 'X' depending on the context.

And yes, but that's possibly just an accent thing. Not sure whether to put a glottal stop in between the two words or not :)

Jarno Montonen 2022-12-20 07:56:23

I feel like for a finn Projectional would be easiest, but out of the other three Structure

Jack Rusher 2022-12-20 09:06:22

all variations of structur* occur in the literature, but I also prefer "Structure Editor" 🙂

Jack Rusher 2022-12-20 09:07:12

("Projectional editor" means an editor that can project the same underlying structure in multiple ways, so it connotes a sort of meta structure editor.)

Václav Blažej 2022-12-20 13:03:03

'Structure editor' seems like a good general term, but a software that directly explores and edits AST may be called ... 'AST Editor'?

Paul Tarvydas 2022-12-20 17:23:12

I would call it “Lisp” or a “tree editor”.

Features of the underlying syntax:

very regular syntax, limited choices

recursive definition

machine-readable, machine-writable.

The different views (“projections”) might be called “skins”. They are micro-syntaxes that make the machine-readable stuff more palatable to humans, i.e. mappings from human-readable -> machine-readable.

Jarno Montonen 2022-12-21 16:54:45

Paul Tarvydas why "Lisp" ?

Paul Tarvydas 2022-12-21 19:22:12

@Jarno Montonen Squinting:

Lisp source code is: hand-written ASTs

e.g. in a ficticious high level language: a := b + c

In ficticious Lisp: (assign a (plus b c))

In real Common Lisp: (setf a (+ b c))

Lisp’s main operations are tree manipulation operators - CAR, CDR, CONS. The rest are nice-to-have noise operations that deal with with the contents of tree nodes and/or convenience functions.

Any AST editor boils down to Lisp operations.

Peter Saxton 2022-12-22 09:17:52

What are the list operations for dealing with tagged structures. i.e records and unions. I see how lisp matches to lambda reductions but I've been adding unions/records/effects all based of row types and am unsure what there native manipulations would be

Paul Tarvydas 2022-12-22 12:58:51

short answer: functions

x.y is really y(x) and is written in Lisp as (y x)

method call self.y(z) is really y(self,z) and is written in Lisp as (y self z)

longer answer: all you get is ASTs (things and lists of things)

Yes, that is very low-level.

McCarthy decided to drape meaning over ASTs, i.e the root node of an AST is a always considered to be a function.

In the past, if you wanted to create more syntactic sugar draped over your ASTs, you would lock yourself away in a room for years and invent a “new language”.

Or, if you were a Lisper, you would create functions called “macros”, but, the resulting syntax always looked like more ASTs (lists).

Character-based syntax was reserved for compiler gurus who knew how to use parser tools.

Today, though: Ohm-JS provides a way to drape character-based syntax over ASTs in an afternoon (it even comes with a REPL for helping you design/debug a syntax - “Ohm-Editor”). All you need is a toolbag of functionality plus Ohm-JS. (i.e. Common Lisp + Ohm-JS, or, JavaScript + Ohm-JS, or …). PEG-based parsers (like Ohm) can do things that CFG parsers can’t. As a result, quickie grammars can be incredibly short (i.e a couple of lines, slightly longer than a REGEX, but way shorter than most YACC-based parsers).

Back to your question: If you want to see other people’s ideas on how to structure data, or if you don’t want to roll your own, see “CLOS” and “DEFCLASS” and “DEFSTRUCT”. CLOS method dispatch is different and better and more flexible than the usual OO stuff.

[The learning curve is probably steep. Lispers are usually glad to help. There are >1 Lispers here].

Konrad Hinsen 2022-12-22 13:49:20

Paul Tarvydas Your reference to "Common Lisp + Ohm-JS" suggests that parsers written in Ohm-JS can be used with languages outside of the JS ecosystem. That's not my impression from looking at the Ohm-JS Web site, which only mentions TypeScript as an alternative target to JavaScript. Is there something I overlooked?

Background: Ohm-JS looks very interesting for some ideas I'd like to play with, but I have investment into JS or Web programming in general.

Paul Tarvydas 2022-12-23 06:01:44

[hmm long answer, I wish it were shorter...]

Correct, the parser technology is built in JavaScript and runs in a browser and in node.js.

I was getting at something else. It’s my fault that the idea wasn’t made clear. Let’s try again:

1) I generate compilers using Ohm-JS that transpile from syntax I invent to other languages like Common Lisp, Python and JS (I believe that I could do more languages, but I haven’t needed to do so (FYI, I have a P.O.C. WASM generator, but more work is needed (this was my first encounter with WASM and more learning curve is needed))).

2) Then, I run those generated compilers to compile code written in the new languages.

3) Then, I run the generated code on the command line, and, sometimes, in the browser.

Ohm-JS, based on PEG, is the game changer in the way I now look at problems. CFGs (LR(k), YACC, etc.), REGEX, and, hand-written recursive descent parsers are just too cumbersome to use in the same way that I use Ohm-JS.

As an aside, one of the first things I wrote was a compiler that produced code that could be bolted into an Ohm-JS project, with the result that, in many use-cases, I don’t have to write any JavaScript code at all. I can write the grammar in Ohm’s grammar syntax and I can write the transform in my own FABrication syntax, which is more succinct than JS.

Further aside: the first thing I did with Ohm-JS was to write a Scheme-subset-to-JS transpiler and used it to convert Nils Holm’s PROLOG in Scheme to JavaScript.

Example: I am deeply interested in true concurrency. My code uses messages that look like:

⟨a b c d⟩

and I use Ohm-JS to transpile this nano-syntax into something like:

⟨Message a b c d⟩

using 2 specifications:

‛⟪«p» «d» «s» «m»⟫’
‛⟨Message «p» «d» «s» «m»⟩’

N.B. the whole spec for the pattern matching is 1 line long, and, the whole spec for transforming is 1 line long.

This specific example could be done with a Python script (or sed with Unicode support), but there are other details that I’m trying to skip over for this example, e.g. messages might contain other messages recursively, for which it helps to have a parser that can express matching brackets.

The input to Ohm-JS is a JS String. The output of my FABricator compiler is a JS String. All of the above steps can be done in one fell-swoop in a JavaScript program that feeds strings to Ohm-JS and calls Ohm-JS twice. At one point, I need to compile a generated String to executable code. JavaScript’s “eval()” does this. (A “compiler” is “eval()“)

Example: here’s a contrived example of something that I would never do with YACC, but would do with Ohm-JS:

Contrived Problem: scan this big JavaScript program and list every name of every top-level function. Using YACC, you need to write a full spec (“grammar”) for JavaScript, with PEG (Ohm-JS), you can say something more obvious and succint: a function is function id (...) { ... } where the ‘...’ stuff is anything including recursively bracketed bits. The point here is not whether I wrote a correct pattern match, but, the difference between “omg, I have to write a grammar for every nook and cranny in JavaScript” vs. writing a grammar with “I don’t care about this part”. This contrived example can probably be done with a REGEX, but if the problem is expanded to be something like “list every function with the name of each parameter” then REGEX works less-well than a parser.

Konrad Hinsen 2022-12-23 08:56:40

Thanks Paul Tarvydas, that's a very good answer (better long than cryptic!). I was aware of the advantages of PEG, and of the exceptional tooling support in Ohm-JS for PEG. So my summary of your explanation is "Rather than using a PEG library for language X, use Ohm-JS to generate something that you can process in/with language X." You gain better tooling, at the price of build system complexity if your ecosystem is not already JS.

Paul Tarvydas 2022-12-25 14:40:33

Addendum: There is “lightweight” pattern matching and “heavyweight” pattern matching.

REGEX falls into the lightweight category, while CFG-based parser generators fall into the heavyweight category.

The terms “lightweight” and “heavyweight” refer to Economy of Expression.

Ohm-JS’s big win is that it fills the gap between REGEX and CFG technologies, enabling a new niche for thought.

Ohm-JS falls into the lightweight category. Ohm-JS can do things that REGEX can’t do, like recursive matching and matching of balanced constructs.

Konrad Hinsen 2022-12-23 08:56:40

Paul Tarvydas 2022-12-23 14:51:16

Question:

How did REGEX jaibreak from Compiler Technology and become popular with non-compilerists?

First guess: (1) grep, sed, awk, etc. broke the ice, and, (2) perl brought REGEX into the mainstream.

I guess that perl solved a latent problem, which made it very popular. What was that problem?

Orion Reed 2022-12-23 14:56:03

I think part of the answer is that is that it’s representation as strings meant that it was supported by a near-ubiquitous infrastructure of plaintext which made it much easier to share, store, and use in everyday systems.

Justin Blank 2022-12-23 15:01:33

The real question for me is why isn’t grep/regex search more commonly used by non-programmers?

Duncan Cragg 2022-12-23 15:05:22

It's funny how all declarative syntaxes (syntaxii?) acquire a reputation for fiendishness! Take CSS - coulda been a non-programmer language but it's notoriously hard to do simple things like line stuff up; take SQL - a team at my work had the "SQL expert" who everyone went to cos it was so bloody hard, and of course, the most notorious of all: regexes - if only there were a non-brain-twisting way of doing them!

Jack Rusher 2022-12-23 17:54:04

Duncan Cragg There are better ways of writing them that date back to the 1960s. This rant from Sussman captures the problems, and the implementation that follows shows one way of improving matters:

github.com/bzinberg/regex-combinator/blob/c49d6aba03d2a42c33b1bdafc7d5e5ded9d60eb8/ps.txt#L33

Justin Blank 2022-12-23 18:00:30

One part of that seems a bit off. The nice formalism is for regular expressions, which don’t support backreferences (one of the ways that modern “regexes” drifted from the original concept—towards more power, but more obscurity).

Justin Blank 2022-12-23 18:04:46

Combinators are neat, and I wish they were better integrated as an option that plays nicely with existing libraries. The downside is that they give up the very powerful idea of regular expressions as a language, instead of just another API for doing string processing.

Justin Blank 2022-12-23 18:05:55

As a result, they’re most powerful for programmers who rarely use regular expressions, and programmers who want to do certain more abstract manipulations of regular expressions, but weak for non-programmer end users, and programmers who routinely use regular expressions.

Andrew F 2022-12-23 23:45:34

Parsing is an essential part of any IO more complicated than accepting and echoing byte strings. Regex solves or helps solve parsing across a wide variety of problems. I think that answers why it "jailbroke" from "compilerists": it was never just a compiler thing, everyone needs parsing. Once someone figured out the math, something like regex was inevitable.

The question of Perl's adoption, and by extension PCRE, is IMO a separate one, more historical than theoretical. I assume it's the usual right-place-right-time/path-dependent/worse-is-better type of story. Certainly Perl is more than PCRE, and was responsive to more problems than just parsing.

You are viewing archived messages. Go here to search the history.

You are viewing archived messages.
Go here to search the history.