You are viewing archived messages.
Go here to search the history.

Breck Yunits 2021-08-31 00:55:59

Has anyone seen a study on "how a DSL becomes a GPL"? Or, alternatively, "what is the most common path travelers take before arriving at Greenspun's 10th rule?"

Is there something like "DSLS either die or eventually add identifiers, then functions, then branching, then macros, etc"? I'm curious if you can look ahead and say "well if this is successful, it will eventual require so many things, so might as well not do a DSL in the beginning and instead start with a GPL and build a library"

Jack Rusher 2021-08-31 08:37:28

I've never seen it studied, but I've seen it happen over and over. This is why DSLs should be embedded in real programming languages.

https://twitter.com/jackrusher/status/1348645505811828737

🐦 ⸘Jack Rusher‽: Starting a DSL from scratch rather than embedding it in a real programming language is folly. You should use languages that make embedded DSLs easy! https://pbs.twimg.com/media/ErdHpSyXEAET5-R.jpg

Konrad Hinsen 2021-09-01 07:11:53

It's unfair that there is a 💯 emoji but not one for 50% agreement!

My first question when choosing embedded vs. standalone for a DSL is: is the stuff that you encode using the DSL more like "code" or more like "data"? In the former case, go for embedded (for the reason Jack Rusher gave). In the latter case, go for standalone in order to keep your data independent of a single language ecosystem, and thus more widely usable.

There is of course no clear borderline between code and data, all code being data from another point of view. But in the context of a specific domain, the choice is often obvious.

Jack Rusher 2021-09-02 07:39:52

Konrad Hinsen The line between code-like and data-like for me is whether or not the data will be interpreted (in the abstract interpreter sense) and thus encodes computation. If one is — for example — just writing a bunch of sensor readings from an experiment, one might as well do it as packed binary data frames. Whereas, if one is creating a specification/configuration language that includes constructs for things like conditionals and loops/recursion, it's already too late — just use scheme/Smalltalk/FORTH and be done with it. 😊

Konrad Hinsen 2021-09-02 09:07:33

Exactly. I thought a lot about this because my current project in DSL space is very much on the borderline. Leibniz (https://github.com/khinsen/leibniz-pharo) is a domain-specific specification language, which does include conditionals etc. (it's a term rewriting system). But its reaon for existence is the documentation of computational models for humans, independently of any concrete implementation in code, so I ended up choosing the standalone approach explicitly to remove the temptation of the quick hack in whoever's favorite programming language of the day.

Konrad Hinsen 2021-09-02 09:09:03

BTW, I do consider a schema for packed binary data frames a DSL, although I am not sure everybody would agree with that.

Andreas S. 2021-09-02 10:48:39

I’m so grateful to Jack Rusher or was it Konrad Hinsen ? For introducing me to Ivan Illich. Here is a interesting piece by him on technology (computers) and how it relates to the concept of commons. https://dlc.dlib.indiana.edu/dlc/bitstream/handle/10535/5962/Silence%252520is%252520a%252520Commons.html?sequence=1&isAllowed=y

Jack Rusher 2021-09-03 09:24:55

Could have been either of us 😹

Rob Haisfield 2021-09-02 22:13:22

Anybody know of a solid GPT-3 generator for GraphQL queries?

Miles Sabin 2021-09-03 09:46:16

Specifically GPT-3? Or are you looking for a fuzz tester?

Rob Haisfield 2021-09-03 14:08:53

Something that could write GraphQL queries for you by specifying what you want in natural language

Rob Haisfield 2021-09-03 14:09:11

Though I’m curious what you know about with fuzz testers

Vijay Chakravarthy 2021-09-03 15:06:35

Not sure the models do very well without some implicit understanding of the underlying data.

Denny Vrandečić 2021-09-02 22:22:34

When designing a programming language, what are good resources for designing the error / exception system?

Alexander Chichigin 2021-09-03 07:32:39

Algebraic Effects obviously! 😄

Jack Rusher 2021-09-03 09:27:25

The process outlined in @abeyer's link — "capture an error, inspect the stack, edit and re-evaluate code, then attempt to continue from the same point" — is an amazingly powerful tool for the working programmer.

Alexander Chichigin 2021-09-03 09:36:18

And I find kinda amusing that that's basically how Algebraic Effects work, but with static type system on top of that. 😉

Jack Rusher 2021-09-03 15:40:07

Alexander Chichigin Algebraic Effects are completely orthogonal to what I've described here, which is a feature of the development environment rather than a property of the language itself.

Alexander Chichigin 2021-09-03 16:00:49

Jack Rusher funny enough "evaluate handler in the context of an error location and continue execution from that point" was first and foremost implemented in Smalltalk and Common List which support it on the language level. Other IDEs still struggle to implement this functionality in its full. 🙂

Tony O'Dell 2021-09-03 19:40:04

Use monads instead of errors

Konrad Hinsen 2021-09-04 07:42:29

Alexander Chichigin Common Lisp and Smalltalk make no distinction between "language" and "development environment", being live systems. Put differently, they are more integrated than anything called IDE today.

William Taysom 2021-09-04 12:03:08

Came here to echo Alexander Chichigin about Common Lisp conditions. Practical Common Lisp has a good introduction https://gigamonkeys.com/book/beyond-exception-handling-conditions-and-restarts.html.

Jack Rusher had a good tl;dr.

More generally, I feel we underutilize the stack as a way to abstract. Instead of explicit passing, build context through selective use of dynamically scoped variables.

Alternatively, I guess there's the aspect oriented notion of CFlow https://schuchert.github.io/wikispaces/pages/aop/AspectJ_CFlowExplained.

And I guess better functional programmers than me could say something about comonads, but in as much as option, collection, either, and exception monads are all about returning things, there should be a sort of opposite construction.

Duncan Cragg 2021-09-05 18:03:09

Give an example of an error/exception that is meaningful in your language!

Duncan Cragg 2021-09-05 18:03:52

I built a complete programming language without them, but then again, I'm eccentric

Duncan Cragg 2021-09-05 18:04:58

If your programming language is used to model reality in any way, then there shouldn't be any because reality has none either

Duncan Cragg 2021-09-05 18:05:44

(e.g. electronic circuits - maybe a transistor overheating is an exception?)

Konrad Hinsen 2021-09-05 18:55:32

My washing machine has exception handling. If I overcharge it, it beeps and shows an error code on the display.

Legal documents (including law itself) also have exception handling. There's often a description of the "normal" case, followed by special treatments for exceptional cases.

I find the distinction between "normal" and "exceptional" useful in many circumstances. Some programmers overuse or even abuse exception handling, but overall it looks like a good way of structuring code for humans.

Duncan Cragg 2021-09-06 06:28:37

Hmmmm.. these are both examples of domain level exceptions, also known as "just normal programming"!

Konrad Hinsen 2021-09-06 07:27:16

Indeed. But without an exception system, normal programming means having the exceptions appear all over the code, either as explicit tests at all levels of abstractions (as in good old C), or as messy types (e.g. monads in Haskell).

Nick Smith 2021-09-04 08:57:34

Huge idea: what if tensors are the next-generation replacement for RAM? Classic RAM (I'm talking about the software abstraction, not the physical hardware) is just a vector with 2^64 cells, most of which are zero and not backed by physical memory. This is commonly known as a sparse vector. The current AI boom has made it obvious that higher-dimensional memory chunks, known as tensors, are an important idea, especially sparse ones. Other than being higher-dimensional, key differences between tensors and RAM include:

  • An AI app will typically work with multiple tensors, but a classical app will only work with one RAM. (Though Wasm can have multiple RAMs, known as "linear memories", and of course, you can pretend to have multiple memories using abstractions like malloc).
  • Tensors can be subjected to unary operations such as slicing, permuting, and aggregation (min, max, sum, product), that generalize the boring read and write operations on RAM.
  • Tensors can be subjected to binary operations such as multiplication/contraction (generalizing matrix multiplication), convolution, and element-wise addition.

The data of everyday programs is often very heterogeneous, which corresponds to having lots of sparse tensors. Sparse tensors need good support in software and ideally in hardware. Thankfully, there is AI hardware being developed that is designed to operate on sparse tensors, by way of dedicated circuits that can compress and decompress them. Tenstorrent is probably the leader here.

Here's a fun fact: multiplication of sparse Boolean tensors is equivalent to a database equi-join. So if you think databases are important, then maybe you should give tensors some thought.

And relatedly: operations on tensors are typically massively-parallelizable, thus could be a good foundation for a high-performance programming language that compiles to AI hardware.

Shubhadeep Roychowdhury 2021-09-04 10:18:03

And relatedly: operations on tensors are typically massively-parallelizable, thus could be a good foundation for a high-performance programming language that compiles to AI hardware.

You hooked me there

Luke Persola 2021-09-04 17:59:41

So we already store tensors in RAM and perform various operations on them. You said software not hardware, so is the difference here that the abstraction between the 1D (flattened) data and its higher dimensional form is provided at a lower level in the software?

Nick Smith 2021-09-04 22:42:41

I mean the assembly language of the hardware should be phrased in terms of operations on tensors. 🙂 The programmer should not be concerned with whether the tensor is ultimately flattened into a linear array of SRAM or DRAM cells. (In AI hardware, they definitely won’t be.)

Nick Smith 2021-09-04 22:45:36

My goal with this post is just to get people thinking a little differently about the memory model upon which a programming language is built. Tensor-based memory models are about to become mainstream (next 5 years) thanks to the AI boom. Could lead to some exciting new paradigms of programming.

Nick Smith 2021-09-04 22:51:16

Here’s a challenge for everyone: when you visualise the act of “allocating memory”, what do you see? If you see a big linear chunk that you can address with a pointer, then maybe you’re trapped in 1-dimensional thinking. I certainly was/am.

Denny Vrandečić 2021-09-05 01:48:03

How's your memory mostly zeros? If it is, you should use smaller machines. I think the idea that RAM is similar to a sparse vector is often not right. At least not if you use Chrome for browsing.

Nick Smith 2021-09-05 01:53:38

I’m referring to virtual memory (i.e. what apps see). I guarantee you that your 64-bits of virtual memory are mostly zeroes! And with a paging system, you can write across vast swathes of the memory whilst only consuming physical resources for the pages you actually touch. That’s what I mean when I describe linear memory as a sparse vector.

Nick Smith 2021-09-05 02:00:42

Now imagine what it would be like to have a memory model where your memory is sparse at the byte-level, and is multidimensional, and you can have multiple memories and perform massively-parallel operations (such as aggregations) over them. This is what sparse tensors are.

Konrad Hinsen 2021-09-05 08:04:00

N-dimensional arrays as a fundamental data representation? That's an idea that has been around since the days of Fortran and APL. The 1960s. Efficient parallelization has been investigated as well, with today's Fortran containing very good support, though it's less automatic / miraculous than people tend to expect.

Konrad Hinsen 2021-09-05 08:05:26

BTW, I avoid calling N-dimensional arrays tensors because a tensor for me is a algebraic and geometric object, not a data structure: https://en.wikipedia.org/wiki/Tensor

Nick Smith 2021-09-05 08:47:51

Does Fortran handle large and high-dimensional (10000x10000x10000x...) sparse tensors, though? That's the main enabler of a lot of interesting applications. Tenstorrent handles sparse tensors completely in hardware; as a programmer you work with them as if they were dense. For context: if you multiply a pair of 99% sparse tensors using a dense multiplication algorithm, you’re doing 10000x more work than you need to (repeatedly multiplying by 0). In general, the asymptotic complexity is different.

Nick Smith 2021-09-05 08:50:33

I'm aware of the more "mathematical" definition of tensor. But I believe the difference is just that the array representation is what you get once you've chosen a basis. You can also talk about tensors without reference to any basis.

Konrad Hinsen 2021-09-05 18:48:35

Fortran doesn't support sparse arrays as a language feature, but library support has been around for decades, getting better all the time.

As for the tensor, yes, once you pick a basis, you get an array representation. But the whole point of tensor algebra and tensor analysis is that the tensor has a meaning (and properties) independent of the choice of a basis.