You are viewing archived messages.
Go here to search the history.

Florian Schulz 2021-09-06 05:01:04

When Ivan Reese talks about the data model in Hest (Hest Podcast, Episode 25), he mentions things like:

  • AOS: Array of Structs
  • SOA: Struct of Arrays

He also mentions a way to organise data where each property (name, color, position, …) is stored in its own Array – in contrast to an Array of Objects of Properties.

This reminded me of a talk I had seen at Unite (Unity Conference) about “Data-Oriented Design”. While it makes a lot of sense in terms of memory usage, caching and lookup performance, I barely see data-oriented design today.

What are examples of programming languages or software that are using/promote data-oriented design outside of game development?

Konrad Hinsen 2021-09-06 07:31:24

Scientific computing has much of the same issues as games: large but simply structured datasets, and dealing with hardware limitations in memory and performance. But explicit support in programming languages is rare. One reason why there was a lot of enthusiasm for C++ for scientific computing in the 1990s was the hope for managing the AOS vs. SOA tradeoff better than in Fortran. Not sure it was a success.

Alexander Chichigin 2021-09-06 09:38:10

Probably, "the languages promoting data-oriented design" are exactly "array programming languages", which are kinda popular again later with the spread of GPGPU, but mostly around scientific computing, DSP and alike. 🙂

Besides, just SoA isn't efficient enough on modern hardware considering all levels of caching, NUMA and vectorization. Thus in practice for maximum performance people use combined layouts like SoAoS. But for that you have to know the sizes and layouts of your base structures, where and how they are used, and adapt algorithms for that. So no "one size fits all" solution here, it's all careful conscious and methodical manual optimisation.

Jimmy Miller 2021-09-06 13:46:40

I know Andrew Kelly just applied some data oriented techniques to zig for compiler performance improvements. https://vimeo.com/507318005

Ivan Reese 2021-09-07 05:05:08

Jonathan Blow's work-in-progress programming language Jai has language-level support for AOS and SOA:

However, as programs get larger, it becomes much more difficult to reorganize the data. Testing whether a single, simple change has any effect on performance can take the developer a long time, because once the data structures must change, all of the code that acts on that data structure breaks. So Jai provides mechanisms for automatically transitioning between SoA and AoS without breaking the supporting code.

Which makes me wonder if this sort of generality of data access is similar to the approach Clojure takes to its data structures — and if so, whether they're directly aligned or lie along slightly different axes.

Jimmy Miller 2021-09-07 14:23:07

For what it is worth I am fairly sure that Jonathan Blow got rid of those abilities. Instead you have macros and arbitrary compile time code execution that would let you reimplement something like that yourself.

Chris Granger 2021-09-07 16:24:24

Anytime someone talks about a column store vs a row store, they’re effectively talking about SoA vs AoS

Nick Smith 2021-09-08 23:12:43

Calling all software engineers who know basic category theory: some applied category theorists in the US are organizing a hackathon to explore how the theory of polynomial functors can be applied to software development (new programming libraries and/or languages). The attendees would be a mix of software engineers (who know category theory) and category theorists. If you'd be interested in attending or brainstorming about such a hackathon let me know. I've offered to help organize it, but we need more input from software engineers. 🙂 This would be an in-person event in the US, taking place ~March next year.

There is a short course and book draft on the category of polynomial functors, outlining its applicability to interactive systems and databases. I expect these applications will be the focus of the hackathon.

Don Abrams 2021-09-11 13:02:23

I’m interested in this, but I am only familiar with the basics of CT.

FYI: I’ve been learning what I can about generating code from laymen-defined models+properties where the generated code “smartly” chooses between both runtime (based on collected metrics) and static (based on machine model) strategies (such as sort algorithm, data layouts, GPU vs. CPU targeting, batch/row sizes, etc). There’s also a very difficult problem on providing feedback or even choices when “things” are over-constrained/non-feasible.

Any thoughts on if Poly could be a helpful way to model this problem? In the Poly book, they restrict the lens structure in a way I don’t understand, so I don’t know if it can model more complex optics like monadic lenses (which is one way to model+execute RT strategy decisions). I also really don’t understand the implications of things like “preserving op-cartesian arrows.”

Nick Smith 2021-09-11 23:59:32

I’ve been learning what I can about generating code from laymen-defined models+properties

It sounds like you're interested in compilers for high-level languages? Your description reminds me of MLIR, which aims to be a good framework for compiling programs from (vastly different) high-level domains, such as neural networks and hardware circuits. It definitely does all those static optimizations you describe.

As for whether Poly can help with this: my hunch is no, not right now. (Disclaimer though: I'm not a category theorist, I'm just a software engineer who's been in contact with the book authors.) Poly is being investigated as a modelling language, but not as a compiler. The main challenge right now is connecting Poly back to everyday programming constructs. We've got an understanding of how it connects to type systems, I/O, and databases, but other important stuff like recursion is missing. It's unclear how deep the Poly rabbit hole goes; it's still very early days. That's why there's a hackathon being planned: to support further exploration!

I don’t know if it can model more complex optics like monadic lenses

Nor do I. I don't even know what a monadic lens is 🙂

I also really don’t understand the implications of things like “preserving op-cartesian arrows.”

The book is mainly targeted at category theorists right now, and so its style is to list out a bunch of theorems on every aspect of Poly. Some of them have clear implications, some of them don't. My personal strategy has been to focus on the theorems that seem interesting to me as a software engineer 😇.

Don Abrams 2021-09-12 10:47:38

Hah, well my current hope is that the compiler input isn’t a high level language, but a model (think more like minecraft or sketch)

I’m looking for the right model(s) to slowly increase power/complexity, basically adding a dimension at a time (and trying to keep people thinking about the “what” rather than the “how” when manipulating the model)

Nick Smith 2021-09-12 23:51:15

What do you mean by "model" here? That word has a lot of meanings!

For example, how do you wish me to perceive Minecraft as a model? I see it as a fully-fledged interactive application! But if I were to think of Minecraft as a tool for making models, I can recall how people use Minecraft blocks to build physical models of locations and buildings. But those aren't models of interactive systems. You can also use Minecraft's redstone system to build circuits, but that's equivalent to a low-level programming language, so I don't see it as something special. And of course you can write mods for Minecraft, but that's just conventional programming.

Ivan Reese 2021-09-09 03:06:00

I'm thinking more about the sensation of time, as it pertains to the execution of code. (Yeah, back on my bullshit.) I see a spectrum here — a spectrum of different sensations of time for different ways we interact with computation.

On one end of the spectrum, we have raw math. There's not supposed to be any sensation of time in the evaluation of math. A variable always has a specific value; relationships either exist or they don't. It might take you (or a computer) some time to crunch values and arrive at a result, but that's absolutely not supposed to be part of the aesthetic. Conal Elliot's Denotational Design is an application of this sort of thinking to software design. Lambda calculus, Curry-Howard, and some of the more hardcore FP languages all exist — infinitely, suspended frozen in midair — over here. Of course, no computer language is actually timeless (that violated physics, and this is addressed in Church-Turing and related work, and we all agree never to try that again), but the desired sensation — the aesthetic — is one in which time is not a constraint or concern.

On the other end of the spectrum, we have mechanical computers. There's no avoiding the sensation of time when operating these clockwork beasts. You're required to think about the passage of time as you plan an execution, or else the result will be nonsense, or malfunction. Nothing is instant, nothing exists until you build it. Here we find the CAP theorem, Turing machines, and Rich Hickey's The Language of the System, all of them toiling, sweating, grinding, churning.

[Aside: note that Functional Programming is orthogonal to this spectrum — it is not bound to the math side. On either extreme and anywhere in between, you can have programming that's about immutable values (or not), static vs dynamic binding, data & behaviour co-located (or not), for-each vs map, place-oriented vs value-oriented, and so forth.]

I've spent all my time over in the mechanical labour camp — this is where Hest lives. So I don't have much insight at all into the crystal tower of pure evaluation. So beyond just suggesting "Hey, talk about this spectrum" (which, granted, I am suggesting), I'd also like to know what examples you can point to that obviously violate this common alignment of aesthetics. For example: what's the most I feel the passage of time in execution you can get to when working with something like Coq, or Haskell, or APL? Is there some step debugger for SML that really lets you feel the iterative progress through execution? Or on the other side, what's out there that takes a process rooted in time — like CAP — and makes it shake that feeling of temporality? Look at something like Erlang/OTP — they take Prolog (timeless) and reify the sensation of process ("let it fail"). Who else is doing that? Is anyone doing it in the other direction?

Cole Lawrence 2021-09-09 11:19:00

Potentially in this direction could be something like temporal.io , which is a workflow language that makes keeping value in the stack for weeks at a time, a comfortable endeavor. (The contrast is usually we put values into a database so they can survive an upgrade over the course of the weeks)

Cole Lawrence 2021-09-09 11:22:17

Also, on another side of this that may or may not bear any fruit, is https://www.liquidsoap.info/ , a language for programming media streams.

Chris Knott 2021-09-09 13:12:42

I don't really know enough about the Haskell end of this spectrum but my sense is that the units for it are probably something like "percent of the code that has some kind of invariance guarantee".

If you take something multi paradigm like Java or C++ it is possible to write very mathsy code by leaning in heavily on const, interfaces, classes etc. and earn yourself very strong guarantees about invariance over time. Different languages force or encourage you to bake more guarantees into your code but it's something that is possible in any language. Even something like Python, in the real world, will probably come with a stack of Organisational Practices and socially enforced norms that allow to in practice make similar assumptions, even though you could in theory overwrite the + operator at runtime.

Accepting these kinds of restrictions limits your expressivity but can superpower your understanding of a codebase because it lets you make massively compressing abstractions and draw very clean lines ("Ok... this can't possibly affect this... It must have come from A, B or C...").

Time invariance is an affordance that lets you better "play computer in your head", as Bret Victor would say.

So basically I think the reason that Haskell programmers play down the actual execution of their code is they are backing themselves to have already run it accurately in their head beforehand.

Chris Knott 2021-09-09 13:18:46

I basically believe you should start from the opposite end.

Computers are physical machines that literally, actually, move and change with the forward arrow of time as part of the physical universe. Because they change in very small ways and very quickly, they kind of fly below our intuition radar. They are also highly deterministic compared to naturally occurring phenomena.

Still, they actually do perform an irreversible action just like a computer built out of metal ball bearings falling through chutes (https://www.turingtumble.com/).

If you want to understand what happened, you should just record what happened and then interrogate that recording with a powerful suite of thinking tools.

Mariano Guerra 2021-09-09 13:21:00

Temporal Logic of Actions sounds like an in between in your spectrum:

This paper introduces TLA, which I now believe is the best general formalism for describing and reasoning about concurrent systems. The new idea in TLA is that one can use actions–formulas with primed and unprimed variables–in temporal formulas. An action describes a state-transition relation. For example, the action x’=x+1 means approximately the same thing as the programming-language statement x := x+1.

https://www.microsoft.com/en-us/research/publication/the-temporal-logic-of-actions/

Andreas S. 2021-09-09 18:21:15

🐦 Vanessa wrote a thing: What is @croquetio anyways?

A thread 🧵

Luke Persola 2021-09-10 00:18:29

Ivan Reese

Functional Programming is orthogonal to this spectrum

I don’t get this. While I would normally think of functional programming as being on the timeless end of the spectrum, you make a good point that it can also be on other end. But isn’t it the case that programming styles that contrast with FP often do have to be structured in time? To pick from your examples, how would you have mutable values without modeling time? Or with forEach don’t you need to have time to have side effects work?

Luke Persola 2021-09-10 01:34:22

Chris Knott

Are you taking an anti-abstraction stance that programmers should be be aware of time as it exists in the internals of the language/environment they’re using (because they’ll have to deal with it eventually anyway)? Or are you just saying it’s good for them to think about/deal with time in their code, but it’s OK if it’s abstracted into a different (likely more limited) form?

Chris Granger 2021-09-10 17:14:10

synchronous languages (Lucid, Esterel, Lustre) have a pretty qualitatively different experience of time

Ivan Reese 2021-09-10 17:35:22

Great references and exploration of the topic — thanks everyone.

Cole Lawrence

keeping value in the stack for weeks at a time

I love this idea. I'm so accustomed to non-live programming that I forget about the completely different sensation of time that you get from a live environment, where state is can be thought of as non-volatile by default.

Chris Knott — I like that your two comments are "here's why [existing thing] is what it is" and "here's what I want to exist". Both make sense to me!

Mariano Guerra — Queued! Though unlike Lamport's earlier work (you know, the hits), this one looks like it might fly over my head a bit :$

Andreas S — Croquet sure is interesting, hey? Though I'm not sure how it relates here — I can imagine ways, but if you had something specific about it you wanted to highlight that'd be appreciated.

@Luke Persola — You're right, I under-considered what I was saying there. I was focused on pointing out that you can have all that conventionally "functional" stuff within a programming system that does feel very mechanical / process-oriented. I didn't consider enough whether it's possible to have the non-FP stuff within a system that feels timeless. Good point, will have to think about this some more.

Chris Granger — Nice pulls! I suppose the same applies to RT OSes, CAN Bus (etc), and perhaps even programming within a high-end game engine (eg: frame cadence, Carmack's preference for algorithms that are slower average-case with less variability, etc). Would love to just be able to just... order a study on this.

Shalabh 2021-09-11 00:16:36

ok, stream of conciousness incoming...

On the concept of math vs mechanics, I perceive math as this vast graph that always exists, but in imagination. Mechanics is about materializing parts of this graph in some physical form. The machine manifests a subset of this graph, and traverses to other nodes, pulling in more and more of this graph, from the imaginary to the physical, as physical time progresses. If you type 2 + 3 in a system, you've got a graph with three nodes (2, 3, +, connected in a nice tree) and after some time, you've got a 4th node (5, connected to the [2,3+] bundle of nodes), but in mathland all 4 nodes (and the edges and more) pre-existed.

On the intersection of these ideas, CRDTs come to mind. You have the semi-lattice, which is very mathematical and static. However you have the actual values at different nodes, which correspond to one node in the lattice at any point in time, but they change over time and eventually walk up the math lattice to meet at the top.

Croquet came to my mind as well when reading the prompt. On the surface it is full-mechanical. The machines moves, step by step, and is implemented in that style. However look between steps - each step is functional, it must be deterministic - that's what keeps all the different systems in sync. There is no logical time within a step (e.g. can be considered instantaneous.. the next input cannot interrupt a step).

Chris Knott 2021-09-11 09:32:36

@Luke Persola Basically the first. You should at least be aware of which end of this spectrum is the actual bedrock, and which is the potentially leaky abstraction.

I've used this example before but CPython and Pypy implement list.pop(0) (popping first item from a list) differently. CPython bumps a pointer, Pypy moves the list. So it's O(1) vs O(n). There is literally no way to discover this "in system". This is the sort of thing that comes up in end-user programming like data science and will actually confuse users. It can make the difference between a visualisation being interactive or not.

The weakness in my position is that in a lot of cases, you truly don't ever need to worry about how the abstraction is actually realised/implemented. Most actual use cases are just IFTTT style plumbing or CRUD apps with N in the hundreds.

Jimmy Miller 2021-09-10 15:19:41

I know a lot of us here have been influenced by talks like Bret Victor’s “Inventing on Principle”. But I’m curious about your favorite papers that are somehow related to the future of coding. I’ll start with mine.

PILOT: A Step Toward Man-Computer Symbiosis - Warren Teitelman

This is actually a thesis, so it is a bit long, though much of that length is taken up with a transcript. I will admit there is a lot in this paper that isn’t great. The resulting system is almost certainly something no one would want to use today. But yet in it are such interesting ideas.

It is often considered to be the paper that introduced aspect oriented programming, but I believe that sells it a bit short. PILOT is an integrated, live, editing and computational system. It’s goal is to allow programmers to 1) customize their interface and syntax, 2) edit programs they are unfamiliar with 3) make changes not just to current functions in the program, but future ones as well 4) control how the program itself executes and so much more. It is a bit of a historical trip, but includes so many fascinating ideas.

Programming as Theory Building - Peter Naur

Naur lays out a view of the activity of programming that is both radical and yet highly attractive. A key consequence on his theory is that the real end product of programming is not the source code, not the build artifact, not the running system, but the knowledge that a programmer builds. So much follows from this. It is a fascinating paper that I highly recommend reading.

The Structure of a Programming Language Revolution - Richard P Gabriel

A beautiful paper about the changes to programming language research that Gabriel has seen over his career. I will just leave you with this quote that sets the mood for the paper.

That night I pulled the paper down from the ACM server and read it while outside enormous puffed clouds dwelled overhead, lit from beneath by the town of Porto de Galinhas on the Brazilian coast; the smell of burning sugarcane and bitter ocean pushed into my room.

What are your favorites? What papers have really pushed you in a particular direction? What papers do you think people should read, even if you disagree with them? Whatever the criteria for the paper being good, I want to know about it.

Konrad Hinsen 2021-09-10 18:47:07

Beyond Programming Languages - Terry Winograd (1979)

Discusses higher-level programming which is less about algorithms and data structures and more about different views on a complex software system that can be manipulated by programmers.

Chris Rabl 2021-09-12 20:01:53

I'll second the Winograd paper! A few that have influenced my thoughts around programming tools are:

Jimmy Miller 2021-09-13 00:04:33

Konrad Hinsen

Just finished the Winograd Paper. Thank you so much for that recommendation!|

Chris Rabl

Love beyond being there. The other’s I have not read, (though I am familiar with racket). Thanks for the recommendations :) Look forward to diving in.