SAD theorem
As programs grow complex, you will be repeatedly forced to either:
maintain some State,
perform some computations Again,
or Duplicate some code.
Has anyone encountered a formulation like this in the literature?
S and A seem to relate to caching, the D talks about code, but if it talked about data it would definitely be about cache invalidation 😛
I'm only concerned about the code architecture. S and A are about caching only to the extent that all data structures are caches (a reasonable view)
I'm going to flesh out a concrete case study. But if y'all think of any papers that may be relevant please throw them here.
right now I can only think of "On the criteria to be used in decomposing systems into modules" by parnas
Maybe of interest:
NiCad Clone Detector https://www.researchgate.net/publication/221219568_The_NiCad_clone_detector
Framing Software Reuse https://www.amazon.ca/Framing-Software-Reuse-Lessons-World/dp/013327859X)
My thoughts on RY as a Programming Primitive
https://guitarvydas.github.io/2021/05/24/RY-Repeat-Yourself-as-a-PL-Primitive.html
StateCharts
https://guitarvydas.github.io/2020/12/09/StateCharts.html)
StateCharts II
https://guitarvydas.github.io/2021/02/25/statecharts-(again).html)
we might say, there is no such thing called "state" once upon a time. "state" is a illusion invented to decouple computation.
@taowen Spotted just now:
a filesystem is a kind of network protocol that allows for communicating across time
or a abstraction to generalize the past, no matter how many events happened in what order, we can always generalize it as "same state", to simplify the integration
Seen from a purely computational perspective, yes. But state is also a feature of our physical universe, and can thus be an important aspect of models of the physical world. Much of the OO vs. FP debate could be eliminated if both sides made a clear distinction between state as part of the model and computational state as an optimization technique.
S and A do seem like things that can be traded against each other, but I'm having a hard time seeing how D comes in. How does duplicated code trade against State? Is the idea that the branch of code you find yourself in implicitly encodes the state?
Say you perform some complex function (rendering some structured data to screen) and then need the result of a sub-computation (where did ___ get drawn?) You could either save some state during the computation (render) or redo a slice of it (a pretend-render function that duplicates some of the logic in render).
Perhaps higher-order functions will help here? I have a tendency to forget them after 2 years of programming in machine code.
Oh, reusing a HOF would correspond to doing some potentially complex computation again for the complex traversal logic. This reminds me of the Scrap Your Boilerplate papers from the Haskell world.
This feels very related to the primary thesis of Out of the Tar Pit
Chris Granger Thank you for bringing up Out of the Tar Pit! I've tried several times to appreciate it, and mostly failed to do so in the past[1]. Your nudge in this context got me to do so one more time, and I got a lot further. This time I went past the preliminaries in chapters 1-7 which had given me a lot of trouble before, and focused on their solution outline in chapters 8-10.
I don't think I'd ever noticed before that their FRP is not Conal Elliott's FRP! Functional Relational Programming, not Functional Reactive Programming. Did everybody else know this? This is the kind of stuff that reminds me that my brain is just a jumble of wires, and all my seeming insights are illusory. I'll call it a Moseley-Marks or MM system, just to get past the camouflage.
Have there been any attempts to build an MM system? Pretty much every FRP mention out there seems to be reactive rather than relational. The closest I got was systems inspired directly by Codd rather than MM: https://wiki.c2.com/?TutorialDee
Anyways, once I pruned away the stuff I was distracted by in the past, I see now a kernel of ideas that seem very useful. It's not essential vs accidental complexity that an MM system manages, because I'm skeptical of our ability to separate those categories, but rather functional invariants vs cross-cutting concerns:
- Base state consisting of immutable value types and relations between them (subsets of points in tuple space).
- Functions over value types.
- Derived relations that aren't needed to describe the problem, but useful to a specific implementation.
- Integrity constraints for base and derived relations.
- Hints on what to store, what order to store it, indexes, etc. A Prolog-like search strategy would fit in here, I think.
The critical new insight for me: this doesn't have to be an all-encompassing framework. Calling it Functional makes it hard to see that I can actually use the framework even in an imperative setting. Set up a phase of a program where it goes through deriving relations from the input, then query the relation store in various ways to create the desired output. Unlike properties like referential transparency, a little impurity here doesn't actually make it impossible to assess the remainder. I can imagine a fairly conventional language toolchain that adds a relvar
type, along with operations to insert into, query and clear relational variables. Use them in the "lumpy" parts of your program, where you're tempted to duplicate code or no obvious new abstractions present themselves. The toolchain could even give feedback in a complexity score every time it rebuilds a program. The only new domain-independent constraint: you can't mutate a value in the relation store.
This framework feels enormously useful once I stop expecting it to be a silver bullet, and start thinking of it instead as a stepping stone to the right architecture. A dynamically typed store of global state that is easy to query. The problem with mutable global state is really just one of UX: it's too easy to get into situations where mutations get squirreled away where they're easy to forget. Creating immutable copies and local variables can lead to the same pathologies; they just tend to do so less often. Given the gradual nature of the benefits, requiring 100% purity to get any benefits feels like a bad trade. The MM system permits more graceful trade-offs.
[1] My opinion of it as of last week, mostly honed in the course of discussions here: https://lobste.rs/s/1yfrup/into_tar_pit#c_3ikuv9
I don't think I'd ever noticed before that their FRP is not Conal Elliott's FRP! Functional > Relational> Programming, not Functional Reactive Programming. Did everybody else know this?
nod
Have there been any attempts to build an MM system?
Last time I asked / researched this, I only heard/saw "no" beyond the example in the paper. I think a lot of folks (myself included) have made feints in that direction, before diverting off elsewhere for whatever reason.
It's not essential vs accidental complexity that an MM system manages, > because I'm skeptical of our ability to separate those categories
Need one of those room-scale "💯" buttons I can jump up and down on.
I think that the Cell language is MM inspired. I'd guess heavily inspired, except that the author doesn't cite the tarpit paper. http://cell-lang.net/
Many versions of Eve were essentially MM systems that tried out different points in the implementation space. From pure log + view to various forms of mutable.
Chris Granger Thank you for bringing up Out of the Tar Pit! I've tried several times to appreciate it, and mostly failed to do so in the past[1]. Your nudge in this context got me to do so one more time, and I got a lot further. This time I went past the preliminaries in chapters 1-7 which had given me a lot of trouble before, and focused on their solution outline in chapters 8-10.
I don't think I'd ever noticed before that their FRP is not Conal Elliott's FRP! Functional Relational Programming, not Functional Reactive Programming. Did everybody else know this? This is the kind of stuff that reminds me that my brain is just a jumble of wires, and all my seeming insights are illusory. I'll call it a Moseley-Marks or MM system, just to get past the camouflage.
Have there been any attempts to build an MM system? Pretty much every FRP mention out there seems to be reactive rather than relational. The closest I got was systems inspired directly by Codd rather than MM: https://wiki.c2.com/?TutorialDee
Anyways, once I pruned away the stuff I was distracted by in the past, I see now a kernel of ideas that seem very useful. It's not essential vs accidental complexity that an MM system manages, because I'm skeptical of our ability to separate those categories, but rather functional invariants vs cross-cutting concerns:
- Base state consisting of immutable value types and relations between them (subsets of points in tuple space).
- Functions over value types.
- Derived relations that aren't needed to describe the problem, but useful to a specific implementation.
- Integrity constraints for base and derived relations.
- Hints on what to store, what order to store it, indexes, etc. A Prolog-like search strategy would fit in here, I think.
The critical new insight for me: this doesn't have to be an all-encompassing framework. Calling it Functional makes it hard to see that I can actually use the framework even in an imperative setting. Set up a phase of a program where it goes through deriving relations from the input, then query the relation store in various ways to create the desired output. Unlike properties like referential transparency, a little impurity here doesn't actually make it impossible to assess the remainder. I can imagine a fairly conventional language toolchain that adds a relvar
type, along with operations to insert into, query and clear relational variables. Use them in the "lumpy" parts of your program, where you're tempted to duplicate code or no obvious new abstractions present themselves. The toolchain could even give feedback in a complexity score every time it rebuilds a program. The only new domain-independent constraint: you can't mutate a value in the relation store.
This framework feels enormously useful once I stop expecting it to be a silver bullet, and start thinking of it instead as a stepping stone to the right architecture. A dynamically typed store of global state that is easy to query. The problem with mutable global state is really just one of UX: it's too easy to get into situations where mutations get squirreled away where they're easy to forget. Creating immutable copies and local variables can lead to the same pathologies; they just tend to do so less often. Given the gradual nature of the benefits, requiring 100% purity to get any benefits feels like a bad trade. The MM system permits more graceful trade-offs.
[1] My opinion of it as of last week, mostly honed in the course of discussions here: https://lobste.rs/s/1yfrup/into_tar_pit#c_3ikuv9