I've been meaning to look into it, so these are mostly notes, that others might find useful,
arenas are x86_64 only, not sure how involved are ports, but there's at least a VOP, that as of right now exists only on x86_64. (the build obviously fails on other systems)
I'm trying it on m1, so I can build with
arch -arch x86_64 sh make.sh --with-system-tlabs
and then run with
arch -arch x86_64 sh run-sbcl.sh
it's pretty raw, so for example allocating object larger than arena results in ldb. there's a lot of sample code in tests/arena.impure.lisp
and the simplest possible test code just to see what's up,
well, so at least we know the array is put somewhere. finally one calls (destroy-arena a), but you can still access the data, so presumably there's no checking involved, and being undisciplined about retaining arena pointers will create all kinds of interesting bugs.
This is always the tradeoff with arenas. Either the runtime keeps it safe and postpones actually destroying the arena if there are references leading to it. Or it trusts the programmer, which will eventually result in fun late-night debugging sessions. TA indicates that SBCL is rather of the former variety since it describes debugging tools to find references leading from the heap to the arena.
I guess it is best practice to bury arena management inside some sort of framework* and don't fiddle with them in business code.
I don't see why a language would need special support for arenas, unless it is GCed. It's a simpler alloction model than malloc, and C++ doesn't really have builtin support for malloc either, it just calls free() in a bunch of destructors in the STL.
In fairness std::allocator has always been a wreck, and in fairness RAII let’s you do a fairly low friction arena: but you run to the same set of problems that the C folks have with things as common as doubly linked lists. There are plenty of good ones, but every project has a different one.
Jai just throws you a bump-allocated heap arena with a (tunable) default bigger than 8MB that you can hit with idiomatic syntax in every basic block.
None of this is impossible in C or C++ or Rust or Java: but idiomatic defaults and notation matter.
Most per-method allocations should be pointer bumped in an arena (we’re already paying for page tables and a TLB, might as well get our money’s worth), and the languages just don’t encourage it.
> In fairness std::allocator has always been a wreck
std::allocator is just fine - it's no more and no less than a wrapper around malloc/free and that's what it really only needs to be. If you need a different allocation scheme that your workload is going to benefit from: (1) you first need to come up with it by finding out what scheme is going to benefit you, (2) you then implement it through a custom allocator.
> Most per-method allocations should be pointer bumped in an arena ... and the languages just don’t encourage it.
System malloc implementation will do that by themselves whenever they see fit. And this is not the programming language problem to begin with at all.
Pascal in its various variants had support for arenas already into the language.
C++ new isn't required to call malloc()/free() per ISO C++, underlying OS APIs can be called instead, there are the whole allocator infrastrcture, and placement new as well.
Google Flights came about, in particular, from the acquisition of ITA Software which was a Common Lisp shop. Much of that system is, reportedly, still in Common Lisp.
Based on the commit message [0], and the references to "user code" in this document, my guess is that user programs have or will have access, but it's not finalized enough to be documented.
That being said, I suppose if you're developing an internal API for a compiler/interpreter, your "users" could be other parts of the project rather than language users.
An interesting idea for sure. I spent too much time worrying about fragmentation on Linux and ended up with jemalloc (based on Firefox not the default).
I also had a local thread based pool I hand wrote for fixed sized objects that were generated en masse.
There's not much context here that explains why this is noteworthy. Arena allocation by itself isn't anything to write home about- is anyone able to explain why this is special?
The standard that you're describing is not really the standard that most HN'ers hold submissions to. The guidelines[1] suggest the following:
> On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.
someone might find a boots-on-the-ground, in-context treatment of arena allocation particularly interesting, moreso than just reading about it in the abstract on wikipedia
someone might have never heard of anything of the sort
someone might be interested to see lispers present a use case where they care about reaching down to the finer points of memory allocation in a language that typically benefits from being above that; see also the wealth of info on the garbage collector in the ocaml documentation
the number of users on this website who might get some sort of kick out of this link is nonzero which is the generally agreed-upon criterion for being worth posting
I knew there'd be at least one pedant! And I almost altered what I wrote to pre-satisfy you!
Begone pedant, and feast thine eyes on the surreal numbers [0]!
> In mathematics, the surreal number system is a totally ordered proper class containing not only the real numbers but also infinite and infinitesimal numbers, respectively larger or smaller in absolute value than any positive real number.
(Also, go look at the `fil` unit in TeX for a good practical application of this ;).)
That Wikipedia article is confusing to me, because I’m really used to “region” being used to refer to type-system-level allocation tracking, Tofte&Talpin style (even if the T&T hope of inferring everything didn’t really work out, and the Rust tradition decided to rename the concept to “lifetimes”). “Arena” seems to be a more common term for a purely runtime construct.
i'm not sure how it's going to work in CL application code, but if you can re-use fixed memory and avoid runtime allocation with it like you can in C, it seems like it would be really nice for lisp game development, where avoiding GC pauses can be unintuitive or annoying
there have been other posts related to Arena based allocation on hn recently and I was grateful because it wasn’t something I was aware of. I found this interesting - I see this post is part of a trend.
Right, how is this top of HN? Something is off, as there is nothing newsworthy about this. If I submitted to HN a random documentation page from LWN or Linux kernel, it would surely not be #1.
There's nothing edifying about yucking other people's yums. If many others are upvoting something, and you don't see why it's interesting, maybe it's OK for the situation to be as simple as that: it's interesting to others even though it isn't interesting to you.
Also, there is absolutely no requirement that HN submissions all qualify as newsworthy. As others have pointed out in this thread, the gist of what's considered appropriate is quite simple and broad: "Anything that good hackers would find interesting."
arenas are x86_64 only, not sure how involved are ports, but there's at least a VOP, that as of right now exists only on x86_64. (the build obviously fails on other systems)
I'm trying it on m1, so I can build with
and then run with it's pretty raw, so for example allocating object larger than arena results in ldb. there's a lot of sample code in tests/arena.impure.lispand the simplest possible test code just to see what's up,
well, so at least we know the array is put somewhere. finally one calls (destroy-arena a), but you can still access the data, so presumably there's no checking involved, and being undisciplined about retaining arena pointers will create all kinds of interesting bugs.neat!