Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collaborating #1

Open
SwiftLawnGnome opened this issue Sep 24, 2019 · 6 comments
Open

Collaborating #1

SwiftLawnGnome opened this issue Sep 24, 2019 · 6 comments

Comments

@SwiftLawnGnome
Copy link

Hello,

Several months ago I forked this repo and started working on it. After some additions, I found myself unable to work with such a large codebase with which I was so unfamiliar, so I decided to try starting from scratch, borrowing some pieces from your project. I expected this to be something I played with for a week then forgot about, but I ended up spending a great deal of my free time working on it, and have made a lot of progress. I had assumed this project was abandoned so I didn't feel the urgent need to contact you, but now that I see you're still maintaining this project I'd like us to collaborate.

Some of the more significant things my version has (not sure which of these are also in yours):

  • true (distinct from cl:t), false, and empty list (both distinct from cl:nil)
  • persistent lists
  • persistent vectors (based on your implementation), and persistent maps and sets (based on cl-hamt)
    • all are funcallable
  • IMeta, IHashEq, IReference and many other interfaces from Clojure's java source
    • Currently all implemented with cl:defclass, ideally these would be defined with defprotocol
  • destructuring, both for maps and sequences
  • loop, but using labels so that TCO is performed by the compiler, and no check is performed to ensure it's used properly
    • Note: I have toyed with a version using tagbody, which works fine, but also doesn't confirm that recur is in tail position
  • fn, supporting multiple arities, a name, and recur within the bodies
    • normally a regular function is returned, but using with-meta on it upgrades it to a Fn.
  • for (producing an actual lazy-seq)
  • A readtable with map, set and vector literals, #_ to discard next sexp, @ to deref, ^ for metadata, #() for functions
    • This readtable is a bit hacky, but I'm also testing one that parses forms into clojure data structures, and supports syntax-quote (including ~@ with any Seq, and auto-gensym# syntax). It seems to work pretty well.
  • reduce, transduce, and many, if not most, transducers from Clojure.core implemented
  • Atoms, Namespaces, Vars
    • Not tested in depth. Vars are probably in the most complete state of the three.
    • Vars are funcallable, and #' syntax for vars works, because def sets the symbol's value a la cl:defvar, and sets the symbol's function to a Var. The Var's funcallable-instance-function just applies its value to the arguments.
  • A lot of scaffolding for Agents
    • If I remember correctly they do work fairly well, but the STM has to be developed before moving forward
  • Symbols and Keywords, distinct from CL keywords and symbols
    • this part's really tricky, and my glue code between the two is very shoddy
  • = (it works okay, still plenty of work to be done)
  • Pretty printing vectors, sets, maps, namespace qualified Vars, Symbols and Keywords, etc.
  • A ton of other clojure.core functions and macros

I think that's most of the major developments. Some major roadblocks I've realized while working on this:

  • Atoms. The only free/libre CL implementations that support CAS on struct/object slots are SBCL and ECL. This means that these are probably the only dists that can be targeted
    • Additionally, ECL (somehow) seems to not support timeouts in any form (needed for deref), so it may also not be a viable platform, leaving just SBCL
  • variadic functions. In CL, the &rest arg must be a proper list. In Clojure, an & arg can contain any ISeq (from what I can tell). Converting every ISeq to a proper list won't do either, because in Clojure you can apply functions to infinite lazy sequences.
  • monitors. every Object in java, and consequently all data in Clojure, can function as a monitor, and the locking macro can be used to synchronize code around an object. I have no idea how to make (locking 3 ...) work in CL.
  • thread local bindings. From what I can tell, in SBCL the only way to establish thread local bindings is via a let in the thread's function, but documentation on this is practically non-existent. push-thread-bindings and co may need to be manually implemented with hash-tables/maps or similar.

If you'd like to merge the projects then let me know and I can put my repo on GitHub and we can figure out how to fit the pieces together (but I must warn you that I'm very busy and generally suck at responding)

Thanks

@joinr
Copy link
Owner

joinr commented Sep 24, 2019

Thanks for the note. Very glad that someone was able to get some use out of this hobby project, and also very glad that you made progress.

I expected this to be something I played with for a week then forgot about, but I ended up spending a great deal of my free time working on it, and have made a lot of progress.

That sounds very familiar. I typically dust this off once a year as a side-project. This year has had significantly more progress, as well as a change in strategy. Rather than trying to implement all of Clojure, I'm just targeting "enough" to get tools.analyzer and tools.reader bootstrapped. This still means implemented a bunch of the prior core functionality (protocols and friends), so I was able to repurpose most of what I'd experimented with and drive on.

true (distinct from cl:t), false, and empty list (both distinct from cl:nil)

This is a pending issue which I plan to cover with a custom emitter via tools.analyzer.

persistent lists

To date, I just extended the seq protocol to cons and null, and don't provide mutable interfaces to them. It's persistent, unless you decide to setf your own. So, bootstrapping-wise, CL lists are persistent lists. I do have a separate implementation for lazy sequences though now.

persistent vectors (based on your implementation), and persistent maps and sets (based on cl-hamt) all are funcallable

I bootstrapped a simple copy-on-write map, since all of these are defined in cljs.core using native clojure. I could've done the same cow-based implementation for bootstrapping, but I was actually learning how the pvector was implemented at the time along with CL, so it's in there. I think CL-HAMT came along after I'd started. Definitely interested in the implementation.

IMeta, IHashEq, IReference and many other interfaces from Clojure's java source Currently all implemented with cl:defclass, ideally these would be defined with defprotocol

I have these all implemented via protocols (ported from cljs.core). The only current hangup I have is getting variadic protocols working, but that's a skip away. I'm finding I strongly prefer the defprotocol approach where possible. My implementation has them as structures specifying which types conform, but the implementation is technically normal CLOS generic functions. They are, however, much much nicer to extend to types (you have bundled functionality vs. a diaspora of functions).

loop, but using labels so that TCO is performed by the compiler, and no check is performed to ensure it's used properly. Note: I have toyed with a version using tagbody, which works fine, but also doesn't confirm that recur is in tail position

I just came up with this the other day due to the presence of loop/recur in some core sequence functions (functions I'd like to use for bootstrapping convenience, and layer delegate to via emitter from the analyzer). I actually started going the labels route, but ran into what I "thought" was a blown stack at one point in SBCL. Might've been due to not being compiled via compile-file (I think you have to do that to get TCO to kick in), as I was testing live at the repl. I just wrote my own implementation on the spot (learned a bit in the process), and use tagbody/go to get the job done. Seems to work for now, although I'd rather delegate more to the compiler. I think it will be sufficient for bootstrapping, but probably not something I'd bet my salary on.

destructuring, both for maps and sequences

Very interested in this. I was going to just cop the clojure implementation of destructuring once I had enough of the substrate implemented (close enough), but if you've got something pre-made, sounds nice.

fn, supporting multiple arities, a name, and recur within the bodies, normally a regular function is returned, but using with-meta on it upgrades it to a Fn.

Curious to see your implementation here. I keep wondering if there's a better way to define these outside of counting args and dispatching. Clojure jvm has runtime support for arg dispatch, and uses constructors with custom arity (I think). It probably doesn't matter for bootstrapping though. After getting tail calls in, I'm about to shove the with-recur form into all of the lower level lambda* implementations to enable default function recur points. This will propagate to fn as well (since it uses lambda*). I don't have names in fn forms yet, nor meta.

for (producing an actual lazy-seq)

I think this was on the short-term target for the sequences package, although I hadn't ported it yet. I was looking at just porting clojure.core's implementation (since I have enough to parse it I think).

A readtable with map, set and vector literals, #_ to discard next sexp, @ to deref, ^ for metadata, #() for functions
This readtable is a bit hacky, but I'm also testing one that parses forms into clojure data structures, and supports syntax-quote (including ~@ with any Seq, and auto-gensym# syntax). It seems to work pretty well.

Interested in this also. Very interested in how you approached the "Eval" problem I ran into with clojure literals. I had to hack SBCL's eval, along with the aforementioned reader macros, to get things working correctly (e.g. for supporting data literals as arbitrary syntax objects for macros and the like).
Most of this is convenience for bootstrapping though; I think implementing a full reader via parsing tools.reader, and emitting CL, ends up getting effectively a native clojure reader in CL, which ultimately obviates the need for reader macros and the like (just use the clojure read implementation). Still, for porting as much as possible in the bootstrap process, this is very useful.

reduce, transduce, and many, if not most, transducers from Clojure.core implemented

I got a naive reduce baked off in a half hour. Mine is compatible with CLOS sequence types, and extends the reduced? semantics to them as well (ended up being tricky). Again, mostly for immediate utility in bootstrapping. A full transducer implementation (as well as reducers e.g. fold and the like) is definitely appealing. Curious to see implementation.

Atoms, Namespaces, Vars
Not tested in depth. Vars are probably in the most complete state of the three.
Vars are funcallable, and #' syntax for vars works, because def sets the symbol's value a la cl:defvar, and sets the symbol's function to a Var. The Var's funcallable-instance-function just applies its value to the arguments.

Very nice. This was an early area I struggled with, and resorted to "living" with CL as a primary host, although the long-term solution is exactly what you described. One thing I had thought about, but hadn't figured around, is implemented namespaces. The the max extent possible, I'd though to try to integrate them with the existing package system, or provide really seamless interop (say you want to refer to something in Clojure from common lisp). The way CL deals with packages and symbol interning is different from clojure (as I recently found out when implementing the tail call stuff), and I haven't explored the depth to which they can be reconciled. The aforementioned strategy of porting the analyzer and reader doesn't do anything to help here...

A lot of scaffolding for Agents
If I remember correctly they do work fairly well, but the STM has to be developed before moving forward

Good deal, very interested. This is one area that cljs doesn't bring anything to bear on and you need a solid host implementation. I'd assumed CL-STM would be decent, but am relatively ignorant there.

Symbols and Keywords, distinct from CL keywords and symbols
this part's really tricky, and my glue code between the two is very shoddy

Concur on the tricky part. Curious to see how you handled it.

= (it works okay, still plenty of work to be done)

Also curious to see how this works. I think, like the JVM implementation, there are lot's of wrappers required around the host system (e.g. for numerics) along with a nice implementation of the Equiv protocol. cljs.core brings in IEquiv though.

Pretty printing vectors, sets, maps, namespace qualified Vars, Symbols and Keywords, etc.

Yeah, that's another animal; although I've got a chapter on that in the Common Lisp Recipes book by Weitz.

A ton of other clojure.core functions and macros

Good deal. Curious to see what's there.

Atoms. The only free/libre CL implementations that support CAS on struct/object slots are SBCL and ECL. This means that these are probably the only dists that can be targeted

Additionally, ECL (somehow) seems to not support timeouts in any form (needed for deref), so it may also not be a viable platform, leaving just SBCL

I have no qualms about targeting SBCL. ECL would be nice, although not necessary for my goals. It may be that you end up going the CLJS route for anything except for SBCL (or else define a custom implementation somehow...I'm not smart enough on the CAS stuff to do so competently, but could surely start an amateur implementation).

variadic functions. In CL, the &rest arg must be a proper list. In Clojure, an & arg can contain any ISeq (from what I can tell). Converting every ISeq to a proper list won't do either, because in Clojure you can apply functions to infinite lazy sequences.

Interesting. I'd think some macro conversions would help here; perhaps I haven't gotten far enough for this to be a problem. I can "imagine" either hacking the internal implementation, or finding a work around with some janky intermediate representation. On the other hand, if you've got funcallable objects emitted by the clojure analyzer, it may not be a problem at all.

monitors. every Object in java, and consequently all data in Clojure, can function as a monitor, and the locking macro can be used to synchronize code around an object. I have no idea how to make (locking 3 ...) work in CL.

Interesting. This is deep magic to me (lack of familiarity with JVM's memory/synch model an internals). Sounds like more work-around or implementation hacking. I have no idea either.

thread local bindings. From what I can tell, in SBCL the only way to establish thread local bindings is via a let in the thread's function, but documentation on this is practically non-existent. push-thread-bindings and co may need to be manually implemented with hash-tables/maps or similar.

Yeah, I hadn't even gotten to this consideration. There are a lot of features in SBCL that only pop out if you read the source :) (I'm learning that). May be worth asking the brains in sbcl or lisp reddit at some point.

If you'd like to merge the projects then let me know and I can put my repo on GitHub and we can figure out how to fit the pieces together (but I must warn you that I'm very busy and generally suck at responding)

I'd encourage you to put your repo up regardless (given the high latency) so others can benefit from your work like you did mine. Certainly interested in collaboration, so far as I can fit things into my revised strategy. Rather than implementing everything in CL via reader macros, hacking eval, etc. just implement enough to get tools.analyzer and tools.reader implemented to compile cljs.core (likely a modified version), and then get a REPL going. I figure this way, things like testing and other stuff is already out there and written. Taking advantage of a majority clojure-in-clojure implementation also seems like a win, with the ability to benefit downstream from upstream improvments to the reader and analyzer (which both seem to be maintained). A secondary emergent goal is to further refine the porting process, and make it even easier in the future. I figure getting CL carved out paves the way for distilling the absolute bare minimums for reading and analyzing. That then opens the door for additional targets (hopefully simpler) like the various Schemes, and even a Julia target.

There's another person @pangloss interested who recently reached out too. There may be a budding community effort, which I find surprising :)

@arichiardi
Copy link

arichiardi commented Nov 3, 2019

Just stopped by to say that this is very cool and indeed interesting. The Clojure API is just too good to be left on the JVM only.

I am quite a Common Lisp newbie but slowly but truly learning and maybe at some point will be able to understand enough for reviewing some PR 😉

@joinr
Copy link
Owner

joinr commented Nov 3, 2019

Always glad to see another interested party. We apparently have like maybe 4 people at this point that have actively reached out/discussed. I'm still doing this off/on, but trying to ramp the pace up from annual to monthly (or faster, depending on progress). I'd like for an end-of-year goal to be to have working reader and analyzer/compiler. Current minor obstacle is rewriting protocol / type definitions to allow non-implemented arities for multiple-arity protocol functions. Multimethods follow (should be very easy though). After that...I should have most if not all of the cljs.core standard lib available and parsing (a significant proto-clojure), and from there getting the reader and analyzer seems very feasible (perhaps with some minor modifications/simplifications).

@joinr
Copy link
Owner

joinr commented Nov 27, 2019

Fyi, cloture popped up on reddit the other day. Looks like a pro CL bubba, taking a different approach. I think the focus there is embedding Clojure in CL; the bidirectional calling stuff looks interesting.

@arichiardi
Copy link

Yep I have noticed that but I have to say that all those uppercase vs lowercase stuff feels a bit painful 😄

Maybe there is some actual limitation there that cannot be worked around though.

@joinr
Copy link
Owner

joinr commented Nov 28, 2019

I think the approach cljs took is worth emulating on the clj side (expose a common-lisp namespace for access to sbcl maybe, or wire up the imports and requires to be aware of asdf). Also curious about the possible (albeit rare) name collisions with that approach. The file integration with asdf is nice (in unfamiliar there). Also, exposing cl evaluation via reader forum (like #js) could be nice (hopefully not necessary though? ). I also noticed a bunch of reader conditionals for cl there. I'd like to avoid that if possible, like the declare-keywords function. I think there's a way around that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants