Skip to content

Maybe we should stop forcing async-bytecomp-package-mode onto users and maintainers #139

@tarsius

Description

@tarsius

And by we I mean @thierryvolpiatto and me, and anyone else who adds something like the following to some of their packages:

(and (require 'async-bytecomp nil t)
     (let ((pkgs (bound-and-true-p async-bytecomp-allowed-packages)))
       (if (consp pkgs)
           (cl-intersection '(all magit) pkgs)
         (memq pkgs '(all t))))
     (fboundp 'async-bytecomp-package-mode)
     (async-bytecomp-package-mode 1))

async-bytecomp-package-mode advices the relevant parts of package.el to use a separate emacs instance to compile packages. This makes it almost impossible that an already loaded version of some package leaks into the byte-code of the newer version when package.el builds the new version in an emacs session in which the old version has already been loaded.

Back when async-bytecomp-package-mode was created package.el did not try to prevent such leakage at all, but nowadays it tries to accomplish the same by first unloading the old version of the package. This is an improvement but this does not actually guarantee that nothing leaks, it just makes it less likely and that simply is not good enough.

Bugs that result from old versions of a package leaking into the byte-code of the new version are extremely difficult to debug. They are always heisenbugs because the code that causes the bug literally does not actually exist in any version of the package.

If one maintains a handful of not all that popular packages, then that is probably okay. You might run into this issue without realizing it and that is okay--one mystery bug per year is okay. Similarly if you maintain Emacs but not any separate packages, then you won't have to deal with this issue much either.

But if you maintain Helm or Magit, then this happens all the time. And it gets really really frustrating to deal with bugs that simply are not possible because there just isn't any code that could possibly be doing what the user claims it is doing. For every bug report you have to keep in the back of your mind that this could be yet another instance of leaked older implementations.

As far as I remember the Emacs maintainers considered such leakage mostly as a theoretical issued but as the maintainer of a very popular package I experienced it very often and so I felt ignored and neglected.

We should now avoid doing something similar to other maintainers and the users of their packages.

There have been reports for a while now about what I call mystery bugs, which may very well be caused by our kludges that are supposed to prevent (a different class of) mystery bugs. E.g. #108, magit/with-editor#85 and magit/magit#2404 (comment).

I for one plan to remove the async kludges from my package, at least for a while. Who knows, maybe the unloading kludge used by package.el now actually is good enough.

Instead I will distribute with Magit a tool for recompiling Magit and its dependencies and always ask users to run that when they report a bug that I cannot immediately reproduce and that sounds even just slightly fishy. It's a bit sad.

The removal of the async kludges from my packages is experimental. If this causes a bunch of mystery bugs, then I would have to revert.

/cc @jwiegley @Malabarba @Stebalien

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions