Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement clang resource directory logic #4665

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

the-horo
Copy link
Contributor

Rationale: On linux distributions the location of the compiler-rt
libraries varies quite a lot:

  • On Gentoo they are in: /usr/lib/clang/<version>/lib/linux
  • On Ubuntu and Debian: /usr/lib/llvm-<major_version>/lib/clang/<version>/lib/linux
  • On Fedora: /usr/lib/clang/<version>/lib/x86_64-redhat-linux-gnu
  • On Arch: /usr/lib/clang/<version>/lib/linux
  • On Nix: /nix/store/<sha>/lib/linux

Additionally, when using a llvm dist tarball the libraries would be in:

  • <LLVM_LIB_DIR>/clang/<version>/lib/<triple>

Out of all the distributions above only Arch properly configures ldc
to be able to link compiler-rt libraries but they require patching to
make it work: PKGBUILD#L37

This problem has tried to be solved in pr#4659 but the implementation
can be made better. It's main fault is that it would embed the whole
path to the library directory making it require more configure
options (specifying the os/triple manually) and it has no runtime
support for changing it.

This implementation tries to make the situation better by allowing to
specify the clang resource directory, which is the path up to the
clang/<version> component either by:

  • the --clang-resource-directory cli option
  • the -DCLANG_RESOURCE_DIR_BASE cmake option which takes in the path
    up to clang/, without <version>. This value defaults to
    <LLVM_LIB_DIR>/clang.

The reason behind CLANG_RESOURCE_DIR_BASE not taking the version is to
make configuration easier, since in most cases <version> needs to be
appended and it can be easily calculated in the cmake file but it may
be harder to do so when invoking cmake itself. This means that Nix
would be left out in favor of other Unixes, and its solution would be
to add --clang-resource-directory to the default ldc switches.

The code for calculating the compiler-rt libs' filenames has also been
brought closer to Clang's, it is now able to deduce the correct
filenames for x86 and android and it is able to calculate the <os>
or the <triple> field in the final path name.

The list of locations in which libraries are searched is now:

  • libldc_rt.asan.a in every configured libdir.

  • libclang_rt.asan.a in every configured libdir.
    This is how libraries are named if they are in the <triple>
    sub-directory.

  • libclang_rt.asan-<arch>.a in every configured libdir.
    This is how libraries are named if they are in the <os>
    sub-directory.

  • <CLANG_RESOURCE_DIR>/<triple>/libclang_rt.asan.a

  • <CLANG_RESOURCE_DIR>/<os>/libclang_rt.asan-<arch>.a

  • <LLVM_LIB_DIR>/clang/<version>/lib/<os>/libclang_rt.asan-<arch>.a
    This form is for backwards compatibility. It is replaceable by
    -DCLANG_RESOURCE_DIR_BASE=<LLVM_LIB_DIR>/clang which is the
    default anyways.

Paths that are no longer looked up:

  • clang/<full_version>/lib/<os>/libclang_rt.asan-<arch>.a in every
    configured libdir. This path is wrong with recent versions of
    LLVM (>=16) since they use only the major version component instead of
    the full version. This happens to be the way Arch get its paths
    because LLVM_LIB_DIR is the same as the ldc2 one (the one in the
    config file). They should not be affected as CLANG_RESOURCE_DIR_BASE
    defaults to LLVM_LIB_DIR/clang and LLMV_LIB_DIR/clang is still
    searched.

This change mainly focuses on Linux but other OS-es should either not
be affected at all or receive improvements in the calculations of
<arch>.

Another small advantage of this implementation is being able to
cross-compile easier when using compiler-rt libraries as the
--clang-resource-directory switch can be passed on the command line
instead of editing ldc2.conf to reference the directory, though users
of this feature are small in number or non-existent, probably.

@the-horo the-horo changed the title Compiler rt Implement clang resource directory logic May 20, 2024
@kinke
Copy link
Member

kinke commented May 20, 2024

Without looking at the diff yet, I really hate all of this logic. I'm a fan of LDC normalizing this crap at CMake time, with its own renamed libldc_rt.* (no platform suffix!) libs in the lib directory, next to druntime, Phobos etc. For a distro, I take it that's not really nice, at least the duplication of these libs. I would have hoped that symlinks would be an option.

@the-horo
Copy link
Contributor Author

I'm a fan of LDC normalizing this crap at CMake time

This is mostly passing along the complexity for determining the correct suffixes/paths to the user. I guess this is overall easier then trying to reimplement all the logic from Clang but still.

I would have hoped that symlinks would be an option.

I can't speak for other distros but this would have a negative effect in Gentoo.

The current way a dependency on compiler-rt would be implemented (it is not yet due to some bugs, see below) is through the use of optfeature. What it means is that ldc can use compiler-rt if they are present, so the package manager doesn't enforce any relationship between them. From the point of the user this translates in:

  • I have ldc and compiler-rt installed => I can use ldc with compiler-rt
  • I have ldc installed but not compiler-rt => I can use ldc but I can't use compiler-rt related functionality.

You also have the perks of being able to uninstall or reinstall compiler-rt without touching ldc and everything will still work. This is kinda relevant since, on Gentoo, you can select which compiler-rt libraries you want installed so you would be able to toggle them on and off. I would guess that there won't be any users who would ever hit this but having ldc not depend on compiler-rt is very nice.

Otherwise, if cmake were to symlink the files at build time, ldc will now depend on compiler-rt. The problem with this is that I also need to specify which portions of compiler-rt ldc depends on and, since cmake would try to symlink every library, I have to specify every library. However, not every architecture has access to all compiler-rt libraries so I wouldn't be able to install ldc on those architectures. Technically, there is another solution in which you would have ldc include every configuration (library) of compiler-rt in its configuration. This adds way too much complexity so it's not viable either.

Finishing my rant about what is ideal for Gentoo, I think we could tone down the complexity a little.

The main change I wanted implemented and what prevented me from support compiler-rt in Gentoo is:

- return triple.getArchName();
+ return llvm::Triple::getArchTypeName(triple.getArch()).str();

Because the top line generates i686 on x86 but the compiler-rt libs are suffixed with i386. x86_64 and aarch64 work as expected which are the only other 2 arches that ldc is supported on currently.

Looking at the other distributions the only other relevant tweak is:

    if (triple.getArch() == llvm::Triple::arm || triple.getArch() == llvm::Triple::armeb)
        return (floatABI == llvm::FloatABI::Hard && !IsWindows)
	    ? "armhf"
	    : "arm";

Which fixes the suffix on debian arm and armeb, I don't know how many users use ldc there.

If it were entirely by me it would be sufficient for ldc to look for (with the change above for <arch>):

  1. libldc_rt*<no_arch>
  2. libclang_rt*<arch>
    In every configured libdir.

Since Fedora doesn't use <arch> for compiler-rt and it probably won't add too much complexity you could also do:

  1. libclang_rt*<no-arch>

  2. Have an option for specifying /usr/lib/clang/<version>/<whatever> and embed it in the config file (already implemented). I could make do without this but I prefer not having to sed the config file.

@kinke
Copy link
Member

kinke commented May 20, 2024

This is mostly passing along the complexity for determining the correct suffixes/paths to the user. I guess this is overall easier then trying to reimplement all the logic from Clang but still.

Yeah, mainly to package manager maintainers, so that they might likely need to tweak it to their particular compiler-rt layout on their distro, without having to hardcode brittle look-up logic in the compiler. It's IMO already a shame that the filenames (and distro package names?) are libclang*, not more generic for all LLVM-based compilers. ;)

The current way a dependency on compiler-rt would be implemented (it is not yet due to some bugs, see below) is through the use of optfeature. What it means is that ldc can use compiler-rt if they are present, so the package manager doesn't enforce any relationship between them.

Thx for elaborating on this, I wasn't aware of that (pretty cool) optfeature stuff.

I was assuming that a distro LDC package would simply depend on the compiler-rt package (or the supported subset if split up into multiple packages - the libs aren't huge, are they?), and the LDC package then e.g. being free to symlink the original libraries. Not necessarily/likely during CMake, but potentially as some external post-processing step in the package recipe. The compiler-rt libs at CMake time are mainly handy for the tests, to circumvent again brittle lit logic trying to find distro-provided libs (so we have such logic in CMake, compiler and lit scripts at the moment...).

@the-horo
Copy link
Contributor Author

I was assuming that a distro LDC package would simply depend on the compiler-rt package (or the supported subset if split up into multiple packages - the libs aren't huge, are they?)

So far only Arch added a dependency. What I meant about the Gentoo dependency I was practically referring to having to write:

DEPEND="sys-libs/compiler-rt-sanitizers[asan,libfuzzer,lsan,msan,profile,tsan,xray]"

To require that all those libraries are present. This will only work on amd64, on x86 I would need to do:

DEPEND="sys-libs/compiler-rt-sanitizers[asan,libfuzzer,lsan,profile]"

The problem is that I'm not allowed to dynamically alter dependencies, so I need to pick one. It's the same compiler-rt package, compiled in a certain way.

and the LDC package then e.g. being free to symlink the original libraries. Not necessarily/likely during CMake, but potentially as some external post-processing step in the package recipe.

I could perform the symlinking as a step in the build instead of doing it in cmake but I'm still left with the same problem. The things that ldc installs are dependent on the compiler-rt libraries that existed at build time.

The compiler-rt libs at CMake time are mainly handy for the tests

I can afford not to run those tests so I won't suffer from not having the compiler-rt libraries available at build-time.

to circumvent again brittle lit logic trying to find distro-provided libs (so we have such logic in CMake, compiler and lit scripts at the moment...).

While I do want to have more support for platforms I don't want the code to become too complex and have to treat every possible esoteric setup. To me it seems reasonable to have the implementation described above, allow the user to specify the libdir to cmake, embed it in the config and hope for the best. In the ldc code you would only need to handle the <arch> suffix which is maybe an acceptable level of complexity. I won't mind if you or somebody else thinks differently, I'm open to changing my view.

@kinke
Copy link
Member

kinke commented May 20, 2024

Finishing my rant about what is ideal for Gentoo, I think we could tone down the complexity a little.
[…]
While I do want to have more support for platforms I don't want the code to become too complex and have to treat every possible esoteric setup. To me it seems reasonable to have the implementation described above, allow the user to specify the libdir to cmake, embed it in the config and hope for the best. In the ldc code you would only need to handle the suffix which is maybe an acceptable level of complexity.

Yeah what you propose there sounds actually good.

Since Fedora doesn't use for compiler-rt and it probably won't add too much complexity you could also do: libclang_rt*<no-arch>

👍

@the-horo
Copy link
Contributor Author

Alright, I'll give this some more time so I can do other stuff and consider possible implementations or other solutions and maybe someone else who is affected by this will chime in.

@kinke
Copy link
Member

kinke commented May 20, 2024

Alright, and thanks for taking the time to investigate the situation with other package managers/distros, for a nice overview. 👍

@JohanEngelen
Copy link
Member

Can you add this compiler-rt search description as a document in the docs folder? (we'll figure out how to render LDC docs at some later moment, but for now let's at least preserve texts like this!)

the-horo added 3 commits May 28, 2024 21:33
Better support non-x86_64 targets when it comes to the architecture
suffix in the library filename.

Change the default search paths by dropping the relative clang
compiler-rt path (`clang/<version>/lib/<os>/<libname>`) logic as:
1. The version component is wrong with `>=llvm-16`
2. This path is already specified fully, by default, in the config
   file

When looking for `libclang_rt.*` also search for filenames without the
architecture suffix. This situation can happen on Linux and can be
observed in Fedora with its `compiler-rt` package.

Signed-off-by: Andrei Horodniceanu <[email protected]>
Add documentation about how LDC searches for compiler-rt libraries and
how a user can configure it.

Signed-off-by: Andrei Horodniceanu <[email protected]>
@the-horo
Copy link
Contributor Author

Notice that I've changed COMPILER_RT_BASE_DIR to require the /clang subdirectory, so it would need to be set to /usr/lib/clang if, previously, it would have been set to /usr/lib, because I consider it more intuitive for someone to figure out the path they need to provide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants