Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output base shows up in outputs (determinism issues) #2278

Open
nmattia opened this issue Jan 14, 2025 · 1 comment
Open

Output base shows up in outputs (determinism issues) #2278

nmattia opened this issue Jan 14, 2025 · 1 comment

Comments

@nmattia
Copy link
Contributor

nmattia commented Jan 14, 2025

Describe the bug

There seems to be a couple ways the output base ends up in build outputs:

  1. the toolchains exes (symlinks) use absolute paths
  2. the generated config files from ./configure include absolute paths
  3. the compiled python (.pyc) includes absolute paths
  4. the patch_bins script includes absolute paths
  5. the haskell interfaces include absolute paths

Below is a quick fix for problems 1 to 4. For 5, I'm not sure where to start looking.

diff
diff --git a/haskell/ghc_bindist.bzl b/haskell/ghc_bindist.bzl
index 51b740db..92da91d3 100644
--- a/haskell/ghc_bindist.bzl
+++ b/haskell/ghc_bindist.bzl
@@ -161,6 +161,8 @@ def _ghc_bindist_impl(ctx):
 
     is_hadrian_dist = ctx.path(unpack_dir).get_child("config.mk.in").exists
 
+    print("is_hadrian_dist", is_hadrian_dist)
+
     # On Windows the bindist already contains the built executables
     if os != "windows":
         # IMPORTANT: all these scripts have to be compatible with BSD
@@ -177,6 +179,7 @@ include Makefile""")
             execute_or_fail_loudly(ctx, ["sed", "-e", "s/RelocatableBuild = NO/RelocatableBuild = YES/", "-i.bak", "mk/config.mk.in"], working_directory = unpack_dir)
             execute_or_fail_loudly(ctx, ["rm", "-f", "mk/config.mk.in.bak"], working_directory = unpack_dir)
 
+        print("config prefix", bindist_dir.realpath)
         execute_or_fail_loudly(ctx, ["./configure", "--prefix", bindist_dir.realpath], working_directory = unpack_dir)
 
         make_loc = ctx.which("make")
@@ -189,6 +192,7 @@ include Makefile""")
             ctx.file("{}/mk/relpath.sh".format(unpack_dir), ctx.read(ctx.path(ctx.attr._relpath_script)), executable = False, legacy_utf8 = False)
             execute_or_fail_loudly(ctx, ["chmod", "+x", "mk/relpath.sh"], working_directory = unpack_dir)
 
+        print("make args", make_args)
         execute_or_fail_loudly(
             ctx,
             ["make", "install"] + make_args,
@@ -203,21 +207,27 @@ include Makefile""")
             working_directory = unpack_dir,
         )
 
-        if not is_hadrian_dist:
-            ctx.file(paths.join(unpack_dir, "patch_bins"), executable = True, content = r"""#!/usr/bin/env bash
+        execute_or_fail_loudly(
+            ctx,
+            ["rm", "config.status", "config.log", "config.mk"],
+            working_directory = unpack_dir,
+        )
+
+        ctx.file(paths.join(unpack_dir, "patch_bins"), executable = True, content = r"""#!/usr/bin/env bash
+
+set -euo pipefail
+BINDIST_DIR="$1"
 find bin -type f -print0 | xargs -0 \
-grep --files-with-matches --null {bindist_dir} | xargs -0 -n1 \
+grep --files-with-matches --null "$BINDIST_DIR" | xargs -0 -n1 \
     sed -i.bak \
         -e '2i\
 DISTDIR="$( dirname "$(resolved="$0"; cd "$(dirname "$resolved")"; while tmp="$(readlink "$(basename "$resolved")")"; do resolved="$tmp"; cd "$(dirname "$resolved")"; done; echo "$PWD/$(basename "$resolved")")" )/.."' \
-        -e 's:{bindist_dir}:$DISTDIR:'
+        -e 's:'"$BINDIST_DIR"':$DISTDIR:'
 find bin -type f -print0 | xargs -0 \
-grep --files-with-matches --null {bindist_dir} | xargs -0 -n1 \
+grep --files-with-matches --null "$BINDIST_DIR" | xargs -0 -n1 \
 rm -f
-""".format(
-                bindist_dir = bindist_dir.realpath,
-            ))
-            execute_or_fail_loudly(ctx, [paths.join(".", unpack_dir, "patch_bins")])
+""")
+    execute_or_fail_loudly(ctx, [paths.join(".", unpack_dir, "patch_bins"), bindist_dir.realpath])
 
     # As the patches may touch the package DB we regenerate the cache.
     if len(ctx.attr.patches) > 0:
diff --git a/haskell/private/pkgdb_to_bzl.bzl b/haskell/private/pkgdb_to_bzl.bzl
index a256ec76..91a6fc84 100644
--- a/haskell/private/pkgdb_to_bzl.bzl
+++ b/haskell/private/pkgdb_to_bzl.bzl
@@ -16,7 +16,7 @@ def pkgdb_to_bzl(repository_ctx, paths, libdir):
         paths["@rules_haskell//haskell:private/pkgdb_to_bzl.py"],
         repository_ctx.attr.name,
         libdir,
-    ])
+    ], environment = { "PYTHONDONTWRITEBYTECODE": "1" })
     if result.return_code:
         fail("Error executing pkgdb_to_bzl.py: {stderr}".format(stderr = result.stderr))
     elif result.stderr:

To Reproduce

bazel --output_base=output-base-1 build @stackage//:some-lib
bazel --output_base=output-base-2 build @stackage//:some-lib

and compare the outputs

Expected behavior

Build outputs (actualOutputs per the exec log) should be the same regardless of the output base.

Environment

  • OS name + version: x86 Linux
  • Bazel version: 7.4.1
  • Version of the rules: 1.0
@avdv
Copy link
Member

avdv commented Jan 20, 2025

@nmattia Thank you for reporting this issue, and for providing solutions for most of the problems! I'll see if I can come up with tests and a PR, if no-one beats me to it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants