From 1ab60bcbf7b1898c21524eaa844a396c46160b9f Mon Sep 17 00:00:00 2001 From: "Alan D. Salewski" Date: Mon, 31 Aug 2020 03:41:59 -0400 Subject: [PATCH 1/7] [0 of 6] New feature: allow 'git::' forced filepaths, both absolute and relative This series of changesets introduces a feature that allows the 'git::' forcing token to be used on local file system paths to reference Git repositories. Both absolute paths and relative paths are supported. For example: git::./some/relative/path/to/a/git-repo//some-subdir?ref=v1.2.3 or: git::../../some/relative/path/to/a/git-repo//some-subdir?ref=v1.2.3 or: git::/some/absolute/path/to/a/git-repo//some-subdir?ref=v4.5.6 Only filepaths that are prefixed with the 'git::' forcing token are considered for processing. Internally, go-getter transforms the provided string into a 'file://' URI with an absolute filepath, with query string params and subdirectory retained. The rationale for using a 'file://' URI internally is that the Git clone operation can already work with 'file://' URIs, and using them for this feature allows us to leverage the existing go-getter URI-handling machinery. That gets us support for query params (to clone a specific git ref (tag, commit hash, ...)) "for free". The rationale for using an absolute filepath (even when the provided string is a relative filepath) is that (per RFC 1738 and RFC 8089) only absolute filepaths are legitimate in 'file://' URIs. But more importantly here, the Git clone operation only supports 'file://' URIs with absolute paths. Q: Why support this functionality at all? Why not just require that a source location use an absolute path in a 'file://' URI explicitly if that's what is needed? A: The primary reason is to allow support for relative filepaths to Git repos. There are use cases in which the absolute path cannot be known in advance, but a relative path to a Git repo is known. For example, when a Terraform project (or any Git-based project) uses Git submodules, it will know the relative location of the Git submodule repos, but cannot know the absolute path in advance because it will vary based on where the "superproject" repo is cloned. Nevertheless, those relative paths should be usable as clonable Git repos, and this mechanism would allow for that. Support for filepaths that are already absolute is provided mainly for symmetry. It would be surprising for the feature to work with relative file paths, but not for absolute filepaths. For projects using Terraform, in particular, this feature (along with a small change in the Terraform code to leverage it) enables the non-fragile use of relative paths in a module "call" block, when combined with Git submodules: module "my_module" { source = "git::../git-submodules/tf-modules/some-tf-module?ref=v0.1.0" // ... } In the above example "superproject" Git repo (the one "calling" the terraform module) knows the relative path to its own Git submodules because they are embedded in a subdirectory beneath the top-level of the "superproject" repo. Two downstream Terraform issues that would require go-getter support for this feature (or something like it) are at [0] and [1]. This first changeset in the series updates the README.md documentation to note the new feature and provide examples. [0] "Unable to use relative path to local Git module" hashicorp/terraform#25488 [1] "In 0.12, modules can no longer be installed from local git repositories at relative paths" hashicorp/terraform#21107 Design Notes ------------ In order for this feature to work, additional contextual information is needed by the Git detector than can be provided using the existing Detector API. Internally, the Detector's Detect method does not pass along to the Detector implementations all of the contextual information that it has available. In particular, the forcing token and go-getter subdir component are stripped out of the source string before invoking the implementation's Detect method. In the particular case of the Git detector, that means it cannot know that a 'git::' forcing token was provided on an input string that otherwise looks like a file system path. And /that/ means that it is not correct or safe for it to identify any filepath string value as a Git repository. Externally, callers (such as Terraform) already provide a value for the 'pwd' parameter of Detect, but it is not (necessarily) the location from which a relative path in a 'git::' string should be resolved. In a Terraform module (which may be in an arbitrary subdirectory from the process current working directory), module "source" references that contain relative paths must be interpreted relative to the location of the module source file. Terraform has that information available, but in the existing Detect API there is no way to convey it to go-getter. Constraints ----------- Additional Detector methods cannot be added without burdening all existing detectors (both internal and in the wild) with the need to support them. Additional Detect method params cannot be added without breaking all existing Detector implementations (internal, wild). Additional parameters cannot be added to the Detect dispatching function without affecting all callers. Approach -------- The goal is to provide the feature in a way that is as minimally invasive as possible. But above all else it needs to avoid breaking backward compatibility in any way. Given that, the approach taken by this changeset series is to introduce the concept of a "Contextual Detector". It is structured in the same way as the current Detector framework, but works through a new CtxDetector interface that is not constrained by the existing API. The only callers affected by this change would be those that wish to take advantage of the additional capabilities. And for those, the migration path straight-forward because the new API is structured like the existing one. In particular, this changeset series introduces four new elements: 1. CtxDetector interface 2. CtxDetect dispatching function 3. CtxDetect method on the CtxDetector interface 4. Full suite of CtxDetector implementations that are analogues of the existing detectors (most of which (currently) just delegate to the existing Detector implementations). There is also a global 'ContextualDetectors' list that serves a function analogous to the existing 'Detectors' list. Signed-off-by: Alan D. Salewski --- README.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/README.md b/README.md index 7f380fdcc..8a28e423b 100644 --- a/README.md +++ b/README.md @@ -304,6 +304,19 @@ The "scp-style" addresses _cannot_ be used in conjunction with the `ssh://` scheme prefix, because in that case the colon is used to mark an optional port number to connect on, rather than to delimit the path from the host. +Git repositories that reside on the local filesystem can be accessed by +prefixing the `git::` forcing token to the file path. Both absolute and +relative paths are accepted, and may contain query parameters and/or the a +double-slash `//` subdirectory component. Some examples: + +#### Git File Path Examples + +- `git::/path/to/some/git/repo` +- `git::/path/to/some/git/repo//some/subdir` +- `git::/path/to/some/git/repo//some/subdir?ref=v1.2.3` +- `git::./path/to/some/git/repo//some/subdir?ref=v1.2.3` +- `git::../../path/to/some/git/repo//some/subdir?ref=v1.2.3` + ### Mercurial (`hg`) * `rev` - The Mercurial revision to checkout. From bf7a422354549569ad1055489bfab0496f814e40 Mon Sep 17 00:00:00 2001 From: "Alan D. Salewski" Date: Thu, 27 Aug 2020 17:34:48 -0400 Subject: [PATCH 2/7] [1 of 6] getter.Detect(...): extract function handleDetected(...) Once the Detect(...) function finds a detector for a source string it combines the obtained result string with other data bits to produce the result returned to the caller. That processing is here extracted into a new handleDetected(...) function, which will be called from an additional context in an upcoming commit. The new handleDetected(...) function lives in a new file: detect_common.go The move emphasizes the function's slightly wider use now by both the Detect(...) and (soon-to-be-introduced) CtxDetect(...) dispatch functions. With the introduction of CtxDetect, the logic in handleDetected is no longer used exclusively by Detect. This changeset provides an implementation modification, but no behavioral change. Signed-off-by: Alan D. Salewski --- detect.go | 39 +++------------------------- detect_common.go | 66 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+), 36 deletions(-) create mode 100644 detect_common.go diff --git a/detect.go b/detect.go index f134f7705..cbebeaaff 100644 --- a/detect.go +++ b/detect.go @@ -2,7 +2,6 @@ package getter import ( "fmt" - "path/filepath" "github.com/hashicorp/go-getter/helper/url" ) @@ -62,41 +61,9 @@ func Detect(src string, pwd string, ds []Detector) (string, error) { continue } - var detectForce string - detectForce, result = getForcedGetter(result) - result, detectSubdir := SourceDirSubdir(result) - - // If we have a subdir from the detection, then prepend it to our - // requested subdir. - if detectSubdir != "" { - if subDir != "" { - subDir = filepath.Join(detectSubdir, subDir) - } else { - subDir = detectSubdir - } - } - - if subDir != "" { - u, err := url.Parse(result) - if err != nil { - return "", fmt.Errorf("Error parsing URL: %s", err) - } - u.Path += "//" + subDir - - // a subdir may contain wildcards, but in order to support them we - // have to ensure the path isn't escaped. - u.RawPath = u.Path - - result = u.String() - } - - // Preserve the forced getter if it exists. We try to use the - // original set force first, followed by any force set by the - // detector. - if getForce != "" { - result = fmt.Sprintf("%s::%s", getForce, result) - } else if detectForce != "" { - result = fmt.Sprintf("%s::%s", detectForce, result) + result, err = handleDetected(result, getForce, subDir) + if err != nil { + return "", err } return result, nil diff --git a/detect_common.go b/detect_common.go new file mode 100644 index 000000000..c1c2d927a --- /dev/null +++ b/detect_common.go @@ -0,0 +1,66 @@ +package getter + +import ( + "fmt" + "path/filepath" + + "github.com/hashicorp/go-getter/helper/url" +) + +// handleDetected is a helper function for the Detect(...) and CtxDetect(...) +// dispatch functions. +// +// Both dispatch functions work in the same general way: +// +// * Each breaks-apart its input string to extract any provided 'force' +// token and/or extract any '//some/subdir' element before supplying the +// downstream {,Ctx}Detect methods with input to chew on. +// +// * When a given detector indicates that it has processed the input +// string, the dispatch function needs to re-introduce the previously +// extracted bits before returning the reconstituted result string to +// its caller. +// +// Given the originally extracted bits along with the result obtained from the +// detector, this function performs that reconstitution. +// +func handleDetected(detectedResult, srcGetForce, subDir string) (string, error) { + var detectForce string + detectForce, result := getForcedGetter(detectedResult) + result, detectSubdir := SourceDirSubdir(result) + + // If we have a subdir from the detection, then prepend it to our + // requested subdir. + if detectSubdir != "" { + if subDir != "" { + subDir = filepath.Join(detectSubdir, subDir) + } else { + subDir = detectSubdir + } + } + + if subDir != "" { + u, err := url.Parse(result) + if err != nil { + return "", fmt.Errorf("Error parsing URL: %s", err) + } + u.Path += "//" + subDir + + // a subdir may contain wildcards, but in order to support them we + // have to ensure the path isn't escaped. + u.RawPath = u.Path + + result = u.String() + } + + // Preserve the forced getter if it exists. We try to use the + // original set force first, followed by any force set by the + // detector. + if srcGetForce != "" { + result = fmt.Sprintf("%s::%s", srcGetForce, result) + } else if detectForce != "" { + result = fmt.Sprintf("%s::%s", detectForce, result) + } + + return result, nil +} From 1c5408a44bc960977f7ad9ecf98df1175feb6aa1 Mon Sep 17 00:00:00 2001 From: "Alan D. Salewski" Date: Sat, 29 Aug 2020 14:29:19 -0400 Subject: [PATCH 3/7] [2 of 6] introduce CtxDetector ("contextual detector") The existing Detector interface cannot be extended in a backward compatible way to support new features desired for the GitDetector implementation[0]. The new CtxDetector interface is introduced to allow such extension without breaking backward compatibility with existing users. The new interface also avoids adding new methods to the existing Detector interface that would then need to be supported by all Detector implementations, both in-library and in the wild. A CtxDetector is slightly more cumbersome to use than Detector. Callers can (and should) continue to use Detector unless the enhanced capabilities of one or more of the CtxDetector implementation modules is needed. At the time of writing (2020-08), the only CtxDetector with such extra mojo is the forthcoming GitCtxDetector. Existing Detector implementations can easily be wrapped by CtxDetector implementations. The information available to a CtxDetector impl. is a strict superset of the information provided to a Detector. Where there is no need for the additional context info provided by the CtxDetect dispatch function, impls. can simply pass through the common subset to the Detect method of the analogous Detector impl. CAVEAT: In this changeset the list of ContextualDetectors is commented- out. This is intended to make a clear introduction of the interface type prior to introducing any implementations of it. A forthcoming change will provide such wrapping for all in-tree Detector impls, followed by the introduction of specialization for the GitCtxDetector impl. [0] C.f., https://github.com/hashicorp/go-getter/issues/268 Signed-off-by: Alan D. Salewski --- detect_ctx.go | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 169 insertions(+) create mode 100644 detect_ctx.go diff --git a/detect_ctx.go b/detect_ctx.go new file mode 100644 index 000000000..d32b0d22f --- /dev/null +++ b/detect_ctx.go @@ -0,0 +1,169 @@ +package getter + +import ( + "fmt" + + "github.com/hashicorp/go-getter/helper/url" +) + +// CtxDetector (read: "Contextual Detector"), the evil twin of Detector. +// +// Like its Detector sibling, CtxDetector defines an interface that an invalid +// URL or a URL with a blank scheme can be passed through in order to +// determine if its shorthand for something else well-known. +// +// CtxDetector expands on the capabilities of Detector in the following ways: +// +// * A CtxDetector allows the caller to provide more information about its +// invocation context to the CtxDetect dispatch function. This allows for +// some types of useful detections and transformations that were not +// previously possible. +// +// * The CtxDetect dispatch function provides the CtxDetector +// implementations with all of the context information it has available +// to it, including the force token flag (e.g., "git::"). This allows the +// implementations to safely take (or avoid taking) actions that would be +// unsafe otherwise. +// +// A CtxDetector is slightly more cumbersome to use than Detector. Callers can +// (and should) continue to use Detector unless the enhanced capabilities of +// one or more of the CtxDetector implementation is needed. At the time of +// writing (2020-08), the only CtxDetector with such extra mojo is +// GitCtxDetector (q.v.). +// +type CtxDetector interface { + + // CtxDetect will detect whether the string matches a known pattern to + // turn it into a proper URL + // + // 'src' (required) is the input string to be interpretted. In the common + // case this value will have been preparsed by the CtxDetect dispatch + // function; its forcing token (if any) will have been removed; same for + // any 'go-getter' subdir portion('//some/subdir'). Some examples: + // + // "s3-eu-west-1.amazonaws.com/bucket/foo/bar.baz?version=1234" + // + // "github.com/hashicorp/foo.git" + // + // "git@github.com:hashicorp/foo.git?foo=bar" + // + // "../../git-submods/tf-mods/some-tf-module?ref=v1.2.3" + // + // 'pwd' (optional, sometimes) is the filepath that should be taken as the + // current working directory (mainly for the purpose of resolving + // filesystem paths; may be overridden for that purpose by + // 'srcResolveFrom'). Some CtxDetector implementation may require this + // path to be an abosolute filepath. + // + // 'forceToken' (optional) is the forcing token, if any, extracted from + // the input string submitted to the CtxDetect dispatch function. It is + // provided as a param to the CtxDetect method so that CtxDetector + // implementations may recognize 'src' strings intended for them. This + // removes ambiguity when a given 'src' value could be legitimately + // processed by more than one CtxDetector implementation. + // + // 'ctxSubDir' (optional) is the 'go-getter' subdir portion (if any) + // pre-extracted from the source string (as noted above). It is provided + // to the CtxDetector implementation only for contextual awareness, which + // conceivably could inform its decision-making process. It should not be + // incorporated into the result returned by the CtxDetector impl. + // + // 'srcResolveFrom' (optional, sometimes) A caller-provided filepath to be + // used as the directory from which any relative filepath in 'src' should + // be resolved, instead of relative to 'pwd'. An individual CtxDetector + // implementation may require that this value be absolute. + // + // Protocol: Where they need to be resolved, relative filepath values in + // 'src' will be resolved relative to 'pwd', unless + // 'srcResolveFrom' is non-empty; then they will be resolved + // relative to 'srcResolveFrom'. + // + // Note that some CtxDetector impls. (FileCtxDetector, + // GitCtxDetector) can only produce meaningful results in some + // circumstances if they have an absolute directory to resolve + // to. For best results, when 'srcResolveFrom' is non-empty, + // provide an absolute filepath. + // + // The CtxDetect interface itself does not require that either + // 'pwd' or 'srcResolveFrom' be absolute filepaths, but that + // might be required by a particular CtxDetector implementation. + // Know that RFC-compliant use of 'file://' URIs (which some + // CtxDetector impls. emit) permit only absolute filepaths, and + // tools (such as Git) expect this. Providing relative filepaths + // for 'pwd' and/or 'srcResolveFrom' may result in the + // generation of non-legit 'file://' URIs with relative paths in + // them, and a CtxDetector implementation is permitted to reject + // them with an error if it requires an absolute path. + // + CtxDetect(src, pwd, forceToken, ctxSubDir, srcResolveFrom string) (string, bool, error) +} + +// ContextualDetectors is the list of detectors that are tried on an invalid URL. +// This is also the order they're tried (index 0 is first). +var ContextualDetectors []CtxDetector + +func init() { + ContextualDetectors = []CtxDetector{ + // new(GitHubCtxDetector), + // new(GitLabCtxDetector), + // new(GitCtxDetector), + // new(BitBucketCtxDetector), + // new(S3CtxDetector), + // new(GCSCtxDetector), + // new(FileCtxDetector), + } +} + +// CtxDetect turns a source string into another source string if it is +// detected to be of a known pattern. +// +// An empty-string value provided for 'pwd' is interpretted as "not +// provided". Likewise for 'srcResolveFrom'. +// +// The (optional) 'srcResolveFrom' parameter allows the caller to provide a +// directory from which any relative filepath in 'src' should be resolved, +// instead of relative to 'pwd'. This supports those use cases (e.g., +// Terraform modules with relative 'source' filepaths) where the caller +// context for path resolution may be different than the pwd. For best result, +// the provided value should be an absolute filepath. If unneeded, use specify +// the empty string. +// +// The 'cds' []CtxDetector parameter should be the list of detectors to use in +// the order to try them. If you don't want to configure this, just use the +// global ContextualDetectors variable. +// +// This is safe to be called with an already valid source string: CtxDetect +// will just return it. +// +func CtxDetect(src, pwd, srcResolveFrom string, cds []CtxDetector) (string, error) { + + getForce, getSrc := getForcedGetter(src) + + // Separate out the subdir if there is one, we don't pass that to detect + getSrc, subDir := SourceDirSubdir(getSrc) + + u, err := url.Parse(getSrc) + if err == nil && u.Scheme != "" { + // Valid URL + return src, nil + } + + for _, d := range cds { + result, ok, err := d.CtxDetect(getSrc, pwd, getForce, subDir, srcResolveFrom) + if err != nil { + return "", err + } + if !ok { + continue + } + + result, err = handleDetected(result, getForce, subDir) + if err != nil { + return "", err + } + + return result, nil + } + + return "", fmt.Errorf("invalid source string: %s", src) +} From 6b60381782b9d3947c423ca9dcd8c04efadce9d2 Mon Sep 17 00:00:00 2001 From: "Alan D. Salewski" Date: Sun, 30 Aug 2020 17:45:41 -0400 Subject: [PATCH 4/7] [3 of 6] Add full set of stock CtxDetector implementations ...and uncomment the 'ContextualDetectors' defult list of them in CtxDetector. The CtxDetector implementations satisfy the CtxDetector interface for all of the built-in detectors (bitbucket, github, gitlab, file, git, GCS, and S3), but do not (yet) take advantage of the the additional contextual information available. These implementations all just pass-through a subset of their arguments to the existing Detector impl. analogue. These are intended to make it comfortable to swap-in use of the new "Contractual Detector" without having to change much code, and also provide a place to hang enhancement code that can take advantage of the contextual information made available by the CtxDetect dispatch function. Signed-off-by: Alan D. Salewski --- detect_ctx.go | 14 +++++++------- detect_ctx_bitbucket.go | 14 ++++++++++++++ detect_ctx_file.go | 14 ++++++++++++++ detect_ctx_gcs.go | 14 ++++++++++++++ detect_ctx_git.go | 14 ++++++++++++++ detect_ctx_github.go | 14 ++++++++++++++ detect_ctx_gitlab.go | 14 ++++++++++++++ detect_ctx_s3.go | 14 ++++++++++++++ 8 files changed, 105 insertions(+), 7 deletions(-) create mode 100644 detect_ctx_bitbucket.go create mode 100644 detect_ctx_file.go create mode 100644 detect_ctx_gcs.go create mode 100644 detect_ctx_git.go create mode 100644 detect_ctx_github.go create mode 100644 detect_ctx_gitlab.go create mode 100644 detect_ctx_s3.go diff --git a/detect_ctx.go b/detect_ctx.go index d32b0d22f..ff508e8de 100644 --- a/detect_ctx.go +++ b/detect_ctx.go @@ -104,13 +104,13 @@ var ContextualDetectors []CtxDetector func init() { ContextualDetectors = []CtxDetector{ - // new(GitHubCtxDetector), - // new(GitLabCtxDetector), - // new(GitCtxDetector), - // new(BitBucketCtxDetector), - // new(S3CtxDetector), - // new(GCSCtxDetector), - // new(FileCtxDetector), + new(GitHubCtxDetector), + new(GitLabCtxDetector), + new(GitCtxDetector), + new(BitBucketCtxDetector), + new(S3CtxDetector), + new(GCSCtxDetector), + new(FileCtxDetector), } } diff --git a/detect_ctx_bitbucket.go b/detect_ctx_bitbucket.go new file mode 100644 index 000000000..c3866d308 --- /dev/null +++ b/detect_ctx_bitbucket.go @@ -0,0 +1,14 @@ +package getter + +// BitBucketCtxDetector implements CtxDetector to detect BitBucket URLs and +// turn them into URLs that the Git or Hg Getter can understand. +// +type BitBucketCtxDetector struct{} + +func (d *BitBucketCtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { + + // Currently not taking advantage of the extra contextual data available + // to us. For now, we just delegate to BitBucketDetector.Detect. + // + return (&BitBucketDetector{}).Detect(src, pwd) +} diff --git a/detect_ctx_file.go b/detect_ctx_file.go new file mode 100644 index 000000000..5ea7025fd --- /dev/null +++ b/detect_ctx_file.go @@ -0,0 +1,14 @@ +package getter + +// "Contextual Detector" implementation for producing 'file://' URIs from +// generic file system paths. +// +type FileCtxDetector struct{} + +func (d *FileCtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { + + // Currently not taking advantage of the extra contextual data available + // to us. For now, we just delegate to FileDetector.Detect. + // + return (&FileDetector{}).Detect(src, pwd) +} diff --git a/detect_ctx_gcs.go b/detect_ctx_gcs.go new file mode 100644 index 000000000..83e71bbb9 --- /dev/null +++ b/detect_ctx_gcs.go @@ -0,0 +1,14 @@ +package getter + +// GCSCtxDetector implements CtxDetector to detect GCS URLs and turn them into +// URLs that the GCSGetter can understand. +// +type GCSCtxDetector struct{} + +func (d *GCSCtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { + + // Currently not taking advantage of the extra contextual data available + // to us. For now, we just delegate to GCSDetector.Detect. + // + return (&GCSDetector{}).Detect(src, pwd) +} diff --git a/detect_ctx_git.go b/detect_ctx_git.go new file mode 100644 index 000000000..371047d27 --- /dev/null +++ b/detect_ctx_git.go @@ -0,0 +1,14 @@ +package getter + +// GitCtxDetector implements CtxDetector to detect Git SSH URLs such as +// git@host.com:dir1/dir2 and converts them to proper URLs. +// +type GitCtxDetector struct{} + +func (d *GitCtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { + + // Currently not taking advantage of the extra contextual data available + // to us. For now, we just delegate to GitDetector.Detect. + // + return (&GitDetector{}).Detect(src, pwd) +} diff --git a/detect_ctx_github.go b/detect_ctx_github.go new file mode 100644 index 000000000..0a739ee54 --- /dev/null +++ b/detect_ctx_github.go @@ -0,0 +1,14 @@ +package getter + +// GitHubCtxDetector implements CtxDetector to detect GitHub URLs and turn +// them into URLs that the Git Getter can understand. +// +type GitHubCtxDetector struct{} + +func (d *GitHubCtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { + + // Currently not taking advantage of the extra contextual data available + // to us. For now, we just delegate to GitHubDetector.Detect. + // + return (&GitHubDetector{}).Detect(src, pwd) +} diff --git a/detect_ctx_gitlab.go b/detect_ctx_gitlab.go new file mode 100644 index 000000000..9581dfcb1 --- /dev/null +++ b/detect_ctx_gitlab.go @@ -0,0 +1,14 @@ +package getter + +// GitLabCtxDetector implements CtxDetector to detect GitLab URLs and turn +// them into URLs that the Git Getter can understand. +// +type GitLabCtxDetector struct{} + +func (d *GitLabCtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { + + // Currently not taking advantage of the extra contextual data available + // to us. For now, we just delegate to GitLabDetector.Detect. + // + return (&GitLabDetector{}).Detect(src, pwd) +} diff --git a/detect_ctx_s3.go b/detect_ctx_s3.go new file mode 100644 index 000000000..4a3eaa2f3 --- /dev/null +++ b/detect_ctx_s3.go @@ -0,0 +1,14 @@ +package getter + +// S3CtxDetector implements CtxDetector to detect S3 URLs and turn them into +// URLs that the S3 getter can understand. +// +type S3CtxDetector struct{} + +func (d *S3CtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { + + // Currently not taking advantage of the extra contextual data available + // to us. For now, we just delegate to S3Detector.Detect. + // + return (&S3Detector{}).Detect(src, pwd) +} From 9797cde199f523808e2c5e3adf21ca0658850398 Mon Sep 17 00:00:00 2001 From: "Alan D. Salewski" Date: Mon, 31 Aug 2020 03:09:20 -0400 Subject: [PATCH 5/7] [4 of 6] Add unit test for full set of stock CtxDetector impls. ...except for GitCtxDetector, which will be getting special treatment since it is the module behind the motivation for this series of changes. The unit tests committed here are all just minimal adaptations of the existing detector unit tests to satisfy the CtxDetect method (which takes three more params than Detect). These tests basically test the "pass through" behavior of their respective implementations, but that's useful to demonstrate that they do not interfere with the traditional behaviors. Signed-off-by: Alan D. Salewski --- detect_ctx_bitbucket_test.go | 71 ++++++++++++++++++++ detect_ctx_file_test.go | 125 +++++++++++++++++++++++++++++++++++ detect_ctx_gcs_test.go | 45 +++++++++++++ detect_ctx_github_test.go | 48 ++++++++++++++ detect_ctx_gitlab_test.go | 48 ++++++++++++++ detect_ctx_s3_test.go | 88 ++++++++++++++++++++++++ 6 files changed, 425 insertions(+) create mode 100644 detect_ctx_bitbucket_test.go create mode 100644 detect_ctx_file_test.go create mode 100644 detect_ctx_gcs_test.go create mode 100644 detect_ctx_github_test.go create mode 100644 detect_ctx_gitlab_test.go create mode 100644 detect_ctx_s3_test.go diff --git a/detect_ctx_bitbucket_test.go b/detect_ctx_bitbucket_test.go new file mode 100644 index 000000000..d94e27ce3 --- /dev/null +++ b/detect_ctx_bitbucket_test.go @@ -0,0 +1,71 @@ +package getter + +import ( + "net/http" + "strings" + "testing" +) + +const testCtxBBUrl = "https://bitbucket.org/hashicorp/tf-test-git" + +func TestBitBucketCtxDetector(t *testing.T) { + t.Parallel() + + if _, err := http.Get(testCtxBBUrl); err != nil { + t.Log("internet may not be working, skipping BB tests") + t.Skip() + } + + cases := []struct { + Input string + Output string + }{ + // HTTP + { + "bitbucket.org/hashicorp/tf-test-git", + "git::https://bitbucket.org/hashicorp/tf-test-git.git", + }, + { + "bitbucket.org/hashicorp/tf-test-git.git", + "git::https://bitbucket.org/hashicorp/tf-test-git.git", + }, + { + "bitbucket.org/hashicorp/tf-test-hg", + "hg::https://bitbucket.org/hashicorp/tf-test-hg", + }, + } + + pwd := "/pwd" + forceToken := "" + ctxSubDir := "" + srcResolveFrom := "" + + f := new(BitBucketCtxDetector) + for i, tc := range cases { + var err error + for i := 0; i < 3; i++ { + var output string + var ok bool + output, ok, err = f.CtxDetect(tc.Input, pwd, forceToken, ctxSubDir, srcResolveFrom) + if err != nil { + if strings.Contains(err.Error(), "invalid character") { + continue + } + + t.Fatalf("err: %s", err) + } + if !ok { + t.Fatal("not ok") + } + + if output != tc.Output { + t.Fatalf("%d: bad: %#v", i, output) + } + + break + } + if i >= 3 { + t.Fatalf("failure from bitbucket: %s", err) + } + } +} diff --git a/detect_ctx_file_test.go b/detect_ctx_file_test.go new file mode 100644 index 000000000..3b30138a8 --- /dev/null +++ b/detect_ctx_file_test.go @@ -0,0 +1,125 @@ +package getter + +import ( + "fmt" + "os" + "path/filepath" + "runtime" + "strings" + "testing" +) + +type ctxFileTest struct { + in, pwd, out string + err bool +} + +var ctxFileTests = []ctxFileTest{ + {"./foo", "/pwd", "file:///pwd/foo", false}, + {"./foo?foo=bar", "/pwd", "file:///pwd/foo?foo=bar", false}, + {"foo", "/pwd", "file:///pwd/foo", false}, +} + +var unixCtxFileTests = []ctxFileTest{ + {"./foo", "testdata/detect-file-symlink-pwd/syml/pwd", + "testdata/detect-file-symlink-pwd/real/foo", false}, + + {"/foo", "/pwd", "file:///foo", false}, + {"/foo?bar=baz", "/pwd", "file:///foo?bar=baz", false}, +} + +var winCtxFileTests = []ctxFileTest{ + {"/foo", "/pwd", "file:///pwd/foo", false}, + {`C:\`, `/pwd`, `file://C:/`, false}, + {`C:\?bar=baz`, `/pwd`, `file://C:/?bar=baz`, false}, +} + +func TestFileCtxDetector(t *testing.T) { + if runtime.GOOS == "windows" { + ctxFileTests = append(ctxFileTests, winCtxFileTests...) + } else { + ctxFileTests = append(ctxFileTests, unixCtxFileTests...) + } + + // Get the pwd + pwdRoot, err := os.Getwd() + if err != nil { + t.Fatalf("err: %s", err) + } + pwdRoot, err = filepath.Abs(pwdRoot) + if err != nil { + t.Fatalf("err: %s", err) + } + + f := new(FileCtxDetector) + + forceToken := "" + ctxSubDir := "" + srcResolveFrom := "" + + for i, tc := range ctxFileTests { + t.Run(fmt.Sprintf("%d", i), func(t *testing.T) { + pwd := tc.pwd + + out, ok, err := f.CtxDetect(tc.in, pwd, forceToken, ctxSubDir, srcResolveFrom) + if err != nil { + t.Fatalf("err: %s", err) + } + if !ok { + t.Fatal("not ok") + } + + expected := tc.out + if !strings.HasPrefix(expected, "file://") { + expected = "file://" + filepath.Join(pwdRoot, expected) + } + + if out != expected { + t.Fatalf("input: %q\npwd: %q\nexpected: %q\nbad output: %#v", + tc.in, pwd, expected, out) + } + }) + } +} + +var noPwdCtxFileTests = []ctxFileTest{ + {in: "./foo", pwd: "", out: "", err: true}, + {in: "foo", pwd: "", out: "", err: true}, +} + +var noPwdUnixCtxFileTests = []ctxFileTest{ + {in: "/foo", pwd: "", out: "file:///foo", err: false}, +} + +var noPwdWinCtxFileTests = []ctxFileTest{ + {in: "/foo", pwd: "", out: "", err: true}, + {in: `C:\`, pwd: ``, out: `file://C:/`, err: false}, +} + +func TestCtxFileCtxDetector_noPwd(t *testing.T) { + if runtime.GOOS == "windows" { + noPwdCtxFileTests = append(noPwdCtxFileTests, noPwdWinCtxFileTests...) + } else { + noPwdCtxFileTests = append(noPwdCtxFileTests, noPwdUnixCtxFileTests...) + } + + f := new(FileCtxDetector) + + forceToken := "" + ctxSubDir := "" + srcResolveFrom := "" + + for i, tc := range noPwdCtxFileTests { + out, ok, err := f.CtxDetect(tc.in, tc.pwd, forceToken, ctxSubDir, srcResolveFrom) + if err != nil != tc.err { + t.Fatalf("%d: err: %s", i, err) + } + if !ok { + t.Fatal("not ok") + } + + if out != tc.out { + t.Fatalf("%d: bad: %#v", i, out) + } + } +} diff --git a/detect_ctx_gcs_test.go b/detect_ctx_gcs_test.go new file mode 100644 index 000000000..a07c60efe --- /dev/null +++ b/detect_ctx_gcs_test.go @@ -0,0 +1,45 @@ +package getter + +import ( + "testing" +) + +func TestGCSCtxDetector(t *testing.T) { + cases := []struct { + Input string + Output string + }{ + { + "www.googleapis.com/storage/v1/bucket/foo", + "gcs::https://www.googleapis.com/storage/v1/bucket/foo", + }, + { + "www.googleapis.com/storage/v1/bucket/foo/bar", + "gcs::https://www.googleapis.com/storage/v1/bucket/foo/bar", + }, + { + "www.googleapis.com/storage/v1/foo/bar.baz", + "gcs::https://www.googleapis.com/storage/v1/foo/bar.baz", + }, + } + + pwd := "/pwd" + forceToken := "" + ctxSubDir := "" + srcResolveFrom := "" + + f := new(GCSCtxDetector) + for i, tc := range cases { + output, ok, err := f.CtxDetect(tc.Input, pwd, forceToken, ctxSubDir, srcResolveFrom) + if err != nil { + t.Fatalf("err: %s", err) + } + if !ok { + t.Fatal("not ok") + } + + if output != tc.Output { + t.Fatalf("%d: bad: %#v", i, output) + } + } +} diff --git a/detect_ctx_github_test.go b/detect_ctx_github_test.go new file mode 100644 index 000000000..3e7cb1b22 --- /dev/null +++ b/detect_ctx_github_test.go @@ -0,0 +1,48 @@ +package getter + +import ( + "testing" +) + +func TestGitHubCtxDetector(t *testing.T) { + cases := []struct { + Input string + Output string + }{ + // HTTP + {"github.com/hashicorp/foo", "git::https://github.com/hashicorp/foo.git"}, + {"github.com/hashicorp/foo.git", "git::https://github.com/hashicorp/foo.git"}, + { + "github.com/hashicorp/foo/bar", + "git::https://github.com/hashicorp/foo.git//bar", + }, + { + "github.com/hashicorp/foo?foo=bar", + "git::https://github.com/hashicorp/foo.git?foo=bar", + }, + { + "github.com/hashicorp/foo.git?foo=bar", + "git::https://github.com/hashicorp/foo.git?foo=bar", + }, + } + + pwd := "/pwd" + forceToken := "" + ctxSubDir := "" + srcResolveFrom := "" + + f := new(GitHubCtxDetector) + for i, tc := range cases { + output, ok, err := f.CtxDetect(tc.Input, pwd, forceToken, ctxSubDir, srcResolveFrom) + if err != nil { + t.Fatalf("err: %s", err) + } + if !ok { + t.Fatal("not ok") + } + + if output != tc.Output { + t.Fatalf("%d: bad: %#v", i, output) + } + } +} diff --git a/detect_ctx_gitlab_test.go b/detect_ctx_gitlab_test.go new file mode 100644 index 000000000..e6d407542 --- /dev/null +++ b/detect_ctx_gitlab_test.go @@ -0,0 +1,48 @@ +package getter + +import ( + "testing" +) + +func TestGitLabCtxDetector(t *testing.T) { + cases := []struct { + Input string + Output string + }{ + // HTTP + {"gitlab.com/hashicorp/foo", "git::https://gitlab.com/hashicorp/foo.git"}, + {"gitlab.com/hashicorp/foo.git", "git::https://gitlab.com/hashicorp/foo.git"}, + { + "gitlab.com/hashicorp/foo/bar", + "git::https://gitlab.com/hashicorp/foo.git//bar", + }, + { + "gitlab.com/hashicorp/foo?foo=bar", + "git::https://gitlab.com/hashicorp/foo.git?foo=bar", + }, + { + "gitlab.com/hashicorp/foo.git?foo=bar", + "git::https://gitlab.com/hashicorp/foo.git?foo=bar", + }, + } + + pwd := "/pwd" + forceToken := "" + ctxSubDir := "" + srcResolveFrom := "" + + f := new(GitLabCtxDetector) + for i, tc := range cases { + output, ok, err := f.CtxDetect(tc.Input, pwd, forceToken, ctxSubDir, srcResolveFrom) + if err != nil { + t.Fatalf("err: %s", err) + } + if !ok { + t.Fatal("not ok") + } + + if output != tc.Output { + t.Fatalf("%d: bad: %#v", i, output) + } + } +} diff --git a/detect_ctx_s3_test.go b/detect_ctx_s3_test.go new file mode 100644 index 000000000..9f6e555e7 --- /dev/null +++ b/detect_ctx_s3_test.go @@ -0,0 +1,88 @@ +package getter + +import ( + "testing" +) + +func TestS3CtxDetector(t *testing.T) { + cases := []struct { + Input string + Output string + }{ + // Virtual hosted style + { + "bucket.s3.amazonaws.com/foo", + "s3::https://s3.amazonaws.com/bucket/foo", + }, + { + "bucket.s3.amazonaws.com/foo/bar", + "s3::https://s3.amazonaws.com/bucket/foo/bar", + }, + { + "bucket.s3.amazonaws.com/foo/bar.baz", + "s3::https://s3.amazonaws.com/bucket/foo/bar.baz", + }, + { + "bucket.s3-eu-west-1.amazonaws.com/foo", + "s3::https://s3-eu-west-1.amazonaws.com/bucket/foo", + }, + { + "bucket.s3-eu-west-1.amazonaws.com/foo/bar", + "s3::https://s3-eu-west-1.amazonaws.com/bucket/foo/bar", + }, + { + "bucket.s3-eu-west-1.amazonaws.com/foo/bar.baz", + "s3::https://s3-eu-west-1.amazonaws.com/bucket/foo/bar.baz", + }, + // Path style + { + "s3.amazonaws.com/bucket/foo", + "s3::https://s3.amazonaws.com/bucket/foo", + }, + { + "s3.amazonaws.com/bucket/foo/bar", + "s3::https://s3.amazonaws.com/bucket/foo/bar", + }, + { + "s3.amazonaws.com/bucket/foo/bar.baz", + "s3::https://s3.amazonaws.com/bucket/foo/bar.baz", + }, + { + "s3-eu-west-1.amazonaws.com/bucket/foo", + "s3::https://s3-eu-west-1.amazonaws.com/bucket/foo", + }, + { + "s3-eu-west-1.amazonaws.com/bucket/foo/bar", + "s3::https://s3-eu-west-1.amazonaws.com/bucket/foo/bar", + }, + { + "s3-eu-west-1.amazonaws.com/bucket/foo/bar.baz", + "s3::https://s3-eu-west-1.amazonaws.com/bucket/foo/bar.baz", + }, + // Misc tests + { + "s3-eu-west-1.amazonaws.com/bucket/foo/bar.baz?version=1234", + "s3::https://s3-eu-west-1.amazonaws.com/bucket/foo/bar.baz?version=1234", + }, + } + + pwd := "/pwd" + forceToken := "" + ctxSubDir := "" + srcResolveFrom := "" + + f := new(S3CtxDetector) + for i, tc := range cases { + output, ok, err := f.CtxDetect(tc.Input, pwd, forceToken, ctxSubDir, srcResolveFrom) + if err != nil { + t.Fatalf("err: %s", err) + } + if !ok { + t.Fatal("not ok") + } + + if output != tc.Output { + t.Fatalf("%d: bad: %#v", i, output) + } + } +} From 41320d0bdac5ad92964a0ecc751963ea0a5b47fe Mon Sep 17 00:00:00 2001 From: "Alan D. Salewski" Date: Mon, 31 Aug 2020 01:06:03 -0400 Subject: [PATCH 6/7] [5 of 6] add real GitCtxDetector impl: support for 'git::' filepaths GitCtxDetector is able to process 'git::' forced filepath strings, when specified with both absolute and relative filepaths. The following all work now: Absolute: git::/path/to/some/git/repo git::/path/to/some/git/repo//some/subdir git::/path/to/some/git/repo//some/subdir?ref=v1.2.3 Relative (subdir): git::./path/to/some/git/repo//some/subdir?ref=v1.2.3 Relative (parent dir): git::../../path/to/some/git/repo//some/subdir?ref=v1.2.3 GitCtxDetector provides a superset of the functionality of GitDetector, with which it is mainly backward compatible. In fact, GitCtxDetector still delegates Git SSH URL detection to GitDetector. However, because CitCtxDetector has contextual information available to it, it provides the following advantages: * It can safely process 'git::' strings as Git repositories; * It can avoid attempting detection on strings that have force tokens intended for other detectors; and * It can provide error messages when it was unable to process a 'git::' forced string (it provides this last capability "around" the wrapped delegation to GitDetector, too). Signed-off-by: Alan D. Salewski --- detect_ctx_git.go | 430 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 424 insertions(+), 6 deletions(-) diff --git a/detect_ctx_git.go b/detect_ctx_git.go index 371047d27..f0e93af29 100644 --- a/detect_ctx_git.go +++ b/detect_ctx_git.go @@ -1,14 +1,432 @@ package getter -// GitCtxDetector implements CtxDetector to detect Git SSH URLs such as -// git@host.com:dir1/dir2 and converts them to proper URLs. +import ( + "fmt" + "net/url" + "path/filepath" + "strings" +) + +// GitCtxDetector implements CtxDetector to detect Git SSH URLs such as: +// +// git@host.com:dir1/dir2 +// +// and 'git::' forced filepaths such as: +// +// git::../../path/to/some/git/repo//some/subdir?ref=v1.2.3 +// +// and convert them to proper URLs that the GitGetter can understand. +// +// The go-getter force token for this CtxDetector implementation is 'git::'. +// +// (If using the sibling Detector interface rather than CtxDetector, see +// GitDetector, instead, which uses the same 'git::' force key). +// +// If a force token is provided but it is something other than 'git::', then +// this module's CtxDetect method will not attempt to interpret the 'src' +// string. In that case it is effectively a NOOP. +// +// If a 'git::' force token is provided, but this module is not able to +// interpet the 'src' string, then an error will be returned. +// +// If no force token is provided, then this module will attempt to determine +// whether the 'src' string can safely be determined to be a Git SSH URL (but +// no filepath detection can be performed). +// +// Details +// ------- +// This CtxDetector implementation attempts to detect Git SSH URLs such as: +// +// git@host.com:dir1/dir2 +// +// and also filepath strings in the form: +// +// git:: +// +// where is either an absolute or relative filepath. For relative +// filepaths, be sure to see the notes on 'srcResolveFrom' below. Some +// examples: +// +// Absolute: +// +// git::/path/to/some/git/repo +// +// git::/path/to/some/git/repo//some/subdir +// +// git::/path/to/some/git/repo//some/subdir?ref=v1.2.3 +// +// Relative (subdir): +// +// git::./path/to/some/git/repo//some/subdir?ref=v1.2.3 +// +// Relative (parent dir): +// +// git::../../path/to/some/git/repo//some/subdir?ref=v1.2.3 +// +// Filepaths are transformed into 'file://' URIs that can be understood by +// GitGetter. The path will be properly encoded and will always be an absolute +// filepath. If an absolute filepath cannot be produced, then an error is +// emitted. // type GitCtxDetector struct{} -func (d *GitCtxDetector) CtxDetect(src, pwd, _, _, _ string) (string, bool, error) { +// CtxDetect is a method defined on the CtxDetector interface. +// +// Note that the 'git::' force token present in some of the above examples +// will have been stripped off of our 'src' param before this method is +// invoked. However, if it was provided, then we detect its presence here, by +// the provided 'forceToken' param, which will be contain just "git". +// +// In the specific case of detecting filepath strings, this module will only +// recognize them if they were specified with the 'git::' forcing +// token. Otherwise it would not be clear that they are supposed to be Git +// repos. +// +// Param Notes: +// ------------ +// +// 'src' (required) +// +// 'pwd' (optional) Relative filepaths are resolved relative to this location +// by default, but can be overridden by the 'srcResolveFrom' param. +// +// 'forceToken' (optional) Can be anything. If non-empty an "git", then src is +// explicitly intended for us to process. If non-empty but is some value +// other than "git", then src is intended for some other detector module +// (so we won't interfere). If empty, then we will attempt to detect +// what is in str. Note that filepath detection can only be performed if +// forceToken is "git". +// +// 'srcResolveFrom' (optional) If non-empty, then relative filepaths are +// resolved relative to this location, overriding the 'pwd' param for +// that purpose. Must be a rooted value in order for path resolution to +// succeed; will emit an error, otherwise. +// +// See also: GitDetector +// +func (d *GitCtxDetector) CtxDetect(src, pwd, forceToken, _, srcResolveFrom string) (string, bool, error) { + + // If the 'git::' force token was specified, then our work here is more + // than just "best effort"; we must complain if we were not able to detect + // how to parse a src string that was explicitly flagged for us to + // handle. This flag tells our response handling how to adapt. + // + mustField := "git" == forceToken // Are we required to field this request, else bust? + + if len(src) == 0 { + if mustField { + return "", false, fmt.Errorf("forced 'git::' handling: src string must be non-empty") + } + return "", false, nil + } + + if len(forceToken) > 0 { + rslt, ok, err := detectGitForceFilepath(src, pwd, forceToken, srcResolveFrom) + if err != nil { + return "", false, err + } + if ok { + return rslt, true, nil + } + } - // Currently not taking advantage of the extra contextual data available - // to us. For now, we just delegate to GitDetector.Detect. + // The remainder of our detection can be done by GitDetector.Detect, as it + // does not require the additional contextual information available + // here. We'll delegate the processing to it in order to avoid duplicating + // the logic here (even if doing so might allow us to provide more precise + // error messages), but we'll intercept it's "not detected" responses and + // change them to errors, if needed. // - return (&GitDetector{}).Detect(src, pwd) + rslt, ok, err := (&GitDetector{}).Detect(src, pwd) + if err != nil { + return "", false, err + } + if ok { + return rslt, true, nil + } + if mustField { + return "", false, fmt.Errorf("forced 'git::' handling: was unable to interpret src string: %s", src) + } + return "", false, nil +} + +// detectGitForceFilepath() allows the 'git::' force token to be used on a +// filepath to a Git repository. Both absolute and relative filepaths are +// supported. +// +// When in-effect, the returned string will contain: +// +// "git::file://" +// OR: +// "git::file://?" +// +// where is the provided src param expanded to an absolute filepath +// (if it wasn't already). If the provided src param is a relative filepath, +// then the expanded absolute file path is based on the 'pwd' param, unless +// 'srcResolveFrom' is non-empty; then the expanded path is based on +// 'srcResolveFrom'. +// +// For consistency with Client, this function will not expand filepaths based +// on the process's current working directory. See comment in client.go +// +// Note that detectGitForceFilepath() only handles filepaths that have been +// explicitly forced (via the 'git::' force token) for Git processing. +// +// +// Q: Why isn't this functionality in our GitDetector.Detect() implementation? +// +// A: It is only safe to do when the 'git::' force token is provided, and the +// Detector interface does not currently support a mechanism for the caller +// to supply that context. Consequently, the GitDetector implementation +// cannot assume that a filepath is intended for Git processing (that is, +// as representing a Git repository). Enter the CtxDetector interface, and +// this function. +// +// +// Q: Why is the returned value embedded in a 'file://' URI? +// +// A. When specifying the 'git::' force token on a filepath, the intention is +// to indicate that it is a path to a Git repository that can be cloned, +// etc. Though Git will accept both relative and absolute file paths for +// this purpose, we unlock more functionality by using a 'file://' URI. In +// particular, that allows our GitGetter to use any provided query params +// to specify a specific tag or commit hash, for example. +// +// +// Q: Why is a relative path expanded to an absolute path in the returned +// 'file://' URI? +// +// A: Relative paths are technically not "legal" in RFC 1738- and RFC 8089- +// compliant 'file://' URIs, and (more importantly for our purposes here) +// they are not accepted by Git. When generating a 'file://' URI as we're +// doing here, using the absolute path is the only useful thing to do. +// +// +// Q: Why support this functionality at all? Why not just require that a +// source location use an absolute path in a 'file://' URI explicitly if +// that's what is needed? +// +// A: The primary reason is to allow support for relative filepaths to Git +// repos. There are use cases in which the absolute path cannot be known in +// advance, but a relative path to a Git repo is known. For example, when a +// Terraform project (or any Git-based project) uses Git submodules, it +// will know the relative location of the Git submodule repos, but cannot +// know the absolute path in advance because that will vary based on where +// the "superproject" repo is cloned. Nevertheless, those relative paths +// should be usable as clonable Git repos, and this mechanism allows for +// that. +// +// Support for filepaths that are already absolute is provided mainly for +// symmetry. It would be surprising if the feature worked for relative +// filepaths and not for absolute filepaths. +// +// +// Param Notes +// ----------- +// 'src' (required) Filepath detection only works if the 'src' param start +// with './', '../', '/', or their Windows equivalents. That is, the +// value must be clearly identifiable as a filepath reference without +// requiring that we actually touch the filesystem itself. +// +// Values such as 'foo', or 'foo/bar', which may or may not be +// filepaths, are ignored (this function will not have an effect on +// them). +// +// 'pwd' (optional) For compatibility with the Detector interface (which has a +// 'pwd' param but not a 'srcResolveFrom' param), this is used as the +// default path against which to resolve a relative filepath in +// 'src'. May be overridden by 'srcResolveFrom'. +// +// When used for filepath resolution, it is /required/ to be a rooted +// filepath (that is, it must be absolute). This is a restriction of +// this CtxDetector implementation that is more strict than what is +// required by CtxDetector.CtxDetect. +// +// 'force' (optional) Indicates whether or not the 'git::' forcing token was +// specified. If empty, or if non-empty, but contains a value other than +// "git", then this function is effectively a NOOP. +// +// 'srcResolveFrom' (optional) If non-empty, then overrides 'pwd' as the +// location from which a relative filepath value in 'src' will be +// resolved. This is useful when the relative filepath in 'src' is +// relative to a different location than the 'pwd' of the process +// (example: Terraform modules in subdirectories -- the filepaths need +// to be interpretted relative to the directory in which the module +// lives). +// +// As with 'pwd', when used for filepath resolution, 'srcResolveFrom' is +// /required/ to be a rooted filepath. This is a restriction of this +// CtxDetector implementation that is more strict than what is required +// by CtxDetector.CtxDetect. +// +// Return value: +// ------------- +// The filepath in our return string is Clean, in the filepath.Clean() +// sense. That means: +// +// 1. Even when 'src' was provided with an absolute filepath value, the +// emitted cleaned value may be different. +// +// 2. Anything that looks like a go-getter "subdir" value +// ('//some/subdir') in 'src' will not be distinguishable as such in +// our result string (because '//' would get cleaned to just '/'). This +// should not be a problem in practice, though as the CtxDetect() +// dispatch function removes such values from 'src' prior to invoking +// the 'CtxDetect()' method of the CtxDetector(s); this function should +// only see such values in 'src' when running under our unit tests. +// +func detectGitForceFilepath(src, pwd, force, srcResolveFrom string) (string, bool, error) { + + // The full force key token is 'git::', but the Detect() dispatcher + // function provides our CtxDetect() method with the parsed value (without + // the trailing colons). + // + if force != "git" { + return "", false, nil + } + + if len(src) == 0 { + // cannot be a filepath; not a value for this function + return "", false, nil + } + + var srcResolvedAbs string + + if filepath.IsAbs(src) { + srcResolvedAbs = src + } else { + + // For our purposes, a relative filepath MUST begin with './' or + // '../', or the Windows equivalent. + // + if !isLocalSourceAddr(src) { + // src is neither an absolute nor relative filepath (at least not + // obviously so), so we'll treat it as "not for us". + return "", false, nil + } + + // Recall that the result of filepath.Join() is Cleaned + if len(srcResolveFrom) > 0 { + + if !filepath.IsAbs(srcResolveFrom) { + return "", false, fmt.Errorf("unable to resolve 'git::' forced filepath (%s)"+ + "; provided srcResolveFrom filepath (%s) is not rooted", src, srcResolveFrom) + } + + srcResolvedAbs = filepath.Join(srcResolveFrom, src) + + } else if len(pwd) > 0 { + + if !filepath.IsAbs(pwd) { + return "", false, fmt.Errorf("unable to resolve 'git::' forced filepath (%s)"+ + "; provided pwd filepath (%s) is not rooted", pwd, srcResolveFrom) + } + + srcResolvedAbs = filepath.Join(pwd, src) + } else { + // There's no way to resolve a more complete filepath for 'src' + // other than to do so relative to the current working directory + // of the process (which go-getter won't do, by design; see + // comment in Client). + + return "", false, fmt.Errorf("unable to resolve 'git::' force filepath (%s)"+ + "; neither 'pwd' nor 'srcResolveFrom' param was provided", src) + } + } + + srcResolvedAbs = filepath.Clean(srcResolvedAbs) + + // To make the filepath usable (hopefully) in a 'file://' URI, we may need + // to flip Windows-style '\' to URI-style '/'. + // + // Note that filepath.ToSlash is effectively a NOOP on platforms where '/' + // is the os.Separator; when running on Unix-like platforms, this WILL NOT + // hose your Unix filepaths that just happen to have backslash characters + // in them. + // + srcResolvedAbs = filepath.ToSlash(srcResolvedAbs) + + if !strings.HasPrefix(srcResolvedAbs, "/") { + + // An absolute file path on Unix will start with a '/', but that is + // not true for all OS's. RFC 8089 makes the authority component + // (including the '//') optional in a 'file:' URL, but git (at least + // as of version 2.28.0) only recognizes the 'file://' form. In fact, + // the git-clone(1) manpage is explicit that it wants the syntax to + // be: + // + // file:///path/to/repo.git/ + // + // Some notes on the relevant RFCs: + // + // RFC 1738 (section 3.10, "FILES") documents a and + // portion being separated by a '/' character: + // + // file:/// + // + // RFC 2396 (Appendix G.2, "Modifications from both RFC 1738 and RFC + // 1808") refines the above by declaring that the '/' is actually part + // of the path. It is still required to separate the "authority + // portion" of the URI from the path portion, but is not a separate + // component of the URI syntax. + // + // RFC 3986 (Section 3.2, "Authority") states that the authority + // component of a URI "is terminated by the next slash ("/"), question + // mark ("?"), or number sign ("#") character, or by the end of the + // URI." However, for 'file:' URIs, only those terminated by a '/' + // characters are supported by Git (as noted above). + // + // RFC 8089 (Appendix A, "Differences from Previous Specifications") + // references the RFC 1738 form including the required '/' after the + // /authority component. + // + // Because it is the most compatible approach across the only + // partially compatible RFC recommendations, and (more importantly) + // because it is what Git requires for 'file:' URIs, we require that + // our absolute path value start with a '/' character. + // + srcResolvedAbs = "/" + srcResolvedAbs + } + + // Our 'srcResolvedAbs' value may have URI query parameters (e.g., + // "ref=v1.2.3"), and the path elements may have characters that would + // need to be escaped in a proper URI. We'll leverage url.Parse() to deal + // with all of that, and then down below the stringified version of it + // will be properly encoded. + // + u, err := url.Parse("file://" + srcResolvedAbs) + if err != nil { + return "", false, fmt.Errorf("error parsing 'git::' force filepath (%s) to URL: %s", srcResolvedAbs, err) + } + + rtn := fmt.Sprintf("%s::%s", force, u.String()) + + return rtn, true, nil +} + +// Borrowed from terraform/internal/initwd/getter.go +// (modified here to accept "." and "..", too, if exact, full matches) +var localSourcePrefixes = []string{ + `./`, + `../`, + `.\\`, + `..\\`, +} +var localExactMatches = []string{ + `.`, + `..`, +} + +func isLocalSourceAddr(addr string) bool { + for _, value := range localExactMatches { + if value == addr { + return true + } + } + for _, prefix := range localSourcePrefixes { + if strings.HasPrefix(addr, prefix) { + return true + } + } + return false } From a19ad2aa513eeb2696885eaaf92247a0cdc87bd6 Mon Sep 17 00:00:00 2001 From: "Alan D. Salewski" Date: Sun, 30 Aug 2020 17:46:36 -0400 Subject: [PATCH 7/7] [6 of 6] Add unit tests for GitCtxDetector Signed-off-by: Alan D. Salewski --- detect_ctx_git_test.go | 1139 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1139 insertions(+) create mode 100644 detect_ctx_git_test.go diff --git a/detect_ctx_git_test.go b/detect_ctx_git_test.go new file mode 100644 index 000000000..54abd8afd --- /dev/null +++ b/detect_ctx_git_test.go @@ -0,0 +1,1139 @@ +package getter + +import ( + "testing" + + "path/filepath" +) + +func TestCtxGitDetector(t *testing.T) { + cases := []struct { + Input string + Output string + }{ + { + "git@github.com:hashicorp/foo.git", + "git::ssh://git@github.com/hashicorp/foo.git", + }, + { + "git@github.com:org/project.git?ref=test-branch", + "git::ssh://git@github.com/org/project.git?ref=test-branch", + }, + { + "git@github.com:hashicorp/foo.git//bar", + "git::ssh://git@github.com/hashicorp/foo.git//bar", + }, + { + "git@github.com:hashicorp/foo.git?foo=bar", + "git::ssh://git@github.com/hashicorp/foo.git?foo=bar", + }, + { + "git@github.xyz.com:org/project.git", + "git::ssh://git@github.xyz.com/org/project.git", + }, + { + "git@github.xyz.com:org/project.git?ref=test-branch", + "git::ssh://git@github.xyz.com/org/project.git?ref=test-branch", + }, + { + "git@github.xyz.com:org/project.git//module/a", + "git::ssh://git@github.xyz.com/org/project.git//module/a", + }, + { + "git@github.xyz.com:org/project.git//module/a?ref=test-branch", + "git::ssh://git@github.xyz.com/org/project.git//module/a?ref=test-branch", + }, + { + // Already in the canonical form, so no rewriting required + // When the ssh: protocol is used explicitly, we recognize it as + // URL form rather than SCP-like form, so the part after the colon + // is a port number, not part of the path. + "git::ssh://git@git.example.com:2222/hashicorp/foo.git", + "git::ssh://git@git.example.com:2222/hashicorp/foo.git", + }, + } + + pwd := "/pwd" + f := new(GitCtxDetector) + ds := []CtxDetector{f} + for _, tc := range cases { + t.Run(tc.Input, func(t *testing.T) { + output, err := CtxDetect(tc.Input, pwd, "", ds) + if err != nil { + t.Fatalf("unexpected error: %s", err) + } + + if output != tc.Output { + t.Errorf("wrong result\ninput: %s\ngot: %s\nwant: %s", tc.Input, output, tc.Output) + } + }) + } +} + +// The empty string value for the 'pwd' and/or 'srcResolveFrom' params is +// interpretted as "no value provided". When neither is provided, there will +// be no filepath against which to resolve the 'src' param. +// +// If 'src' is already an unambiguous absolute filepath, then Git filepath +// detection will still work (because neither 'pwd' nor 'srcResolveFrom' was +// needed to do so). +// +func Test_detectGitForceFilepath_directly_pos_pwd_and_srcResolveFrom_both_emtpy(t *testing.T) { + + posCases := []struct { + Input string + Output string + }{ + { + "/somedir", + "git::file:///somedir", + }, + { + "/somedir/two", + "git::file:///somedir/two", + }, + { + "/somedir/two with space", + "git::file:///somedir/two%20with%20space", + }, + { + "/somedir/two/three", + "git::file:///somedir/two/three", + }, + { + "/somedir/two with space/three", + "git::file:///somedir/two%20with%20space/three", + }, + + { // subdir-looking thing IS NOT retained here; is "cleaned away" (good) + "/somedir/two//three", + "git::file:///somedir/two/three", + }, + + { + "/somedir/two/three?ref=v4.5.6", + "git::file:///somedir/two/three?ref=v4.5.6", + }, + + { + "/somedir/two with space/three?ref=v4.5.6", + "git::file:///somedir/two%20with%20space/three?ref=v4.5.6", + }, + } + + pwd := "" + srcResolveFrom := "" + force := "git" // parsed form of magic 'git::' force token + + for _, tc := range posCases { + t.Run(tc.Input, func(t *testing.T) { + output, ok, err := detectGitForceFilepath(tc.Input, pwd, force, srcResolveFrom) + if err != nil { + t.Fatalf("unexpected error: %s", err) + } + if !ok { + t.Fatalf("unexpected !ok") + } + + if output != tc.Output { + t.Errorf("wrong result\ninput: %s\ngot: %s\nwant: %s", tc.Input, output, tc.Output) + } + }) + } +} + +// The empty string value for the 'pwd' and/or 'srcResolveFrom' params is +// interpretted as "no value provided". When neither is provided, there will +// be no filepath against which to resolve the 'src' param. +// +// Regardless of whether the whether or not 'pwd' and/or 'srcResolveFrom' +// params are provided, if the 'src' param does not look unambiguously like a +// filepath, then detectGitForceFilepath(...) should not process the input +// string. +// +// For these negative cases, anything that looks like a relative filepath will +// not be processable. Unlike cases in which detectGitForceFilepath() ignores +// values that are not intended for it, in this case the 'git::' force token +// coupled with the relative filepath means the item is definitely intended +// for the function. But it simply cannot be processed because no rooted path +// value was provided against which to resolve 'src'. The function therefore +// emits an error directly. +// +func Test_detectGitForceFilepath_directly_neg_pwd_and_srcResolveFrom_both_emtpy(t *testing.T) { + + // These are cases that should be ignored because the provided string does + // not look like a filepath. These will not be processed, but no error + // should be emitted for them. + // + // See also negErrorCases below. + // + negIgnoreCases := []struct { + Input string + Output string + }{ + { + "", + "", + }, + { + "foo", + "", + }, + { + "foo/bar", + "", + }, + { + "foo/bar/", + "", + }, + { + "git@github.com:hashicorp/foo.git", + "", + }, + { + "git@github.com:org/project.git?ref=test-branch", + "", + }, + { + "git@github.com:hashicorp/foo.git//bar", + "", + }, + { + "git@github.com:hashicorp/foo.git?foo=bar", + "", + }, + { + "git@github.xyz.com:org/project.git", + "", + }, + { + "git@github.xyz.com:org/project.git?ref=test-branch", + "", + }, + { + "git@github.xyz.com:org/project.git//module/a", + "", + }, + { + "git@github.xyz.com:org/project.git//module/a?ref=test-branch", + "", + }, + { + "ssh://git@git.example.com:2222/hashicorp/foo.git", + "", + }, + } + + // These are cases that should cause an error to be emitted directly by + // the function because the provided string looks like a relative + // filepath, but no rooted 'pwd' or 'srcResolveFrom' value was provided + // against which to resolve them. + // + // See also negIgnoreCases above. + // + negErrorCases := []struct { + Input string + Output string + }{ + { + ".", + "", + }, + { + "./", + "", + }, + { + "./.", + "", + }, + { + "././", + "", + }, + { + "././.", + "", + }, + { + "./././", + "", + }, + { + "..", + "", + }, + { + "../", + "", + }, + { + "../..", + "", + }, + { + "../../", + "", + }, + { + "../../..", + "", + }, + { + "../../../", + "", + }, + + { + "./somedir", + "", + }, + { + "./somedir/two", + "", + }, + { + "./somedir/two with space", + "", + }, + { + "./somedir/two/three", + "", + }, + { + "./somedir/two//three", + "", + }, + { + "./somedir/two/three?ref=v4.5.6", + "", + }, + + { + "../some-parent-dir", + "", + }, + { + "../some-parent-dir/two", + "", + }, + + { + "../some-parent-dir/two/three", + "", + }, + { + "../some-parent-dir/two//three", + "", + }, + + { + "../../some-grandparent-dir", + "", + }, + { + "../../some-grandparent-dir?ref=v1.2.3", + "", + }, + + { + "../../../some-great-grandparent-dir", + "", + }, + { + "../../../some-great-grandparent-dir?ref=v1.2.3", + "", + }, + + { + "../../../../some-2x-great-grandparent-dir", + "", + }, + { + "../../../../some-2x-great-grandparent-dir?ref=v1.2.3", + "", + }, + + { + "../../../../../some-3x-great-grandparent-dir", + "", + }, + { + "../../../../../some-3x-great-grandparent-dir?ref=v1.2.3", + "", + }, + } + + pwd := "" + srcResolveFrom := "" + force := "git" + + for _, tc := range negIgnoreCases { + t.Run(tc.Input, func(t *testing.T) { + output, ok, err := detectGitForceFilepath(tc.Input, pwd, force, srcResolveFrom) + if err != nil { + t.Fatalf("unexpected error: %s", err) + } + if ok { + t.Errorf("unexpected ok on input: %s", tc.Input) + } + if output != "" { + t.Errorf("unexpected non-empty output string; input: %s; output: %s", tc.Input, output) + } + }) + } + + for _, tc := range negErrorCases { + t.Run(tc.Input, func(t *testing.T) { + output, ok, err := detectGitForceFilepath(tc.Input, pwd, force, srcResolveFrom) + if err == nil { + t.Errorf("unexpected non-error; input: %s; output: %s; ok: %t", tc.Input, output, ok) + } + }) + } +} + +// When the 'srcResolveFrom' param is empty, then filepath resolution should +// be performed against the 'pwd' param (unless it, too, is empty). In this +// function, srcResolveFrom is always empty, and pwd is always non-empty and +// rooted. +// +func Test_detectGitForceFilepath_directly_pos_empty_srcResolveFrom(t *testing.T) { + + // Test case input values represent the parsed inputs, similar to what + // would be presented by the 'CtxDetect' dispatch function. + // + // CAREFUL: Recall that the 'CtxDetect' dispatch function does subdir + // parsing before attempting to invoke the 'CtxDetect' methods of + // the CtxDetectors. The subdir value is passed as a separate + // 'subDir' param to the 'CtxDetect' methods, and in the + // particular case of our GitCtxDetector that subDir param is not + // provided to the 'detectGitForceFilepath' function. So in + // practice 'detectGitForceFilepath' should never see a ('//') + // subdir in its 'src' param. Consequently, our positive tests + // below that fake one up, anyway, result in "cleaned" paths (in + // the filepath.Clean() sense). The function is behaving + // correctly by /not/ doing explicit subdir handling, and by + // "cleaning away" filepaths that happen to have subdir-looking + // sections in them. + + pwd := "/some/caller-provided/abs/path" + pwdMinusOne := filepath.Dir(pwd) // dirname + pwdMinusTwo := filepath.Dir(pwdMinusOne) // likewise + pwdMinusThree := filepath.Dir(pwdMinusTwo) // likewise + pwdMinusFour := filepath.Dir(pwdMinusThree) // likewise; is root ('/') + + // Sanity check our "directory math" + if pwdMinusThree == "/" { + t.Fatalf("pwdMinusThree is unexpectedly the top-level root ('/') directory") + } + if pwdMinusFour != "/" { + t.Fatalf("pwdMinusFour is not the top-level root ('/') directory; is: %s", pwdMinusFour) + } + + // The 'CtxDetect' protocol, ultimately implemented here in our + // 'detectGitForceFilepath' function, is that 'src' is resolved relative + // to 'pwd', unless the provided 'srcResolveFrom' param is non-empty; then + // the 'src' is resolved relative to 'srcResolveFrom'. + // + // Because our 'pwd' param is non-empty in all of these cases, these tests + // demonstrate the above protocol working. + // + posCases := []struct { + Input string + Output string + }{ + { + ".", + "git::file://" + pwd, + }, + { + "./", + "git::file://" + pwd, + }, + { + "./.", + "git::file://" + pwd, + }, + { + "././", + "git::file://" + pwd, + }, + { + "././.", + "git::file://" + pwd, + }, + { + "./././", + "git::file://" + pwd, + }, + + { + "..", + "git::file://" + pwdMinusOne, + }, + { + "../", + "git::file://" + pwdMinusOne, + }, + { + "../.", + "git::file://" + pwdMinusOne, + }, + { + ".././", + "git::file://" + pwdMinusOne, + }, + + { + "../../", + "git::file://" + pwdMinusTwo, + }, + + { + "../../../", + "git::file://" + pwdMinusThree, + }, + + { + "../../../../", + "git::file://" + pwdMinusFour, + }, + + { // Same output as previous, since we only have 2x + // great-grandparents before we hit the top-level root directory + "../../../../../", + "git::file://" + pwdMinusFour, + }, + + { + "/somedir", + "git::file:///somedir", + }, + { + "./somedir", + "git::file://" + pwd + "/somedir", + }, + + { + "/somedir/two", + "git::file:///somedir/two", + }, + { + "./somedir/two", + "git::file://" + pwd + "/somedir/two", + }, + + { + "/somedir/two/three", + "git::file:///somedir/two/three", + }, + { + "./somedir/two/three", + "git::file://" + pwd + "/somedir/two/three", + }, + + { // subdir-looking thing IS NOT retained here; is "cleaned away" (good) + "/somedir/two//three", + "git::file:///somedir/two/three", + }, + { + "./somedir/two//three", + "git::file://" + pwd + "/somedir/two/three", // no subdir here (good) + }, + + { + "/somedir/two/three?ref=v4.5.6", + "git::file:///somedir/two/three?ref=v4.5.6", + }, + { + "./somedir/two/three?ref=v4.5.6", + "git::file://" + pwd + "/somedir/two/three?ref=v4.5.6", + }, + + { + "../some-parent-dir", + "git::file://" + pwdMinusOne + "/some-parent-dir", + }, + { + "../some-parent-dir/two", + "git::file://" + pwdMinusOne + "/some-parent-dir/two", + }, + + { + "../some-parent-dir/two/three", + "git::file://" + pwdMinusOne + "/some-parent-dir/two/three", + }, + { + "../some-parent-dir/two//three", + "git::file://" + pwdMinusOne + "/some-parent-dir/two/three", // no subdir here (okay) + }, + + { + "../../some-grandparent-dir", + "git::file://" + pwdMinusTwo + "/some-grandparent-dir", + }, + { + "../../some-grandparent-dir?ref=v1.2.3", + "git::file://" + pwdMinusTwo + "/some-grandparent-dir?ref=v1.2.3", + }, + + { + "../../../some-great-grandparent-dir", + "git::file://" + pwdMinusThree + "/some-great-grandparent-dir", + }, + { + "../../../some-great-grandparent-dir?ref=v1.2.3", + "git::file://" + pwdMinusThree + "/some-great-grandparent-dir?ref=v1.2.3", + }, + + { + "../../../../some-2x-great-grandparent-dir", + "git::file://" + pwdMinusFour + "some-2x-great-grandparent-dir", + // .............................^ + // CAREFUL: No explicit leading '/' here because 'pwdMinusFour' is + // our top-level root directory; is just '/'; we do not + // expect 'file:////some-2x-...' + }, + { + "../../../../some-2x-great-grandparent-dir?ref=v1.2.3", + "git::file://" + pwdMinusFour + "some-2x-great-grandparent-dir?ref=v1.2.3", + }, + + { // Same output as previous, since we only have 2x + // great-grandparents before we hit the top-level root directory + "../../../../../some-3x-great-grandparent-dir", + "git::file://" + pwdMinusFour + "some-3x-great-grandparent-dir", + }, + { + "../../../../../some-3x-great-grandparent-dir?ref=v1.2.3", + "git::file://" + pwdMinusFour + "some-3x-great-grandparent-dir?ref=v1.2.3", + }, + } + + force := "git" // parsed form of magic 'git::' force token + + srcAbsResolveFrom := "" + + for _, tc := range posCases { + t.Run(tc.Input, func(t *testing.T) { + output, ok, err := detectGitForceFilepath(tc.Input, pwd, force, srcAbsResolveFrom) + if err != nil { + t.Fatalf("unexpected error: %s", err) + } + if !ok { + t.Fatalf("unexpected !ok") + } + + if output != tc.Output { + t.Errorf("wrong result\ninput: %s\ngot: %s\nwant: %s", tc.Input, output, tc.Output) + } + }) + } +} + +func Test_detectGitForceFilepath_directly_pos_nonempty_srcResolveFrom(t *testing.T) { + + // Test case input values represent the parsed inputs, similar to what + // would be presented by the 'CtxDetect' dispatch function. + // + // CAREFUL: Recall that the 'CtxDetect' dispatch function does subdir + // parsing before attempting to invoke the 'CtxDetect' methods of + // the CtxDetectors. The subdir value is passed as a separate + // 'subDir' param to the 'CtxDetect' methods, and in the + // particular case of our GitCtxDetector that subDir param is not + // provided to the 'detectGitForceFilepath' function. So in + // practice 'detectGitForceFilepath' should never see a ('//') + // subdir in its 'src' param. Consequently, our positive tests + // below that fake one up, anyway, result in "cleaned" paths (in + // the filepath.Clean() sense). The function is behaving + // correctly by /not/ doing explicit subdir handling, and by + // "cleaning away" filepaths that happen to have subdir-looking + // sections in them. + // + // Compare with: + // Test_detectGitForceFilepath_indirectly() + // which leverages the higher-level processing of CtxDetect(), as + // well. + + srcAbsResolveFrom := "/some/caller-provided/abs/path" + srcAbsResolveFromMinusOne := filepath.Dir(srcAbsResolveFrom) // dirname + srcAbsResolveFromMinusTwo := filepath.Dir(srcAbsResolveFromMinusOne) // likewise + srcAbsResolveFromMinusThree := filepath.Dir(srcAbsResolveFromMinusTwo) // likewise + srcAbsResolveFromMinusFour := filepath.Dir(srcAbsResolveFromMinusThree) // likewise; is root ('/') + + // Sanity check our "directory math" + if srcAbsResolveFromMinusThree == "/" { + t.Fatalf("srcAbsResolvFromMinusThree is unexpectedly the top-level root ('/') directory") + } + if srcAbsResolveFromMinusFour != "/" { + t.Fatalf("srcAbsResolvFromMinusFour is not the top-level root ('/') directory; is: %s", srcAbsResolveFromMinusFour) + } + + // Our test loop below runs one iteration, with a hard-coded, non-empty + // 'pwd' value. + // + // Our non-empty srcAbsResolveFrom values for the 'srcResolveFrom' param + // are provided in addition to the value provided for the 'pwd' param. + // + // The 'CtxDetect' protocol, ultimately implemented here in our + // 'detectGitForceFilepath' function, is that 'src' is resolved relative + // to 'pwd', unless the provided 'srcResolveFrom' param is non-empty; then + // the 'src' is resolved relative to 'srcResolveFrom'. + // + // Because our 'pwd' param is non-empty in all of these cases, these tests + // demonstrate the above protocol working. + // + posCases := []struct { + Input string + Output string + }{ + { + "/somedir", + "git::file:///somedir", + }, + { + "./somedir", + "git::file://" + srcAbsResolveFrom + "/somedir", + }, + + { + "/somedir/two", + "git::file:///somedir/two", + }, + { + "./somedir/two", + "git::file://" + srcAbsResolveFrom + "/somedir/two", + }, + + { + "/somedir/two/three", + "git::file:///somedir/two/three", + }, + { + "./somedir/two/three", + "git::file://" + srcAbsResolveFrom + "/somedir/two/three", + }, + + { // subdir-looking thing IS NOT retained here; is "cleaned away" (good) + "/somedir/two//three", + "git::file:///somedir/two/three", + }, + { + "./somedir/two//three", + "git::file://" + srcAbsResolveFrom + "/somedir/two/three", // no subdir here (good) + }, + + { + "/somedir/two/three?ref=v4.5.6", + "git::file:///somedir/two/three?ref=v4.5.6", + }, + { + "./somedir/two/three?ref=v4.5.6", + "git::file://" + srcAbsResolveFrom + "/somedir/two/three?ref=v4.5.6", + }, + + { + "../some-parent-dir", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir", + }, + { + "../some-parent-dir/two", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir/two", + }, + + { + "../some-parent-dir/two/three", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir/two/three", + }, + { + "../some-parent-dir/two//three", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir/two/three", // no subdir here (okay) + }, + + { + "../../some-grandparent-dir", + "git::file://" + srcAbsResolveFromMinusTwo + "/some-grandparent-dir", + }, + { + "../../some-grandparent-dir?ref=v1.2.3", + "git::file://" + srcAbsResolveFromMinusTwo + "/some-grandparent-dir?ref=v1.2.3", + }, + + { + "../../../some-great-grandparent-dir", + "git::file://" + srcAbsResolveFromMinusThree + "/some-great-grandparent-dir", + }, + { + "../../../some-great-grandparent-dir?ref=v1.2.3", + "git::file://" + srcAbsResolveFromMinusThree + "/some-great-grandparent-dir?ref=v1.2.3", + }, + + { + "../../../../some-2x-great-grandparent-dir", + "git::file://" + srcAbsResolveFromMinusFour + "some-2x-great-grandparent-dir", + // ...........................................^ + // CAREFUL: No explicit leading '/' here because + // 'srcAbsResolveFromMinusFour' is our top-level root + // directory; is just '/'; we do not expect 'file:////some-2x-...' + }, + { + "../../../../some-2x-great-grandparent-dir?ref=v1.2.3", + "git::file://" + srcAbsResolveFromMinusFour + "some-2x-great-grandparent-dir?ref=v1.2.3", + }, + + { // Same output as previous, since we only have 2x + // great-grandparents before we hit the top-level root directory + "../../../../../some-3x-great-grandparent-dir", + "git::file://" + srcAbsResolveFromMinusFour + "some-3x-great-grandparent-dir", + }, + { + "../../../../../some-3x-great-grandparent-dir?ref=v1.2.3", + "git::file://" + srcAbsResolveFromMinusFour + "some-3x-great-grandparent-dir?ref=v1.2.3", + }, + } + + pwd := "/pwd-ignored" + force := "git" // parsed form of magic 'git::' force token + + for _, tc := range posCases { + t.Run(tc.Input, func(t *testing.T) { + output, ok, err := detectGitForceFilepath(tc.Input, pwd, force, srcAbsResolveFrom) + if err != nil { + t.Fatalf("unexpected error: %s", err) + } + if !ok { + t.Fatalf("unexpected !ok") + } + + if output != tc.Output { + t.Errorf("wrong result\ninput: %s\ngot: %s\nwant: %s", tc.Input, output, tc.Output) + } + }) + } +} + +func Test_detectGitForceFilepath_directly_neg(t *testing.T) { + + // Test case input values represent the parsed inputs, similar to what + // would be presented by the CtxDetect() dispatch function. + // + // These are negative tests; the input represent values that + // detectGitForceFilepath() should effectively ignore. So most outputs are + // expected to be a !ok flag coupled with an empty result string. + // + // Recall that detectGitForceFilepath() considers as relative only those + // paths that begin with './' or '../', or their Windows + // equivalents. Paths in the form 'parent-dir/child-dir' are not + // considered relative. + + negCases := []struct { + Input string + Output string + }{ + { + "", + "", + }, + { + "somedir", + "", + }, + { + "somedir/two", + "", + }, + { + "somedir/two//three", + "", + }, + { + "somedir/two/three?ref=v4.5.6", + "", + }, + } + + pwd := "/pwd-ignored" + srcResolveFrom := "/some/absolute/file/path/ignored" + + // We'll loop over our tests multiple times, with 'force' set to different + // values. The tests in which the value is not 'git' should short circuit + // quickly because the function should only have an effect when the force + // token is in effect. For these iterations, the input string values + // should be irrelevant. + // + // Those tests in which 'force' is set to "git" should also result in + // negative results here, but exercise a different part of the + // logic. These simulate the force token being in effect, and the function + // interpretting the string input values to decide whether to + // respond. Check the coverage report. + // + // The parsed form of the magic 'git::' force token is "git". All of the + // other values (including "git::") should be ignored by + // detectGitForceFilepath(). + // + forceVals := []string{"", "blah:", "blah::", "git:", "git::", "git"} + + for _, force := range forceVals { + + for _, tc := range negCases { + t.Run(tc.Input, func(t *testing.T) { + output, ok, err := detectGitForceFilepath(tc.Input, pwd, force, srcResolveFrom) + if err != nil { + t.Fatalf("unexpected error: %s", err) + } + if ok { + t.Errorf("unexpected ok on input: %s", tc.Input) + } + if output != "" { + t.Errorf("unexpected non-empty output string; input: %s; output: %s", tc.Input, output) + } + }) + } + } +} + +func Test_detectGitForceFilepath_indirectly_pos(t *testing.T) { + + // Test case input values represent the raw input provided to the + // CtxDetect() dispatch function, which will parse them prior to invoking + // GitCtxDetector's CtxDetect() method (which in turn invokes + // detectGitForceFilepath()). + + srcAbsResolveFrom := "/some/caller-provided/abs/path" + srcAbsResolveFromMinusOne := filepath.Dir(srcAbsResolveFrom) // dirname + srcAbsResolveFromMinusTwo := filepath.Dir(srcAbsResolveFromMinusOne) // likewise + srcAbsResolveFromMinusThree := filepath.Dir(srcAbsResolveFromMinusTwo) // likewise + srcAbsResolveFromMinusFour := filepath.Dir(srcAbsResolveFromMinusThree) // likewise; is root ('/') + + // Sanity check our "directory math" + if srcAbsResolveFromMinusThree == "/" { + t.Fatalf("srcAbsResolvFromMinusThree is unexpectedly the top-level root ('/') directory") + } + if srcAbsResolveFromMinusFour != "/" { + t.Fatalf("srcAbsResolvFromMinusFour is not the top-level root ('/') directory; is: %s", srcAbsResolveFromMinusFour) + } + + posCases := []struct { + Input string + Output string + }{ + { + "git::/somedir", + "git::file:///somedir", + }, + { + "git::./somedir", + "git::file://" + srcAbsResolveFrom + "/somedir", + }, + + { + "git::/somedir/two", + "git::file:///somedir/two", + }, + { + "git::./somedir/two", + "git::file://" + srcAbsResolveFrom + "/somedir/two", + }, + + { + "git::/somedir/two/three", + "git::file:///somedir/two/three", + }, + { + "git::./somedir/two/three", + "git::file://" + srcAbsResolveFrom + "/somedir/two/three", + }, + + { + "git::/somedir/two//three", + "git::file:///somedir/two//three", // subdir is preserved + }, + { + "git::./somedir/two//three", + "git::file://" + srcAbsResolveFrom + "/somedir/two//three", // subdir is preserved + }, + + { + "git::/somedir/two/three?ref=v4.5.6", + "git::file:///somedir/two/three?ref=v4.5.6", + }, + { + "git::./somedir/two/three?ref=v4.5.6", + "git::file://" + srcAbsResolveFrom + "/somedir/two/three?ref=v4.5.6", + }, + + { + "git::/somedir/two//three?ref=v4.5.6", + "git::file:///somedir/two//three?ref=v4.5.6", // subdir is preserved + }, + { + "git::./somedir/two//three?ref=v4.5.6", + "git::file://" + srcAbsResolveFrom + "/somedir/two//three?ref=v4.5.6", // subdir is preserved + }, + + { + "git::/somedir/two//three?ref=v4.5.6", + "git::file:///somedir/two//three?ref=v4.5.6", // subdir is preserved + }, + { + "git::./somedir/two//three?ref=v4.5.6", + "git::file://" + srcAbsResolveFrom + "/somedir/two//three?ref=v4.5.6", // subdir is preserved + }, + + { + "git::../some-parent-dir", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir", + }, + { + "git::../some-parent-dir/two", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir/two", + }, + { + "git::../some-parent-dir/two/three", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir/two/three", + }, + { + "git::../some-parent-dir/two//three", + "git::file://" + srcAbsResolveFromMinusOne + "/some-parent-dir/two//three", // subdir is preserved + }, + { + "git::../../some-grandparent-dir", + "git::file://" + srcAbsResolveFromMinusTwo + "/some-grandparent-dir", + }, + { + "git::../../some-grandparent-dir?ref=v1.2.3", + "git::file://" + srcAbsResolveFromMinusTwo + "/some-grandparent-dir?ref=v1.2.3", + }, + { // subdir is preserved on output + "git::../../some-grandparent-dir/childdir//moduledir?ref=v1.2.3", + "git::file://" + srcAbsResolveFromMinusTwo + "/some-grandparent-dir/childdir//moduledir?ref=v1.2.3", + }, + } + + pwd := "/pwd-ignored" + + f := new(GitCtxDetector) + ds := []CtxDetector{f} + + for _, tc := range posCases { + t.Run(tc.Input, func(t *testing.T) { + + output, err := CtxDetect(tc.Input, pwd, srcAbsResolveFrom, ds) + if err != nil { + t.Fatalf("unexpected error: %s", err) + } + + if output != tc.Output { + t.Errorf("wrong result\ninput: %s\ngot: %s\nwant: %s", tc.Input, output, tc.Output) + } + }) + } +} + +func Test_detectGitForceFilepath_indirectly_neg(t *testing.T) { + + // Test case input values represent the path string input provided to the + // CtxDetect() dispatch function, which will parse them prior to invoking + // the CtxDetect() method on GitCtxDetector (which internally invokes + // detectGitForceFilepath()). + // + // We loop over the list several times with different force key values to + // trigger different paths through the code. All of these tests should + // produce negative results either due to the absence of a legit 'git::' + // force key, or because the path value does not represent a relative in + // the form required. + // + // When the 'git::' force key is specified for these values, then the + // GitCtxDetector will yield an error because the fields are clearly + // flagged for processing by that detector, but it is unable to do so + // because the input is not meaningful to it. + + negCases := []struct { + Input string + Output string + }{ + // FIXME: This first case (which is pathological, to be sure) + // incorrectly succeeds when joined with the 'git::' + // prefix. The reason is because the getForcedGetter function + // (in get.go) that is used to identify the force tokens is + // expecting at least one character after the '::' separator; + // at the time of writing it is using this regex: + // + // `^([A-Za-z0-9]+)::(.+)$` + // + // Since the length of the token on the left of the '::' is not + // size restricted, how to fix that without accidentally + // breaking somebody's use case? + // { + // "", + // "", + // }, + { + "somedir", + "", + }, + { + "somedir/two", + "", + }, + { + "somedir/two/three", + "", + }, + { + "somedir/two//three", + "", + }, + { + "somedir/two/three?ref=v4.5.6", + "", + }, + { + "somedir/two//three?ref=v4.5.6", + "", + }, + } + + pwd := "/pwd-ignored" + srcResolveFrom := "/src-resolve-from-ignored" + + f := new(GitCtxDetector) + ds := []CtxDetector{f} + + // Empty-string force will fail because no CtxDetector handles these + // quasi-relative filepaths. + // + // The 'git::' force will fail because neither GitCtxDetector.CtxDetect() + // nor detectGitForceFilepath() recognize the quasi-relative filepaths + // (and neither do any other Detector implemenations, though they are not + // invoked in this configuration). + // + forceVals := []string{"", "git::"} + + for _, force := range forceVals { + for _, tc := range negCases { + t.Run(tc.Input, func(t *testing.T) { + + output, err := CtxDetect(force+tc.Input, pwd, srcResolveFrom, ds) + if err == nil { + // When force is "", the error would be an "invalid source + // string" error from the CtxDetect dispatch function. + // + // When force is "git::", the error would be something + // from the GitCtxDetector's CtxDetect method complaining + // that the string value was force tagged for it to + // process, but it is not able to. + // + t.Fatalf("was expecting error, but call succeeded: output: %s (force is: %s)", output, force) + } + + if output != "" { + t.Errorf("unexpected non-empty output string; input: %s; output: %s (force is: %s)", tc.Input, output, force) + } + }) + } + } + +}