Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for installing packages from authenticated repositories, with keyring support #729

Open
atheriel opened this issue Dec 20, 2024 · 1 comment

Comments

@atheriel
Copy link

We're expecting to add an authenticated repositories feature to Posit Package Manager soon. Because we're hoping to support both Python and R, we're limited by the fact that pip and friends only support HTTP Basic Auth. So we know there won't be e.g. complex OAuth flows involved.

Now, it's technically possible to install packages from repos that use basic authentication today, e.g. with

options(repos = c(CRAN = "https://username:[email protected]/cran/latest"))

but I'm really not a fan of this approach because I think it will lead users to copy around and embed plaintext credentials in configuration files. We've also found that it can cause issues on Windows if the password is too long -- which it will be in the case of a JWT.

Instead, I think we should follow the example of tools like pip and uv here, too, by checking the system keyring for passwords automatically.

(renv also has a very flexible mechanism for configuring auth headers that could be wired up to the system keyring, too.)

For example, when pak is installing packages from a repository URL like https://[email protected]/cran/latest, we could automatically check whether there is a corresponding "password" (or more likely, a token or API key or some kind) in the system keyring and use that to construct an Authorization header:

repo_auth <- function(repo_url) {
  if (!is_installed("keyring")) {
    return(NULL)
  }
  
  # Pull the username out of the repo URL.
  parsed <- httr2::url_parse(repo_url)
  username <- parsed$username
  if (is.null(username)) {
    return(NULL)
  }

  # Reconsitute the repo URL without the username.
  parsed$username <- NULL
  base_url <- httr2::url_build(parsed)

  tryCatch(
    {
      pwd <- keyring::key_get(base_url, username)
      auth <- paste(username, pwd, sep = ":")
      c("Authorization" = paste("Basic", openssl::base64_encode(auth)))
    },
    error = function(e) NULL
  )
}

the above should work with renv, too, via options(renv.download.headers = repo_auth).

To help users out, we could also have a utility function to set this "password". I'm imagining an API something like the following:

pak::repo_set_auth("https://[email protected]/cran/latest")

This would call the equivalent of

keyring::key_set(
  "https://ppm.internal/cran/latest",
  "username",
  prompt = "Password, Token, or API Key: "
)

under the hood.

A more advanced implementation might also prompt the user for credentials when a repo returns a HTTP 401 in an interactive session, and offer to save them in the system keyring. (This is impossible with install.packages() because that function swallows 401 responses.)

Note: one could test basic auth support in pak by running a local NGINX with basic auth enabled proxying to https://p3m.dev.

I'm happy to help out with this, but I need some pointers on where the relevant changes would need to be made.

@gaborcsardi
Copy link
Member

gaborcsardi commented Jan 15, 2025

My thoughts about this, initially a brain dump, to be edited and extended, here or elsewhere.

Must have

  • We need a way to store credentials that works with renv, Python, etc. as well, preferably OOTB.
  • Credentials must be specific to host names and/or URLs. (We can probably do the same as git here.)
  • We need to handle credentials in non-interactive sessions, preferably better than pip does. We could run keyring in a subprocess, with a timeout, or (much better) we could introduce a timeout into the keyring package, if possible. If we use a subprocess, then we'll probably need to cache credentials in env vars.
  • We need to keep pak self-contained, i.e. embed the keyring package or the oskeyring package into pak, probably. Also, preferably have a static pak build on Linux that is able to use the secret service API via dbus.
  • We need a config option to add arbitrary headers to pak HTTP queries.
  • Caching. We probably need to cache credentials in (host or URL specific) env vars, like the gitcreds packages does.

Nice to have

  • Better credential storage on server Linux, where there is effectively no system credential store.
  • Pluggable auth.
  • netrc file support, like pip.
  • We could probably use the same credential format that the git credential helpers use.

UI for users

I am not a big fan of the renv UI where the user needs to set an option to a function for a couple of reasons:

  • You need to edit your profile(s) to set options, and the profile does not run in --vanilla sessions, so there is no auth there.
  • The function might refer to packages (e.g. keyring), but that's problematic if keyring is the package that pak/renv is installing, especially on Windows.
  • It does not work well with pak doing things in a subprocess, because pak would need to copy the function to another process, and that's always error-prone.

I like the idea of having utility functions to get/set credentials:

pak::repo_auth("https://ppm.internal/cran/latest")
pak::repo_set_auth("https://ppm.internal/cran/latest")

I would not store the username in the repo URL, or at least that should not be required. If the username is not there, then the admin can configure repo URLs for all users easier. Again, we could follow what git does here when it looks up credentials (including usernames) from the credential store.

Issues with the keyring and oskeyring packages

We need to solve these at some point, some seem urgent, some not.

  • keyring has a lot of dependencies that we would need to get rid of before embedding it into pak. Or embedding a simplified version should be also possible.
  • oskeyring has no dependencies, but it is also not pluggable.
  • oskeyring does not support the secret service API on Linux, AFAIR. Could be easily added, though.
  • On macOS keyring and oskeyring both use a deprecated macOS API. (macOS: use SecItem API  keyring#160)

Implementation

A significant part of this should go into the pkgcache package. pkgcache handles all HTTP for downloading metadata and packages. So I think the first step would be to implement everything in pkgcache, and then solve the issues with embedding the new pkgcache into pak.

The HTTP client functions in https://github.com/r-lib/pkgcache/blob/main/R/async-http.R have a headers argument, that's where the additional headers need to be passed in.

atheriel added a commit to atheriel/pkgcache that referenced this issue Jan 16, 2025
This commit updates both the metadata and package caches to support
downloading packages and package indexes from repositories that require
HTTP basic authentication to access.

Initial support for these authenticated repositories is very narrow: the
repository URL must contain a username, no password, and have an entry
in the system keyring. We also don't make any attempt to prompt users
for credentials when requests fail.

Unit tests are included for the new authentication header helpers, but
there are currently no tests of end-to-end workflows with an
authenticated repository, and I may have missed something.

Part of r-lib/pak#729.

Signed-off-by: Aaron Jacobs <[email protected]>
atheriel added a commit to atheriel/pkgcache that referenced this issue Jan 16, 2025
This commit updates both the metadata and package caches to support
downloading packages and package indexes from repositories that require
HTTP basic authentication to access.

Initial support for these authenticated repositories is very narrow: the
repository URL must contain a username, no password, and have an entry
in the system keyring. We also don't make any attempt to prompt users
for credentials when requests fail.

Unit tests are included for the new authentication header helpers, but
there are currently no tests of end-to-end workflows with an
authenticated repository, and I may have missed something.

Part of r-lib/pak#729.

Signed-off-by: Aaron Jacobs <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants