Skip to content

Latest commit

 

History

History
111 lines (91 loc) · 7.18 KB

README.md

File metadata and controls

111 lines (91 loc) · 7.18 KB

binlore - generate and aggregate info about executables in nix packages

Since binlore is very young and currently has a limited scope, the vision may make a little more sense if I outline what it currently does and why. If you'd like to help improve binlore, see How to help

Why / Motive

I'm building binlore to help resholve decide how likely the executables it finds in shell scripts are to also execute one of their arguments.

This information helps resholve scrutinize these invocations more carefully and require user triage as-needed (without wasting user time on unlikely cases).

What / API (high level)

binlore itself is a Nix API with three main functions:

  • make which builds a derivation that runs a black-box we'll call [analysis] for a single package and outputs a directory with one or more named files containing some or all of the output from that analysis.
  • collect which builds a derivation that depends on and aggregates the output of make for each package in a list. In this case, aggregation means concatenating files with the same name for every package into a single file of the same name.
  • synthesize which (with the aid of a Shell DSL) makes it easy to attach manually-generated lore overrides to a specific package.
  • It'll take some use for norms/patterns to settle, but I tentatively see each file as a "type" or "kind" of lore.

Trying out the API

If you want to contribute to binlore or use binlore in your own project, then you’ll probably want to see what lore binlore produces for particular commands. Here’s how you would get the lore for hello and haskellPackages.hello:

$ git clone https://github.com/abathur/binlore.git
$ cd binlore
$ nix-build -E 'with import <nixpkgs> { }; (callPackage ./binlore.nix { binloreSrc = ./.; }).collect { drvs = [ hello haskellPackages.hello ]; }'
/nix/store/...-more-binlore
$ cat result/execers
cannot:/nix/store/...-hello-2.12.1/bin/hello
cannot:/nix/store/...-hello-1.0.0.2/bin/hello
$ cat result/wrappers
$

In this example, result/wrappers is empty.

How / Analyses

The only [analysis] so far meets resholve's immediate needs. You can find its definition in the loreDev attr in default.nix, but the broad strokes are that it:

Trying out the analyses (low-level)

Sometimes it's enough to see the high-level lore produced by the collect function for a specific package or executable, and other times you'll need to pop open the hood to understand how binlore's YARA rules are leading to a specific result. (Perhaps because it's wrong, or perhaps because binlore just doesn't have rules for a specific language or binary format.)

With the traditional nix commands, you can do something like:

$ git clone https://github.com/abathur/binlore.git
$ cd binlore
$ nix-shell
$ binlore_yara /nix/store/...-diffutils-3.10
...
executable /nix/store/...-diffutils-3.10/bin/cmp
macho_binary /nix/store/...-diffutils-3.10/bin/cmp
binary /nix/store/...-diffutils-3.10/bin/cmp
macho_cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
decidable /nix/store/...-diffutils-3.10/bin/cmp
cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
executable /nix/store/...-diffutils-3.10/bin/diff
macho_binary /nix/store/...-diffutils-3.10/bin/diff
binary /nix/store/...-diffutils-3.10/bin/diff
macho_execve /nix/store/...-diffutils-3.10/bin/diff
execve /nix/store/...-diffutils-3.10/bin/diff
decidable /nix/store/...-diffutils-3.10/bin/diff
can_exec /nix/store/...-diffutils-3.10/bin/diff

Using the experimental CLI you can do something like:

$ nix develop github:abathur/binlore
$ binlore_yara /nix/store/...-diffutils-3.10
...
executable /nix/store/...-diffutils-3.10/bin/cmp
macho_binary /nix/store/...-diffutils-3.10/bin/cmp
binary /nix/store/...-diffutils-3.10/bin/cmp
macho_cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
decidable /nix/store/...-diffutils-3.10/bin/cmp
cannot_exec /nix/store/...-diffutils-3.10/bin/cmp
executable /nix/store/...-diffutils-3.10/bin/diff
macho_binary /nix/store/...-diffutils-3.10/bin/diff
binary /nix/store/...-diffutils-3.10/bin/diff
macho_execve /nix/store/...-diffutils-3.10/bin/diff
execve /nix/store/...-diffutils-3.10/bin/diff
decidable /nix/store/...-diffutils-3.10/bin/diff
can_exec /nix/store/...-diffutils-3.10/bin/diff

Each line here indicates that a YARA rule of the same name (currently in execers.yar) matched for that path.

Note: You may also want to fork this repo, add the relevant package to big.nix (if it isn't already there), and push it up to github to run the CI process. The CI job will dump this information (and some additional analysis) for all included packages.

Meta

I'm not sure what binlore's long-term relationship to individual analyses will be (or that the current abstractions are right).

  • I suspect there may be utility in having a collection of standardized analyses that people can discover and apply without needing to understand enough to write one. This makes me want to collect them in binlore for now. (But there's no real mechanism yet--I don't want to fall into over-designing this until someone's asking.)
  • But I also realize that needs may be too idiomatic for standardized analyses to ever be a good fit. If it smells like analyses are accumulating with no real re-use (no in-the-clear uses, re-use that almost always entails tweaking/adjusting an existing form, etc.)

Lore Formats

There are currently two "kinds" of lore. In both cases below, the field separator is equivalent to FIELD_SEPARATOR=$':':

  • $out/execers, which has the format: ${verdict}${FIELD_SEPARATOR}${executable_path} where:
    • verdict=can|cannot|might
    • executable_path is whatever path YARA printed for the match
  • $out/wrappers, which has the format ${wrapper_path}${FIELD_SEPARATOR}${wrapped_path} where:
    • wrapper_path is a path identified as a wrapper by YARA
    • wrapped_path is the path that the shell_wrapper yallback and exec_target function in execers.yall pick out of the source of the wrapper

Usage / Examples

For now, at least, resholve's own uses of binlore should serve as a good example of the rough intent and usage: