Skip to content

bips-hb/cARFi_paper

Repository files navigation

Conditional ARF Feature Importance (cARFi)

This repository contains the code and material to reproduce the results of the manuscript "Conditional Feature Importance with Generative Modeling using Adversarial Random Forests".

The core method introduced in our paper, cARFi, is implemented in the R script located at cARFi.R. This script includes the main functions and including a description of all parameters and how to use them.

The repository is structured as follows:

  • eval_bike_sharing/ contains the code to evaluate the cARFi method on the Bike-Sharing dataset considering various conditioning sets of variables (Fig. 4)
  • eval_conditioning_set/ contains the code to compare cARFi under various conditionng sets against some marginal (PFI, SAGE) and conditional methods (CS, CPI with Gaussian and sequential knockoffs, LOCO) on our modified DAG of König et al. (2020) (Fig. 3)
  • eval_mixed_data/ contains the mixed data simulation based on the DAG of Blesch et al. (2023) (Fig. 2 and Appendix S7 + S8)
  • eval_proof_of_concept/ contains the proof of concept simulation (Fig. 1 and Appendix S1-S6)
  • figures/ contains the figures generated by the simulations and analyses from above

You can run the code and reproduce the results by running the corresponding file run_simulation.R in each simulation directory. The run_simulation.R file will automatically save the results in the figures/ directory. For example:

Rscript eval_bike_sharing/run_simulation.R

Requirements

This project was developed and tested using R version 4.4.1 and it requires the following R packages:

Not CRAN packages

  • arf version $\geq$ 0.2.2 installed by running pak::pkg_install("bips-hb/arf")
  • seqknockoff installed by running pak::pkg_install("kormama1/seqknockoff")
  • cs installed by running pak::pkg_install("christophM/paper_conditional_subgroups")
  • cpi version $\geq$ 0.1.5 installed by running pak::pkg_install("bips-hb/cpi")

CRAN packages

  • ggplot2
  • batchtools
  • data.table
  • here
  • envalysis
  • ggpubr
  • ggsci
  • fastDummies
  • Metrics
  • mlr3verse
  • ranger
  • microbenchmark
  • pak
  • dplyr

Note: The script setup.R ensures that all necessary R packages are installed and is called before any analysis is run. It also automatically sets up the environment by creating required folders, setting the ggplot2 theme, and managing CPU usage during simulations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published