Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File-specific rules #102

Open
sbibauw opened this issue Jul 14, 2021 · 5 comments
Open

File-specific rules #102

sbibauw opened this issue Jul 14, 2021 · 5 comments

Comments

@sbibauw
Copy link

sbibauw commented Jul 14, 2021

On this page, @josephmturner and @adept have proposed to use file-specific rules, i.e. hyperspecific rules that would only apply to one specific imported file, typically because they only target one transaction or transactions from this specific timeframe. The main reason for this is to avoid cluttering the main rules file while still being able to correctly allocate specific transactions, even by overriding general rules in some cases. The second advantage is in terms of performance optimization, as these rules won't be unnecessarily evaluated on every single file.

I think it would make sense for hledger-flow to allow similar file-specific rules and to enforce a way to implement them. In your example directory, I'd see it as:

└── import
    └── gawie
        └── bogart
            └── cheque
                ├── 1-in
                │   └── 2016
                │       ├── 123456789_2016-03-30.csv
                │       └── ...
                ├── 2-preprocessed
                │   └── 2016
                │       ├── 123456789_2016-03-30.csv
                │       └── ...
                ├── 3-journal
                │   └── 2016
                │       ├── 123456789_2016-03-30.journal
                │       └── ...
                ├── bogart-cheque.rules                 # General rules
                ├── ...
                └── rules                               # <-- File-specific rules go here
                    └── 2016
                        └── 123456789_2016-03-30.rules  # File-specific rules, same name as file

Process

Now, as I don't think hledger allows to specify using multiple rules files, we'd have to include the general rules in the specific rules. I'd see the process as following (but maybe there's a better way to do it):

  1. On hledger-flow import, check if file-specific rules exist and create them if they don't (don't overwrite) for all detected CSVs. Write the following in each new file:

    include ../../../../../bogart.rules  # if file exists
    include ../../bogart-cheque.rules    # if file exists
    include ../../../../../all.rules     # if file exists (just an idea, that's where I put my generic transactions account-assignment rules, feel free to ignore, as it can also be included manually in any of the above files)
    
    # Write specific transactions rules below this line
    
  2. Then, to import, use the file-specific rules only (as they contain all generic rules).

Potential issues

  • If there are no rules files whatsoever, there's nothing to include in the specific files, and hledger will complain on an empty rules file. But I suppose that an error message indicating that it cannot import without a rules file would make sense then.
  • Maybe it could be difficult for some users to understand where to put their account-identification rules? Or it could incite avoiding generic rules and favoring transaction-specific rules (inefficient)? But I suppose hledger-flow users know better ;-)
@apauley
Copy link
Owner

apauley commented Jul 14, 2021

Could this case be covered by naming the file in question using the rfo conventions?

They are described here:
https://github.com/apauley/hledger-flow#statement-specific-rules-files

@sbibauw
Copy link
Author

sbibauw commented Jul 21, 2021

Actually, no. The format of these files is still the same and all generic rules still apply.

It's just a way to isolate hyperspecific transaction allocation rules that only apply once.

E.g., in your hledger-flow-example:

All generic rules are in import/gawie/bogart/cheque/bogart-cheque.rules:

skip 1
fields _, _, date, desc1, desc2, desc3, amount, balance, _
currency R
account1 Assets:Current:Gawie:Bogart:Cheque
description %desc1/%desc2/%desc3

# Generic transactions

if ATM,Cash Withdrawal
    account2   Expenses:Cash

but rules that only have to be applied to one file are in a specific rules file:

import/gawie/bogart/cheque/rules/2016/123456789_2016-03-30.rules

include ../../bogart-cheque.rules

# Write specific transactions rules below this line

if 2016-03-15,ATM,Cash Withdrawal,,-3000.00
    account2   Expenses:Restaurant
    comment    Restaurant with Mike, cash used for paying the bill

As you can see in the example, "ATM Cash Withdrawals" are by default categorized as Expenses:Cash. But for one specific transaction on a specific date, we specify that this has to be overridden (any specific rule will override the generic ones) and allocated to Expenses:Restaurant.

Because this rule is only useful for this specific transaction, it is much clearer in my opinion to have in a specific rules file. This keeps the main rules file readable and sensible, reduces the number of rules performed over all files, and avoids the risk of having a hyperspecific rule actually be applied out of its intended scope.

@bronislav
Copy link
Contributor

I agree that file specific rules might be useful in my situation as well. Creating rules override in every case seems too much manual toil.

@bronislav
Copy link
Contributor

@sbibauw I think this can be done with following patch

diff --git a/src/Hledger/Flow/Import/CSVImport.hs b/src/Hledger/Flow/Import/CSVImport.hs
index b95c97c..3584c2b 100644
--- a/src/Hledger/Flow/Import/CSVImport.hs
+++ b/src/Hledger/Flow/Import/CSVImport.hs
@@ -198,14 +198,8 @@ generalRulesFiles importDirs = do
 
 statementSpecificRulesFiles :: TurtlePath -> ImportDirs -> [TurtlePath]
 statementSpecificRulesFiles csvSrc importDirs = do
-  let srcSuffix = snd $ T.breakOnEnd "_" (Turtle.format Turtle.fp (Turtle.basename csvSrc))
-
-  if ((T.take 3 srcSuffix) == "rfo")
-    then
-    do
-      let srcSpecificFilename = T.unpack srcSuffix <.> "rules"
-      map (</> srcSpecificFilename) [accountDir importDirs, bankDir importDirs, importDir importDirs]
-    else []
+  let srcSpecificFilename = T.unpack (Turtle.format Turtle.fp (Turtle.basename csvSrc)) <.> "rules"
+  [accountDir importDirs </> "rules" </> srcSpecificFilename]
 
 customConstruct :: RuntimeOptions -> TChan FlowTypes.LogMessage -> TurtlePath -> Turtle.Line -> Turtle.Line -> Turtle.Line -> TurtlePath -> TurtlePath -> IO TurtlePath
 customConstruct opts ch constructScript bank account owner csvSrc journalOut = do

This replaces rules override logic with statement specific rules at <name>/<bank>/<account>/rules/<statement-file-basename>.rules

@bronislav
Copy link
Contributor

@sbibauw Are you still looking for this feature? There is a pull request with this change that you can test. Please take a look and let me know if anything works differently than you would expect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants