Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Remove largest duplicate substr of reg name #268

Closed

Conversation

shaydonb
Copy link

@shaydonb shaydonb commented Jan 10, 2025

Problem: Sometimes register names contain duplicate strings that unnecessarily lengthen them.

Solution: Remove the largest duplicated substring, with a specified delimiter to avoid removing mere duplicated characters.

Testing: Unit-level test of helper function, integration level test with SVD file in and expected out.

Issues: Closes #267.

@shaydonb shaydonb force-pushed the feat/remove-duplicates branch from 603e2fb to 1f9bdf8 Compare January 10, 2025 06:43
Problem: Sometimes register names contain duplicate strings that
unnecessarily lengthen them.

Solution: Remove the largest duplicated substring, with a specified
delimiter to avoid removing mere duplicated characters.

Testing: Unit-level test of helper function, integration level test with
SVD file in and expected out.

Issues: Closes rust-embedded#267.
@shaydonb shaydonb force-pushed the feat/remove-duplicates branch from 1f9bdf8 to 0e3ac47 Compare January 10, 2025 07:04
Comment on lines +481 to +483
// Define a regex to match duplicated groups of underscore-separated words
// As it turns out, this is not viable since backreferences are not supported by Regex in Rust
// let re = Regex::new(r"(?i)(\b(?:\w+_)+\w+)_\1").unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds not very good.

As for me this case looks too specific. What vendor do this?

Some time ago I was thinking about some more common way like supporting regex based replace. Something like:

PER:
  _replace:
    "REGSPEC*":
      name: # what string entry to process
        "namematch": "to"  # first replacement
        "namematch2": "to2"  # second replacement
      description:
        "descmatch": "to"

//let test_regex = r"^(.+)\1$";
//let test_regex = r"^(\w+) (\1)$";
//let test_regex = r"^[^_]+_";
let test_regex = r"\b((?:\w+_)*\w+)\b_\1\b";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some docstings with explanation what each regex should do, please. And add link from README docs to here as example of using.

@shaydonb
Copy link
Author

Thanks for humoring this change @burrbull, but I'm going to close this out for the time being. What I don't like about my attempt at making this change is that it requires intimate knowledge of the internals of each layer's struct contents and whether they are string types or not, since attempting to modify via dim elements requires knowledge of the dim element name. That to me defeats the purpose of this since I'm hoping to make generic changes to patterns of registers with extraneous garbage in the names of fields, and I don't want to write a bunch of hard-to-maintain boilerplate to map the YAML keys to the right struct members of each layer's dim values.

If you want to pick this up from here you are more than welcome to, or if you have suggestions for a better approach than the one I started perhaps you could convince me to pick back up on this but I just am getting too near my deadline at work to continue on this current implementation approach that I don't have high confidence would even be quality enough for you to want in mainline.

@shaydonb shaydonb closed this Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a remove duplicates functionality
3 participants