-
Notifications
You must be signed in to change notification settings - Fork 588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbSta: Add create_cell_usage_snapshot command #6469
base: master
Are you sure you want to change the base?
Conversation
We already have report_cell_usage so it would be better to enhance that than create a similar command. |
Thanks for the feedback! I thought about that option and then realized that report_cell_usage at the moment is already a bit convoluted (the command itself reports a few different things and "-verbose" is not very expressive / self-explanatory wrt what is reported), so I think it is more readable and testable to create a separate command for this. Besides, this is the first step of a series of create_.*_snapshot(s). We have plan to introduce timing histogram snapshot (and potentially a power snapshot as well) to capture the snapshot of PPA at a given moment. The general idea is to dump snapshots as JSON and separate utilities can then diff between snapshots to get the delta which will enable evaluation of flow steps. IMHO, it would be cleaner to have create_.*_snapshots that creates snapshots as JSON files while the other commands can still report it as part of the tool (log) output. If you have more constructive suggestions on how to achieve this without degrading readability and testability, I would be happy to help! Let me know what you think. cc @QuantamHD for viz |
All reports are inherently snapshots so having both create_.snapshots and report. feels redundant and confusing to me, especially if they capture different information. Having the option to generate a json report makes sense to me. I don't see it requiring a different command but just being an option to existing reporting. If you are not fond of -verbose please explain what options you intend to create to replace it. We can always update the options for reporting to be more fine grained. We also have the metrics format in json for dumping QOR information. Can what you need to represented as a metric? |
Thanks for the prompt response on the weekend! I don't disagree that all reports are inherently snapshots, yet if the output goes to log, then it requires extra efforts / hurdles to parse the log and then convert it to be machine readable.
So far the options really are (i) the directory and (ii) name of the stage when the snapshot is taken. The command is designed to do one thing and one thing only, so it is not flexible in terms of output format (i.e., JSON with the structure mapped to the C++ struct in the code) or destination (i.e., it only outputs to JSON file and won't print in log).
The command currently can be called as "create_cell_usage_snapshot -path <path> -stage <stage_name>". Are you fine with something like "report_cell_usage -snapshot -path <path> -stage <stage_name>"? I was a bit concerned whether this would complicate option validation (e.g., if -snapshot is present, then -path and -stage must be present).
Is there a doc for QoR (or C++ implementation)? In general we would prefer something more structured than key-value pairs. For example, we can have some hierarchy in the snapshot like "{"stage":"test_stage","cell_usage_info":[{"name":"snl_bufx1","count":4,"area":1E3},{"name":"snl_ffqx1","count":2,"area":1E3}]}". If you have a pointer to some doc of metrics, I would be more than happy to take a look. Thank you!
|
The metrics format is from https://vlsicad.ucsd.edu/Publications/Conferences/388/c388.pdf. You can run any design in ORFS to see the metrics reported by stage. It encodes information in the key name rather than introduce extra hierarchy in the json. I'd like to avoid having to support multiple formats unless there is a compelling reason. What does We already have Would |
Thanks for getting back to me! I would argue that encoding information in the key name is not a great way to do this and explicitly having hierarchy in the data structure is a cleaner way. Plus if information is all encoded in the key, backward compatibility would be a pain as all entries need to be updated whereas if you have hierarchy and the extra attribute only exists on a low level, you can leave most of the information untouched. If I have to pick one, I would drop the support of flat / pure key-value pair one. It is similar spirit to that for liberty / LEF cell name, much information is encoded in the name (e.g., VT, cell height, etc), yet it is still more usable if such information is explicitly written down for easier query.
I would argue that I was thinking about creating snapshots for each module if there are hierarchy, so I put Thank you! |
I would find it helpful if you wrote a document describing your full set of requirements rather than learning about them piecemeal in this discussion. I don't think a PR is the right place for this design discussion. |
I will write a one-pager later this week. |
Given the amount of time I can spend on this, I am going to change it to "report_cell_usage -format json -file". |
What would be the proper way to handle the module_inst argument of report_cell_usage? Currently it seems that the command assumes anything other than -verbose would be the name of the targeted module. Does it make sense to introduce -module key for the command (along with -file and -format). I feel like it would be cleaner to first introduce -module key (and update tests) before introducing -file and -format. Would breaking other's script a concern since this is effectively an API change? |
You could add a -module option. You can also keep backward compatibility by treating any non-flag argument as the module name. |
Add create_cell_usage_snapshot command to create a cell usage snapshot in JSON format.
create_cell_usage_snapshot -path -stage <stage_name> will create a JSON file named as cell_usage_snapshot-<module_name>-<stage_name>.json under . The snapshot reflects (i) the name of the stage and (ii) cell usage statistics. Currently cell usage statistics consist of (a) cell name, (b) number of instance of the cell and (c) the cell area in um^2.
As compared to current implementation of report_cell_usage -v, this is more structured and more extensible and testable. We have planned work to support diff of such snapshots such that we can better evaluate the performance of each flow step / stage by capturing snapshots as often as needed.