Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCtoolkit doesnt find the full gc pattern #302

Open
logesh-encipherhealth opened this issue Jul 13, 2023 · 2 comments
Open

GCtoolkit doesnt find the full gc pattern #302

logesh-encipherhealth opened this issue Jul 13, 2023 · 2 comments

Comments

@logesh-encipherhealth
Copy link

gctoolkit cant identify the full GC present in the log file. The full GC class is there but nothing full GC object was created.

@karianna
Copy link
Member

@logesh056 Can you add the log file in question to this ticket

@kcpeppe
Copy link
Collaborator

kcpeppe commented Jul 18, 2023

Hi @logesh056,

The parsing makes use of regex. Regex expressions are built up in interfaces that are specific to each collector. For example, the patterns to parse CMS log lines are in CMSPatterns. Here is an example

GCParseRule PARALLEL_REMARK_CLASS_UNLOADING = new GCParseRule("PARALLEL_REMARK_CLASS_UNLOADING", GC_PREFIX + YOUNG_GEN_BLOCK + RESCAN_BLOCK + WEAK_REF_BLOCK + CLASS_UNLOADING_BLOCK + SYMBOL_TABLE_SCRUB_BLOCK + STRING_TABLE_SCRUB_BLOCK + REMARK_BLOCK);

GCParseRule is a utility class of sorts. It knows how to convert the strings into useful data types. The argument in the constructor is the pattern. If you look at YOUNG_GEN+BLOCK, that is a regex fragment that picks out the YOUNG generation block in a log entry. You may find that several GCParseRule objects will make use of that regex fragment. In this case YOUNG_GEN_BLOCK is defined in CMSPatterns.

String YOUNG_GEN_BLOCK = "\[YG occupancy: " + COUNTER + " K \(" + COUNTER + " K\)\]";

COUNTER is defined in a token class. Here are some definitions. As you can see, DECIMAL_POINT has been internationalized.

String DECIMAL_POINT = "(?:\\.|,)";
String INTEGER = "\\d+";
String COUNTER = "(" + INTEGER + ")";

To add a parse rule, you just need to define it in the appropriate parser patterns interface. These regex segments are useful in helping build a matching rule that can be used to extract information from the log entry.

The second thing to do is add a test for the parse rule. We want to make sure that the rule captures the intended log line and no others. You can add the test to appropriate test class in the package com.microsoft.gctoolkit.parser.patterns in the testing directories.

Next, add the rule to the parser. To do this, name the rule and then add it to the ParserRule defined at the top of the parser class. You'll need to define a lamda that will wrap a matching method that will process the information into an JVMEvent. In this it will be a FullGC event. The information will be parsed into a forward reference and once the CPU information is seen, then the forward reference will be converted into a FullGC event which will be published. All of the conversion and publishing should happen automagically.

The final test that may need to be adjusted is the test for the parsers. There are two sets of tests. One that will ensure that the event object contains the correct information and another that will parse the entire test log. You can find examples of these tests in com.microsoft.gctoolkit.parser.patterns and com.microsoft.gctoolkit.parser.unittests.

Note that since the project is modularized we've had to make some decisions w.r.t. package naming to accommodate the rules regarding split packages.

Finally the test log. If you have a test log it needs to be added to the GCToolKit-testdata repository.

HTH

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants