-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix State transition err when hdfs ccm feature is used #77
Conversation
@xkrogen hi Erik, could you please have a look? |
Hi @yuxintan , thanks for reporting this. We don't use the caching feature so we haven't encountered this issue. I wonder if there's a better way to handle this by adding support for the CCM feature to Dynamometer? I'm not sure how much work this would be, since I don't know much about how it works. For this patch in particular, I'm fine with excluding the directives, but not too sure about the current logic. First off, is there really no Second, is it guaranteed that a line containing |
@xkrogen Thanks very much for reviewing the patch. CCM feature improve HDFS performance by storing some specific files' blocks in DataNode's memory, NameNode only need save these paths in its memory, there is little or no effect on NN. So we don't add support for the CCM feature to Dynamometer. For the XML tag issue, it's the lack of clarity in my description led to confusion. There is
and the directive line won't be split across multiple lines, which are similar to other XML The patch only ignores these lines and bypass the problem, but I'm quite of your opinion that it's more elegant to avoid any such issues by skipping the entire Thank you very much for your attention and kindly advice. |
Got it, thanks for the detailed description. I'm re-visiting this and feel like the right way to solve this is actually to add a new Let me know what you think. |
When we used dynamometer to test HDFS performance, the test encountered a error when generate DataNode Block info. The error stack is
Error: java.io.IOException: State transition not allowed; from DEFAULT to FILE_WITH_REPLICATION at com.linkedin.dynamometer.blockgenerator.XMLParser.transitionTo(XMLParser.java:107) at com.linkedin.dynamometer.blockgenerator.XMLParser.parseLine(XMLParser.java:77) at com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:53) at com.linkedin.dynamometer.blockgenerator.XMLParserMapper.map(XMLParserMapper.java:26) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:151) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:828) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1690) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
After checking Fsimage xml and the source code, we find that XMLParser can not parse the lines correctly, these lines are like
<directive><id>8963</id><path>/user/somepath/path1</path><replication>3</replication><pool>cache_other_pool</pool><expiration><millis>1544454142310</millis><relatilve>false</relatilve></expiration> <directive><id>8964</id><path>/user/somepath/path2</path><replication>3</replication><pool>cache_hadoop-data_pool</pool><expiration><millis>1544497817686</millis><relatilve>false</relatilve></expiration> <directive><id>8965</id><path>/user/somepath/path3</path><replication>3</replication><pool>cache_hadoop-peisong_pool</pool><expiration><millis>1544451500312</millis><relatilve>false</relatilve></expiration> <directive><id>8967</id><path>/user/somepath/path4</path><replication>3</replication><pool>cache_other_pool</pool><expiration><millis>1544497602570</millis><relatilve>false</relatilve></expiration>
These fsimage xml lines are generated when HDFS Centralized Cache Management (CCM) feature is used.
So we add a patch to ignore the CCM fsimage xml lines when parsing xml.