Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomistic file extension #16

Draft
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

MSallermann
Copy link
Member

@MSallermann MSallermann commented Aug 19, 2021

Implements the format suggested in #15.

Includes a partial refactor of the parsing of the segment header keywords:

  • keyword/value pairs are now specified as their own pegtl rules, see keywords.hpp and atomistic_keywords.hpp
    • This allows one to be flexible regarding the shapes etc., which is useful for e.g for multiline keywords like the new 'basis'
    • Also makes it possible to transparently select, which keyword value pairs are allowed for which file format
  • Validation of keywords with regards to the mesh has been moved to a single pass, once the header is read.
  • Overall the refactor ended up quite verbose ... suggestions welcome.

Then the fileformat discussed in the issue has been implemented with some minor changes.

  • For the basis I chose the format:
# basis:
# 0 0 0
# 0.5 0 0
  • In compatibility mode, the version string reads:
# OOMMF OVF 2.0
##% AOVF_COMP 1.0
  • The lattice meshtype needs to be selected with an additional: #%% meshtype lattice, which overwrites any other meshtype
  • Extensive unit tests for the new format have been added.
  • On reading in a file, the fileformat is automatically detected and saved in a new member of ovf_file : ovf_extension_format.
 /* OVF file formats */
#define OVF_EXTENSION_FORMAT_OVF                0 // Standard OVF 2 format
#define OVF_EXTENSION_FORMAT_AOVF              1 // Atomistic extension, with new keywords
#define OVF_EXTENSION_FORMAT_AOVF_COMP  2 // Atomistic extension in compatibility format, (##% prefix for new keywords)
  • On writing a file, having set meshtype lattice will lead to an error if OVF_EXTENSION_FORMAT_OVF is used.

TODO

  • Fortran bindings
  • Maybe squash some commits to remove noise from the log
  • Formatting
  • General pass to clean up some rough edges
  • Check if performance has suffered
  • Check all edge cases, where unexpected behaviour may occur
  • Naming of some files variables should probably be improved
  • Bravais and basis vectors should probably be saved as doubles not floats
  • When using the plain OVF format ##% lines are not ignored

@MSallermann MSallermann requested a review from GPMueller August 19, 2021 14:06
@coveralls
Copy link

coveralls commented Aug 19, 2021

Coverage Status

Coverage increased (+1.4%) to 94.286% when pulling 32da6aa on MSallermann:atomistic_v2 into 6e3383e on spirit-code:master.

@codecov
Copy link

codecov bot commented Aug 19, 2021

Codecov Report

Merging #16 (32da6aa) into master (6e3383e) will increase coverage by 0.15%.
The diff coverage is 77.58%.

@@            Coverage Diff             @@
##           master      #16      +/-   ##
==========================================
+ Coverage   52.22%   52.37%   +0.15%     
==========================================
  Files           8       10       +2     
  Lines        1779     1814      +35     
==========================================
+ Hits          929      950      +21     
- Misses        850      864      +14     

Copy link
Member

@GPMueller GPMueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good so far

Comment on lines 31 to 88
////// bravaisa
// x
struct bravaisa_value_x : pegtl::pad<ovf::detail::parse::decimal_number, pegtl::blank>
{ };

template<>
struct kw_action< bravaisa_value_x >
{
template< typename Input >
static void apply( const Input& in, ovf_file & f, ovf_segment & segment)
{
segment.bravaisa[0] = std::stof(in.string());
}
};

// y
struct bravaisa_value_y : pegtl::pad<decimal_number, pegtl::blank>
{ };

template<>
struct kw_action< bravaisa_value_y >
{
template< typename Input >
static void apply( const Input& in, ovf_file & f, ovf_segment & segment)
{
segment.bravaisa[1] = std::stof(in.string());
}
};

// z
struct bravaisa_value_z : pegtl::pad<decimal_number, pegtl::blank>
{ };

template<>
struct kw_action< bravaisa_value_z >
{
template< typename Input >
static void apply( const Input& in, ovf_file & f, ovf_segment & segment)
{
segment.bravaisa[2] = std::stof(in.string());
}
};

struct bravaisa_value : pegtl::seq<bravaisa_value_x, bravaisa_value_y, bravaisa_value_z, end_kw_value>
{ };

template<>
struct kw_action< bravaisa_value >
{
template< typename Input >
static void apply( const Input& in, ovf_file & f, ovf_segment & segment)
{
f._state->found_bravaisa = true;
}
};

struct bravaisa : TAO_PEGTL_ISTRING("bravaisa")
{ };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why split the vector into distinct rules instead of reading in a vector-3?

Comment on lines 182 to 187
else if( file.version == 1 )
{
// TODO...
file._state->message_latest = fmt::format(
"libovf segment_header: OVF version \'{}\' in file \'{}\' is not supported...",
file.file_name, file.version);
return OVF_INVALID;
if(file.ovf_extension_format == OVF_EXTENSION_FORMAT_AOVF)
success = pegtl::parse< pegtl::plus<v2::segment_header<v2::aovf_keyword_value_line>>, v2::ovf_segment_header_action, v2::ovf_segment_header_control >( in, file, segment );
else if (file.ovf_extension_format == OVF_EXTENSION_FORMAT_AOVF_COMP)
success = pegtl::parse< pegtl::plus<v2::segment_header<v2::caovf_keyword_value_line>>, v2::ovf_segment_header_action, v2::ovf_segment_header_control >( in, file, segment );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it confusing if the extension formats are discerned here, inside if( file.version == 1), as they are an extension of the version 2 format?

output_to_file += fmt::format( "# meshtype: {}\n", meshtype );

int n_rows = 0;
if( meshtype == "rectangular" )
if( meshtype == "rectangular" && std::string(segment->meshtype) != "lattice")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the second half should be just meshtype != "lattice"

Comment on lines 68 to 69
segment->title = const_cast<char *>("ovf test title - append");
segment->comment = const_cast<char *>("test append");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const_casting here could yield undefined behaviour in other places, see https://en.cppreference.com/w/cpp/language/string_literal
It is used here in Spirit as well...
I'm not sure what the right way to assign to these char *s would be in C++...

Seems like a wrapper like this might be the way to go:

template <size_t N>
char * ptr_from_literal(const char (&str)[N])
{
    char* ptr = new char[N];
    memcpy(ptr, str, N);
    return ptr;
}

(got it from https://www.py4u.net/discuss/101381 )

In fact, if we want to avoid a small memory leak in Spirit, we need to

if( segment->title )
    delete segment->title;
....

etc before each assignment to any of the char *s.

Moritz added 6 commits August 27, 2021 08:41
- magic "%" char now only changes behaviour in AOVF_COMP format
- parse_rules are now templated with a `Version` enum
- Some accompanying changes in parse
- simplified parsing bravais vectors in atomistic_keywords with it
Replaced it with strdup("literal") throughout the code.
Copy link
Member

@GPMueller GPMueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In several files, you updated the formatting. Please apply clang-format to all sources (including tests) to make it consistent again.

Regarding the naming, I find it hard to say whether it would be better to go with the classical ovf-style naming for the new aovf keywords, or to start using "nicer" names.
If you want to provide an incompatible aovf format, it might be worth considering defining an entirely new, consistent set of keywords.

The changes are too many to really review, so I hope your tests are thorough enough ;)

}
};

struct bravaisa_value : pegtl::seq<bravaisa_value_x, bravaisa_value_y, bravaisa_value_z, end_kw_value>
////// bravaisa
struct bravaisa : TAO_PEGTL_ISTRING("bravaisa")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "bravais_a" would be a more readable name than "bravaisa"?

struct bravaisc : TAO_PEGTL_ISTRING("bravaisc")
{ };


////// ncellpoints
struct ncellpoints : TAO_PEGTL_ISTRING("ncellpoints")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "n_cell_points" would be more readable?

Comment on lines +191 to +193
float temp[3];
read_vector( in, temp );
f._state->_basis.push_back( {temp[0], temp[1], temp[2]} );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::vector<std::array<float, 3>> _basis = std::vector<std::array<float, 3>>(0);

I therefore believe it would be more canonical to write

auto atom = f._state->_basis.emplace_back();
read_vector( in, (*atom).data() );

or

std::array<float,3> atom;
read_vector( in, atom.data() );
f._state->_basis.push_back(atom);

f._state->found_basis = true;
if( segment.ncellpoints != f._state->_cur_basis_line ) // Need to make sure that the basis array is already allocated
// Allocate and data in segment struct and copy
segment.basis = new float[3 * f._state->_basis.size()];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid memory leaks, lines like this should probably be preceded by

if( segment.basis )
    delete segment.basis;

@@ -70,7 +70,7 @@ namespace parse
if( success )
{
success = false;
if( file.version == 2 )
if( file.version == 2 || (file.version == 1 && (file.ovf_extension_format == OVF_EXTENSION_FORMAT_AOVF || file.ovf_extension_format == OVF_EXTENSION_FORMAT_AOVF_COMP )))
Copy link
Member

@GPMueller GPMueller Sep 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the OVF_EXTENSION_FORMAT_AOVF_COMP case be for file.version == 2?

if( file.version == 2
    || (file.version == 2 && file.ovf_extension_format == OVF_EXTENSION_FORMAT_AOVF_COMP)
    || (file.version == 1 && file.ovf_extension_format == OVF_EXTENSION_FORMAT_AOVF) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants