Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring Proposal #166

Open
4 tasks
AndreG-P opened this issue Jan 29, 2018 · 6 comments
Open
4 tasks

Refactoring Proposal #166

AndreG-P opened this issue Jan 29, 2018 · 6 comments

Comments

@AndreG-P
Copy link
Member

This is a general plan to refactor the mathosphere project.

Open Problems and Possible Solutions

Problem Description Proposed Solution
Git Conflicts with Submodules Usually git submodules only refer to the HEAD version. (not on a specific branch). When this submodule gets updated constantly it is a mess and every change in the submodule needs a new commit in mathosphere parent project. This can produce git conflicts just with versions in the submodules. Possibly avoid submodules or change all submodules to a specific branch (master branch) which is possible since Git 1.8.2.. Another solution might be to tweak .gitignore?

Tasks to do before start the refactoring process

  • Get rid of snapshot versions (no more snapshot imports and no more own versions via snapshots in the master-branch)
  • Clean the branches of this project (some of the branches are over 3 years old)

Other open tasks after a successful re-factorization

  • Update all classes to handle new gold standard
  • Comprehensive documentation (java doc at least for all public functions)

Refactoring Overview

The aim of this new structure is mainly to provide a comprehensive mathosphere API.

mathosphere
├─ lib   <-- Submodules
├─ config    <-- all configs
├─ documentation    <-- a complete documentation and overview of all modules
├─ modules
│   ├─ mathosphere-core    <-- cleaned core, depends on most (maybe all) other mvn-submodules
│   │   └─ ... TODO
│   ├─ mathosphere-utils     <-- a collection of useful utilities (use annotations for static classes) almost everything is public here!
│   │   ├─ utils-pojo    <-- collection of all pojos
│   │   ├─ utils-math    <-- math collection (such as distances calculations or others)
│   │   ├─ utils-xmml    <-- XML and MML utils (depends on OR integrates MathMLTools)
│   │   └─ utils-gold    <-- Utils for the gold standard
│   ├─ mathosphere-basex              <-- ?
│   ├─ mathosphere-evaluation       <-- Evaluation algorithms for gold standard
│   └─ mathosphere-restd               <-- ?
├─ README  <-- Readme quick overview, use wiki for detailed explanations such
... etc.
@physikerwelt
Copy link
Member

@AndreG-P can you estimate how large the effort for this refactoring project would be. I would assume that it is possible to move the files in one afternoon. In that case, we should do it. Otherwise, I'm a bit skeptical.
Moreover, I wonder if it would be better to make mathosphere much smaller. i.e. remove modules entirely and put them to the respective repos. In particular we could argue to merge mathosphere utils with mathmltools. The branches, however, contain still good code that should be preserved somehow. Can you create a new bug for
Get rid of snapshot versions (no more snapshot imports and no more own versions via snapshots in the master-branch) which has a list of the projects that should be mavenized. Thereafter I voulunteer to do the job

@physikerwelt
Copy link
Member

@AndreG-P I think it's better to assign only one person per task.

@AndreG-P
Copy link
Member Author

@physikerwelt
I would assume the same. Most of the work would renaming and moving files. However, the snapshot dependency problematic might be more tough.

I'm not sure about the multiple repository approach. We should put code that belongs together in one repository. For example, MathMLTools, MathMLSim, MathMLConverters is confusing and easily cause code duplications. I would prefer one repository (MathMLTools) that contains submodules for specific tasks (Similarity Calculations, Conversions, etc). Having one parent repository might be also helpful to avoid code duplications, such as in all those MathML-Repos.

However, I think the first most important path is a better naming convention. Such as apache-commons is split into apache-io, apache-cli, apache-lang and so on. Providing the same logic would be very useful to find and extend existing code.

@vstange
Copy link
Member

vstange commented Jan 31, 2018

My first thought is - do we want to delete the following directories:

  • mlp
  • flinkMLtest - the scala class is not referenced anywhere and seems to be a leftover local test class from leonard.

and rename the module restd to restx ? Was this a typo?

@AndreG-P
Copy link
Member Author

AndreG-P commented Feb 2, 2018

I would definitely delete mlp. I also checked flinkMLtest and it really looks like just a small test. So I would delete this also. Maybe we should move the current state to a seperate branch to keep it "alive".

The rest I would refactor as discussed in the very first issue comment.

@AndreG-P
Copy link
Member Author

AndreG-P commented Feb 2, 2018

In addition, we dont need utils-xmml, this should be part of the new MathMLTools repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants