Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flagser_memory support #61

Open
davidpitl opened this issue Feb 10, 2021 · 13 comments · May be fixed by #67
Open

flagser_memory support #61

davidpitl opened this issue Feb 10, 2021 · 13 comments · May be fixed by #67
Assignees
Labels
enhancement New feature or request

Comments

@davidpitl
Copy link

davidpitl commented Feb 10, 2021

Would it be possible to support flagser_memory to be able to use it from Giotto-TDA?

@ulupo
Copy link
Contributor

ulupo commented Feb 10, 2021

Hi @davidpitl! This looks like a duplicate of #57, unless I misunderstood. Please let me know if you agree/disagree and feel free to follow/add to the discussion there. I wonder what kind of support you are hoping to see in giotto-tda; at the moment, giotto-tda does not expose data structures for filtered simplicial complexes to the user. (You can reply in the other issue if this one is a duplicate.)

@davidpitl
Copy link
Author

I simply mean having the possibility for Giotto to use the Flagser in memory calculation implementation:
https://github.com/luetge/flagser/blob/master/docs/documentation_flagser.pdf
it talks about a faster calculation using flagser-memory

@ulupo
Copy link
Contributor

ulupo commented Feb 15, 2021

Thanks for the clarification, and sorry for the confusion. You are right that we missed/forgot about this feature from flagser. We will look into it! @MonkeyBreaker this might be of interest to you too.

@ulupo ulupo added the enhancement New feature or request label Feb 15, 2021
@ulupo
Copy link
Contributor

ulupo commented Feb 15, 2021

Actually, I have now looked into the flagser code and it does seem that "in memory" calculation means "keeping the flag complex in memory", so indeed this is another side of the same issue as in #57, though not a strict duplicate as you are only asking to have a performance feature without access ti the data structure necessarily

@MonkeyBreaker
Copy link
Collaborator

Hi,

Sorry for only replying only now.

Thank you @davidpitl for suggesting adding this feature.
One easy way to add it to pyflagser would be to generate a binary that compile with KEEP_FLAG_COMPLEX_IN_MEMORY definition enable.
And from the python, adding a sort of boolean as input parameter to choose to run in memory or "normal".

This doesn't seem in my opinion to do a lot of changes, but I didn't work with the in memory version, some points needs to be verified before enabling this on pyflagser:

  1. Does it output expected barcodes ?
  2. Does it work on all OS (Windows, Mac, Linux) ?
  3. Measure how much faster it is to run the memory version, how much more memory it consumes, etc.

I expect the first point to be a quick verification, but we need to be sure it does work. The second point about if it works on all architectures, this is important for us, because we want that all users can use our package, independent of the OS used.

For the third point, if the two previous are good, I can do it myself, it is just to have a raw estimation of performances.

Unfortunately, at the moment I have little time to work with pyflagser, Issue #57 is still pending. But in my opinion, this issue could be done before #57. Because if I understood correctly, we only want the in memory version for performance, right ?

Best,
Julián

@davidpitl
Copy link
Author

davidpitl commented Feb 23, 2021

  1. Does it output expected barcodes ?
    Looking at the Flagser tests it seems that the output is similar.
  1. Does it work on all OS (Windows, Mac, Linux) ?
    See https://github.com/luetge/flagser/blob/master/src/flagser.cpp
  1. Measure how much faster it is to run the memory version, how much more memory it consumes, etc.
    Running test on our server:
    time ./flagser-memory
    real 0m0,102s
    user 0m0,113s
    sys 0m0,080s
    time ./flagser
    real 2m16,709s
    user 2m42,685s
    sys 0m1,937s

@MonkeyBreaker
Copy link
Collaborator

Indeed, the runtime are an order of magnitude different ! Thank you for the information.

Could you run the same dataset but with a different command (if you're on Linux) in order to get an idea of the memory consumption ?
Command: /usr/bin/time -v ... this should give more details on the run and also output the peak memory consumption.

e.g.

Command being timed: "..."
        User time (seconds): 1.16
        System time (seconds): 0.24
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.41
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 611576 <= this value interest me
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 1
        Minor (reclaiming a frame) page faults: 283293
        Voluntary context switches: 4
        Involuntary context switches: 16
        Swaps: 0
        File system inputs: 8216
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

If the output are similar, that's already a good point.

Again, @davidpitl thank you for all the work you can help us :)

Best,
Julián

@davidpitl
Copy link
Author

davidpitl commented Feb 24, 2021

The measurements have been made on a server that is currently very busy with other tasks that we cannot stop at this moment. I think you should repeat them. The results are:

/usr/bin/time -v ./test_flagser_memory
Command being timed: "./test_flagser_memory"
User time (seconds): 0.12
System time (seconds): 0.07
Percent of CPU this job got: 187%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.10
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 4820
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 896
Voluntary context switches: 593
Involuntary context switches: 1
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0

/usr/bin/time -v ./test_flagser
Command being timed: "./test_flagser"
User time (seconds): 154.52
System time (seconds): 1.10
Percent of CPU this job got: 122%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:07.48
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 557484
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 394732
Voluntary context switches: 27893
Involuntary context switches: 1751
Swaps: 0
File system inputs: 0
File system outputs: 8
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status:

@MonkeyBreaker
Copy link
Collaborator

Thank you for the measurements.
I'm just a bit confuse, I would have expect the contrary in terms of memory consumption => the in memory version should consume more memory compared to the "normal" version.

After looking at test_flagser.cpp and test_flagser_memory.cpp, the later does not test all the dataset, that's why the runtime are different.

Let me run both test but on the same datasets.
I'll post with updated results.

@MonkeyBreaker
Copy link
Collaborator

MonkeyBreaker commented Feb 24, 2021

I made some measurements with the test:

About small and big in the following tables, it's related to the function called inside of test_flagser.cpp and test_flagser_memory.cpp files.

  • small => run_all(false);
  • big => run_all(true);

runtime [s]

normal (small) normal (big) in memory (small) in memory (big)
0.05 75 0.05 72

Memory [GB]

normal (small) normal (big) in memory (small) in memory (big)
0.007 0.57 0.007 0.53

The difference in runtime is around 4% faster when in memory is enable. And memory consumption is similar.

I'll create an issue to discuss this results directly in flagser repository.
If the author confirms that bigger speed-ups are achievable with the in memory version. Then we can consider integrating it inside of pyflagser.

@ulupo what do you think about the results measured ?

Best,
Julián

@davidpitl
Copy link
Author

Seems like a very small size test , doesn't it? Only 0.6 Gb of memory usage.

@MonkeyBreaker
Copy link
Collaborator

I think you're right, it's nothing big.
I don't have other dataset right now to test, but let me come back when I have one that requires much more memory.

@MonkeyBreaker
Copy link
Collaborator

Well, I found out that the test for in memory was not calling the expected data structure, it was using the same one as "normal". Meaning the measure where always using the same data structure.

The results after fixing this:

runtime [s]

normal (small) normal (big) in memory (small) in memory (big)
0.05 75 0.05 65

Memory [GB]

normal (small) normal (big) in memory (small) in memory (big)
0.007 0.57 0.007 1.4

Now we obtain around 13% speed up at run time when using the in memory version, but at the cost of consuming at peak three times more memory.

In my opinion it's interesting to integrate it in pyflagser. The problem is how to do this, because in memory is a compile definition, we should generate a binary for when the flag is enable and disable. And, currently we already do this in order to support coefficients > 2, meaning that we should then support 4 different binaries. I don't think it's the correct approach, let me think a bit on how to do this in a better way.

But, on the meantime, if you needs to test the in memory version with pyflagser, you can build from source.

  1. git clone https://github.com/giotto-ai/pyflagser.git
  2. In the CMakeLists.txt file, change the following lines as follow:
    • from to target_compile_definitions(flagser_pybind PRIVATE RETRIEVE_PERSISTENCE=1 KEEP_FLAG_COMPLEX_IN_MEMORY=1)
    • and from to target_compile_definitions(flagser_coeff_pybind PRIVATE RETRIEVE_PERSISTENCE=1 USE_COEFFICIENTS=1 KEEP_FLAG_COMPLEX_IN_MEMORY=1)
  3. python -m pip install -e .

Let me know if you encounter any issues.

Best,
julián

@reds-heig reds-heig linked a pull request Mar 30, 2021 that will close this issue
@MonkeyBreaker MonkeyBreaker linked a pull request Aug 16, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants