[BUG] JVM Fatal Error During Video Ingestion on Cottontail Backend (141 videos consistently) #115

flurinB · 2024-09-19T11:47:24Z

Description

When ingesting videos into the nmr-backend (cottontail version), the JVM runs into a fatal error after a certain amount of videos. In the tests it always was 141 videos, while we always used the same videos in the same order:

In the postgresql version of the backend, the problem seemed to get solved by using the cashedContentFactory and / or ingesting the videos in chunks (although it proceeds to crash later on). This does not seem to work in the cottontail version. The following plots show the JVM memory over time up until the crash happened. The values were optained using the "jstat -gc $PID" command, where the PID is the process id of the JVM process. The Measurements were taken with using the cashedContentFactory, instead of the default InMemoryContentFactory

CORRECTION: inserting the videos in chunks DOES seem to work. The optimum I found was by taking a 10 minute break every 75 videos.

CCSC: Compressed class space capacity (kB)

CCSU: Compressed class space used (kB).

?

EC: Current eden space capacity (kB).

EU: Eden space utilization (kB).

FGC: Number of full GC events.

FGCT: Full garbage collection time.

GCT: Total garbage collection time.

MC: Metaspace capacity (kB).

MU: Metacspace utilization (kB).

OC: Current old space capacity (kB).

OU: Old space utilization (kB).

S0C: Current survivor space 0 capacity (kB).

S0U: Survivor space 0 utilization (kB).

S1C: Current survivor space 1 capacity (kB).

S1U: Survivor space 1 utilization (kB).

YGC: Number of young generation GC events.

YGCT: Young generation garbage collection time.

The ingested videos are part of the V3C collection, speciffically these files (in this order):
<11565.mp4,09363.mp4,15238.mp4,14120.mp4,05462.mp4,08443.mp4,08269.mp4,06631.mp4,10837.mp4,13252.mp4,17130.mp4,11891.mp4,15743.mp4,12064.mp4,07480.mp4,11825.mp4,05907.mp4,05988.mp4,07252.mp4,16087.mp4,12552.mp4,10767.mp4,04343.mp4,07037.mp4,01392.mp4,11592.mp4,03316.mp4,12106.mp4,15498.mp4,06179.mp4,09130.mp4,03390.mp4,00836.mp4,14499.mp4,17023.mp4,05372.mp4,14579.mp4,15898.mp4,05898.mp4,12263.mp4,04095.mp4,04695.mp4,02452.mp4,04685.mp4,16447.mp4,07044.mp4,02060.mp4,04563.mp4,15830.mp4,01280.mp4,08548.mp4,16064.mp4,10831.mp4,03546.mp4,10318.mp4,13986.mp4,07931.mp4,09304.mp4,07826.mp4,03190.mp4,11586.mp4,14190.mp4,13795.mp4,06869.mp4,17112.mp4,16242.mp4,05553.mp4,08447.mp4,03812.mp4,02957.mp4,12889.mp4,08455.mp4,11470.mp4,06624.mp4,04770.mp4,12460.mp4,14029.mp4,13065.mp4,01798.mp4,06834.mp4,05387.mp4,15484.mp4,12234.mp4,16542.mp4,12901.mp4,02194.mp4,10575.mp4,01687.mp4,04970.mp4,08655.mp4,10439.mp4,15720.mp4,05576.mp4,11562.mp4,17161.mp4,10699.mp4,06145.mp4,00219.mp4,04103.mp4,00186.mp4,13011.mp4,00176.mp4,03531.mp4,11608.mp4,01217.mp4,05944.mp4,05746.mp4,13845.mp4,00872.mp4,07471.mp4,14858.mp4,07676.mp4,13286.mp4,08709.mp4,14191.mp4,16627.mp4,00067.mp4,09122.mp4,00509.mp4,14162.mp4,13614.mp4,13301.mp4,03833.mp4,10498.mp4,09216.mp4,16135.mp4,04261.mp4,02349.mp4,07220.mp4,17181.mp4,05684.mp4,09131.mp4,11283.mp4,12974.mp4,06233.mp4,08984.mp4,01479.mp4,13209.mp4,08411.mp4,00066.mp4,10889.mp4>

Following are the pipeline-config files:
IMAGE.json
MESH.json
VIDEO.json

As well as the Config files within the other config files within the backend (would all be .kt, but is not supported in a github issue):
APIConfig.txt
Config.txt
MinioConfig.txt

net-cscience-raphael · 2024-09-20T07:51:35Z

Please specify the video collection used and provide the configuration of the schema and pipeline.

flurinB · 2024-09-25T12:58:53Z

Please specify the video collection used and provide the configuration of the schema and pipeline.

The requested information is now in the issue description, as well as a correction (chunk sized insertion does seem to help).

ppanopticon · 2024-09-26T13:21:03Z

Given the error, I doubt that this has anything to do with Cottontail DB vs. PostgreSQL. Much rather, I suspect a native memory handling problem in the javacv library to be the culprit.

That being said: Is it always the same video that causes this error? Or is it just a thing that happens after a while?

ppanopticon · 2024-09-26T13:28:13Z

The concrete error message and condition is documented in the JavaCV repository:

The suggested solution involves manual deallocation of pointers by using a PointerScope.

flurinB added the bug Something isn't working label Sep 19, 2024

net-cscience-raphael assigned net-cscience-raphael and unassigned net-cscience-raphael Sep 20, 2024

ppanopticon self-assigned this Sep 26, 2024

ppanopticon added this to the Release Candidate #2 milestone Sep 26, 2024

net-cscience-raphael added a commit that referenced this issue Sep 30, 2024

adds PointerScope to address issue #115

a648841

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] JVM Fatal Error During Video Ingestion on Cottontail Backend (141 videos consistently) #115

[BUG] JVM Fatal Error During Video Ingestion on Cottontail Backend (141 videos consistently) #115

flurinB commented Sep 19, 2024 •

edited

Loading

net-cscience-raphael commented Sep 20, 2024

flurinB commented Sep 25, 2024

ppanopticon commented Sep 26, 2024 •

edited

Loading

ppanopticon commented Sep 26, 2024 •

edited

Loading

[BUG] JVM Fatal Error During Video Ingestion on Cottontail Backend (141 videos consistently) #115

[BUG] JVM Fatal Error During Video Ingestion on Cottontail Backend (141 videos consistently) #115

Comments

flurinB commented Sep 19, 2024 • edited Loading

Description

CCSC: Compressed class space capacity (kB)

CCSU: Compressed class space used (kB).

?

?

EC: Current eden space capacity (kB).

EU: Eden space utilization (kB).

FGC: Number of full GC events.

FGCT: Full garbage collection time.

GCT: Total garbage collection time.

MC: Metaspace capacity (kB).

MU: Metacspace utilization (kB).

OC: Current old space capacity (kB).

OU: Old space utilization (kB).

S0C: Current survivor space 0 capacity (kB).

S0U: Survivor space 0 utilization (kB).

S1C: Current survivor space 1 capacity (kB).

S1U: Survivor space 1 utilization (kB).

YGC: Number of young generation GC events.

YGCT: Young generation garbage collection time.

net-cscience-raphael commented Sep 20, 2024

flurinB commented Sep 25, 2024

ppanopticon commented Sep 26, 2024 • edited Loading

ppanopticon commented Sep 26, 2024 • edited Loading

flurinB commented Sep 19, 2024 •

edited

Loading

ppanopticon commented Sep 26, 2024 •

edited

Loading

ppanopticon commented Sep 26, 2024 •

edited

Loading