Skip to content
This repository has been archived by the owner on Dec 24, 2019. It is now read-only.

task analysis seems stuck #89

Open
valeriocos opened this issue Aug 16, 2019 · 20 comments
Open

task analysis seems stuck #89

valeriocos opened this issue Aug 16, 2019 · 20 comments

Comments

@valeriocos
Copy link
Member

I'm trying to analyse a the repo https://github.com/chaoss/grimoirelab-perceval. I added a new github token and defined a task which included some metrics to analyse issue tracker. Since this morning the oss-app seems stucked on org.eclipse.scava.metricprovider.trans.plaintextprocessing.PlainTextProcessingTransMetricProvider.

I attach the log below:

INFO  [MetricListExecutor (grimoirelabperceval, 20180112)] (07:49:02): Ending execution AnalysisTask 'grimoirelabperceval:perceval' with MetricExecution 'org.eclipse.scava.metricprovider.historic.bugs.opentime.OpenTimeHistoricMetricProvider' done in 100 ms
INFO  [ProjectExecutor (w1:grimoirelabperceval:perceval)] (07:49:02): Executing factoids.
INFO  [MetricListExecutor (grimoirelabperceval, 20180112)] (07:49:02): Starting execution AnalysisTask 'grimoirelabperceval:perceval' with MetricExecution 'org.eclipse.scava.factoid.bugs.channelusage.BugsChannelUsageFactoid'
INFO  [MetricListExecutor (grimoirelabperceval, 20180112)] (07:49:02): Adding dependencies to metricProvider 'org.eclipse.scava.factoid.bugs.channelusage.BugsChannelUsageFactoid' into project 'grimoirelabperceval'
INFO  [MetricListExecutor (grimoirelabperceval, 20180112)] (07:49:02): Added dependencies to metricProvider 'org.eclipse.scava.factoid.bugs.channelusage.BugsChannelUsageFactoid' into project 'grimoirelabperceval'
INFO  [AnalysisSchedulingService] (07:49:02): Starting MetricExecution 'org.eclipse.scava.factoid.bugs.channelusage.BugsChannelUsageFactoid'
Aug 16, 2019 7:49:02 AM org.restlet.engine.log.LogFilter afterHandle
INFO: 2019-08-16	07:49:02	172.30.0.9	-	172.30.0.6	8182	GET	/analysis/workers	-	200	9320	0	29	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
Aug 16, 2019 7:49:22 AM org.restlet.engine.log.LogFilter afterHandle
INFO: 2019-08-16	07:49:22	172.30.0.9	-	172.30.0.6	8182	GET	/analysis/workers	-	200	9312	0	27	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
	usedNobc.getBugTrackerData().size(): 1
Entering Index Preparation
0 patches captured
	usedNobc.getBugTrackerData().size(): 1
Started org.eclipse.scava.metricprovider.trans.plaintextprocessing.PlainTextProcessingTransMetricProvider
Started org.eclipse.scava.metricprovider.trans.detectingcode.DetectingCodeTransMetricProvider
Started org.eclipse.scava.metricprovider.trans.sentimentclassification.SentimentClassificationTransMetricProvider
Started org.eclipse.scava.metricprovider.trans.emotionclassification.EmotionClassificationTransMetricProvider
00:00:00,000	00:00:00,000	-1	Started org.eclipse.scava.metricprovider.trans.severityclassification.SeverityClassificationTransMetricProvider
00:00:02,431	00:00:02,431	0	prepared bug comments
00:00:00,020	00:00:02,451	0	prepared newsgroup articles
00:00:00,000	00:00:02,451	0	nothing to classify
Entering Index Preparation
0 patches captured
	usedNobc.getBugTrackerData().size(): 1
	usedNobc.getBugTrackerData().size(): 1
Started org.eclipse.scava.metricprovider.trans.plaintextprocessing.PlainTextProcessingTransMetricProvider

Any workarounds to suggest @Danny2097 @creat89 ? thanks

@valeriocos
Copy link
Member Author

The cockpit UI is freezed too (no projects, workers shown)

captura_19
captura_20

@creat89
Copy link
Contributor

creat89 commented Aug 16, 2019

Hello @valeriocos,

I don't see any reason for being stuck in the plain text extractor. Do you have any other stack trace? As well, even if the metric is frozen, the UI should still showing elements like the projects and the tasks. Maybe it has something to do with MongoDB, as the metric needs to store information there, and the UI retrieves the data regarding projects from MongoDB. So, maybe MongoDB is down?

@valeriocos
Copy link
Member Author

maybe, I'll check later, thanks

@valeriocos
Copy link
Member Author

valeriocos commented Aug 16, 2019

I re-did the process, it doesn't stop at the plain text extractor (I will edit the issue title), however the process doesn't finish (at least according to the status task).
All containers were created and up for 2 hours.

These are the steps I did:

docker system prune -a --volumes

comment the sections `dashb-importer` and `prosoul` in the docker-compose

docker-compose -f docker-compose-build.yml build --no-cache --parallel
docker-compose -f docker-compose-build.yml up

set a new github token

click on import project
set https://github.com/chaoss/grimoirelab-perceval/ as URL

create a task analysis
name: perceval
task type: single execution
start date: 2018-01-01
end date: 2018-12-31

select metrics:
Sentiment Classification
Request/Reply Classification
Plain Text Processing
index preparation
Emotion Classifier
Documentation plain text processor.
Documentation processing
Distinguishes between code and natural language
Number of new bugs
Emotions in bug comments
Number of bug comments
Bug header metadata
Overall Sentiment of Bugs
Number of emotions per day per bug tracker
Number Of Bug Comments Per Day

The last Worker's Heartbeat happened at 16/08/2019 21:58:31
captura_25

The containers logs are here (except kibiter):
logs.zip

The project perceval appears as Not defined:
captura_26

If I click on the button configure of perceval, this is what is shown (after a considerable delay):
captura_27

@creat89 could you try replicate the steps above or do you have a workaround to suggest? thanks

When trying to stop the docker-compose, I got this:

^CGracefully stopping... (press Ctrl+C again to force)
Stopping scava-deployment_admin-webapp_1  ... done
Stopping scava-deployment_kb-service_1    ... done
Stopping scava-deployment_api-server_1    ... done
Stopping scava-deployment_oss-app_1       ... 
Stopping scava-deployment_kibiter_1       ... done
Stopping scava-deployment_auth-server_1   ... done
Stopping scava-deployment_oss-db_1        ... error
Stopping scava-deployment_elasticsearch_1 ... error
Stopping scava-deployment_kb-db_1         ... done

ERROR: for scava-deployment_oss-app_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

I stopped the containers like this then:

slimbook@slimbook-KATANA:~/Escritorio/sources/scava-deployment$ docker ps
CONTAINER ID        IMAGE                                   COMMAND                  CREATED             STATUS              PORTS                              NAMES
dbf86d9a31c2        scava-deployment_oss-db                 "docker-entrypoint.s…"   2 hours ago         Up 2 hours          0.0.0.0:27017->27017/tcp           scava-deployment_oss-db_1
00b62bc77808        acsdocker/elasticsearch:6.3.1-secured   "/docker-entrypoint.…"   2 hours ago         Up 2 hours          8000/tcp, 0.0.0.0:9200->9200/tcp   scava-deployment_elasticsearch_1
slimbook@slimbook-KATANA:~/Escritorio/sources/scava-deployment$ docker stop dbf86d9a31c2 00b62bc77808
dbf86d9a31c2
00b62bc77808

I did docker-compose -f docker-compose-build.yml up.
The information about the project is now correctly shown
captura_28

The worker's heartbeat is updated too
captura_29

Nevertheless, the oss-app log traces don't seem to reflect that a calculation is ongoing:
oss-app-next.zip

These are the logs of the oss-db:

rm: cannot remove '/var/log/mongodb.log': No such file or directory
Starting DB repair
Starting restore session
Waiting for mongo to initialize... (2 seconds so far)
Restoring (1/14)
Restoring (2/14)
Restoring (3/14)
Restoring (4/14)
Restoring (5/14)
Restoring (6/14)
Restoring (7/14)
Restoring (8/14)
Restoring (9/14)
Restoring (10/14)
Restoring (11/14)
Restoring (12/14)
Restoring (13/14)
Restoring (14/14)
killing process with pid: 34
Restore session closed
Starting listening session on port 27017
Starting DB repair
Starting restore session
Waiting for mongo to initialize... (2 seconds so far)
Waiting for mongo to initialize... (4 seconds so far)
Waiting for mongo to initialize... (6 seconds so far)
Waiting for mongo to initialize... (8 seconds so far)
Waiting for mongo to initialize... (10 seconds so far)
Waiting for mongo to initialize... (12 seconds so far)
Restoring (1/14)
Restoring (2/14)
Restoring (3/14)
Restoring (4/14)
Restoring (5/14)
Restoring (6/14)
Restoring (7/14)
Restoring (8/14)
Restoring (9/14)
Restoring (10/14)
Restoring (11/14)
Restoring (12/14)
Restoring (13/14)
Restoring (14/14)
killing process with pid: 47
Restore session closed
Starting listening session on port 27017

When trying to stop the docker-compose, I got these errors:

^CGracefully stopping... (press Ctrl+C again to force)
Stopping scava-deployment_admin-webapp_1  ... done
Stopping scava-deployment_api-server_1    ... 
Stopping scava-deployment_kb-service_1    ... done
Stopping scava-deployment_auth-server_1   ... error
Stopping scava-deployment_oss-app_1       ... error
Stopping scava-deployment_kibiter_1       ... 
Stopping scava-deployment_oss-db_1        ... error
Stopping scava-deployment_kb-db_1         ... done
Stopping scava-deployment_elasticsearch_1 ... error

ERROR: for scava-deployment_api-server_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)

ERROR: for scava-deployment_kibiter_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Note that Kibiter is currently not used, since the dashboard importer and prosoul are deactivated

@valeriocos valeriocos changed the title oss-app stucked on org.eclipse.scava.metricprovider.trans.plaintextprocessing.PlainTextProcessingTransMetricProvider task analysis seems stuck Aug 16, 2019
@valeriocos
Copy link
Member Author

valeriocos commented Aug 18, 2019

I performed the same steps listed at: #89 (comment) on https://github.com/chaoss/grimoirelab selecting only docker metrics.

The analysis stopped at 23% percent, and when trying to list the projects, I end up with the same screenshots at: #89 (comment)

  • Containers are up and running:
slimbook@slimbook-KATANA:~/docker$ docker ps
CONTAINER ID        IMAGE                                            COMMAND                  CREATED             STATUS              PORTS                                                      NAMES
be479811be2b        scava-deployment_admin-webapp                    "/bin/sh /startup.sh"    10 hours ago        Up 10 hours         0.0.0.0:5601->80/tcp                                       scava-deployment_admin-webapp_1
e9897448f0fa        scava-deployment_api-server                      "java -jar scava-api…"   10 hours ago        Up 10 hours         0.0.0.0:8086->8086/tcp                                     scava-deployment_api-server_1
446d6a639825        scava-deployment_kb-service                      "java -jar scava.kno…"   10 hours ago        Up 10 hours         0.0.0.0:8080->8080/tcp                                     scava-deployment_kb-service_1
32245b4f0e01        scava-deployment_auth-server                     "./wait-for-it.sh os…"   10 hours ago        Up 10 hours         0.0.0.0:8085->8085/tcp                                     scava-deployment_auth-server_1
4eaa3acca06c        scava-deployment_oss-app                         "./wait-for-it.sh os…"   10 hours ago        Up 10 hours         0.0.0.0:8182->8182/tcp, 0.0.0.0:8192->8192/tcp, 8183/tcp   scava-deployment_oss-app_1
3685384fe41b        acsdocker/grimoirelab-kibiter:crossminer-6.3.1   "/docker-entrypoint.…"   10 hours ago        Up 10 hours         0.0.0.0:80->5601/tcp                                       scava-deployment_kibiter_1
db99d501fa4f        scava-deployment_oss-db                          "docker-entrypoint.s…"   10 hours ago        Up 10 hours         0.0.0.0:27017->27017/tcp                                   scava-deployment_oss-db_1
92107f217bd2        scava-deployment_kb-db                           "docker-entrypoint.s…"   10 hours ago        Up 10 hours         0.0.0.0:27018->27017/tcp                                   scava-deployment_kb-db_1
6451d4144410        acsdocker/elasticsearch:6.3.1-secured            "/docker-entrypoint.…"   10 hours ago        Up 10 hours         8000/tcp, 0.0.0.0:9200->9200/tcp                           scava-deployment_elasticsearch_1
ERROR [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Exception thrown during metric provider execution (docker).
java.lang.ArrayIndexOutOfBoundsException: 1
	at org.eclipse.scava.platform.jadolint.model.Cmd.<init>(Cmd.java:65)
	at org.eclipse.scava.platform.jadolint.Jadolint.run(Jadolint.java:62)
	at org.eclipse.scava.metricprovider.trans.configuration.docker.smells.DockerTransMetricProvider.lambda$2(DockerTransMetricProvider.java:114)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
	at org.eclipse.scava.metricprovider.trans.configuration.docker.smells.DockerTransMetricProvider.measure(DockerTransMetricProvider.java:112)
	at org.eclipse.scava.metricprovider.trans.configuration.docker.smells.DockerTransMetricProvider.measure(DockerTransMetricProvider.java:1)
	at org.eclipse.scava.platform.osgi.analysis.MetricListExecutor.run(MetricListExecutor.java:103)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
INFO  [AnalysisSchedulingService] (01:24:34): Ending MetricExecution 'org.eclipse.scava.metricprovider.trans.configuration.docker.smells.DockerTransMetricProvider'
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Starting Metric Execution (newVersionDocker).
INFO  [AnalysisSchedulingService] (01:24:34): Ending MetricExecution 'org.eclipse.scava.metricprovider.trans.configuration.docker.smells.DockerTransMetricProvider' is done.
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Ending execution AnalysisTask 'grimoirelab:grimoirelab' with MetricExecution 'org.eclipse.scava.metricprovider.trans.configuration.docker.smells.DockerTransMetricProvider' done in 925 ms
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Started Metric Executed (newVersionDocker).
INFO  [AnalysisSchedulingService] (01:24:34): Ending MetricExecution 'org.eclipse.scava.metricprovider.trans.newversion.docker.NewVersionDockerTransMetricProvider'
INFO  [AnalysisSchedulingService] (01:24:34): Ending MetricExecution 'org.eclipse.scava.metricprovider.trans.newversion.docker.NewVersionDockerTransMetricProvider' is done.
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Ending execution AnalysisTask 'grimoirelab:grimoirelab' with MetricExecution 'org.eclipse.scava.metricprovider.trans.newversion.docker.NewVersionDockerTransMetricProvider' done in 290 ms
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Starting execution AnalysisTask 'grimoirelab:grimoirelab' with MetricExecution 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies'
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Adding dependencies to metricProvider 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies' into project 'grimoirelab'
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Added dependencies to metricProvider 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies' into project 'grimoirelab'
INFO  [AnalysisSchedulingService] (01:24:34): Starting MetricExecution 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies'
INFO  [AnalysisSchedulingService] (01:24:34): Update the worker 'w1' heartBeat Sun Aug 18 01:24:34 UTC 2019'
INFO  [AnalysisSchedulingService] (01:24:34): Starting MetricExecution 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies' is done.
INFO  [AnalysisSchedulingService] (01:24:34): Starting find out MetricExecution 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies' on Project 'grimoirelab'
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Starting Metric Execution (HistoricDockerDependencies).
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Started Metric Executed (HistoricDockerDependencies).
INFO  [AnalysisSchedulingService] (01:24:34): Ending MetricExecution 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies'
INFO  [AnalysisSchedulingService] (01:24:34): Ending MetricExecution 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies' is done.
INFO  [MetricListExecutor (grimoirelab, 20180314)] (01:24:34): Ending execution AnalysisTask 'grimoirelab:grimoirelab' with MetricExecution 'org.eclipse.scava.metricprovider.historic.configuration.docker.dependencies' done in 4 ms
WARN  [ProjectExecutor (w1:grimoirelab:grimoirelab)] (01:24:34): Project in error state. Resuming execution.
INFO  [ProjectExecutor (w1:grimoirelab:grimoirelab)] (01:24:34): Date 20180314 Task Execution ( grimoirelab:grimoirelab completed in 10435 ms )
INFO  [ProjectExecutor (w1:grimoirelab:grimoirelab)] (01:24:34): Project grimoirelabTask execution ( grimoirelab:grimoirelab : Date 20180315 )
INFO  [AnalysisSchedulingService] (01:24:34): Starting new daily execution AnalysisTask 'grimoirelab:grimoirelab'
INFO  [AnalysisSchedulingService] (01:24:34): Starting new daily execution AnalysisTask 'grimoirelab:grimoirelab' is done.
INFO  [ProjectExecutor (w1:grimoirelab:grimoirelab)] (01:24:34): Date: 20180315, project: grimoirelab
INFO  [ProjectDelta (grimoirelab,20180315)] (01:24:34): Creating Delta
  • oss-app is not working, http://localhost:8182/ is loading eternally. Nevertheless, if I enter in a container (scava-deployment_kb-db) and ping oss-app (apt-get update && apt-get install -y iputils-ping), the container is reachable:
root@92107f217bd2:/home# ping oss-app
PING oss-app (172.18.0.7) 56(84) bytes of data.
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=1 ttl=64 time=0.115 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=2 ttl=64 time=0.045 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=3 ttl=64 time=0.408 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=4 ttl=64 time=0.320 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=5 ttl=64 time=0.068 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=6 ttl=64 time=0.099 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=7 ttl=64 time=0.336 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=8 ttl=64 time=0.048 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=9 ttl=64 time=0.112 ms
64 bytes from scava-deployment_oss-app_1.scava-deployment_default (172.18.0.7): icmp_seq=10 ttl=64 time=0.067 ms
^C
--- oss-app ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9193ms
rtt min/avg/max/mdev = 0.045/0.161/0.408/0.130 ms
  • oss-db seems to work fine (I entered to the oss-app container and pinged oss-db)
slimbook@slimbook-KATANA:~/docker$ docker exec -it 4eaa3acca06c bash
root@4eaa3acca06c:/ossmeter# ping oss-db
PING oss-db (172.18.0.4) 56(84) bytes of data.
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=1 ttl=64 time=0.079 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=2 ttl=64 time=0.053 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=3 ttl=64 time=0.072 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=4 ttl=64 time=0.412 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=5 ttl=64 time=0.122 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=6 ttl=64 time=0.056 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=7 ttl=64 time=0.051 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=8 ttl=64 time=0.113 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=9 ttl=64 time=0.056 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=10 ttl=64 time=0.059 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=11 ttl=64 time=0.222 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=12 ttl=64 time=0.110 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=13 ttl=64 time=0.046 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=14 ttl=64 time=0.047 ms
64 bytes from scava-deployment_oss-db_1.scava-deployment_default (172.18.0.4): icmp_seq=15 ttl=64 time=0.360 ms
^C
--- oss-db ping statistics ---
15 packets transmitted, 15 received, 0% packet loss, time 14316ms
rtt min/avg/max/mdev = 0.046/0.123/0.412/0.113 ms
  • When trying to stop the docker-compose, I got these errors (note that Kibiter is not used by any containers since prosoul and the dashboard importer are deactivated):
^CGracefully stopping... (press Ctrl+C again to force)
Stopping scava-deployment_admin-webapp_1  ... done
Stopping scava-deployment_api-server_1    ... 
Stopping scava-deployment_kb-service_1    ... done
Stopping scava-deployment_auth-server_1   ... error
Stopping scava-deployment_oss-app_1       ... error
Stopping scava-deployment_kibiter_1       ... 
Stopping scava-deployment_oss-db_1        ... error
Stopping scava-deployment_kb-db_1         ... done
Stopping scava-deployment_elasticsearch_1 ... error

ERROR: for scava-deployment_api-server_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)

ERROR: for scava-deployment_kibiter_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

@valeriocos
Copy link
Member Author

valeriocos commented Aug 18, 2019

I performed the same steps listed at: #89 (comment) on https://gitlab.com/rdgawas/docker-jmeter selecting only docker metrics. The idea is to check whether the error may be related to github fetching processes.

Also in this case the analysis is blocked

^CGracefully stopping... (press Ctrl+C again to force)
Stopping scava-deployment_admin-webapp_1  ... done
Stopping scava-deployment_api-server_1    ... done
Stopping scava-deployment_kb-service_1    ... done
Stopping scava-deployment_auth-server_1   ... done
Stopping scava-deployment_oss-app_1       ... 
Stopping scava-deployment_kibiter_1       ... done
Stopping scava-deployment_oss-db_1        ... error
Stopping scava-deployment_elasticsearch_1 ... error
Stopping scava-deployment_kb-db_1         ... done

ERROR: for scava-deployment_oss-app_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

As commented by @creat89 at: #89 (comment), it is possible that oss-db gets stucked.
Can someone have a look at this issue and provide a fix or a workaround? thanks

@creat89
Copy link
Contributor

creat89 commented Aug 19, 2019

Hello @valeriocos, sorry for not being replying as fast as expected, but I'm on my holidays. I'll try to check that remotely. However, one of the stack trace seems to be an issue with a metric made by @blueoly, I think.

Checking the stack traces, I'm seeing that the http request comes from either the oss_app container or the api_server. Which, I don't know why we would have two different containers having the same type of problem.

Still, I'm guessing it has something to do with Mongo, but I don't know if the api_server make requests or not to Mongo.

For the moment, I don't have a idea of how to work around the issues.

@valeriocos
Copy link
Member Author

Thank you for answering @creat89 , sorry I didn't know you were on holidays, enjoy :)

@md2manoppello, @tdegueul @MarcioMateus @mhow2 any idea?

@MarcioMateus
Copy link
Contributor

Hello @valeriocossorr for the late reply. I just returned from holidays.

Regarding to the error while stopping the containers, I already have similar errors (don't know if with these containers or others). Usually, when I see messages like that, I do a docker-compose down and then restart the docker engine (when on Mac OS) or restart the machine (when on Linux).

Regarding to the stuck analysis task, I think that it happened some times to us. Some times it was due to being reached the limit of requests for the GitHub api. But it should restart again in less then one hour (unless it is implemented a back-off algorithm that grew too much...). I remember that in our case we let it running during the night and the task eventually restarted.

@valeriocos
Copy link
Member Author

valeriocos commented Aug 20, 2019

Thank you for answering @MarcioMateus :)
I launched the analysis yesterday night on just two metric providers. I followed the suggestion proposed by @ambpro (crossminer/scava#326 (comment)), thus I commented out the dumps in the oss-db/Dockerfile. The analysis got blocked:
captura_39
captura_40

Then I did a docker-compose down and got the error reported at #86.

slimbook@slimbook-KATANA:~/Escritorio/sources/scava-deployment$ docker-compose -f docker-compose-build.yml down
Stopping scava-deployment_admin-webapp_1   ... done
Stopping scava-deployment_dashb-importer_1 ... done
Stopping scava-deployment_prosoul_1        ... done
Stopping scava-deployment_kb-service_1     ... done
Stopping scava-deployment_api-server_1     ... 
Stopping scava-deployment_oss-app_1        ... error
Stopping scava-deployment_kibiter_1        ... 
Stopping scava-deployment_auth-server_1    ... error
Stopping scava-deployment_kb-db_1          ... done
Stopping scava-deployment_oss-db_1         ... error
Stopping scava-deployment_elasticsearch_1  ... error

ERROR: for scava-deployment_api-server_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)

ERROR: for scava-deployment_kibiter_1  UnixHTTPConnectionPool(host='localhost', port=None): Read timed out. (read timeout=70)
ERROR: An HTTP request took too long to complete. Retry with --verbose to obtain debug information.
If you encounter this issue regularly because of slow network conditions, consider setting COMPOSE_HTTP_TIMEOUT to a higher value (current value: 60).

Finally I did a docker stop on the faulty containers:

slimbook@slimbook-KATANA:~/Escritorio/sources/scava-deployment$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

@valeriocos
Copy link
Member Author

@MarcioMateus can you tell me the specific where you are running your crossminer instance (ram, disk storage, etc.)? I was thinking that all this problem may be related to a limit on my machine (16GB, Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz) because I'm not able to see a task analysed there.

@MarcioMateus
Copy link
Contributor

@valeriocos, my machine is similar to yours. Actually, I usually set the max RAM to 10-12 GB on my Docker engine. Again, as I use Mac OS, some behaviours may be different.

But thinking better on the issue, I rarely run all the containers in my machine as it causes a huge strain on my PC and I don't need all of them for our use case. When I need to perform more complete tests (with all the containers), I use our server.

You are probably correct. That resources may not be enough to run the whole CROSSMINER platform.

@valeriocos
Copy link
Member Author

Thank you for the info @MarcioMateus .
Are you limiting the RAM with: sudo docker run -it --memory=”10g” ... or there are other commands/params I should take into accounts?

I'm now trying to analyse the repo https://github.com/chaoss/grimoirelab with the minimal conf below (I also blocked the population of oss-db with the dumps). Let's see tomorrow :)

version: "3"
services:
    admin-webapp:
        build: ./web-admin
        environment:
            - API_GATEWAY_VAR=http://localhost:8086
        depends_on:
            - api-server
        networks:
            - default
        expose:
            - 80
        ports:
            - "5601:80"

    oss-app: #Deploys a container with the OSSMETER platform configured to act as master
             # and to run the api server used by oss-web service
        build: ./metric-platform
        entrypoint: ["./wait-for-it.sh", "oss-db:27017", "-t", "0", "--", "./eclipse", "-master", "-apiServer", "-worker", "w1", "-config", "prop.properties"]
        depends_on:
            - oss-db
            - elasticsearch
        networks:
            - default
        expose: #exposes OSSMETER API client port to oss-web application
            - 8182
            - 8183 #Admin API?
            - 8192 # JMX port
        ports:
          - "8182:8182"
          - "8192:8192"

    oss-db: # data storage service
        build: ./oss-db #restores a dump of of the Knowledge Base
        # image: mongo:3.4 #current setup uses mongodb
        networks:
            - default
        expose:  #exposes database port to oss-web and oss-app
            - 27017
        ports:
            - "27017:27017"

    api-server: # API gateway to route the access of REST APIs
        build: ./api-gw

        depends_on:
            - oss-app
            - auth-server

        networks:
            - default

        expose:
            - 8086
        ports:
            - "8086:8086"


    auth-server: # Server responsible for the authentication of the
        build: ./auth
        entrypoint: ["./wait-for-it.sh", "oss-db:27017", "-t", "0", "--", "java", "-jar", "scava-auth-service-1.0.0.jar" ]
      #  entrypoint: ["./wtfc.sh", "oss-app:8182", "--timeout=0", "java", "-jar", "scava-auth-service-1.0.0.jar" ]
        depends_on:
            - oss-db

        networks:
            - default

        expose:
            - 8085

        ports:
            - "8085:8085"

    elasticsearch:
        image: acsdocker/elasticsearch:6.3.1-secured
        command: /elasticsearch/bin/elasticsearch -E network.bind_host=0.0.0.0 -Ehttp.max_content_length=500mb
        networks:
            - default
        expose:
            - 9200
        ports:
          - "9200:9200"
        environment:
          - ES_JAVA_OPTS=-Xms2g -Xmx2g
          - ES_TMPDIR=/tmp

@valeriocos
Copy link
Member Author

The analysis reached 69% but then it's stopped waiting for the token to refresh:
oss-app logs

oss-app_1        | AbstractInterceptor.intercept( https://api.github.com/repos/chaoss/grimoirelab/issues/135/comments?per_page=100&page=1 )
oss-app_1        | Get platform properties ...
oss-app_1        | Aug 20, 2019 10:38:02 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:02	172.19.0.6	-	172.19.0.5	8182	GET	/platform/properties	-	200	74	0	4	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | {"url":"https://github.com/chaoss/website"}
oss-app_1        | front stripped: github.com/chaoss/website
oss-app_1        | INFO  [importer.gitHub] (22:38:03): ---> processing repository chaoss/website
oss-app_1        | ERROR [importer.gitHub] (22:38:03): API rate limit exceeded. Waiting to restart the importing...Server returned HTTP response code: 403 for URL: https://api.github.com/repos/chaoss/website?access_token=747f5cf0b834cc90f8a776100ae2aed3af97b9fc
oss-app_1        | INFO  [importer.gitHub] (22:38:03): API rate limit exceeded. Waiting to restart the importing...
oss-app_1        | Aug 20, 2019 10:38:44 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:44	172.19.0.6	-	172.19.0.5	8182	GET	/analysis/workers	-	200	8640	0	5	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | Aug 20, 2019 10:38:44 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:44	172.19.0.6	-	172.19.0.5	8182	GET	/analysis/tasks	-	200	16665	0	20	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | Aug 20, 2019 10:38:46 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:46	172.19.0.6	-	172.19.0.5	8182	GET	/projects	-	200	3610	0	5	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | EXECUTION
oss-app_1        | Aug 20, 2019 10:38:46 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:46	172.19.0.6	-	172.19.0.5	8182	GET	/analysis/tasks/status/project/grimoirelab	-	200	23	0	8	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | STOP
oss-app_1        | Aug 20, 2019 10:38:46 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:46	172.19.0.6	-	172.19.0.5	8182	GET	/analysis/tasks/status/project/elasticsearch	-	200	22	0	9	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | Aug 20, 2019 10:38:46 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:46	172.19.0.6	-	172.19.0.5	8182	GET	/analysis/tasks/project/elasticsearch	-	200	8040	0	3	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | Aug 20, 2019 10:38:46 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:46	172.19.0.6	-	172.19.0.5	8182	GET	/analysis/tasks/project/grimoirelab	-	200	8626	0	4	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | Get platform properties ...
oss-app_1        | Aug 20, 2019 10:38:50 PM org.restlet.engine.log.LogFilter afterHandle
oss-app_1        | INFO: 2019-08-20	22:38:50	172.19.0.6	-	172.19.0.5	8182	GET	/platform/properties	-	200	74	0	4	http://oss-app:8182	Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36	http://localhost:5601/
oss-app_1        | {"url":"https://github.com/orientechnologies/orientdb"}
oss-app_1        | front stripped: github.com/orientechnologies/orientdb
oss-app_1        | INFO  [importer.gitHub] (22:38:56): ---> processing repository orientechnologies/orientdb
oss-app_1        | ERROR [importer.gitHub] (22:38:56): API rate limit exceeded. Waiting to restart the importing...Server returned HTTP response code: 403 for URL: https://api.github.com/repos/orientechnologies/orientdb?access_token=747f5cf0b834cc90f8a776100ae2aed3af97b9fc
oss-app_1        | INFO  [importer.gitHub] (22:38:56): API rate limit exceeded. Waiting to restart the importing...

I'll restart the docker-compose and see

@MarcioMateus
Copy link
Contributor

Hi @valeriocos, Docker engine for Mac OS comes already with an interface for configure these values
E.g.

But may also exist CLI commands to configure these values on Linux, I don't know.

I think that the command you identified only defines limits for a specific container.

@valeriocos
Copy link
Member Author

Thank you @MarcioMateus for the info.

I was able to get the repo https://github.com/chaoss/grimoirelab analyzed (I have still to check the data in Kibana). The steps I followed are below (I guess I'll add something to the scava-deployment readme):

Now I'm repeating the analysis with https://github.com/elastic/elasticsearch, and see if it works

Yesterday the execution probably freezed since I added a new analysis when the one about grimoirelab was still ongoing. Thus, this morning I deleted the oss-db and oss-app containers and did a docker-compose up. Not sure it's the best solution, but it worked.

I would keep this issue open until I got 3-4 repos analyzed.

@valeriocos
Copy link
Member Author

The solution seems to work, it can be summarized with:

  • Limit the number of services in docker-compose
  • Limit the number of metrics
  • Don't queue task analysis (once a task has finished, add a new one)
  • In case the task analysis stops, these steps seem to work for me:
    • docker-compose stop
    • docker-compose up
    • delete task analysis
    • docker-compose stop
    • docker-compose up
    • recreate and start the task analysis

@MarcioMateus
Copy link
Contributor

Hi @valeriocos . That is a good summary. Let me just add some comments.

I think I never noticed problems with having multiple analysis task in a queue, but I accept the it may consume more resources.

When a task is stuck I perform similar steps, however I usually don't delete the task. I just stop the execution of the task and then start it again and after some seconds the worker starts analysing it.

@valeriocos
Copy link
Member Author

Thank you @MarcioMateus for the info. I'm going to submit a PR to update the scava-deployment readme (taking into account also your feedback)

@creat89
Copy link
Contributor

creat89 commented Aug 22, 2019

I guess it should be interesting and helpful to detected which metrics use a lot of ram and document them. In order to set a minimum of ram that should be use if all the metrics want to be used.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants