Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-32909 Retrieve index settings from ElasticSearch server #19377

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

kenrowland
Copy link
Contributor

@kenrowland kenrowland commented Jan 3, 2025

Added code to query configured index on ElasticSearch server for dynamic mapping template match values for hpcc metric type suffixes.

Signed-Off-By: Kenneth Rowland [email protected]

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

github-actions bot commented Jan 3, 2025

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32909

Jirabot Action Result:
Assigning user: [email protected]
Workflow Transition To: Merge Pending
Updated PR

@kenrowland kenrowland requested a review from rpastrana January 3, 2025 15:45
In all cases, the configuration of the sink must match that of the dynamic mappings in the index.
The object names beginning with "hpcc_metric_" represent the mappings for the three affected hpcc metrics types.
The _match_ value for each defines the expected suffix for each HPCC metric value, based on type, when metrics
are reported and indexed. The object names above are the default values useed by the sink. The index can be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"useed"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -306,10 +325,11 @@ Add the following to the environment.xml configuration file (note that some valu
<metrics name="mymetricsconfig">
<sinks name="myelasticsink" type="elastic">
<settings period="30" ignoreZeroMetrics="1">
<host domain="<domainname>" port="<port>" protocol="http|https">
<host domain="<domainname>" port="<port>" protocol="http|https"
certificateFilePath="/home/krowland/elastic/cert/http_ca.crt">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid using your username from this example. I'd opt for something obvious such as "myusername"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced. Question: is there a location on the node where the cert file can be placed so that the sink does not have to specify the path? Could be part of configuring a cluster to use ElasticSearch.

countMetricSuffix: count
gaugeMetricSuffix: gauge
histogramMetricSuffix: histogram

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this new line prob should be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -1,5 +1,5 @@
/*##############################################################################
HPCC SYSTEMS software Copyright (C) 2024 HPCC Systems®.
HPCC SYSTEMS software Copyright (C) 2025 HPCC Systems®.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this was created last year, date shouldn't be updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored to 2024

@@ -16,21 +16,6 @@
#include "jsecrets.hpp"
#include "jencrypt.hpp"

//including cpp-httplib single header file REST client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change directly related to the issue addressed by this jira?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

httplib::Result res = pClient->Get(endpoint.c_str(), elasticHeaders);
if (res->status != 200)
{
WARNLOG("ElasticMetricSink: Unable to retrieve index mapping for index %s", indexName.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if httplib::Result provides a message as well, it should be reported as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

httplib result includes a response body (html), but that is likely too long to log. I added the return status code. If we find during testing that additional information from the body is needed, we can review and determine the best way to parse the body and add relevant data to the log.

auto indexConfig = data[indexName.str()];
if (indexConfig.is_null())
{
WARNLOG("ElasticMetricSink: Mapping section for Index '%s' does not exist", indexName.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does one recover from this? Provide useful information to the target audience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no recovery. The mapping section is required. "Unable to continue" can be added (or similar). This is where Ops would need to properly configure the index. Should a log message tell the target audience to refer to a specific document?

WARNLOG("ElasticMetricSink: Invalid gauge suffix pattern");
return false;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else? report unexpected mapping if encountered

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other mappings can exist. We only look for the ones of interest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't see if this was addressed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment I forgot to release. It is not an error for other mappings to exist.

system/metrics/sinks/elastic/elasticSink.cpp Outdated Show resolved Hide resolved
@@ -187,37 +187,41 @@ sink will report metrics from all components to the same index. Therefore, the
index must be configured to receive metrics from all components.

#### Index Configuration
The index must be created in ElasticSearch before metrics can be reported to it. The name is passed
to the sink as a configuration setting. The index must be created with the following settings:
The sink requires the index to be created and configured fully in ElasticSearch in order for the sink to report
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README updates are meant to be helpful, but I found the wording confusing. It might be worth seeking feedback from the doc guys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the information here is beyond the intent of the README. I think we need to consider some separate documentation for each sink (could be in same doc). I'd like to address that with Stu and Jim. But for now, I wanted to try and capture the information.

Copy link
Member

@rpastrana rpastrana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

follow up comments...

@@ -38,6 +37,10 @@ ElasticMetricSink::ElasticMetricSink(const char *name, const IPropertyTree *pSet
configurationValid = true;
PROGLOG("ElasticMetricSink: Loaded and configured");
}
else
{
PROGLOG("ElasticMetricSink: Unable to complete initialization, sink not collecting");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a warning rather than just progress log. Also, if we know why the problem occurred (either host or index config invalid/missing) we should report it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to WARNLOG. If init fails, there will be at least one log message prior to this one with the specific reason.

@@ -175,9 +177,16 @@ bool ElasticMetricSink::getDynamicMappingSuffixesFromIndex(const IPropertyTree *
std::string endpoint;
endpoint.append("/").append(indexName.str()).append("/_mapping");
httplib::Result res = pClient->Get(endpoint.c_str(), elasticHeaders);

if (res == nullptr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per online search:
Error Handling:
If the Result is null, it indicates that the request failed. You can use httplib::Error enum to get more details about the error.
and I found this sample error handling pattern:
if (res && res->status == 200)
{
std::cout << res->body << "\n";
}
else
{
auto err = res.error();
std::cout << httplib::to_string(err) << "\n";
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added code to capture the error, but there is not httplib::to_string function, so just the number is displayed.

@@ -291,6 +298,10 @@ void ElasticMetricSink::intializeElasticClient()
// Add cert path if needed
if (!certificateFilePath.isEmpty())
pClient->set_ca_cert_path(certificateFilePath.str());

pClient->set_connection_timeout(5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default timeout values are always difficult and usually don't cover most situations. A couple of suggestions:

  • make these configurable
  • connect and read/write timeout values should be independent, 5 sec connection timeout might be reasonable, but successful reads/writes might take longer than 5 secs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added configurable host parameters. Note that after the initial read, we will only be doing writes to index metrics.

{
std::string endpoint;
endpoint.append("/").append(indexName.str());
auto res = pClient->Get(endpoint.c_str(), elasticHeaders);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is pClient only used to extract index info? if we intent to use it for other functionality, the above timeout values might not be reasonable.

@@ -318,9 +329,15 @@ bool ElasticMetricSink::validateIndex()
endpoint.append("/").append(indexName.str());
auto res = pClient->Get(endpoint.c_str(), elasticHeaders);

if (res->status != 200)
if (res == nullptr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above suggestion on libhttp err handling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional handling added.

}


bool ElasticMetricSink::prepareToStartCollecting()
bool ElasticMetricSink::convertPatternToSuffix(const char *pattern, StringBuffer &suffix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be static, or does it need to be a member? also, is it const?

WARNLOG("ElasticMetricSink: Invalid gauge suffix pattern");
return false;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't see if this was addressed.

@kenrowland kenrowland requested a review from rpastrana January 8, 2025 19:40
Copy link
Member

@rpastrana rpastrana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kenrowland
Copy link
Contributor Author

@ghalliday please merge.

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any problems.

{
StringBuffer countSuffixMappingName;
if (!pIndexConfigTree->getProp("@countSuffixMappingName", countSuffixMappingName))
countSuffixMappingName.append("hpcc_metrics_count_suffix");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are concerned about efficiency, then the following is better since it avoids cloning the strings:

const char * countSuffixMappingName = pIndexConfigTree->queryProp("@countSuffixMappingName");
if (!countSuffixMappingName)
        countSuffixMappingName = "hpcc_metrics_count_suffix";

However this is only called once, so efficiency is not significant. Not something to change.

auto const &mappingName = it.first;
auto kvPairs = it.second;
auto const &matchValue = kvPairs["match"];
if (mappingName == countSuffixMappingName.str())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is confusing - because the RHS is a const char *, so this would intuitively be a pointer comparison. I think it probably is a string comparison - because the LHS is a std:string - but that isn't at all clear because of the use of auto.
I had thought this would mean each comparison is converting the const char * to a str::string, but the standard library does optimize that case.

Use strsame/streq would be clearer.

@@ -18,6 +18,22 @@
#include "jptree.hpp"
#include "jstring.hpp"

//including cpp-httplib single header file REST client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better solution would be to move the definition of the ElasticMetricSink class into the c++ file. If this was included from many places, then each of those places would now be processing the entire header file.

@ghalliday
Copy link
Member

@kenrowland there is an (unrelated) build error which needs addressing - probably by disabling a warning - before this can be merged.

Added code to query configured index on ElasticSearch server for dynamic
mapping template match values for hpcc metric type suffixes.

Signed-Off-By: Kenneth Rowland [email protected]
@kenrowland
Copy link
Contributor Author

@ghalliday Build has been fixed. Needed to disable a warning in the ElasticSink as predicted.

@ghalliday
Copy link
Member

@kenrowland please do not squash and rebase at the same time - it makes it very hard to track the incremental changes you have made.

@ghalliday ghalliday merged commit c3adb18 into hpcc-systems:master Jan 14, 2025
49 checks passed
Copy link

Jirabot Action Result:
Added fix version: 9.10.0
Workflow Transition: 'Resolve issue'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants