Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor OpenSearch Relevance Functions out of Core #2019

Closed

Conversation

MitchellGale
Copy link
Contributor

@MitchellGale MitchellGale commented Aug 22, 2023

Description

Moving Relevance-based search functions out of core module and into opensearch module.
This only affects the Function Resolver.

Out of Scope:

Issues Resolved

Partially fixes #811

Internal review: Bit-Quill#338

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

* Created Relevance and OpenSeach Functions in opensearch module

Signed-off-by: Guian Gumpac <[email protected]>

* Moving Test in progress. Committing for rebase purposes

Signed-off-by: Guian Gumpac <[email protected]>

* Moved tests out from core

Signed-off-by: Guian Gumpac <[email protected]>

* Pulled out analyzers to opensearch module

Signed-off-by: Guian Gumpac <[email protected]>

* Revert "Pulled out analyzers to opensearch module"

This reverts commit a602e30.

* Fixed jacoco for opensearch module

Signed-off-by: Guian Gumpac <[email protected]>

* Refactor out OpenSearchFunction from OpenSearchFunctions; Move OpenSearchFunctions and RelevanceFunctionResolver

Signed-off-by: acarbonetto <[email protected]>

* Fixed tests in core. WIP

Signed-off-by: Guian Gumpac <[email protected]>

* Update to use OpenSearchDSL

Signed-off-by: acarbonetto <[email protected]>

* Fixed nested test coverage

Signed-off-by: Guian Gumpac <[email protected]>

* Update datasource metadata to check for initial cluster state

Signed-off-by: acarbonetto <[email protected]>

* Add score functions to analyzer test

Signed-off-by: acarbonetto <[email protected]>

* Re-add nested tests into core

Signed-off-by: acarbonetto <[email protected]>

* Added hashcode for DataSourceMetadata

Signed-off-by: Guian Gumpac <[email protected]>

* Added test for score without boost

Signed-off-by: Guian Gumpac <[email protected]>

* Re-add nested tests in Analyzer

Signed-off-by: acarbonetto <[email protected]>

* Clean up analyzer tests

Signed-off-by: acarbonetto <[email protected]>

* Cleaned up some code

Signed-off-by: Guian Gumpac <[email protected]>

* Fix jacoco and add unit tests for coverage

Signed-off-by: acarbonetto <[email protected]>

---------

Signed-off-by: Guian Gumpac <[email protected]>
Signed-off-by: acarbonetto <[email protected]>
Co-authored-by: acarbonetto <[email protected]>
@acarbonetto acarbonetto changed the title Refactor Relevance out of Core Refactor OpenSearch Relevance Functions out of Core Aug 23, 2023
@codecov
Copy link

codecov bot commented Aug 23, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.30%. Comparing base (ed2b683) to head (f008573).
Report is 271 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #2019   +/-   ##
=========================================
  Coverage     97.30%   97.30%           
- Complexity     4623     4626    +3     
=========================================
  Files           407      409    +2     
  Lines         11935    11946   +11     
  Branches        828      830    +2     
=========================================
+ Hits          11613    11624   +11     
  Misses          315      315           
  Partials          7        7           
Flag Coverage Δ
sql-engine 97.30% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

assertEquals("match(test=\"test\")", function.toString());
}

// @Test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

lenient().when(table.createScanBuilder()).thenReturn(tableScanBuilder);
}

// @Test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this file... it provides no new tests

acarbonetto
acarbonetto previously approved these changes Aug 23, 2023
Signed-off-by: acarbonetto <[email protected]>
import org.opensearch.sql.storage.StorageEngine;
import org.opensearch.sql.storage.Table;

public class EmptyDataSourceService {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a javadoc and a good description why it is here

@@ -46,7 +49,7 @@ public class BuiltinFunctionRepository {
private final Map<FunctionName, FunctionResolver> functionResolverMap;

/** The singleton instance. */
private static BuiltinFunctionRepository instance;
private static final Map<Integer, BuiltinFunctionRepository> instance = new HashMap<>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extend the comment and probably rename to instances.

Set<DataSourceMetadata> dataSourceMetadataSet = dataSourceService.getDataSourceMetadata(true);
Set<Integer> dataSourceServiceHashSet =
dataSourceMetadataSet.stream()
.map(metadata -> metadata.hashCode())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be optimized to

Suggested change
.map(metadata -> metadata.hashCode())
.map(Object::hashCode)

.getDataSource(metadata.getName())
.getStorageEngine()
.getFunctions()
.forEach(function -> repository.register(function));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.forEach(function -> repository.register(function));
.forEach(repository::register);

Comment on lines +104 to +105
instance.put(metadata.hashCode(), repository);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
instance.put(metadata.hashCode(), repository);
}
}
instance.put(metadata.hashCode(), repository);

new NamedArgumentExpression(
"test", new LiteralExpression(new ExprStringValue("test")))));
assertEquals(BOOLEAN, function.type());
assertThrows(UnsupportedOperationException.class, () -> function.valueOf(null));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check exception message

@@ -80,6 +80,9 @@ public OpenSearchDataSourceMetadataStorage(

@Override
public List<DataSourceMetadata> getDataSourceMetadata() {
if (!this.clusterService.getClusterApplierService().isInitialClusterStateSet()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment when this happens

public static final String INT_TYPE_NULL_VALUE_FIELD = "int_null_value";
public static final String INT_TYPE_MISSING_VALUE_FIELD = "int_missing_value";
public static final String DOUBLE_TYPE_NULL_VALUE_FIELD = "double_null_value";
public static final String DOUBLE_TYPE_MISSING_VALUE_FIELD = "double_missing_value";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these things used in the test?

public class OpenSearchDSL {
private OpenSearchDSL() {}

public static LiteralExpression literal(Byte value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to copy these things from DSL: literal, ref and named


LogicalPlan plan =
project(
nested(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to copy the entire set. Keep project, nested and values

@Swiddis
Copy link
Collaborator

Swiddis commented Jan 7, 2025

Closing as stale -- reopen if needed

@Swiddis Swiddis closed this Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Make OpenSearch function registered dynamically
5 participants