Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Adding code for sending parameter "response_format" as request payload #2317

Closed

Conversation

FarrukhMasud
Copy link
Contributor

Related Issues/PRs

For newer models, openai accepts response_format, which can be text or json_object. This PR adds the parameter for this. Additionally, when developer set postprocessingoptions, in OpenAIPrompt class, post processing field is reduncdant and can be made optional. We cannot infer post processing from response format because json as response format is not supported by many models including GPT-4 and GPT-3.5-Turbo. All changes made in this PR as backward compatible and will not break any existing code.

What changes are proposed in this pull request?

In this PR, we are adding:

  1. Ability to add parameter for response_format in OpenAIPrompt class and OpenAIChatCompletion class.
  2. Added unit tests to validate this
  3. Added code to infer postprocessing from postprocessingoptions and hence making it optional.

How is this patch tested?

Unit tests and integration tests are added to validate this change

Does this PR change any dependencies?

  • No. You can skip this section.
  • Yes. Make sure the dependencies are resolved correctly, and list changes here.

Does this PR add a new feature? If so, have you added samples on website?

  • No. You can skip this section.
  • Yes. Make sure you have added samples following below steps.

@FarrukhMasud
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link

codecov-commenter commented Nov 21, 2024

Codecov Report

Attention: Patch coverage is 92.53731% with 5 lines in your changes missing coverage. Please review.

Project coverage is 84.50%. Comparing base (08aab6a) to head (14d460f).

Files with missing lines Patch % Lines
...zure/synapse/ml/services/openai/OpenAIPrompt.scala 85.71% 3 Missing ⚠️
...apse/ml/services/openai/OpenAIChatCompletion.scala 95.55% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2317   +/-   ##
=======================================
  Coverage   84.50%   84.50%           
=======================================
  Files         326      326           
  Lines       16755    16810   +55     
  Branches     1480     1498   +18     
=======================================
+ Hits        14158    14205   +47     
- Misses       2597     2605    +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@@ -55,11 +132,20 @@ class OpenAIChatCompletion(override val uid: String) extends OpenAIServicesBase(
override def responseDataType: DataType = ChatCompletionResponse.schema

private[this] def getStringEntity(messages: Seq[Row], optionalParams: Map[String, Any]): StringEntity = {
val mappedMessages: Seq[Map[String, String]] = messages.map { m =>
var mappedMessages: Seq[Map[String, String]] = messages.map { m =>
Copy link
Collaborator

@mhamilton723 mhamilton723 Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please no vars in code unless required. In scala any kind of mutability is highly discouraged. Here in this case you can return either the original or the original plus your addition in the if statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made mappedMessages val again

Comment on lines 112 to 142
val responseFormat = new Param[String](
this, "responseFormat", "The response format from the OpenAI API.")

def getResponseFormat: String = $(responseFormat)

def setResponseFormat(value: String): this.type = {
if (value.isEmpty) {
this
} else {
val normalizedValue = value.toLowerCase match {
case "json" => "json_object"
case other => other
}

// Validate the normalized value using the OpenAIResponseFormat enum
if (!OpenAIResponseFormat.values
.map(_.asInstanceOf[OpenAIResponseFormat.ResponseFormat].name)
.contains(normalizedValue)) {
throw new IllegalArgumentException("Response format must be valid for OpenAI API. " +
"Currently supported formats are " + OpenAIResponseFormat.values
.map(_.asInstanceOf[OpenAIResponseFormat.ResponseFormat].name)
.mkString(", "))
}

set(responseFormat, normalizedValue)
}
}

def setResponseFormat(value: OpenAIResponseFormat.ResponseFormat): this.type = {
this.setResponseFormat(value.name)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code seems duplicated, can it be abstracted and shared?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now using shared code.

assert(nonNullCount == 4)
}

test("Basic Usage - Gpt 4o with response format text") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Comment on lines 295 to 311
test("setResponseFormat should set the response format correctly for valid values") {
val prompt = new OpenAIPrompt()
prompt.setResponseFormat("text")
prompt.getResponseFormat should be ("text")

prompt.setResponseFormat("json")
prompt.getResponseFormat should be ("json_object")

prompt.setResponseFormat("json_object")
prompt.getResponseFormat should be ("json_object")

prompt.setResponseFormat("jSoN")
prompt.getResponseFormat should be ("json_object")

prompt.setResponseFormat("TEXT")
prompt.getResponseFormat should be ("text")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests look duplicated from above, if the core parameter flexibility is shared between the two we can keep just 1 copy and still have same basic coverage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@mhamilton723
Copy link
Collaborator

Awesome! Only minor nits!

@FarrukhMasud
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

…ts can be reused, in this iteration, I am removing this possibility. Now each test create a new OpenAIPrompt
@FarrukhMasud
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@FarrukhMasud
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@FarrukhMasud
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@FarrukhMasud
Copy link
Contributor Author

/azp run

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@FarrukhMasud FarrukhMasud deleted the response-format-02 branch January 6, 2025 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants