-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1794373: Support DataFrameWriter.insertInto/insert_into #2835
base: main
Are you sure you want to change the base?
SNOW-1794373: Support DataFrameWriter.insertInto/insert_into #2835
Conversation
…rage-data-frame-writer-insert-into
full_table_name = ( | ||
table_name if isinstance(table_name, str) else ".".join(table_name) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder to add a TODO to built ast for this API https://snowflakecomputing.atlassian.net/browse/SNOW-1489960
if target_table.schema != self._dataframe.schema: | ||
raise SnowparkClientException( | ||
f"Schema of the DataFrame: {self._dataframe.schema} does not match the schema of the table {full_table_name}: {target_table.schema}." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should do this check on the client side. I think it is possible to append the data by type coercion even though the schemas are not an exact match.
parse_table_name(table_name) if isinstance(table_name, str) else table_name | ||
) | ||
|
||
target_table = self._dataframe._session.table(qualified_table_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it work in pyspark even when table does not exist? I think this would fail if table doesn't exist.
Which Jira issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes SNOW-NNNNNNN
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Implement DataFrameWriter.insert_into/insertInto which adds dataframe content to existing table requiring same schema;
https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrameWriter.insertInto.html
support in both live and local testing mode