-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11962. [Docs] Hive Integration #7596
base: master
Are you sure you want to change the base?
Conversation
Thanks @jojochuang for the patch. The Hive doc looks good.
If you would like to open separate PRs for doc pages, but not for the index page: create PRs for each content page, and add the same index page in all. But do not add other content pages. This way any PR can be merged first, the rest only need removal of the index. Rather than requiring specific order in the chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this @jojochuang. Disclaimer: I didn't actually test the steps but I know you've done a lot of work on ozone integration. Mostly minor comments on formatting, I appreciate how concise the content is!
``` | ||
|
||
## Using the S3A Protocol | ||
In addition to ofs, Hive can access Ozone via the S3 Gateway using the S3A file system. For more details, refer to the [S3 Protocol]({{< ref "interface/S3.md">}}) documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also link to the hadoop docs on s3a here, and maybe hive's specific docs on s3a integration if they exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cloudera's user doc https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/ozone-storing-data/topics/ozone-access-ozone-s3-using-s3a-filesystem.html is the most comprehensive and accurate one. However I'll refrain from linking a vendor's doc in Apache.
e7ca1e2
to
1a6a96b
Compare
Change-Id: I4a90e652f4e6a8f9007252a014c2b4fb473b030f
```sql | ||
CREATE DATABASE d1 MANAGEDLOCATION 'ofs://ozone1/vol1/bucket1/data'; | ||
``` | ||
|
||
Tables created in the database d1 will be stored under the specified path: | ||
`ofs://ozone1/vol1/bucket1/data` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Managed Tables will be stored in that path, MANAGEDLOCATION
is for Managed Tables, LOCATION
is for External Tables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanx @jojochuang, Minor comments, rest LGTM
|
||
* With external tables, the data is expected to be created and managed by another tool. | ||
* Hive queries the data as-is. | ||
* The metadata is stored under the external warehouse directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data is stored in the external warehouse directory, if no LOCATION is specified when creating an external table, maybe if not needed we can drop the explanation of external table itself, that goes bit into Hive scope
* The metadata is stored under the external warehouse directory. | ||
* Note: Dropping an external table in Hive does not delete the associated data. | ||
|
||
You can also have the metadata for the external tables stored in Ozone too by applying the following configuration in the `hive-site.xml` file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metadata is stored in HMS. only the Data is stored Storage layer, the below config will configure to store the data for External table by default in the specified path, if the path isn't explicitly specified while creating the external table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! Please take a look at the updated doc again.
Change-Id: Id937eef2cf13bd41fdb1afd0d6f8b8377bf6a785
What changes were proposed in this pull request?
HDDS-11962. [Docs] Hive Integration
Please describe your PR in detail:
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11947
How was this patch tested?
./hadoop-ozone/dev-support/checks/docs.sh passed.