Data: Add partition stats writer and reader #11216

ajantha-bhat · 2024-09-26T17:00:06Z

Introduce APIs to write the partition stats into files in table default format using Iceberg generic writers and readers.

PartitionStatisticsFile partitionStatisticsFile =
        PartitionStatsHandler.computeAndWriteStatsFile(testTable, "b1");

testTable.updatePartitionStatistics().setPartitionStatistics(partitionStatisticsFile).commit();

ajantha-bhat · 2024-09-27T01:46:10Z

core/src/main/java/org/apache/iceberg/PartitionStats.java

+
+  @Override
+  @SuppressWarnings("checkstyle:CyclomaticComplexity")
+  public boolean equals(Object other) {


StructLikeMap was previously handling this implicitly. But when PartitionStatsRecord wraps PartitionStats now for the writers, it needs to override equals and hashcode

ajantha-bhat · 2024-09-27T01:48:26Z

core/src/main/java/org/apache/iceberg/PartitionStatsUtil.java

-        StructLike coercedPartition =
-            PartitionUtil.coercePartition(partitionType, spec, file.partition());
-        StructLike key = keyTemplate.copyFor(coercedPartition);
+        Record key = coercedPartitionRecord(file, spec, partitionType);


Need Record instead of PartitionData for the writers.

Cannot keep this conversion in the data module as it just to wraps the same PartitionStats object.

ajantha-bhat · 2024-09-27T01:49:36Z

core/src/main/java/org/apache/iceberg/data/PartitionStatsRecord.java

+
+/** Wraps the {@link PartitionStats} as {@link Record}. Used by generic writers and readers. */
+public class PartitionStatsRecord implements Record, StructLike {
+  private static final LoadingCache<StructType, Map<String, Integer>> NAME_MAP_CACHE =


Class is similar to GenericRecord but for a specific partition stats schema.

I'm a little confused why we need a special class for this? GenericRecord should work right? Also Record already implements StructLike so that's unnecessary

I'm a little confused why we need a special class for this? GenericRecord should work right?

I got a review comment perviously from Anton that keeping the Record in the public interface of writer and readers is fragile. So, New class introduced which is less fragile (coupled with partition stats schema and just wraps the PartitionStats).

#10176 (comment)

I'm not sure why we need a special class here still? His comment is just to remove Record from the public interface which has been done. I don't think creating a new special class (which is public) is necessary since the records only exist within private handler code? - @aokolnychyi was in Europe last I checked but when he is back he can check it out.

Also, GenericRecord can't wrap the PartitionStats, it maintains its own data array.

core/src/test/java/org/apache/iceberg/TestPartitionStatsUtil.java

ajantha-bhat · 2024-09-27T01:52:39Z

core/src/test/java/org/apache/iceberg/TestTables.java

+      Schema schema,
+      PartitionSpec spec,
+      int formatVersion,
+      Map<String, String> properties) {


There was no option to pass the table properties before.
Needed to pass different file format for paramterized test.

ajantha-bhat · 2024-09-27T02:00:04Z

@aokolnychyi: This PR is ready. But as we discussed previously, this PR wraps the PartitionStats into a Record as the writers cannot work with Iceberg internal objects yet.

I will explore adding the internal writers for Parquet and Orc. Similar to #11108.
If we fail to have it ready by 1.7.0, I think it makes sense to merge this PR and introduce the optimized writer in the next version by deprecating this writer.

ajantha-bhat · 2024-10-23T16:18:50Z

@RussellSpitzer: It would be good to have this in 1.7.0.
I am waiting from a month for a review.

aokolnychyi · 2024-10-24T04:34:52Z

I think we should try to use "internal" writers. @rdblue added "internal" readers recently.

Any guidance on how to add a writer, @rdblue? We can start with Avro for now. We will also need such readers/writers for Parquet.

ajantha-bhat · 2024-10-24T08:39:46Z

@aokolnychyi, @rdblue:

I already tried POC for internal writers on another branch,
c209bc9

The problems:
a) I am using PartitionData instead of Record for partition value, but the PartitionData get() method wraps the byte array to the byte buffer, which is a problem for internal writers, they expect byte[]. So, I didn't felt like using a new class instead of PartitionData just for this.

b) Also, Using partitionData in StructLikeMap is not working fine. Some keys are missing in the map (looks like equals() logic), If I use Record, it is fine.

Maybe in the next version we can have optimized writer and reader (without converter using internal reader and writers).
For end user it doesn't make any difference as new readers can also read the old partition stats parquet file and old readers can read the new partition stats parquet file. So, can we merge this?

RussellSpitzer · 2024-10-25T15:33:58Z

core/src/main/java/org/apache/iceberg/PartitionStats.java

+    }
+
+    PartitionStats that = (PartitionStats) other;
+    return Objects.equals(partition, that.partition)


StructLike doesn't have equal, I think you need to use StructLikeComparator here

We are storing Record as Partition from coercedPartitionRecord().
Since GenericRecord has equals implemented, calling Objects.equals is working here.
Hence, I didn't add comparator logic.

But agree that need to understand the implementation logic, just by looking at this class, it looks like we need comparator logic. I can update it if it is necessary.

I also tried adding the comparator logic today and pass the comparator of partition type.

Since, we are converting the partition values for the writer in PartitionStatsHandler.statsToRecords().
Comparator expects integer value for date column but we have converted the values to LocalDate, hence comparison fails.

If I don't use the comparator, Record.equals() will be called which does array compare and passes.

Sorry but we can't assume subtype here unless you want to assert and change the field type above. If we say something is structLike we can't assume it behaves like a record (even if given the current code we know it won't.) If you want it to be a record you need to cast it and assert earlier in the class.

Agree. I have added the assert (preconditions) to make sure it is always of the type Record. Also, added a comment that why keeping the type as StructType instead of Record when it is always a record.

It is because in future when we introduce internal parquet writers that works with StructLike instead of Record, we don't have to change method signatures and it will be compatible.

core/src/main/java/org/apache/iceberg/PartitionStats.java

core/src/test/java/org/apache/iceberg/TestTables.java

data/src/test/java/org/apache/iceberg/data/TestPartitionStatsHandler.java

RussellSpitzer · 2024-10-28T18:24:06Z

Moving out of 1.7.0 since we still have a bit of discussion here

ajantha-bhat · 2024-11-19T15:31:28Z

@RussellSpitzer: I have added the Assertion for Partition type as you suggested and replied to #11216 (comment), do you have anymore comments for this PR?

aokolnychyi · 2024-11-20T20:57:57Z

I had a conversation with @rdblue today about internal writers. Ryan should have a bit of time to help/guide.
I will check the current implementation today too.

aokolnychyi · 2024-11-21T00:19:32Z

core/src/main/java/org/apache/iceberg/PartitionStats.java

@@ -205,6 +211,8 @@ public <T> T get(int pos, Class<T> javaClass) {
  public <T> void set(int pos, T value) {
    switch (pos) {
      case 0:
+        Preconditions.checkArgument(


It feels a bit awkward to rely on Record for a nested field while the main object is simply StructLike.

I can keep the member as Record instead of StructLike to avoid this. But since we have plan to use internal writers in future (which uses StructLike), we lose compatibility if we keep the member as Record instead of StructLike.

I don't think it is too awkward as Record implements StructLike.

aokolnychyi · 2024-11-21T00:28:35Z

data/src/main/java/org/apache/iceberg/data/PartitionStatsHandler.java

+  }
+
+  @VisibleForTesting
+  static Iterator<PartitionStatsRecord> statsToRecords(


We are doing a lot of logic here that wouldn't be needed with internal readers and writers. Let's at least estimate the amount of work to get the internal writer for Avro, to begin with. Any thoughts, @rdblue?

Let's at least estimate the amount of work to get the internal writer for Avro

This PR needs internal writer for parquet and orc as well not just Avro.
Considering 1.8.0 is planned end of this month and we have holidays coming up next month, I don't want to miss the release train again (like 1.7.0)

We are waiting for partition stats from long time (almost an year) and this PR is implemented based on what is available in the current Iceberg. I too agree that having internal writers will be nice. But it can be added in future and PR is designed such that we can replace current writers with internal writers without losing compatibility.

So, I don't see a reason to block the development of this feature.
Merging this PR will complete milestone for partition stats.

And regarding effort for internal writers, I tried a POC last time just for parquet.
ajantha-bhat@c209bc9

Introduced GenericStructParquetWriter used by GenericStructFileWriterFactory.
As It uses BaseParquetWriter which expects LocalDate for date type instead of Int type and so on for other types. Should we use BaseParquetWriter with converter for internal writers or we should go and refactor the ParquetValueWriters and ColumnWriter was my doubt.

Also, If we use writers with converters, StructLike comparator will fail as it need int type for date but the final value is LocalDate.

Do we really need internal writers here ? I understand it's an improvement, but I don't think it's a blocker for this PR. The effort is ongoing for a long time now, I would be more in favor of moving forward soon and plan internal writers improvement in a second step.

@rdblue, @aokolnychyi, @RussellSpitzer: Can we please conclude on this?

ping @rdblue, @aokolnychyi, @RussellSpitzer

jbonofre · 2024-11-28T06:55:13Z

@RussellSpitzer @aokolnychyi I'm reviewing the stale PRs, and this one is open for month. Do we have a way to move forward ? I can do a new review, but at the end of the day, it won't help for the merge (as only committers can merge PR).

deniskuzZ · 2024-12-02T19:19:36Z

Thanks @ajantha-bhat for your work on partition stats support in Iceberg! That could be reused in Hive as a building block for apache/hive#5498

jbonofre · 2024-12-11T09:48:22Z

@danielcweeks @RussellSpitzer @aokolnychyi would you have some time to take a look on this PR and my proposal (previous comment) ?

regadas · 2024-12-18T22:57:34Z

I just found this PR as I'm desperately looking for this functionality. Thanks @ajantha-bhat! Let's see if the review gets wrapped soon 🤞

deniskuzZ · 2025-01-08T11:45:48Z

data/src/main/java/org/apache/iceberg/data/PartitionStatsHandler.java

+   * @param branch A branch information to select the required snapshot.
+   * @return {@link PartitionStatisticsFile} for the given branch.
+   */
+  public static PartitionStatisticsFile computeAndWriteStatsFile(Table table, String branch) {


Wouldn't it be better not to have fat methods with multiple responsibilities? What if we introduce the write method that gets the stats iterator as an argument.
We might not need to execute the complete stats rebuild for all the registered partitions, but only for those that were changed in the current snapshot.
Snapshot summary already has a metric for the number of changed partitions, maybe we could extend it with the partition list and re-compute stats only for them. Then generate a new stats file based on the prev snapshot stats with updates to the changed partitions.

deniskuzZ · 2025-01-09T10:31:39Z

hi @ajantha-bhat, could you please check below:

it seems that Date, time, timestamp partition values are not properly serialized
see Type.TypeID

DATE(Integer.class),
TIME(Long.class),
TIMESTAMP(Long.class),

PartitionSpec.partitionToPath(PartitionStatsRecord.unwrap().partition()) thows an exception

Caused by: java.lang.IllegalStateException: Not an instance of java.lang.Integer: 1999-12-31
	at org.apache.iceberg.data.GenericRecord.get(GenericRecord.java:123)
	at org.apache.iceberg.PartitionSpec.get(PartitionSpec.java:195)
	at org.apache.iceberg.PartitionSpec.partitionToPath(PartitionSpec.java:213)

I think, instead of Record(1999-12-31) it should be Record{10956}

full code snippet

Types.StructType partitionType = Partitioning.partitionType(table);
Schema schema = PartitionStatsHandler.schema(partitionType);

CloseableIterable<PartitionStatsRecord> partitionStatsRecords = PartitionStatsHandler.readPartitionStatsFile(
    schema, table.io().newInputFile(statsFile.path()));

try (Closeable toClose = partitionStatsRecords) {
  PartitionStats partitionStats = Iterables.tryFind(partitionStatsRecords, stats -> {
        PartitionSpec spec = table.specs().get(stats.unwrap().specId());
        return spec.partitionToPath(stats.unwrap().partition()).equals(partish.getPartition().getName());
      })
      .transform(PartitionStatsRecord::unwrap)
      .orNull();

  if (partitionStats != null) {
    Map<String, String> stats = ImmutableMap.of(
        TOTAL_DATA_FILES_PROP, String.valueOf(partitionStats.dataFileCount()),
        TOTAL_RECORDS_PROP, String.valueOf(partitionStats.dataRecordCount()),
        TOTAL_EQ_DELETES_PROP, String.valueOf(partitionStats.equalityDeleteRecordCount()),
        TOTAL_POS_DELETES_PROP, String.valueOf(partitionStats.positionDeleteRecordCount()),
        TOTAL_FILE_SIZE_PROP, String.valueOf(partitionStats.totalDataFileSizeInBytes())
    );
    return stats;
}

ajantha-bhat · 2025-01-09T10:50:33Z

@deniskuzZ: Thanks for testing out. We are working on Internal parquet/Avro/orc readers and writes. partition stats will use them. So, we don't need to go through these converters. I will retest all the data types once I use internal writers for partition stats.

PRs:
#11919
#11904

deniskuzZ · 2025-01-09T12:12:04Z

@deniskuzZ: Thanks for testing out. We are working on Internal parquet/Avro/orc readers and writes. partition stats will use them. So, we don't need to go through these converters. I will retest all the data types once I use internal writers for partition stats.

PRs: #11919 #11904

JFYI, once I removed IdentityPartitionConverters.convertConstant from the statsToRecords->convertPartitionValues issue with partitionToPath method is gone.
IDK, maybe you had some other idea of how to do the partition filtering on the returned stats, but I used the following

spec.partitionToPath(stats.unwrap().partition())
  .equals(partish.getPartition().getName())

ajantha-bhat marked this pull request as draft September 26, 2024 17:00

github-actions bot added core data labels Sep 26, 2024

ajantha-bhat mentioned this pull request Sep 26, 2024

Data: Add a util to read write partition stats #10176

Closed

ajantha-bhat force-pushed the stats_writer branch 2 times, most recently from 941505a to 05a80f6 Compare September 27, 2024 01:43

ajantha-bhat commented Sep 27, 2024

View reviewed changes

core/src/test/java/org/apache/iceberg/TestPartitionStatsUtil.java Show resolved Hide resolved

ajantha-bhat commented Sep 27, 2024

View reviewed changes

ajantha-bhat added this to the Iceberg 1.7.0 milestone Sep 27, 2024

ajantha-bhat marked this pull request as ready for review September 27, 2024 02:00

ajantha-bhat requested a review from aokolnychyi September 27, 2024 02:00

ajantha-bhat mentioned this pull request Oct 16, 2024

Partition stats task tracker #8450

Open

11 tasks

RussellSpitzer reviewed Oct 25, 2024

View reviewed changes

core/src/main/java/org/apache/iceberg/PartitionStats.java Show resolved Hide resolved

RussellSpitzer reviewed Oct 25, 2024

View reviewed changes

core/src/test/java/org/apache/iceberg/TestTables.java Show resolved Hide resolved

RussellSpitzer reviewed Oct 25, 2024

View reviewed changes

data/src/test/java/org/apache/iceberg/data/TestPartitionStatsHandler.java Show resolved Hide resolved

RussellSpitzer modified the milestones: Iceberg 1.7.0, Iceberg 2.0.0 Oct 28, 2024

ajantha-bhat added 2 commits November 19, 2024 20:39

Data: Add partition stats writer and reader

becf073

Assert partition type must be Record

ee3b273

ajantha-bhat force-pushed the stats_writer branch from 05a80f6 to ee3b273 Compare November 19, 2024 15:23

aokolnychyi reviewed Nov 21, 2024

View reviewed changes

ajantha-bhat modified the milestones: Iceberg 2.0.0, Iceberg 1.8.0 Nov 22, 2024

deniskuzZ mentioned this pull request Nov 28, 2024

Core: Support aggregated basic stats in partition summary #11669

Closed

danielcweeks requested a review from rdblue December 4, 2024 16:39

deniskuzZ reviewed Jan 8, 2025

View reviewed changes

Data: Add partition stats writer and reader #11216

Are you sure you want to change the base?

Data: Add partition stats writer and reader #11216

Conversation

ajantha-bhat commented Sep 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajantha-bhat Sep 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajantha-bhat Oct 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajantha-bhat commented Sep 27, 2024

ajantha-bhat commented Oct 23, 2024

aokolnychyi commented Oct 24, 2024

ajantha-bhat commented Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

ajantha-bhat Oct 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RussellSpitzer commented Oct 28, 2024

ajantha-bhat commented Nov 19, 2024 • edited Loading

aokolnychyi commented Nov 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbonofre commented Nov 28, 2024

deniskuzZ commented Dec 2, 2024

jbonofre commented Dec 11, 2024

regadas commented Dec 18, 2024

deniskuzZ Jan 8, 2025 • edited Loading

Choose a reason for hiding this comment

deniskuzZ commented Jan 9, 2025 • edited Loading

ajantha-bhat commented Jan 9, 2025

deniskuzZ commented Jan 9, 2025

ajantha-bhat commented Sep 26, 2024 •

edited

Loading

ajantha-bhat Sep 27, 2024 •

edited

Loading

ajantha-bhat Oct 30, 2024 •

edited

Loading

ajantha-bhat commented Oct 24, 2024 •

edited

Loading

ajantha-bhat Oct 28, 2024 •

edited

Loading

ajantha-bhat commented Nov 19, 2024 •

edited

Loading

deniskuzZ Jan 8, 2025 •

edited

Loading

deniskuzZ commented Jan 9, 2025 •

edited

Loading