Bugfix: Grouped store operations by partition key #7

petero-dk · 2019-04-15T19:31:09Z

Batch operations will fail when they are on different partitions. When using custom filters this is more of an issue.

This bugfix will simply separate the batch operations into different batches based on the partition key.

dei79 · 2019-06-07T06:54:08Z

This is a wonderful extension but I'm worried about the performance with huge amount of data sets, meaning multiple 10K. Because I see a parameter which enables this operation, what do you think?

petero-dk · 2019-06-07T08:11:32Z

I am not sure what you mean by "parameter that enables this operation" either the caller of the library should do this manually or the library should do this. Any batch operation that spans multiple partitions will fail with an error.

I can see that the Linq groupby clause could be a pain point for performance for very large datasets, I could probably take a stab at changing that out for a manual grouping. Do you think the performance increase would be worth it compared to the readability of the code?

dei79 · 2019-06-07T14:04:51Z

You are right in the opinion either the caller or the library should do this. I would like to prefer to have two APIs one which is not doing this grouping automatically and which runs into an error but which is optimized for huge datasets and another one more as comfort function which is doing this stuff so the caller can decide. It could be also encoded in the storage operation flag or so.

petero-dk · 2019-06-07T14:46:33Z

I see your point now, let me think for a second

petero-dk · 2019-06-14T22:48:54Z

I will have an update on this early next week. I did a little testing and refactoring and found one error in my PR and then I got a factor 7 increase in speed on the HugeDemoTest with the latest code. I just need to clean it up

petero-dk · 2019-06-15T09:03:13Z

Please note that I have NOT removed the timer code yet. I just wanted to let you see the progress. The testing has shown that the LINQ groupby clause has zero effect on performance. However the foreach loop that actually makes the connections to tablestorage can be parallelized and that gives a huge performance boost

My test case with 20.000 rows (up from 2000)

Executing Demo Cases (WW Cloud) with Parallel connections

CoreHelpers.WindowsAzure.Storage.Table.Demo.DemoCases.UC10CreateHugeAmountOfDemoEntries
RunTime 00:00:05.67
RunTime 00:00:02.11

Executed Write Operations:
        mergeOrInserOperation: 200
        delete: 200
Executed Query Operations:
        Query: 20



Executing Demo Cases (WW Cloud) without Parallel connections

CoreHelpers.WindowsAzure.Storage.Table.Demo.DemoCases.UC10CreateHugeAmountOfDemoEntries
RunTime 00:01:03.29
RunTime 00:00:57.95

Executed Write Operations:
        mergeOrInserOperation: 200
        delete: 200
Executed Query Operations:
        Query: 20

…o lists)

Grouped store operations by partition key

d0767e0

Added parallel connections in StoreAsync

a4af89a

petero-dk added 2 commits June 15, 2019 11:52

missing space for auto merge

b8a4f1d

Small bug fixes and tiny performance and logical fixes (when to add t…

c5c9508

…o lists)

petero-dk mentioned this pull request Dec 24, 2019

Feature: Handle nullable datatypes from store, and enums #11

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix: Grouped store operations by partition key #7

Bugfix: Grouped store operations by partition key #7

petero-dk commented Apr 15, 2019

dei79 commented Jun 7, 2019

petero-dk commented Jun 7, 2019

dei79 commented Jun 7, 2019

petero-dk commented Jun 7, 2019

petero-dk commented Jun 14, 2019

petero-dk commented Jun 15, 2019

Bugfix: Grouped store operations by partition key #7

Are you sure you want to change the base?

Bugfix: Grouped store operations by partition key #7

Conversation

petero-dk commented Apr 15, 2019

dei79 commented Jun 7, 2019

petero-dk commented Jun 7, 2019

dei79 commented Jun 7, 2019

petero-dk commented Jun 7, 2019

petero-dk commented Jun 14, 2019

petero-dk commented Jun 15, 2019