Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding compression by default, or mention it in README #214

Closed
hnykda opened this issue Oct 14, 2024 · 1 comment
Closed

Consider adding compression by default, or mention it in README #214

hnykda opened this issue Oct 14, 2024 · 1 comment

Comments

@hnykda
Copy link

hnykda commented Oct 14, 2024

Hello,

I know there have been some discussions about related topics, but I would propose turning the compression, or at least educate the users. It took me about three hours to find out that simply doing:

ALTER TABLE ltss SET (
  timescaledb.compress,
  timescaledb.compress_orderby = 'time DESC',
  timescaledb.compress_segmentby = 'entity_id'
);


SELECT add_compression_policy('ltss', INTERVAL '30 days');

SELECT SELECT
  *
FROM timescaledb_information.hypertables
WHERE hypertable_name = 'ltss';

-- now it takes time based on the amount of data to have to chunk it
-- it took couple of minutes for 30 GB of data
-- it seems you don't need to run compress_chunk manually on old chunks
-- as I thought

-- this simple thing achieved incredible compression of 98% at least based on:

SELECT 
    chunk_schema,
    chunk_name,
    compression_status,
    pg_size_pretty(before_compression_total_bytes) as before_size,
    pg_size_pretty(after_compression_total_bytes) as after_size,
    CASE 
        WHEN compression_status = 'Compressed' AND before_compression_total_bytes > 0 THEN
            (100 * (1 - after_compression_total_bytes::float / before_compression_total_bytes::float), 2)
        ELSE
            NULL
    END as compression_ratio_percent,
    before_compression_total_bytes,
    after_compression_total_bytes
FROM chunk_compression_stats('ltss')
ORDER BY chunk_name;

achieves a crazy compression ratio of something like 98% (is it even real? am I doing some dumb error here?), bringing my 30 GB to a fraction of that without any noticeable effects for my usecase. It also doesn't significantly change the table in anyway (like additional columns), or creates new tables, views, triggers...

I totally understand this is very usecase dependent, but I would argue for many something like this would be a better and more maintable solution than no compression 🤷 . I do not hold this view strongly though, and can imagine that this tool is being used by pros who can do this anyway, and for us noobs it's to dangerous.

Totally understand if you don't want to do that. But at least I am bringing this to the front in case there are more like me.

@freol35241
Copy link
Owner

@hnykda Thanks for the detailed writeup. Yes, I am aware of the compression feature and yes, you are perfectly correct in that you can achieve very good compression ratios. This has been discussed to some length in #25.

The main downside of the compression feature is that it puts restrictions on what changes that can be done to the table layout once compressed. Some changes requires decompressing the entire table first... This makes it a bit tricky to handle. Also, obviously, lack of time working on this project.

I will close this issue in favour of #25 , feel free to continue the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants