Serialize Lark.grammar (fixes issue #1472) #1506

NasalDaemon · 2025-01-17T12:27:16Z

Fixes #1472

erezsh · 2025-01-17T14:46:26Z

How much does this change add to the cache size? (in rough terms)

NasalDaemon · 2025-01-18T22:35:56Z

How much does this change add to the cache size? (in rough terms)

In my scenario with a file.lark of 4kb:

cache without grammar: 63,994 bytes
cache with grammar: 115,970 bytes

making an increase of ~52kb

As a quick and dirty performance test, I ran my script which uses lark to parse a file and produce a cpp (with jinja) multiple times under different regimes.

Without caching: 440ms
With caching, but using Lark.grammer = load_grammar(...): 370ms
With caching+serializing grammar: 320ms

This shows that there is indeed a noticeable speed benefit even if the cache has become larger, despite the overhead of jinja and the rest of my script.

erezsh · 2025-01-18T22:47:46Z

How does serializing the grammar make it run faster? It only adds operations, and doesn't remove any.

NasalDaemon · 2025-01-19T19:08:34Z

How does serializing the grammar make it run faster? It only adds operations, and doesn't remove any.

According to the above tests, deserialising the grammar from the cache is faster than reproducing it from the source .lark file each time.

erezsh · 2025-01-19T21:21:34Z

I think you're mistaken.

NasalDaemon · 2025-01-21T20:05:25Z

I think you're mistaken.

If you would like to avoid the cost of de/serialising the grammar by default, I can make its serialisation optional instead, so you must specify cacheWithGrammar=True.

erezsh · 2025-01-21T20:33:39Z

I prefer cache_grammar=False.

Also consider adding text about this in the reconstructor error.

Serialize Lark.grammar

32725c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialize Lark.grammar (fixes issue #1472) #1506

Serialize Lark.grammar (fixes issue #1472) #1506

NasalDaemon commented Jan 17, 2025

erezsh commented Jan 17, 2025

NasalDaemon commented Jan 18, 2025

erezsh commented Jan 18, 2025

NasalDaemon commented Jan 19, 2025

erezsh commented Jan 19, 2025

NasalDaemon commented Jan 21, 2025

erezsh commented Jan 21, 2025

Serialize Lark.grammar (fixes issue #1472) #1506

Are you sure you want to change the base?

Serialize Lark.grammar (fixes issue #1472) #1506

Conversation

NasalDaemon commented Jan 17, 2025

erezsh commented Jan 17, 2025

NasalDaemon commented Jan 18, 2025

erezsh commented Jan 18, 2025

NasalDaemon commented Jan 19, 2025

erezsh commented Jan 19, 2025

NasalDaemon commented Jan 21, 2025

erezsh commented Jan 21, 2025