Implement store/load methods for eprint resources on CanonicalStorage #19

erickpeirson · 2019-06-27T16:20:22Z

The arxiv.canonical.services.store.CanonicalStorage class has two private methods for loading/storing the state of an e-print from/to S3.

arxiv-canonical/arxiv/canonical/services/store.py

Lines 167 to 189 in 983f394

    
               def _store_eprint(self, eprint: EPrint) -> None: 
        
                   """ 
        
                   Store a :class:`.EPrint`. 
        
                   If the :attr:`.EPrint.source_package` or :attr:`.EPrint.pdf` content 
        
                   has changed, those should also be stored. 
        
                   Should complain loudly if ``self.read_only`` is ``True``. 
        
                   """ 
        
                   raise NotImplementedError('Implement me!') 
        
               def _load_eprint(self, identifier: Identifier, version: int) \ 
        
                       -> EPrint: 
        
                   """ 
        
                   Load an :class:`.EPrint`. 
        
                   The content of the :attr:`.EPrint.source_package` and  
        
                   :attr:`.EPrint.pdf` should implement :class:`.Readable`. The ``read()`` 
        
                   method should be a closure that, when called, retrieves the content of  
        
                   the corresponding resource from storage. 
        
                   """ 
        
                   raise NotImplementedError('Implement me!')

The schema for eprint metadata stored on S3 can be found here.

_load_eprint(self, identifier: Identifier, version: int) -> EPrint:

This should load metadata (a JSON document) from S3 using the path/key described here. That metadata should be used to instantiate an arxiv.canonical.domain.EPrint object.

When instantiating the Eprint object, the source_package and pdf members should be objects with a read() -> bytes: method. That method should lazily retrieve and read the contents of the corresponding resource on S3 (again, at the path/key described in the README.

_store_eprint(self, eprint: EPrint) -> None:

This should basically do everything in reverse. Note that source_package and pdf may be None. If they are not None, then their contents should also be stored to S3 at the correct paths.

The text was updated successfully, but these errors were encountered:

erickpeirson added the arxiv.canonical Work related to the arxiv.canonical package label Jun 27, 2019

erickpeirson mentioned this issue Oct 30, 2019

ARXIVNG-1495 Data architecture for the canonical record #25

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement store/load methods for eprint resources on CanonicalStorage #19

Implement store/load methods for eprint resources on CanonicalStorage #19

erickpeirson commented Jun 27, 2019

Implement store/load methods for eprint resources on CanonicalStorage #19

Implement store/load methods for eprint resources on CanonicalStorage #19

Comments

erickpeirson commented Jun 27, 2019