Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5.2.0 #755

Merged
merged 37 commits into from
Jan 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
5c6f1c6
commands.manual:cmd_manual_convert - fix incorrect usage of `\b` to f…
MatteoCampinoti94 Jan 10, 2025
f5fb84c
commands - commit changes after every batch/file has been processed t…
MatteoCampinoti94 Jan 15, 2025
b4a1687
commands.info - add command to print general information about the da…
MatteoCampinoti94 Jan 15, 2025
e2e0d9d
commands.info:cmd_info - add initialization date (time of first log e…
MatteoCampinoti94 Jan 15, 2025
b54bb32
commands.init:cmd_init - clearer version numbers in log
MatteoCampinoti94 Jan 15, 2025
1508e46
commands.finalize - add group to perform final changes for delivery
MatteoCampinoti94 Jan 15, 2025
b281cd2
commands.finalize.doc_collections - add command to rearrange files in…
MatteoCampinoti94 Jan 15, 2025
9491ad1
changelog:5.2.0 - add new features
MatteoCampinoti94 Jan 15, 2025
95a7153
version - minor 5.1.0 > 5.2.0
MatteoCampinoti94 Jan 15, 2025
ca5a9a4
changelog:5.2.0 - update new features
MatteoCampinoti94 Jan 15, 2025
92695c0
readme - update
MatteoCampinoti94 Jan 15, 2025
4acd59e
tests.finalize:finalize_docs_collections - add test for "finalize doc…
MatteoCampinoti94 Jan 15, 2025
b418c95
commands.finalize.doc_collections - handle XSD templates of GML files
MatteoCampinoti94 Jan 15, 2025
c578fe7
commands.finalize.doc_collections:cmd_doc_collections - add UUID inde…
MatteoCampinoti94 Jan 16, 2025
fa30c96
poetry - update lock file
MatteoCampinoti94 Jan 16, 2025
60ff1d3
commands.identify:identify_original_file - remove duplicate select fo…
MatteoCampinoti94 Jan 16, 2025
ee49866
commands.identify:cmd_identify_original - add --ignore-lock option to…
MatteoCampinoti94 Jan 16, 2025
45642e9
readme - update
MatteoCampinoti94 Jan 16, 2025
52fb580
commands.identify:identify_original_file - add output when a file is …
MatteoCampinoti94 Jan 16, 2025
2a51b26
tests.reference_files.fileformats - update to version 4.1.7
MatteoCampinoti94 Jan 16, 2025
0c601ee
tests.avid - update data
MatteoCampinoti94 Jan 16, 2025
313a452
commands.identify:identify_original_file - add output when a file is …
MatteoCampinoti94 Jan 16, 2025
1fc3fce
commands.extract.extractors.msg:msg_attachment - catch ValueError exc…
MatteoCampinoti94 Jan 17, 2025
e51f8dd
commands.identify - improve operations in output log for new, updated…
MatteoCampinoti94 Jan 17, 2025
335580c
changelog:5.2.0 - add fixes
MatteoCampinoti94 Jan 17, 2025
84c5d3c
changelog:5.2.0 - add changes
MatteoCampinoti94 Jan 17, 2025
efd7bf7
poetry - use acacore 4.1.2
MatteoCampinoti94 Jan 17, 2025
6e3d2f6
tests.avid - upgrade to acacore 4.1.2
MatteoCampinoti94 Jan 17, 2025
e711ee7
commands ensure file object types in rollback commands
MatteoCampinoti94 Jan 17, 2025
8e4b3cd
commands.info - use Table.count methods
MatteoCampinoti94 Jan 17, 2025
42886ef
tests.edit:edit_original_action - better typing
MatteoCampinoti94 Jan 17, 2025
b2960ee
Revert "commands ensure file object types in rollback commands"
MatteoCampinoti94 Jan 17, 2025
ca8eefd
github.workflows.test - use poetry 2.0.1
MatteoCampinoti94 Jan 17, 2025
a57c960
github.workflows.test - use abatilo/actions-poetry@v4
MatteoCampinoti94 Jan 17, 2025
7ee09bd
commands - ensure file object types in rollback commands
MatteoCampinoti94 Jan 17, 2025
f7b6422
github.workflows.test - use Python 3.13.1
MatteoCampinoti94 Jan 17, 2025
f23b310
Revert "github.workflows.test - use Python 3.13.1"
MatteoCampinoti94 Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ on:

env:
PYTHON_VERSION: 3.11.9
POETRY_VERSION: 1.8.3
POETRY_VERSION: 2.0.1

jobs:
linting:
Expand All @@ -23,7 +23,7 @@ jobs:
- uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: abatilo/actions-poetry@v2
- uses: abatilo/actions-poetry@v4
with:
poetry-version: ${{ env.POETRY_VERSION }}
- run: poetry install
Expand Down Expand Up @@ -106,7 +106,7 @@ jobs:
- uses: actions/setup-python@v4
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: abatilo/actions-poetry@v2
- uses: abatilo/actions-poetry@v4
with:
poetry-version: ${{ env.POETRY_VERSION }}
- uses: actions/setup-go@v4
Expand Down
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
# Changelog

## v5.2.0

### New Features

* `finalize doc-collections` command to rearrange files in Documents into docCollection directories.
* `info` command to print general information about the database

### Changes

* Improve log output of `identify` commands

### Fixes

* Fix MSG extraction failing sometimes when attachments were incorrectly parsed as MSG files

## v5.1.0

### New Features
Expand Down
71 changes: 68 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,14 @@
* [manual](#digiarch-manual)
* [extract](#digiarch-manual-extract)
* [convert](#digiarch-manual-convert)
* [finalize](#digiarch-finalize)
* [doc-collections](#digiarch-finalize-doc-collections)
* [search](#digiarch-search)
* [original](#digiarch-search-original)
* [master](#digiarch-search-master)
* [access](#digiarch-search-access)
* [statutory](#digiarch-search-statutory)
* [info](#digiarch-info)
* [log](#digiarch-log)
* [upgrade](#digiarch-upgrade)
* [help](#digiarch-help)
Expand All @@ -61,7 +64,9 @@ Commands:
extract Unpack archives.
edit Edit the database.
manual Perform actions manually.
finalize Finalize for delivery.
search Search the database.
info Database information.
log Display the event log.
upgrade Upgrade the database.
help Show the help for a command.
Expand Down Expand Up @@ -137,6 +142,7 @@ Options:
[multiple]
--batch-size INTEGER RANGE Amount of files to identify at a time.
[default: 100; x>=1]
--ignore-lock Re-identify locked files.
--dry-run Show changes without committing them.
--help Show this message and exit.
```
Expand Down Expand Up @@ -853,9 +859,10 @@ Usage: digiarch manual convert [OPTIONS] ORIGINAL {master|access|statutory}

Manually add converted files with ORIGINAL UUID as their parent.

\b Depending on the TARGET, a different type of ORIGINAL file will be
needed: * "master": original file parent * "access": master file parent *
"statutory": master file parent
Depending on the TARGET, a different type of ORIGINAL file will be needed:
* "master": original file parent
* "access": master file parent
* "statutory": master file parent

The given FILEs must be located inside the MasterDocuments, AccessDocuments,
or Documents folder depending on the TARGET.
Expand All @@ -871,6 +878,53 @@ Options:
--help Show this message and exit.
```

### digiarch finalize

```
Usage: digiarch finalize [OPTIONS] COMMAND [ARGS]...

Perform the necessary opration to ready the AVID directory for delivery.

The changes should be performed in the following order:
* doc-collections
* doc-index (TBA)
* av-db (TBA)

Options:
--help Show this message and exit.

Commands:
doc-collections Create docCollections.
```

#### digiarch finalize doc-collections

```
Usage: digiarch finalize doc-collections [OPTIONS]

Rearrange files in Documents using docCollections.

If the process is interrupted, all changes are rolled back, but the newly
named files can be recovered using the --resume option when the command is
run next. The option should only ever be used if NO other changes have
occured to the files or the database. The default behaviour is to remove any
leftover files and start the process anew.

To change the number of documents in each docCollection directory, use the
--docs-in-collection option.

To see the changes without committing them, use the --dry-run option.

Options:
--docs-in-collection INTEGER RANGE
The maximum number of documents to put in
each docCollection. [default: 10000; x>=1]
--resume / --no-resume Resume a previously interrupted
rearrangement.
--dry-run Show changes without committing them.
--help Show this message and exit.
```

### digiarch search

```
Expand Down Expand Up @@ -1012,6 +1066,17 @@ Options:
--help Show this message and exit.
```

### digiarch info

```
Usage: digiarch info [OPTIONS]

Display information about the database.

Options:
--help Show this message and exit.
```

### digiarch log

```
Expand Down
2 changes: 1 addition & 1 deletion digiarch/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "5.1.0"
__version__ = "5.2.0"
4 changes: 4 additions & 0 deletions digiarch/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@
from .commands.completions import cmd_completions
from .commands.edit.edit import grp_edit
from .commands.extract.extract import cmd_extract
from .commands.finalize.finalize import grp_finalize
from .commands.help import cmd_help
from .commands.identify import grp_identify
from .commands.info import cmd_info
from .commands.init import cmd_init
from .commands.log import cmd_log
from .commands.manual import grp_manual
Expand All @@ -30,7 +32,9 @@ def app():
app.add_command(cmd_extract, cmd_extract.name)
app.add_command(grp_edit, grp_edit.name)
app.add_command(grp_manual, grp_manual.name)
app.add_command(grp_finalize, grp_finalize.name)
app.add_command(grp_search, grp_search.name)
app.add_command(cmd_info, cmd_info.name)
app.add_command(cmd_log, cmd_log.name)
app.add_command(cmd_upgrade, cmd_upgrade.name)
app.add_command(cmd_help, cmd_help.name)
Expand Down
12 changes: 12 additions & 0 deletions digiarch/commands/edit/action.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,8 @@ def cmd_action_original_convert(
set_action(ctx, database, file, "convert", data, reason, dry_run, log_stdout)
if lock:
set_lock(ctx, database, file, reason, dry_run, log_stdout)
if not dry_run:
database.commit()

end_program(ctx, database, exception, dry_run, log_file, log_stdout)

Expand Down Expand Up @@ -271,6 +273,8 @@ def cmd_action_original_extract(
set_action(ctx, database, file, "extract", data, reason, dry_run, log_stdout)
if lock:
set_lock(ctx, database, file, reason, dry_run, log_stdout)
if not dry_run:
database.commit()

end_program(ctx, database, exception, dry_run, log_file, log_stdout)

Expand Down Expand Up @@ -327,6 +331,8 @@ def cmd_action_original_manual(
set_action(ctx, database, file, "manual", data, reason, dry_run, log_stdout)
if lock:
set_lock(ctx, database, file, reason, dry_run, log_stdout)
if not dry_run:
database.commit()

end_program(ctx, database, exception, dry_run, log_file, log_stdout)

Expand Down Expand Up @@ -392,6 +398,8 @@ def cmd_action_original_ignore(
set_action(ctx, database, file, "ignore", data, reason, dry_run, log_stdout)
if lock:
set_lock(ctx, database, file, reason, dry_run, log_stdout)
if not dry_run:
database.commit()

end_program(ctx, database, exception, dry_run, log_file, log_stdout)

Expand Down Expand Up @@ -467,6 +475,8 @@ def cmd_action_original_copy(
set_action(ctx, database, file, action, data, reason, dry_run, log_stdout)
if lock:
set_lock(ctx, database, file, reason, dry_run, log_stdout)
if not dry_run:
database.commit()

end_program(ctx, database, exception, dry_run, log_file, log_stdout)

Expand Down Expand Up @@ -520,6 +530,8 @@ def cmd_action_master_convert(
with ExceptionManager(BaseException) as exception:
for file in query_table(database.master_files, query, [("lower(relative_path)", "asc")]):
set_master_convert(ctx, database, file, data, action_type, reason, dry_run)
if not dry_run:
database.commit()

end_program(ctx, database, exception, dry_run, log_file, log_stdout)

Expand Down
4 changes: 4 additions & 0 deletions digiarch/commands/edit/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ def edit_file_value(
setattr(file, "lock", True)
table.update(file)
database.log.insert(event)
database.commit()
event.log(INFO, *loggers, show_args=["uuid", "data"], path=file.relative_path)


Expand All @@ -70,8 +71,11 @@ def _handler(_ctx: Context, _avid: AVID, database: FilesDB, event: Event, file:
table = database.statutory_files
else:
return
if not isinstance(file, table.model):
raise TypeError(f"{type(file)} is not {table.model.__name__}")
prev_value, next_value = event.data
setattr(file, property_name, prev_value)
table.update(file)
database.commit()

return _handler
3 changes: 3 additions & 0 deletions digiarch/commands/edit/remove.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,8 @@ def remove_files(
if reset_processed:
reset_parent_processed(database, file)

database.commit()


def rollback_remove_original(_ctx: Context, avid: AVID, database: FilesDB, event: Event, file: BaseFile | None):
old_file = OriginalFile.model_validate(event.data)
Expand All @@ -153,6 +155,7 @@ def rollback_remove_original(_ctx: Context, avid: AVID, database: FilesDB, event
raise FileNotFoundError(old_file.relative_path)

database.original_files.insert(old_file)
database.commit()


@rollback("remove", rollback_remove_original)
Expand Down
4 changes: 4 additions & 0 deletions digiarch/commands/edit/rename.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from acacore.database import FilesDB
from acacore.models.event import Event
from acacore.models.file import BaseFile
from acacore.models.file import OriginalFile
from acacore.utils.click import end_program
from acacore.utils.click import param_callback_regex
from acacore.utils.click import start_program
Expand All @@ -31,6 +32,8 @@
def rollback_rename_original(_ctx: Context, avid: AVID, database: FilesDB, event: Event, file: BaseFile | None):
if not file:
raise FileNotFoundError(f"No file with UUID {event.file_uuid}")
if not isinstance(file, OriginalFile):
raise TypeError(f"{type(file)} is not OriginalFile")

file.root = avid.path
current_path: Path = file.relative_path
Expand Down Expand Up @@ -134,5 +137,6 @@ def cmd_rename_original(

event.log(INFO, log_stdout, show_args=["uuid", "data"])
database.log.insert(event)
database.commit()

end_program(ctx, database, exception, dry_run, log_file, log_stdout)
4 changes: 4 additions & 0 deletions digiarch/commands/extract/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ def rollback_extract_remove_child(ctx: Context, avid: AVID, database: FilesDB, f
def rollback_extract(ctx: Context, avid: AVID, database: FilesDB, _event: Event, file: BaseFile | None):
if not file:
return
if not isinstance(file, OriginalFile):
raise TypeError(f"{type(file)} is not OriginalFile")

for child in database.original_files.select("parent = ?", [str(file.uuid)]).fetchall():
rollback_extract_remove_child(ctx, avid, database, child)
Expand Down Expand Up @@ -305,5 +307,7 @@ def cmd_extract(
archive_file.action_data.ignore = IgnoreAction(template="extracted-archive")

db.original_files.update(archive_file)
if not dry_run:
db.commit()

end_program(ctx, db, exception, dry_run, log_file, log_stdout)
2 changes: 1 addition & 1 deletion digiarch/commands/extract/extractors/extractor_msg.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def msg_attachment(attachment: AttachmentBase) -> Message | bool | None:
attachment_msg = openMsg(attachment.data, delayAttachments=True)
else:
raise TypeError(f"Unsupported attachment data type {type(attachment.data)}")
except (ExMsgBaseException, FileNotFoundError):
except (ExMsgBaseException, FileNotFoundError, ValueError):
return None

return attachment_msg if isinstance(attachment_msg, (Message, MessageSigned)) else False
Expand Down
Loading
Loading