Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues when formatting large file #88

Open
scamille opened this issue Oct 27, 2023 · 2 comments · May be fixed by #94
Open

Issues when formatting large file #88

scamille opened this issue Oct 27, 2023 · 2 comments · May be fixed by #94

Comments

@scamille
Copy link

scamille commented Oct 27, 2023

I recently ran into a issue when formatting a large C++ file. The error message that always came up was error: unable to read file content from object database: coming from here

After investigating the issues further, I stumbled upon the python documentation for subprocess Popen.wait (https://docs.python.org/3/library/subprocess.html#subprocess.Popen.wait) which states that wait() will deadlock under certain circumstances when using pipes, which I assume happendes here.

Changing the implementation of the format_object function to the following resolved the issue for me:

# Run formatter on a git blob identified by its hash. Writes output to a new git
# blob, and returns the hash of the new blob.
def format_object(formatter, object_hash, file_path, verbose=False):
    get_content = subprocess.Popen(
            ['git', 'cat-file', '-p', object_hash],
            stdout=subprocess.PIPE
            )
    command = re.sub(file_path_placeholder, file_path, formatter)
    if verbose:
        info(command)
    format_content = subprocess.Popen(
            command,
            shell=True,
            stdin=get_content.stdout,
            stdout=subprocess.PIPE
            )
    write_object = subprocess.Popen(
            ['git', 'hash-object', '-w', '--stdin'],
            stdin=format_content.stdout,
            stdout=subprocess.PIPE
            )

    get_content.communicate()
    format_content.communicate()
    
    new_hash, err = write_object.communicate()

    if write_object.returncode != 0:
        raise Exception('unable to write formatted content to object database')

    return new_hash.decode('utf-8').rstrip()

I am not 100% sure if that solution is correct. It is mostly based on https://stackoverflow.com/a/34166541

@mgavin
Copy link

mgavin commented May 24, 2024

I don't know why the fix was reverted in this pull, but the code here solved my similar issue.

@hallettj
Copy link
Owner

I'm sorry, it wasn't reverted but it hasn't been merged either. That's because the fix causes tests to fail. I was working on a variation of the fix in #94 that does not break any existing functionality, but I ran into a dead end because I wasn't able to reproduce the large file problem reliably on my system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants