Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot read and write to open streams on the same file #222

Open
IMSoP opened this issue Feb 25, 2020 · 10 comments
Open

Cannot read and write to open streams on the same file #222

IMSoP opened this issue Feb 25, 2020 · 10 comments
Labels

Comments

@IMSoP
Copy link

IMSoP commented Feb 25, 2020

Firstly, thanks to the creators and maintainers of this project, it's a really useful tool.

I had a scenario where I wanted to mock a streamed input (in the real application, it will open php://input), so tried to open the same file twice, once for writing (in the test) and once for reading (in the tested code).

Any data written before the first read shows up fine on the read handle, but subsequent writes are not reflected. Rewinding the read handle works as expected, i.e. the whole contents of the file are visible on subsequent reads.

I suspect this is related to #129; testing before and after the patch in #220 shows that previously even the first read came back empty.

Simple reproduction script:

require 'vendor/autoload.php';
$root = org\bovigo\vfs\vfsStream::setup();
$file = org\bovigo\vfs\vfsStream::newFile('test')->at($root);
$w = fopen($file->url(), 'w');
$r = fopen($file->url(), 'r');

fwrite($w, "hello\n");
echo fgets($r); # echoes "hello" as expected

fwrite($w, "hello again\n");
echo fgets($r); # does not echo anything :(

rewind($r);
echo fgets($r); # echoes "hello"
echo fgets($r); # echoes "hello again"

For comparison, if I change the fopen calls to reference a real file, the "hello again" gets echoed as I expected.

@IMSoP
Copy link
Author

IMSoP commented Feb 25, 2020

Possibly relevant is that calling feof($r) after the first read returns true, whereas with a real file it returns false until after an additional call to fgets:

$file = vfsStream::newFile('test')->at($root);
$w = fopen($file->url(), 'w');
$r = fopen($file->url(), 'r');
fwrite($w, "hello\n");
echo fgets($r);
# 'hello'
echo feof($r) ? "EOF\n" : "Not EOF\n"; 
# 'EOF'

vs

$w = fopen('/tmp/something', 'w');
$r = fopen('/tmp/something', 'r');
fwrite($w, "hello\n");
echo fgets($r);
# 'hello'
echo feof($r) ? "EOF\n" : "Not EOF\n"; 
# 'Not EOF'
echo fgets($r);
# empty
echo feof($r) ? "EOF\n" : "Not EOF\n"; 
# 'EOF'
# subsequent calls to fgets($r) will return empty, even if more data has been written

@jaapio
Copy link
Contributor

jaapio commented Feb 25, 2020

We are currently working on a fix in #220 this functionality has landed in our master branch. Since a large rename is currently in progress it might not be very convenient to use the version in master right now.

Sorry I missed something in your report. It looks like the implementation is not fully correct to mimic the real files.

@IMSoP
Copy link
Author

IMSoP commented Feb 25, 2020

I believe the test cases in #223 reproduce my issue and describe the behaviour that would match native files.

@bizurkur
Copy link
Contributor

👍 Those tests are great. Hopefully #221 can get finished soon, as it restructures a lot of things that would likely cause a lot of conflicts if more than a couple lines need to be changed.

@mikey179
Copy link
Member

mikey179 commented Feb 26, 2020

I just played around with that, thanks for providing the test. It seems like the fault isn't in vfsStream. vfsStreamWrapper::stream_read() always receives a value of 8192, independent of what is passed as second argument to fread(). If I check what's written into the file with the first file handle, the contents are there completely. It's just that vfsStream receives incorrect information on how much bytes it should read on the first fread().

To be more precise: even if fread() is called twice, vfsStream receives only one call to vfsStreamWrapper::stream_read(). You can see that when using fgets() instead - the last byte of contentA is chopped off before returned to contentBeforeWrite and becomes the first byte of contentAfterWrite. So I guess PHP does some buffering here, which it doesn't do with real files.

@mikey179
Copy link
Member

mikey179 commented Feb 26, 2020

There's a workaround, though: you can change the default value vfsStreamWrapper::stream_read() receives with stream_set_chunk_size(). If you set it to stream_set_chunk_size($fp2, strlen($contentA)) the second read will return exactly what is expected. However, now each call to vfsStreamWrapper::stream_read() with that resource receives the value set with stream_set_chunk_size(). That's... inconvenient, I guess.

@mikey179
Copy link
Member

Unfortunately, stream_set_read_buffer() can't be used. It always returns -1 if applied to a vfs resource, meaning PHP says it can't honor this request.

@IMSoP
Copy link
Author

IMSoP commented Feb 26, 2020

Ah, I think I see what's happening then.

fgets is doing the following under the hood:

  1. On the first call, ask the stream wrapper for up to 8192 bytes of data, and put it in a buffer
  2. Asks the stream if it's EOF
  3. Search the buffer for the first linebreak, and return to the user
  4. If no linebreak is found, and not EOF, go to step 1, appending to the buffer
  5. On subsequent calls, start at step 3, growing the buffer as necessary

This process is failing with vfsStream for two reasons:

  • SeekableFileContent::read always assumes that the offset has progressed by $offset bytes, even if not enough bytes were available in the file.
  • SeekableFileContent::eof returns true as soon as the pointer reaches the end of the stream, even though the stream could grow with subsequent writes.

So in my example, PHP asks for "up to 8192 bytes", and gets the full file content. The stream should return the contents of the file, and remain open with offset=size, eof=false until the next read.

IMSoP added a commit to IMSoP/vfsStream that referenced this issue Feb 26, 2020
…read

This doesn't correctly handle manual seeks, which should somehow reset
the EOF pointer.

There are also lots of failing tests which are asserting behaviour which
doesn't match the native file handler.
IMSoP added a commit to IMSoP/vfsStream that referenced this issue Feb 26, 2020
…read

This doesn't correctly handle manual seeks, which should somehow reset
the EOF pointer.

There are also lots of failing tests which are asserting behaviour which
doesn't match the native file handler.
IMSoP added a commit to IMSoP/vfsStream that referenced this issue Feb 26, 2020
The EOF flag on native file streams is only set after reading *past*
the end of the buffer, never on open or seek.

I believe all the failing tests here are asserting behaviour that
doesn't match the native file handler.
IMSoP added a commit to IMSoP/vfsStream that referenced this issue Feb 26, 2020
Testing with normal files show that fread() does not progress the
pointer beyond the end of the file, and a read *beyond* the file size is
required before feof() returns true.
@IMSoP
Copy link
Author

IMSoP commented Feb 26, 2020

Based on the above, I have raised a PR which tracks offset and EOF separately: https://github.com/bovigo/vfsStream/pull/224/files

The possibly non-obvious part is that fseek always resets feof to false, even if given an offset beyond the current file size. Since OpenedFile calls seek internally, I had to add a special case to stop that over-writing the EOF flag. That will go away if the seek implementation is moved onto OpenedFile as in #221

IMSoP added a commit to IMSoP/vfsStream that referenced this issue Feb 26, 2020
The EOF flag on native file streams is only set after reading *past*
the end of the buffer, never on open or seek.

I believe all the failing tests here are asserting behaviour that
doesn't match the native file handler.
IMSoP added a commit to IMSoP/vfsStream that referenced this issue Feb 26, 2020
The EOF flag on native file streams is only set after reading *past*
the end of the buffer, never on open or seek.

I believe all the failing tests here are asserting behaviour that
doesn't match the native file handler.
@mikey179
Copy link
Member

Great find! Although I'm a bit surprised that it took more than 12 years to find that my initial eof implementation is flawed. 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants