Releases: tidyverse/vroom
vroom 1.5.2
-
vroom()
now supports inputs with unnamed column types that are less than the number of columns (#296) -
vroom()
now outputs the correct column names even in the presence of skipped columns (#293, tidyverse/readr#1215) -
vroom_fwf(n_max=)
now works as intended when the input is a connection. -
vroom()
andvroom_write()
now automatically detect the compression format regardless of the file extension for bzip2, xzip, gzip and zip files (#348) -
vroom()
andvroom_write()
now automatically support many more archive formats thanks to the archive package.
These include new support for writing zip files, reading and writing 7zip, tar and ISO files. -
vroom(num_threads = 1)
will now not spawn any threads.
This can be used on as a workaround on systems without full thread support. -
Threads are now automatically disabled on non-macOS systems compiling against clang's libc++.
Most systems non-macOS systems use the more common gcc libstdc++, so this should not effect most users.
vroom 1.5.1
vroom v1.5.0
Major improvements
-
New
vroom(show_col_types=)
argument to more simply control when column types are shown. -
vroom()
,vroom_fwf()
andvroom_lines()
now support multi-byte encodings such as UTF-16 and UTF-32 by converting these files to UTF-8 under the hood (#138) -
vroom()
now supports skipping comments and blank lines within data, not just at the start of the file (#294, #302) -
vroom()
now uses the tzdb package when parsing date-times (@DavisVaughan, #273) -
vroom()
now emits a warning of classvroom_parse_issue
if there are non-fatal parsing issues. -
vroom()
now emits a warning of classvroom_mismatched_column_name
if the user supplies a column type that does not match the name of a read column (#317). -
The vroom package now uses the MIT license, as part of systematic relicensing throughout the r-lib and tidyverse packages (#323)
Minor improvements and fixes
-
`vroom() correctly reads double values with comma as decimal separator (@kent37 #313)
-
vroom()
now correctly skips lines with only one quote if the format doesn't use quoting (tidyverse/readr#991 (comment)) -
vroom()
andvroom_lines()
now handle files with mixed windows and POSIX line endings (tidyverse/readr#1210) -
vroom()
now outputs a tibble with the expected number of columns and types based oncol_types
andcol_names
even if the file is empty (#297). -
vroom()
no longer mis-indexes files read from connections with windows line endings when the two line endings falls on separate sides of the read buffer (#331) -
vroom()
no longer crashes ifn_max = 0
andcol_names
is a character (#316) -
vroom()
now preserves the spec attribute when vroom and readr are both loaded (#303) -
vroom()
now allows specifying column names incol_types
that have been repaired (#311) -
vroom()
no longer inadvertently calls.name_repair
functions twice (#310). -
vroom()
is now more robust to quoting issues when tracking the CSV state (#301) -
vroom()
now registers the S3 class withmethods::setOldClass()
(r-dbi/DBI#345) -
col_datetime()
now supports '%s' format, which represents decimal seconds since the Unix epoch. -
col_numeric()
now supportsgrouping_mark
anddecimal_mark
that are unicode characters, such as U+00A0 which is commonly used as the grouping mark for numbers in France (tidyverse/readr#796). -
vroom_fwf()
gains askip_empty_rows
argument to skip empty lines (tidyverse/readr#1211) -
vroom_fwf()
now respectsn_max
, as intended (#334) -
vroom_lines()
gains ana
argument. -
vroom_write_lines()
no longer escapes or quotes lines. -
vroom_write_lines()
now works as intended (#291). -
vroom_write(path=)
has been deprecated, in favor offile
, to match readr. -
vroom_write_lines()
now exposes thenum_threads
argument. -
problems()
now prints the correct row number of parse errors (#326) -
problems()
now throws a more informative error if called on a readr object (#308). -
problems()
now de-duplicates identical problems (#318) -
Fix an inadvertent performance regression when reading values (#309)
-
n_max
argument is correctly respected in edge cases (#306) -
factors with implicit levels now work when fields are quoted, as intended (#330)
-
Guessing double types no longer unconditionally ignores leading whitespace. Now whitespace is only ignored when
trim_ws
is set.
vroom 1.4.0
Major changes and new functions
-
vroom now tracks indexing and parsing errors like readr. The first time an issue is encountered a warning will be signaled. A tibble of all found problems can be retrieved with
vroom::problems()
. (#247) -
Data with newlines within quoted fields will now automatically revert to using a single thread and be properly read (#282)
-
NUL values in character data are now permitted, with a warning.
-
New
vroom_write_lines()
function to write a character vector to a file (#291) -
vroom_write()
gains aeol=
parameter to specify the end of line character(s) to use. Usevroom_write(eol = "\r\n")
to write a file with Windows style newlines (#263).
Minor improvements and fixes
-
Datetime formats used when guessing now match those used when parsing (#240)
-
Quotes are now only valid next to newlines or delimiters (#224)
-
vroom()
now signals an R error for invalid date and datetime formats, instead of crashing the session (#220). -
vroom(comment = )
now accepts multi-character comments (#286) -
vroom_lines()
now works with empty files (#285) -
Vectors are now subset properly when given invalid subscripts (#283)
-
vroom_write()
now works when the delimiter is empty, e.g.delim = ""
(#287). -
vroom_write()
now works with all ALTREP vectors, including string vectors (#270) -
An internal call to
new.env()
now correctly uses theparent
argument (#281)
vroom 1.3.2
vroom 1.3.1
vroom v1.3.0
-
The Rcpp dependency has been removed in favor of cpp11.
-
vroom()
now handles cases whenid
is set and a column in skipped (#237) -
vroom()
now supports column selections when there are some empty column names (#238) -
vroom()
argumentn_max
now works properly for files with windows newlines and no final newline (#244) -
Subsetting vectors now works with
View()
in RStudio if there are now rows to subset (#253). -
Subsetting datetime columns now works with
NA
indices (#236).
vroom 1.2.1
-
vroom()
now writes the column names if given an input with no rows (#213) -
vroom()
columns now support indexing with NA values (#201) -
vroom()
no longer truncates the last value in a file if the file contains windows newlines but no final newline (#219). -
vroom()
now works when thena
argument is encoded in non ASCII or UTF-8 locales and the file encoding is not the same as the native encoding (#233). -
vroom_fwf()
now verifies that the positions are valid, namely that the begin value is always less than the previous end (#217). -
vroom_lines()
gains alocale
argument so you can control the encoding of the file (#218) -
vroom_write()
now supports theappend
argument with R connections (#232)
vroom 1.2.0
Breaking changes
vroom_altrep_opts()
and the argumentvroom(altrep_opts =)
have been
renamed tovroom_altrep()
andaltrep
respectively. The prior names have
been deprecated.
New Features
-
vroom()
now supports reading Big Integer values with thebit64
package.
Usecol_big_integer()
or the "I" shortcut to read a column as big integers. (#198) -
cols()
gains a.delim
argument andvroom()
now uses it as the delimiter
if it is provided (#192) -
vroom()
now supports reading fromstdin()
directly, interpreted as the
C-level standard input (#106).
Minor improvements and fixes
-
col_date
now parses single digit month and day (@edzer, #123, #170) -
fwf_empty()
now uses theskip
parameter, as intended. -
vroom()
can now read single line files without a terminal newline (#173). -
vroom()
can now select the id column if provided (#110). -
vroom()
now correctly copies string data for factor levels (#184) -
vroom()
no longer crashes when files have trailing fields, windows newlines
and the file is not newline or null terminated. -
vroom()
now includes a spec object with thecol_types
class, as intended. -
vroom()
now better handles floating point values with very large exponents
(#164). -
vroom()
now uses better heuristics to guess the delimiter and now throws an
error if a delimiter cannot be guessed (#126, #141, #167). -
vroom()
now has an improved error message when a file does not exist (#169). -
vroom()
now outputs its messages onstdout()
rather thanstderr()
,
which avoids the text being red in RStudio and in the Windows GUI. -
vroom()
no longer overflows when reading files with more than 2B entries (@wlattner, #183). -
vroom_fwf()
is now more robust if not all lines are the expected length (#78) -
vroom_fwf()
andfwf_empty()
now support passingInf
toguess_max()
. -
vroom_str()
now works with S4 objects. -
vroom_fwf()
now handles files with dos newlines properly. -
vroom_write()
now does not try to write anything when given empty inputs (#172). -
Dates, times, and datetimes now properly consider the locale when parsing.
-
Added benchmarks with wide data for both numeric and character data (#87, @R3myG)
-
The delimiter used for parsing is now shown in the message output (#95 @R3myG)
vroom 1.0.2
New Features
- The column created by
id
is now stored as an run length encoded Altrep
vector, which uses less memory and is much faster for large inputs. (#111)
Minor improvements and fixes
-
vroom_lines()
now properly respects then_max
parameter (#142) -
vroom()
andvroom_lines()
now support reading files which do not end in
newlines by using a file connection (#40). -
vroom_write()
now works with the standard output connectionstdout()
(#106). -
vroom_write()
no longer crashes non-deterministically when used on Altrep vectors. -
The integer parser now returns NA values for invalid inputs (#135)
-
Fix additional UBSAN issue in the mio project reported by CRAN (#97)
-
Fix indexing into connections with quoted fields (#119)
-
Move example files for
vroom()
out of\dontshow{}
. -
Fix missing columns and windows newlines (#114)