Skip to content

vroom v1.5.0

Compare
Choose a tag to compare
@jimhester jimhester released this 15 Jun 17:54

Major improvements

  • New vroom(show_col_types=) argument to more simply control when column types are shown.

  • vroom(), vroom_fwf() and vroom_lines() now support multi-byte encodings such as UTF-16 and UTF-32 by converting these files to UTF-8 under the hood (#138)

  • vroom() now supports skipping comments and blank lines within data, not just at the start of the file (#294, #302)

  • vroom() now uses the tzdb package when parsing date-times (@DavisVaughan, #273)

  • vroom() now emits a warning of class vroom_parse_issue if there are non-fatal parsing issues.

  • vroom() now emits a warning of class vroom_mismatched_column_name if the user supplies a column type that does not match the name of a read column (#317).

  • The vroom package now uses the MIT license, as part of systematic relicensing throughout the r-lib and tidyverse packages (#323)

Minor improvements and fixes

  • `vroom() correctly reads double values with comma as decimal separator (@kent37 #313)

  • vroom() now correctly skips lines with only one quote if the format doesn't use quoting (tidyverse/readr#991 (comment))

  • vroom() and vroom_lines() now handle files with mixed windows and POSIX line endings (tidyverse/readr#1210)

  • vroom() now outputs a tibble with the expected number of columns and types based on col_types and col_names even if the file is empty (#297).

  • vroom() no longer mis-indexes files read from connections with windows line endings when the two line endings falls on separate sides of the read buffer (#331)

  • vroom() no longer crashes if n_max = 0 and col_names is a character (#316)

  • vroom() now preserves the spec attribute when vroom and readr are both loaded (#303)

  • vroom() now allows specifying column names in col_types that have been repaired (#311)

  • vroom() no longer inadvertently calls .name_repair functions twice (#310).

  • vroom() is now more robust to quoting issues when tracking the CSV state (#301)

  • vroom() now registers the S3 class with methods::setOldClass() (r-dbi/DBI#345)

  • col_datetime() now supports '%s' format, which represents decimal seconds since the Unix epoch.

  • col_numeric() now supports grouping_mark and decimal_mark that are unicode characters, such as U+00A0 which is commonly used as the grouping mark for numbers in France (tidyverse/readr#796).

  • vroom_fwf() gains a skip_empty_rows argument to skip empty lines (tidyverse/readr#1211)

  • vroom_fwf() now respects n_max, as intended (#334)

  • vroom_lines() gains a na argument.

  • vroom_write_lines() no longer escapes or quotes lines.

  • vroom_write_lines() now works as intended (#291).

  • vroom_write(path=) has been deprecated, in favor of file, to match readr.

  • vroom_write_lines() now exposes the num_threads argument.

  • problems() now prints the correct row number of parse errors (#326)

  • problems() now throws a more informative error if called on a readr object (#308).

  • problems() now de-duplicates identical problems (#318)

  • Fix an inadvertent performance regression when reading values (#309)

  • n_max argument is correctly respected in edge cases (#306)

  • factors with implicit levels now work when fields are quoted, as intended (#330)

  • Guessing double types no longer unconditionally ignores leading whitespace. Now whitespace is only ignored when trim_ws is set.