Different behavior of column remove for `separate_*()` and `separate()` #1587

Edgar-Zamora · 2025-01-24T00:28:58Z

Issue

While updating code to move from separate() to separate_*(), I noticed that the behavior of where the final placement of the column being separated differed between both. While not breaking changes, the behavior is not expected and might cause issues if column selection is based on position.

Proposed Change

I think the easiest non-breaking change would be to add to the documentation mentioning that the position for the column being separated is not preserved.

Example

In separate(), the column remains in the same position when remove=FALSE.
In separate*(), using cols_remove=FALSE places the column after the last column that was seperated.

library(tidyverse)
library(tidyr)
library(reprex)

df <- tibble(x = c('a_b', 'c_d', "e_f", 'g_h', 'i_j'),
             x2 = c('k_l', 'm_n', "o_p", 'q_r', 'x_y'))

# using separate
df |> 
  separate(x, into = c('xx1', 'xx2'), sep = '_', remove = FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   a     b     k_l  
#> 2 c_d   c     d     m_n  
#> 3 e_f   e     f     o_p  
#> 4 g_h   g     h     q_r  
#> 5 i_j   i     j     x_y


df |> 
  separate(x2, into = c('xx1', 'xx2'), sep = '_', remove = FALSE)
#> # A tibble: 5 × 4
#>   x     x2    xx1   xx2  
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k_l   k     l    
#> 2 c_d   m_n   m     n    
#> 3 e_f   o_p   o     p    
#> 4 g_h   q_r   q     r    
#> 5 i_j   x_y   x     y


# delim
df |> 
  separate_wider_delim(x2, names = c('xx1', 'xx2'), delim = '_', cols_remove =  FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k     l     k_l  
#> 2 c_d   m     n     m_n  
#> 3 e_f   o     p     o_p  
#> 4 g_h   q     r     q_r  
#> 5 i_j   x     y     x_y

# position
df |> 
  separate_wider_position(x2, widths = c(xx1 = 1, 1, xx2 = 1), cols_remove = FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k     l     k_l  
#> 2 c_d   m     n     m_n  
#> 3 e_f   o     p     o_p  
#> 4 g_h   q     r     q_r  
#> 5 i_j   x     y     x_y

# regex
df |> 
  separate_wider_regex(x2, patterns = c(xx1 = ".", "_", xx2 = "."), cols_remove = FALSE)
#> # A tibble: 5 × 4
#>   x     xx1   xx2   x2   
#>   <chr> <chr> <chr> <chr>
#> 1 a_b   k     l     k_l  
#> 2 c_d   m     n     m_n  
#> 3 e_f   o     p     o_p  
#> 4 g_h   q     r     q_r  
#> 5 i_j   x     y     x_y

^{Created on 2025-01-23 with reprex v2.1.0}

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different behavior of column remove for `separate_*()` and `separate()` #1587

Different behavior of column remove for `separate_*()` and `separate()` #1587

Edgar-Zamora commented Jan 24, 2025 •

edited

Loading

Different behavior of column remove for separate_*() and separate() #1587

Different behavior of column remove for separate_*() and separate() #1587

Comments

Edgar-Zamora commented Jan 24, 2025 • edited Loading

Issue

Proposed Change

Example

Different behavior of column remove for `separate_*()` and `separate()` #1587

Different behavior of column remove for `separate_*()` and `separate()` #1587

Edgar-Zamora commented Jan 24, 2025 •

edited

Loading