Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: "delimiter" must be string, not unicode #36

Open
carlos-jenkins opened this issue Nov 17, 2014 · 5 comments
Open

TypeError: "delimiter" must be string, not unicode #36

carlos-jenkins opened this issue Nov 17, 2014 · 5 comments

Comments

@carlos-jenkins
Copy link

Python 2.7:

from __future__ import unicode_literals
from unicodecsv import DictReader

reader = DictReader(
    StringIO(content),
    fieldnames=CSV_HEADER,
    delimiter=';',
    quotechar='"'
)
@pymaldebaran
Copy link

Same problem for me: delimiter and quotechar keywordargs for DictReader() and Reader() have to be string, not unicode. This is kind of anoying when you want to work only with unicode...
Workaround:

delimiter=str(u';'), quotechar=str(u'"')

Here the u prefix is just for clarity, most of the time my 2.7 module have from __future__ import unicode_literals to ensure Python 3 compatibility (and to ensure correct utf8 handling everywhere).

@jruere
Copy link

jruere commented Jun 11, 2015

Can't this be handled by the library? The encoding of the CSV is provided.

@jdunck
Copy link
Owner

jdunck commented Jun 11, 2015

Yes, it can. I'll try to work this up soon.

@ryanhiebert
Copy link
Collaborator

In addition to making sure it's a string, it also need to make sure it's a one-byte string when it's done. The reader only support single character str (bytes) for these. It's a reasonable limitation, but perhaps one that we should guard for when converting these characters.

@akaIDIOT
Copy link

Hate to pull the old '+1', but just ran in to this trying to make my lib work with both py2 and py3. Many things seem to be a lot easier than with the builtin csv module, but delimiter (and lineterminator) arguments are required to be str. My current issue is with the writer, not the reader, but I'm assuming this is the same issue at its core.

As I was already using six, my current workaround looks like this:

from __future__ import unicode_literals
import csv
import six

if six.PY2:
    import unicodecsv as csv

def export(fname, …, delimiter='\t', …):
    if six.PY2:
        output = open(fname, 'wb')
        delimiter = delimiter.encode('utf-8')
    else:
        output = open(fname, 'w')
    …
    with output:
        writer = csv.DictWriter(output, …, delimiter=delimiter, …)
        …

Getting rid of the first bit would be awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants