curl-and-python

Re: Uploading files with UTF-8 names

From: Andre Polykanine <andre_at_oire.org>
Date: Tue, 11 Jun 2013 02:45:33 +0300

Hello Dima,

I've made the simplest file:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import codecs

filename = u"d:\\playlists\\collection\\main\\MĂ nran - Oran na cloiche.mp3"
f = codecs.open(filename, 'rU', 'utf-8')
print f

And that *does* open a unicode filename.
However, when I try to pass a unicode filename to CURL (either as a
local or as a, say, remote filename), it says "Type error: value must
be string" (not unicode, presumably).
So, maybe is there a way to feed it with urlencode data so it would be
100% ASCII?
Thank you and bear with me: this has been bugging me for a long time.

-- 
With best regards from Ukraine,
Andre
Skype: Francophile
My blog: http://oire.org/menelion (mostly in Russian)
Twitter: http://twitter.com/m_elensule
Facebook: http://facebook.com/menelion
------------ Original message ------------
From: Dima Tisnek <dimaqq_at_gmail.com>
To: curl with python
Date created: , 8:07:38 PM
Subject: Uploading files with UTF-8 names
      Seems like your problem is with Python's handling of filename encoding
and/or file name encoding in the filesystem.
In short, nothing to do with pycurl (yet).
Unicode is hard when file system allows invalid byte sequences, that's
properly solved only in Python 3:
http://www.python.org/dev/peps/pep-0383/
I don't know if that is your problem, or something else is.
this mailing list is not really the right place to ask.
to get to pycurl issue (if there is one), you have to get your code to
a state when you can do:
open(unicode_filename, "rb")  # in Python, and
fopen(encoded_filename, "rb");  // in C
I bet when you get to that stage, your pycurl code will work too.
d.
On 10 June 2013 18:21, Andre Polykanine <andre_at_oire.org> wrote:
> Hello Zdenek,
>
> local_filename = filename.encode("utf8")
> Result:
>     logging.debug("Local filename is %s: %s" % (local_filename, os.stat(local_filename)))
> WindowsError: [Error 2] The system cannot find the file specified: 'D:\\playlists\\Collection\\Main\\M\xc3\xa0nran - Oran na Cloiche.mp3'
>
> local_filename = filename.encode("cp1251")
> UnicodeEncodeError: 'charmap' codec can't encode character u'\xe0' in position 30: character maps to <undefined>
>
>
> --
> With best regards from Ukraine,
> Andre
> Skype: Francophile
> My blog: http://oire.org/menelion (mostly in Russian)
> Twitter: http://twitter.com/m_elensule
> Facebook: http://facebook.com/menelion
>
> ------------ Original message ------------
> From: Zdenek Pavlas <zpavlas_at_redhat.com>
> To: curl with python
> Date created: , 2:50:45 PM
> Subject: Uploading files with UTF-8 names
>
>
>       > os.stat()  here  doesn't work either, it says "System cannot find file
>> specified", because it turns unicode chars into \xe0, \xa9 etc. in the
>> path.
>
> just use the encoded string, not unicode.
>
>>> I'd try os.stat() for both encodings in turn, then send to pycurl
>                      ^^^^^^^^^^^^^^^^^^
>
> for e in 'cp1251', 'utf8':
>  try: enc = filename.encode(e); os.stat(enc); break
>  except: continue
> else:
>  raise ...
> print 'using encoding', e ...
> curl(enc)
> _______________________________________________
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
>
>
> _______________________________________________
> http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
_______________________________________________
http://cool.haxx.se/cgi-bin/mailman/listinfo/curl-and-python
Received on 2013-06-11