Can't pass a unicode string as an argument (sys.argv)

I am passing a name of existing file as a parameter to my Python3 script and it couldn’t be passed correctly when it contained characters outside of system’s codeset (cp1251 in my case).
Here is the code I used to test:

    #!/usr/bin/env python3
    import sys, locale, os
    
    print('Python location: '             + sys.exec_prefix)
    print('sys.getdefaultencoding: '      + sys.getdefaultencoding())
    print('sys.getfilesystemencoding: '   + sys.getfilesystemencoding())
    print('locale.getpreferredencoding: ' + locale.getpreferredencoding())
    print('sys.stdin.encoding: '      + str(sys.stdin.encoding))
    print('sys.argv[1] type: '        + str(type(sys.argv[1])))
    if (os.path.exists(sys.argv[1])): print('File found.\n')
    else: print('File not found.\n')
    print('sys.argv[1] value: ' + sys.argv[1])

Initially I had results like this:

Python location: c:\program files (x86)\python
sys.getdefaultencoding: utf-8
sys.getfilesystemencoding: mbcs
locale.getpreferredencoding: cp1251
sys.stdin.encoding: cp1251
sys.argv[1] type: < class ‘str’>
File not found.

When I was feeding a filename containing only latin (or cyrillic, i.e. corresponding to my cp1251) characters, everything worked fine and file could be located. But when it contained latin diacrytic characters (â,è, etc), they were converted to the similar characters from basic latin and obviously file with this name couldn’t be found. Then I added `` PYTHONIOENCODING=utf-8 to system environment variables and confirmed by checking that now sys.stdin.encoding was shown as utf-8. With these settings, my script started worked correctly with unicode input arguments - but only when I was executing it from the command line! When I launched it from Komodo IDE (entering argument in the Debugging Options window), unicode characters were lost. This problem seems to be specific to Komodo IDE as it doesn’t occur neither when launch from command line nor from other IDE (PyCharm). Should I change something in the settings?

Hi, is the PYTHONIOENCODING environment variable set in Komodo? (Edit > Preferences > Environment)

yep, it is set to utf-8

Could you share your command line string invoking your script (the one that you said works) and then take a screenshot or describe the fields you are filling into Komodo’s Debugging Options window?

sure, see attached screenshots.


Thanks for the additional information. I think this may be a bug, but I don’t have too much time to look into it right now. I’ve reproduced this in one form and logged a bug on our bug tracker: https://github.com/Komodo/KomodoEdit/issues/1352. Feel free to comment on it with corrections or more information.

Thanks!

Hi, just a followup: this is a limitation in Python 2.x’s subprocess support – it cannot handle unicode in filenames on Windows very well. (Komodo use Python 2.x internally, so we cannot take advantage of Python 3’s fix right now.) I’ve written more details in the github ticket in my previous post if you’re curious. Sorry for the inconvenience :frowning: