当前位置: 动力学知识库 > 问答 > 编程问答 >

python - UnicodeDecodeError on import of a .pyd file

问题描述:

I've started to slowly dabble with the Python/C API and after much fiddling and finagling, I was able to build a spam.pyd file.

However, I must be missing something with this process and was hoping that someone could point me in the right direction. I thought that once spam.pyd was created, I could call it from Python via import spam. Is this true?

When I try this, I get the following trace:

Traceback (most recent call last):

File "< pyshell#25 >", line 1, in <module>

import spam

UnicodeDecodeError: 'utf8' codec can't decode byte 0x89 in position 1: unexpected code byte

Any ideas as to what I am doing wrong? I am working with Python 3.1.2 on Windows XP. I compiled spam.c via the mingw32 compiler.

Thanks for reading this!

EDIT:

Well, it looks like the problem was that I had written the C code in an editor that saved the file with ANSI encoding. Strangely, if I retyped the code in Notepad, and saved the file with UTF8 encoding, I would get compile time errors complaining about invalid characters. When I used the built-in IDLE editor, everything worked fine. I was just following the example from the Python tutorial here.

Is this an usual problem to have??

Here is all the code that was used if it helps any:

#include < Python.h >

static PyObject *spam_system(PyObject *self, PyObject *args)

{

const char *command;

int sts;

if (!PyArg_ParseTuple(args, "s", &command))

return NULL;

sts = system(command);

return Py_BuildValue("i", sts);

}

static PyMethodDef SpamMethods[] = {

{"system", spam_system, METH_VARARGS,

"Execute a shell command."},

{NULL, NULL, 0, NULL}

};

static struct PyModuleDef spammodule = {

PyModuleDef_HEAD_INIT,

"spam",

NULL,

-1,

SpamMethods

};

PyMODINIT_FUNC

PyInit_spam(void)

{

return PyModule_Create(&spammodule);

}

网友答案:

You say: Well, it looks like the problem was that I had written the C code in an editor that saved the file with ANSI encoding.

This is exceedingly unlikely. There are no non-ASCII characters visible in your published C source. If there were any, you would have got an error message from the C compiler (except maybe if it was in a string constant; I've never tried that).

You say: Strangely, if I retyped the code in Notepad, and saved the file with UTF8 encoding, I would get compile time errors complaining about invalid characters.

Not strangely. Notepad prepends a UTF-8 BOM. This means your C compiler was being presented with a source file which started with 3 bytes of junk. Don't use Notepad. Use a proper text editor.

The indications are that the problem is much more likely to be in your Python input. The default source-file encoding in Python 3 is UTF-8. Your file contains "byte 0x89" which is not a valid UTF-8 lead byte and which the Windows cp125X encodings map to alias U+2030 PER MILLE SIGN -- either you have this in a string constant or you've typed that by mistake for a % (PER CENT SIGN). However it's difficult to guess how you got the traceback that you did. Getting into an interpreter (e.g. IDLE) and typing import spam should NOT give you that traceback.

分享给朋友:
您可能感兴趣的文章:
随机阅读: