python - how to correct the misencoded string? -


I used to read MP3 metadata because the ID3 tag is read as Unicode but in fact it is GBK Is encoded. How to correct this dragon?

  Audio = Ijiaidi 3 (name) title = Audio [ "title"] [0] print titles print repr (title)  

  μ ± Äã¹Âμ yen A.A. »ÁÏëÆðË u '\ xb5 \ xb1 \ xc4 \ XE3 \ XB9 \ xC2 \ xb5 \ xa5 \ xc4 \ XE3 \ xbb \ xe1 \ XCF \ xeb \ xc6 \ xf0 \ XCB \ xad'  
< P> But in fact this (sugar) should be in GBK.

  当 你 孤单 你 会 想起 谁  

it seems that the string has been decoded in Unicode using the wrong encoding (Latin-1)) is.

You need to convert it to a byte string and then use the right encoding to decode it back to Unicode.

  title = u '\ xb5 \ xb1 \ xc4 \ XE3 \ XB9 \ xC2 \ xb5 \ xa5 \ xc4 \ XE3 \ xbb \ xe1 \ XCF \ xeb \ xc6 \ xf0 \ XCB \ xad 'Print title.encode (' Latin-1 '). Decode ( 'GBK') 当 你 孤单 你 会 想起 谁  

Comments

Popular posts from this blog

sql - dynamically varied number of conditions in the 'where' statement using LINQ -

asp.net mvc - Dynamically Generated Ajax.BeginForm -

Debug on symbian -