请问有人知道这里的原始中文编码是什么? 如何解码?
json 返回的字符串
æµç•…轮廓ã€ç²¾è‡´ç»†èŠ‚和自由律动,糅åˆå‘ˆçŽ°#TODS2023秋冬 男士系列。
#TodsTheItalianPortrait
创æ„总监:Walter Chiapponi
实际内容
流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023 秋冬 男士系列。
#TodsTheItalianPortrait
创意总监:Walter Chiapponi
询问了几个 AI, 基本都建议这样解码:
const iconv = require('iconv-lite');
const garbledText = "æµç•…轮廓ã€ç²¾è‡´ç»†èŠ‚和自由律动,糅åˆå‘ˆçŽ°#TODS2023秋冬 男士系列。\n#TodsTheItalianPortrait\n\n创æ„总监:Walter Chiapponi";
const buf = Buffer.from(garbledText, 'binary');
// const decodedText = iconv.decode(buf, 'windows-1252');
// const decodedText = iconv.decode(buf, 'latin1');
// const decodedText = iconv.decode(buf, 'gbk');
const decodedText = iconv.decode(buf, 'utf-8');
console.log(decodedText);
但是实际输出是这样的, 只有小部分内容被解码:
流�"&轮�㬁精�!�� �`�R�!����9�`��R�&�����}�#TODS2023�9� � ��士系���
#TodsTheItalianPortrait
��:��欻�:�aWalter Chiapponi
{
"BaseResponse": {
"Ret": 0,
"ErrMsg": ""
},
"AddMsgCount": 1,
"AddMsgList": [
{
"MsgId": "100930987469004064",
"FromUserName": "@c4dcd4010dc50e5ee03f32ae786701de",
"ToUserName": "filehelper",
"MsgType": 1,
"Content": "æµç•…轮廓ã€ç²¾è‡´ç»†èŠ‚和自由律动,糅åˆå‘ˆçŽ°#TODS2023秋冬 男士系列。<br/>#TodsTheItalianPortrait<br/><br/>创æ„总监:Walter Chiapponi",
"Status": 3,
"ImgStatus": 1,
"CreateTime": 1740397130,
"VoiceLength": 0,
"PlayLength": 0,
"FileName": "",
"FileSize": "",
"MediaId": "",
"Url": "",
"AppMsgType": 0,
"StatusNotifyCode": 0,
"StatusNotifyUserName": "",
"RecommendInfo": {
"UserName": "",
"NickName": "",
"QQNum": 0,
"Province": "",
"City": "",
"Content": "",
"Signature": "",
"Alias": "",
"Scene": 0,
"VerifyFlag": 0,
"AttrStatus": 0,
"Sex": 0,
"Ticket": "",
"OpCode": 0
},
"ForwardFlag": 0,
"AppInfo": {
"AppID": "",
"Type": 0
},
"HasProductId": 0,
"Ticket": "",
"ImgHeight": 0,
"ImgWidth": 0,
"SubMsgType": 0,
"NewMsgId": 100930987469004064,
"OriContent": "",
"EncryFileName": ""
}
],
"ModContactCount": 0,
"ModContactList": [],
"DelContactCount": 0,
"DelContactList": [],
"ModChatRoomMemberCount": 0,
"ModChatRoomMemberList": [],
"Profile": {
"BitFlag": 0,
"UserName": {
"Buff": ""
},
"NickName": {
"Buff": ""
},
"BindUin": 0,
"BindEmail": {
"Buff": ""
},
"BindMobile": {
"Buff": ""
},
"Status": 0,
"Sex": 0,
"PersonalCard": 0,
"Alias": "",
"HeadImgUpdateFlag": 0,
"HeadImgUrl": "",
"Signature": ""
},
"ContinueFlag": 0,
"SyncKey": {
"Count": 14,
"List": [
{
"Key": 1,
"Val": 940546031
},
{
"Key": 2,
"Val": 897439235
},
{
"Key": 3,
"Val": 940546023
},
{
"Key": 11,
"Val": 940546048
},
{
"Key": 19,
"Val": 44482
},
{
"Key": 23,
"Val": 1740396794
},
{
"Key": 24,
"Val": 1740397130
},
{
"Key": 25,
"Val": 897439235
},
{
"Key": 27,
"Val": 308443
},
{
"Key": 201,
"Val": 1740397130
},
{
"Key": 203,
"Val": 1740396590
},
{
"Key": 206,
"Val": 101
},
{
"Key": 1000,
"Val": 1740395520
},
{
"Key": 1001,
"Val": 1740395522
}
]
},
"SKey": "",
"SyncCheckKey": {
"Count": 14,
"List": [
{
"Key": 1,
"Val": 940546031
},
{
"Key": 2,
"Val": 897439235
},
{
"Key": 3,
"Val": 940546023
},
{
"Key": 11,
"Val": 940546048
},
{
"Key": 19,
"Val": 44482
},
{
"Key": 23,
"Val": 1740396794
},
{
"Key": 24,
"Val": 1740397130
},
{
"Key": 25,
"Val": 897439235
},
{
"Key": 27,
"Val": 308443
},
{
"Key": 201,
"Val": 1740397130
},
{
"Key": 203,
"Val": 1740396590
},
{
"Key": 206,
"Val": 101
},
{
"Key": 1000,
"Val": 1740395520
},
{
"Key": 1001,
"Val": 1740395522
}
]
}
}
windows-1252测试结果
> iconv.decode(iconv.encode('æµç•…轮廓ã€ç²¾è‡´ç»†èŠ‚和自由律动,糅å呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait创æ„总监:Walter Chiapponi', 'windows-1252'), 'utf-8')
'�畅轮廓�精致细节和自由律动,糅�呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait创�总监:Walter Chiapponi'
> iconv.decode(iconv.encode('流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait 创意总监:Walter Chiapponi', 'utf-8'), 'windows-1252')
'�畅轮廓�精致细节和自由律动,糅�呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait 创�总监:Walter Chiapponi'
> iconv.decode(iconv.encode(iconv.decode(iconv.encode('流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait 创意总监:Walter Chiapponi', 'utf-8'), 'windows-1252'), 'windows-1252'), 'utf-8')
'浝畅轮廓〝精致细节和自由律动,糅坈呈现#TODS2023秋冬 男士系列。#TodsTheItalianPortrait 创愝总监:Walter Chiapponi'
下面这段:
当å‰å¾®ä¿¡ç‰ˆæœ¬ä¸æ”¯æŒå±•ç¤ºè¯¥å†…容,请å‡çº§è‡³æœ€æ–°ç‰ˆæœ¬ã€‚
.
用windows-1252编码,然后utf-8解码后是:
iconv.decode(iconv.encode('当å‰å¾®ä¿¡ç‰ˆæœ¬ä¸æ”¯æŒå±•ç¤ºè¯¥å†…容,请å‡çº§è‡³æœ€æ–°ç‰ˆæœ¬ã€‚', 'windows-1252'), 'utf-8')
.
当�?微信版本�?支�?展示该内容,请�?�级至最新版本。
.
通过网络搜索,发现正确的文字是:
当前微信版本不支持展示该内容,请升级至最新版本。
.
部分文字无法解码
仔细分析“当”和“前”发现:
当 => [0xe5, 0xbd, 0x93] => å ½ “
前 => [0xe5, 0x89, 0x8d] => å ‰ (这里8d没有对应的编码,导致编码出现问题).
windows-1252 的 81、8D、8F、90 和 9D 都未有使用( https://zh.wikipedia.org/wiki/Windows-1252 ).
查看原始网络数据包,发现字符串包含了部分不可见字符,比如: \x8D \x81.
iconv.encode('', 'windows-1252')
之后要替换掉对应位置的值为81、8D、8F、90 或 9D。 1
ntedshen 20 天前
现(utf8)=e78eb0=现(latin1)
|
![]() |
2
chenliang0571 OP @ntedshen 似乎不对?
> iconv.encode('现', 'utf-8') <Buffer e7 8e b0> > iconv.encode('现', 'latin1') <Buffer e7 3f b0> |
3
ntedshen 20 天前
@chenliang0571
https://cs.stanford.edu/people/miles/iso8859.html 3f 是问号 其实不用管这个,你现在只需要知道编码是错的,接口无论如何也不可能给你一个拉丁字符集让你自己处理中文。。。 看看 contenttype 是不是没 utf8 |
![]() |
4
chenliang0571 OP @ntedshen
request:content-type:application/json;charset=UTF-8 response:content-type:text/plain --- 我知道原因了,windows-1252 的 81 、8D 、8F 、90 和 9D 都未有使用( https://zh.wikipedia.org/wiki/Windows-1252 ) 所以下面的中文编码为 windows-1252 ,然后重新解码 utf-8 部分中文会出错。 iconv.decode(iconv.encode(iconv.decode(iconv.encode('流畅轮廓、精致细节和自由律动,糅合呈现#TODS2023 秋冬 男士系列。#TodsTheItalianPortrait 创意总监:Walter Chiapponi', 'utf-8'), 'windows-1252'), 'windows-1252'), 'utf-8') 浝畅轮廓〝精致细节和自由律动,糅坈呈现#TODS2023 秋冬 男士系列。#TodsTheItalianPortrait 创愝总监:Walter Chiapponi |