Skip to content

Add guessableName to GB18030 encoding#300315

Open
frg2089 wants to merge 1 commit intomicrosoft:mainfrom
frg2089:patch-1
Open

Add guessableName to GB18030 encoding#300315
frg2089 wants to merge 1 commit intomicrosoft:mainfrom
frg2089:patch-1

Conversation

@frg2089
Copy link

@frg2089 frg2089 commented Mar 10, 2026

Currently, jschardet often misidentifies GB18030 encoded files as GB2312 due to their backward compatibility. However, this causes significant issues because GB18030 supports the full Unicode range (similar to UTF-8), whereas GB2312 does not.

Reproduction: If a text file containing characters like 🌟𠀚〇𡌴鉏 is saved in GB18030, jschardet may detect it as GB2312. Upon reading the file with this incorrect encoding, these unsupported characters result in garbled text (e.g., �9�9�2�2〇�5�2鉏).

Close: #248513

@vs-code-engineering
Copy link

📬 CODENOTIFY

The following users are being notified based on files changed in this PR:

@bpasero

Matched files:

  • src/vs/workbench/services/textfile/common/encoding.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't Add Supported Encodings like GB18030 to Candidate Guess Encodings

2 participants