Code Character Encoding Standard
- Requirement: Every piece of code MUST use only UTF-8 encoding.
Rationale
UTF-8 encoding supports a large number of characters from the Unicode standard. It is compatible with ASCII and provides the flexibility needed to represent characters from all languages and a wide range of symbols. Ensuring all code uses UTF-8 encoding facilitates interoperability, reduces the likelihood of encoding errors, and makes the codebase more inclusive and accessible.
Enforcement
This standard applies to all code and text files in the project, including but not limited to source code, configuration files, and documentation. Contributors are required to configure their text editors or IDEs to use UTF-8 encoding by default for all files associated with the project.
Failure to comply with this standard may result in code that is not portable or that causes runtime errors due to encoding mismatches. Such code will not be accepted into the project repository.
Verification
- Before committing changes, contributors MUST verify that all files are encoded in UTF-8.
- Reviewers SHOULD use tools or scripts to automatically check the encoding of files as part of the code review process.
Tools and Configuration
Most modern text editors and IDEs support UTF-8 encoding and allow users to set UTF-8 as the default encoding for new files. In addition, many version control systems can be configured to reject files that do not conform to this encoding standard.