👋 Hello World in 2️⃣0️⃣ languages

Himanshu Jain
4 min readJun 23, 2024

--

This article might come as a surprise 🎉, and I bet you were expecting to see the code for “Hello World” in 20 different programming languages, but I have got something better for you 💯.

Let’s get started !!

Code to print “Hello World” in 20 different languages 🤯

Output:

❯ python3 hello-world.py
English: Hello World
Spanish: ¡Hola Mundo
French: Bonjour le Monde
German: Hallo Welt
Chinese (Simplified): 你好世界
Japanese: こんにちは世界
Russian: Здравствуй мир
Arabic: مرحبا بالعالم
Hindi: नमस्ते दुनिया
Portuguese: Olá Mundo
Italian: Ciao Mondo
Korean: 안녕하십시오 세계
Turkish: Merhaba Dünya
Greek: Γειαστή κόσμε
Dutch: Hallo Wereld
Swedish: Hej Världen
Thai: สวัสดีครับ ส้างส์
Hebrew: שלום עלאם
Vietnamese: Xin chào Thế giối
Finnish: Hei Maailma

The random numbers that you see in the code against each language are called Unicode”.

P.S: I had an easy option to write English letters insead of there Unicode symbols

Let's learn!!

What is Unicode ?? (explain as if I am a layman)

Unicode is like a giant catalogue that assigns a unique number to every character, symbol, and emoji that you see on your computer or phone.

Imagine each letter, number, and special character has its own special code so that computers everywhere can understand and show them correctly, no matter what language or script they come from.

Whether you’re typing in English, Chinese, Arabic, or any other language, Unicode ensures that your text looks right to everyone, no matter where they are in the world.

Well now I am interested in the technical definition !!

Unicode is a standardized system for encoding characters used in writing systems worldwide. It aims to assign each character a unique numeric code called a “code point.” This allows computers to represent and manipulate text in different languages and scripts consistently, regardless of platform or software.

Key points about Unicode:

  1. Universal Character Set: Unicode provides a universal character set that includes characters from all major languages, scripts, and symbols used around the world. This includes Latin, Cyrillic, Greek, Arabic, Hebrew, Chinese, Japanese, Korean, and many others.
  2. Code Points: Each character in Unicode is assigned a unique code point, typically represented as a hexadecimal number. For example, the code point for the letter ‘A’ is U+0041.
  3. UTF-8 and UTF-16: Unicode defines several encoding schemes, with UTF-8 and UTF-16 being the most common. These schemes allow computers to store and transmit Unicode text efficiently. UTF-8 is a variable-width encoding, while UTF-16 uses fixed-width encoding.
  4. Compatibility: Unicode aims to maintain compatibility with older character encoding standards (like ASCII, ISO 8859, etc.) by mapping their characters to Unicode code points.
  5. Support for Emoji and Symbols: Unicode also includes support for emojis, mathematical symbols, currency symbols, diacritical marks, and other special characters.

How a symbol is assigned a Unicode number

Assigning a symbol or character a Unicode number involves a systematic process managed by the Unicode Consortium. Here’s an overview of how this assignment typically works:

  1. Unicode Consortium: The Unicode Consortium is responsible for maintaining and updating the Unicode Standard. It includes representatives from major software companies, academic institutions, and internationalization experts.
  2. Character Proposals: When a new symbol or character needs to be added to Unicode, someone (often a linguist, script expert, or interested party) submits a proposal to the Unicode Consortium. This proposal outlines the need for the character, provides examples of usage, and may include historical or cultural significance.
  3. Technical Committee Review: The Unicode Technical Committee (UTC) reviews the proposal. They evaluate factors such as the character’s frequency of use, compatibility with existing standards, and uniqueness compared to other characters.
  4. Assigning a Code Point: If the proposal is accepted, the UTC assigns a unique code point (a numerical value) to the character. Code points are typically represented in hexadecimal format.
  5. Unicode Database: The assigned code point is recorded in the Unicode Standard’s database maintained by the Unicode Consortium. This database is continuously updated to reflect new characters, modifications, and corrections.
  6. Encoding Forms: The Unicode Standard defines encoding forms like UTF-8, UTF-16, and UTF-32, which specify how code points are represented in binary format for storage and transmission in computer systems.
  7. Implementation: Software developers and vendors implement Unicode support in their applications, ensuring that users can input, display, and process text containing the newly assigned characters correctly.
  8. Public Availability: Once a new version of the Unicode Standard is finalized and published, the assigned code points become part of the official Unicode repertoire. This makes the characters universally accessible for use in various applications, operating systems, and digital communication.

Give me some references:

Where can you find that list of latest Unicodes ?
Right here -> https://home.unicode.org
— — —
Other useful websites to search and learn about a Unicode:
https://unicodeplus.com
https://www.amp-what.com

Well, THE END … Check out some of my other articles :) Drop a follow if you like the article.

Checkout some of my other work online: https://www.instagram.com/oss.hub

--

--

Himanshu Jain
Himanshu Jain

Written by Himanshu Jain

0 Followers

Follow 🟢 me for spicy 🌶️ content in Tech, Software and OSS 🤓