Historically, SMS API claimed to support 160 ASCII characters. This turned out to be oversimplified in two ways:
All carriers break up messages into 160 character blocks (unless the message has any special characters or emoji’s then it is 70 characters) called segments. If you are sending messages that are over 160 characters, then that message is split into more than one segment and sent. Certain characters in GSM 03.38 require an escape character. This means they take 2 characters to encode. These characters include: |, ^, {, }, €, [, ~, ] and \.
UCS-2 Encoding
If you send your SMS message using UCS-2 encoding, the number of characters per message is reduced to 70 characters. Some characters and symbols are not supported using the standard GSM character set, which causes your SMS message to automatically convert to UCS-2. The most common causes are a grave accent (back-tick) [`], accented letters [ã], as well as other accented characters that are usually included when copying text from a Microsoft Word document into a SMS message, or use of foreign language characters [ぁ] and Emoji’s.
Multipart SMS Messages
When the message length exceeds 160 characters in case of 7bit encoding (or 70 characters for UCS-2 encoding), the message is split up to multiple separate SMS and sent to the handset separately as well.
To be able to concatenate the messages on the phone, special header (UDH) is set for each message, which states the order and message each part belongs to. Due to this special UDH, the length of each combined 7-bit message is shortened to 153 characters (67 characters for UCS-2).
Click here for further information on encoding
- The size limitation is actually more complicated.
- ASCII is not the proper encoding, the standard for mobile networks is GSM 03.38 character encoding to send messages.
All carriers break up messages into 160 character blocks (unless the message has any special characters or emoji’s then it is 70 characters) called segments. If you are sending messages that are over 160 characters, then that message is split into more than one segment and sent. Certain characters in GSM 03.38 require an escape character. This means they take 2 characters to encode. These characters include: |, ^, {, }, €, [, ~, ] and \.
UCS-2 Encoding
If you send your SMS message using UCS-2 encoding, the number of characters per message is reduced to 70 characters. Some characters and symbols are not supported using the standard GSM character set, which causes your SMS message to automatically convert to UCS-2. The most common causes are a grave accent (back-tick) [`], accented letters [ã], as well as other accented characters that are usually included when copying text from a Microsoft Word document into a SMS message, or use of foreign language characters [ぁ] and Emoji’s.
Multipart SMS Messages
When the message length exceeds 160 characters in case of 7bit encoding (or 70 characters for UCS-2 encoding), the message is split up to multiple separate SMS and sent to the handset separately as well.
To be able to concatenate the messages on the phone, special header (UDH) is set for each message, which states the order and message each part belongs to. Due to this special UDH, the length of each combined 7-bit message is shortened to 153 characters (67 characters for UCS-2).
Regular SMS | Multipart SMS | |
---|---|---|
7-bit | 160 characters | 153 characters |
Unicode | 70 characters | 67 characters |
Click here for further information on encoding