Jump to content
View in the app

A better way to browse. Learn more.

FMForums.com

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Check for Latin + Non-Latin mixed text

Featured Replies

Hi, I've been looking for a way to check (via a calculation) whether a text string contains any non-latin characters.

Specifically I'm interested in checking whether text string contains characters only between code(902)-HEX386 AND code(974)-HEX3CE of the Unicode set (greek language), or others too.

The string is not supposed to be large (<100 chars) in length so recursion limits will not be an issue I think.

I only want to have a "switch" turned to (1) when mixed characters are typed (Greek + others).

Can anyone help with building a function for this scope? (I'm really bad at this)

I am not sure the problem is well defined. You can easily check if the text contains ONLY certain characters, e.g. if =

Exact ( text ; Filter ( text ; "αβγ...Ω" ) )

returns true, then text contains no other characters. However, if the test returns false, the "other characters" could be anything that isn't listed explicitly in the filterText parameter - in the above example, that would include digits, punctuation marks, currency symbols etc.

  • Author

I am not sure the problem is well defined. You can easily check if the text contains ONLY certain characters, e.g. if =

Exact ( text ; Filter ( text ; "αβγ...Ω" ) )

returns true, then text contains no other characters. However, if the test returns false, the "other characters" could be anything that isn't listed explicitly in the filterText parameter - in the above example, that would include digits, punctuation marks, currency symbols etc.

Well the problem is that it may contain "abc", "αβγ" as well as "abcdαβγδ". I want to check whether it contains mixed Latin and Non-Latin characters or not (i.e. TRUE= case #3 only).

Hypothetically, I could use two expressions (one for latin and one for non-latin chars) like yours side-by-side checking for both, but then it sounds error-prone to hard-code/define the complete unicode set, isn't it?

Therefore, I thought a recursive function which checks separately if each character falls in a specified code range would do the job.

But then I can't write it myself :B :B :B

Such custom function would face the same issue: in which category does "αβγ123" fall? In any case, I don't see a functional difference between defining characters by range or by listing them. Performance-wise, I'd guess the Filter() function will be faster.

  • Author

... in which category does "αβγ123" fall?

That would be TRUE (mixed characters),

whereas:

"abc" or "αβγ" is FALSE (not mixed)

I still don't get the logic here. Is "abc123" mixed? Perhaps you should explain the purpose of this exercise.

That would be TRUE (mixed characters),

whereas:

"abc" or "αβγ" is FALSE (not mixed)

Why not use the Filter function to a second field for the Not exceptable text and have it show a warning or something?

HTH

Lee

  • Author

I still don't get the logic here. Is "abc123" mixed? Perhaps you should explain the purpose of this exercise.

No, numbers would not matter. Hence your example would be mixed=FALSE

Why not use the Filter function to a second field for the Not exceptable text and have it show a warning or something?

HTH

Lee

The mentioned function will be used when storing inventory IDs comprised of initials and numbers (MP234, ΜΣ234a). Due to the use of Greek characters a potential mistyping of common letters in latin(e.g. A,B,M,N,O etc) could easily cause sorting/searching problems. By displaying a warn symbol or even not validating the entry I imagined I could effectively confront this. Once again, numbers don't matter, only if the text contains greek and latin characters.

mixed characters

only greek=FALSE

only latin=FALSE

greek and latin=TRUE

* numbers don't matter

numbers don't matter

If numbers don't matter, then why is "αβγ123" mixed, but "abc123" is not?

potential mistyping of common letters in latin(e.g. A,B,M,N,O etc) could easily cause sorting/searching problems.

Isn't it more likely for someone to mistype "ΑΒ" (Greek) instead of "AB" (Latin) rather than switching the character set in the middle of typing?

Can the Code() function be used to identify the high-ascii characters in the string? (Granted it may need a recursive custom function...) I assume here that you are looking for non-standard characters that are normally accessible only with the Option key on a Mac or the Alt key on a PC.

identify the high-ascii characters

My dear Vaughan, you're living in the past: in the age of Unicode, Greek characters have their own block - and higher-ascii no longer serves as the common vehicle for all "other" alphabets.

  • Author

1. If numbers don't matter, then why is "αβγ123" mixed, but "abc123" is not? :B

- oups - got me there...

2.Isn't it more likely for someone to mistype "ΑΒ" (Greek) instead of "AB" (Latin) rather than switching the character set in the middle of typing?

1."αβγ123" and "abc123" should not be considered mixed (numbers don't matter)

2. Data entry involves latin characters in other fields

Can the Code() function be used to identify the high-ascii characters in the string? (Granted it may need a recursive custom function...) I assume here that you are looking for non-standard characters that are normally accessible only with the Option key on a Mac or the Alt key on a PC.

If you refer to characters like - ά,έ,ί,ή,ϊ,ΐ etc, in a Greek keyboard layout these are typed almost directly (e.g. ά = ;+α), therefore, no Alt or Option key used.

Data entry involves latin characters in other fields

Yes, but I still think it's more likely for the user to forget to switch the keyboard when entering the field - thus producing a "not mixed" entry which is still wrong, rather than switch the keyboard in the middle of typing and producing a "mixed" entry.

Anyway, I believe a "mixed" entry is true when =

not IsEmpty ( Filter ( Lower ( text ) ; "αβγ...ω" ) )

and

not IsEmpty ( Filter ( Lower ( text ) ; "abc...z" ) )

  • Author

Yes, but I still think it's more likely for the user to forget to switch the keyboard when entering the field - thus producing a "not mixed" entry which is still wrong, rather than switch the keyboard in the middle of typing and producing a "mixed" entry.

Anyway, I believe a "mixed" entry is true when =

not IsEmpty ( Filter ( Lower ( text ) ; "αβγ...ω" ) )

and

not IsEmpty ( Filter ( Lower ( text ) ; "abc...z" ) )

At a first glance, it looks like it's ok for the job. If I face any problem I'll let you know. Thanks a lot.

Create an account or sign in to comment

Important Information

By using this site, you agree to our Terms of Use.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.