.NET Framework - Localization and the Comparer class

Asked By Tony Johansson
09-Feb-10 06:21 PM
Hi!

Below is a simple program that is using the Comparer class to compare two
strings named str1 and str2.
If I use the 0x040A as the first argument to the CultureInfo I use the
traditional sort order accoding to the MSDN documentation that you can find
at the bottom.
The WriteLine statement in the program is writing 1 as the value meaning
that str1 > str2.
Can somebody explain how this works because the comparing is not based on
the ascii table  ?
I mean if we use the normal ascii table we would have said that str1 < str2
because the letter l is less then u.

public static void Main()
{
// Creates the strings to compare.
String str1 = "llegar";
String str2 = "lugar";

Comparer myCompTrad = new Comparer(new CultureInfo(0x040A, false));
Console.WriteLine("   Traditional Sort  : {0}",
myCompTrad.Compare(str1, str2));
}

The Spanish (Spain) culture uses two culture identifiers, 0x0C0A using the
default international sort order, and 0x040A using the traditional sort
order. If the CultureInfo is constructed using the es-ES culture name, the
new CultureInfo uses the default international sort order. For the
traditional sort order, the object is constructed using the name
es-ES_tradnl.

//Tony
Console.WriteLine
(1)
Comparer
(1)
Console
(1)
MyCompTrad.Compare
(1)
APIs
(1)
Main
(1)
Sort
(1)
CultureInfo
(1)
  Peter Duniho replied to Tony Johansson
09-Feb-10 07:13 PM
At the bottom of what?


What do you want to know?  If you want all the gory details of the
comparison, you need to just look at the implementation (which may or
may not involve diving into the unmanaged Windows API).

The basic answer is: duh, of course a culture-specific comparison must
not be based on the ASCII character values.  That's the whole point of a
culture-specific comparison, as ASCII is itself not a culturally-based
character encoding.

Instead, when you do a culture-specific comparison, it uses whatever
ordering rules exist for that specific culture.  Humans being the kind
of animal they are, these rules are not always logical.  Even when they
are logical, the logic does not necessarily follow the representation of
characters and words as found in a computer.

But, those rules _are_ what a human being expects when the computer is
asked to order the input, which is the whole reason for having
culture-specific support in various APIs, including .NET.


The 0x040A LCID is not even listed on the reference that I looked at
(http://msdn.microsoft.com/en-us/goglobal/bb896001.aspx).  But, we can
see on the documentation for the CultureInfo class that it is used to
indicate a "traditional" Spanish-specific sorting.

And for whatever reason (I do not speak Spanish, so I could not tell you
why), the word "llegar" is alphabetized after "lugar".  So that is what
the Compare() method tells you when you compare them.

If you want to know why in the "traditional" ordering, "llegar" comes
after "lugar", but in the "international" ordering, it comes before, you
need to ask someone who knows about Spanish culture.  it is not a
programming question.

Pete
  Harlan Messinger replied to Peter Duniho
15-Feb-10 11:25 AM
The Spanish alphabet is, officially, a, b, c, ch, d, e, f, g, h, i, j,
k, l, ll, m, n, ?, o, p, q, r, s, t, u, v, w, x, y, z. The digraph "ll"
which has its own pronunciation distinct from that of "l", has been
treated as a single letter, in the same way as the digraph "ch".

However, a 1994 international language reform passed during the Tenth
Congress of the Association of Spanish Language Academies decreed that
henceforth, for purposes of sorting, "ch" and "ll" should be treated as
two separate letters, so the official order would now be {llegar,
lugar), despite the fact that "llegar" is still officially considered to
consist of five letters. Weird, but official, and perhaps enacted in
order to avoid the kinds of problems involved in international,
computerized data exchange, given that everybody, Spanish speakers
included, *types* "ch" and "ll" each as a sequence of two letters
instead of as a digraph.
Create New Account
help
ulong, ulong> (); lList.Add(1, 0); lList.Add(2, 0); ulong l = lList.Min().Key; Console.WriteLine(l); / * (at least one element must implement IComparable . . .) Unbehandelte Ausnahme: System.ArgumentException: Mindestens ein Objekt muss IComparable implementieren. bei System.Collections.Comparer.Compare(Object a, Object b) bei System.Collections.Generic.ObjectComparer`1.Compare(T x, T EulerDotNetProblems.SLTest (1) System.Collections.Generic.ObjectComparer (1) System.Collections.Generic.SortedList (1) System.Collections.Comparer.Compare (1) KeyNotFoundException (1) ArgumentException (1) Console.WriteLine (1) Console.ReadLine (1) ) g> (); The reason why you can use Max() here is that SortedList<TKey or an assembly reference?)S: \ VS \ Test \ MS \ ConsoleApplication4 \ ConsoleApplication4 \ Program.cs 3891 26 EulerDotNetApp Console.WriteLine(l); } ok, so i tried Console.WriteLine(lList[0].ToString()); with the runtime error
can provide this query: Select DISTINCT from DataTable please help me thanks Tvin .NET Discussions Console.WriteLine (1) DataTable.Select (1) DataTable (1) LINQ (1) UntypedDataTable.AsEnumerable (1) VB (1) AsEnumerable (1) Console (1) Can you clarify the question? I'm not 100% sure I have followed it However, looking forward a few weeks, LINQ supports this via Distinct() [which can accept a comparer], and .NET 3.5 ships with LINQ extensions for DataTable etc. I may not have row in untypedDataTable.AsEnumerable() select row.Field<string> ("Name")).Distinct(); foreach (var name in query) { Console.WriteLine(name); } alternatively, if typed: var distinctNames = ( from row in typedDataTable select row.Name).Distinct(); Marc
poin me towards a possible solution? Thanks in advance C# Discussions System.Collections.Generic (1) Console.WriteLine (1) Comparer.Default (1) Console.Write (1) CompareTo (1) Comparer (1) Console (1) Array (1) First: T[][] isn't a multi-dimensional array; it is a jagged i = 0; i < data.Length; i++) { for (int j = 0; j < data[i].Length; j++) { Console.Write(data[i][j]); Console.Write(' \ t'); } Console.WriteLine(); } Marc BTW - if you need, you can get the ordinal comparer
static void Main(string[] args) { Class1 myCl1 = 3D new Class1(); Class2 myCl2 = 3D new Class2(); Console.WriteLine("101 is: {0}", myCl1.i); string s = 3D myCl2.myClass1Pair.First.ToString(); int j1 = 3D myCl2.myClass1Pair.First.i; Console.WriteLine("string is: {0}, while myClass1Pair.First.i is: {1}", s, j1); Class2 myClassTwo2 = 3D new Class2(11.11, 123, 321); string s2 = 3D myClassTwo2.myClass1Pair.First.ToString(); Console.WriteLine("myC 2: {0}", s2); int j2 = 3D myClassTwo2.myClass1Pair.First.i; Console.WriteLine("myClassTwo2.myClass1Pair.First.i: {0}", j2); int j3 = 3D myClassTwo2.myClass1Pair.Second.i; Console.WriteLine