You can use sort to order lists numerically or lexically. To sort lexically you use the cmp operator within the curly braces:
and for sorting numerically, you use the <=> operator:
1. sort{$a cmp $b}("please", "sort", "this", "list");
What I think this means is that $a and $b refer to the two strings or numbers that are being compared when the algorithm is run. The letter "a" comes before the letter "b" in the alphabet and $a is put in front of $b in the code which means that words beginning with a letter that comes before other words in the alphabet should be sorted in front of those other words. Hopefully this all makes sense and is actually accurate.
1. sort{$a <=> $b}(13,2,8,1,3,1,5,21);
Perl 5.6 and earlier used the quicksort algorithm and perl 5.7 and onwards, uses the merge sort algorithm. On wikipedia (http://en.wikipedia.org/wiki/Merge_sort) there's a pretty good animation at the top of the page that shows how merge sort works.
Quicksort - perl 5.6 and earlier
I remember learning this at school and uni and thinking it's pretty cool but also wondering what it was used for. We had to write a program in Java that carried out a quicksort. I should have just submitted one line of perl code...
Quicksort is a divide and conquer algorithm and there are different variations of it based on which position you choose for the pivot (explanation very soon) but I couldn't find out which version was used for sort in perl. So for simplicity, and because it's how I was taught, I'm going to choose the pivot and the value in the middle of the list.
So we have a list of numbers (assuming we're using numbers and sorting them in ascending numerical order) and the value in the centre of the list is the pivot. All numbers lower than the pivot are put on the left of this pivot and all numbers higher are put on the right. This creates three different lists - numbers lower than the pivot, numbers higher than the pivot and the pivot itself. The pivot list is one element long and is considered sorted so now we need to sort the other two lists in the same way. For each list, a pivot is chosen and again, lower numbers go on the left and higher numbers go on the right. This now creates seven lists. This continues until you have lists containing only one element, putting these lists all together will give one sorted list. Here's an example (please feel free to skip the example if you already know how it works or if it doesn't interest you!):
(4,2,9,6,5,1,8,7,3)
The number in the centre of the list is the pivot.
Starting from the left, numbers lower than 5 are put into one list to the left and numbers higher than 5 are put into another list on the right. 4 is taken out first giving:
(4) (2,9,6,5,1,8,7,3)
Then 2 is taken out and put in the list on the left:
(4,2) (9,6,5,1,8,7,3)
Then 9:
(4,2) (6,5,1,8,7,3) (9)
This is repeated, excluding the pivot until we have three list with the pivot on its own:
(4,2,1,3) (5) (9,6,8,7)
Pivots are now chosen from any lists with more than one element. I think this time, as there is no real middle element, I will use the (n+1)/2th element.
(4,2,1,3) (5) (9,6,8,7)
The same is applied starting with the first list - all element lower than one go on the left and all numbers higher than one go on the right:
(1) (4,2,3) (5) (9,6,8,7)
The same is applied to the other list:
(1) (4,2,3) (5) (6,7) (8) (9)
Again, pivots are picked for the lists with more than one element:
(1) (4,2,3) (5) (6,7) (8) (9)
And the same sorting is applied giving:
(1) (2) (3,4) (5) (6) (7) (8) (9)
Then the final list that contains more than one element is sorted:
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Put this all together and you have a sorted list:
(1,2,3,4,5,6,7,8,9)
Merge Sort - perl 5.7 onwards
So all of that wasn't entirely relevant for those of you that have upgraded your perl version since 2002 because from perl 5.6, the sort function has used the merge sort algorithm.
Merge sort is also a divide and conquer algorithm. It works by splitting up the list into individual elements. Each element is compared to its neighbour and (again assuming we're using numbers and sorting them in ascending numerical order) the smaller number is put on the left and the larger on the right. Then the next set of pairs are compared and so on until the end of the list. Then these sorted pair are compared to each other and ordered to make groups of four. Then these sorted groups are compared to the other groups and this is repeated until the whole list is in one big, sorted group. Hopefully this example will make all this clearer (again, feel free to skip this bit if you don't really care!):
Say we have a list of:
(4,2,6,5,1,8,7,3)
We then split it into sublists which contain one individual element. This means we have eight sublists of single elements:
(4) (2) (6) (5) (1) (8) (7) (3)
Then we compare the elements to their next door neighbour. First will be (4) and (2). 2 is smaller so goes in front of 4 and we have our first sorted pair of (2,4)
(2,4) (6) (5) (1) (8) (7) (3)
The smallest sublists are always compared first so we compare the next two single element sublists - (6) and (5) giving:
(2,4) (5,6) (1) (8) (7) (3)
This is continued until all the sublists contain two sorted elements:
(2,4) (5,6), (1,8) (7) (3)
(2,4) (5,6) (1,8) (3,7)
Then the pairs are compared to each other starting with (2,4) and (5,6) and always comparing the first element of each list. First 2 and 5 are compared, 2 is smaller so is taken out to make a new list:
(2,4) (5,6) (1,8) (3,7)
(2
Then the first elements of the target sublists are compared so 4 and 5, 4 is smaller so is taken out making:
(4) (5,6) (1,8) (3,7)
(2,4
Then there's only one target sublist left so elements are taken out one by one from the front:
(5,6) (1,8) (3,7)
(2,4,5
(6) (1,8) (3,7)
(2,4,5,6)
And finally:
(2,4,5,6) (1,8) (3,7)
Then the pairs (1,8) and (7,3) are compared in the same way with intermediate steps:
(2,4,5,6) (1,8) (3,7)
(1
(2,4,5,6) (8) (3,7)
(1,3
(2,4,5,6) (8) (7)
(1,3,7
(2,4,5,6) (8)
(1,3,7,8)
Giving us two sorted lists of:
(2,4,5,6) (1,3,7,8)
Finally these two sublists are compared in the same way to give one sorted list of
(1,2,3,4,5,6,7,8)
If that doesn't make sense, again, I highly recommend looking at the animation on the wikipedia page.
Why was quicksort ditched in favour of merge sort?
According to the perl docs, the quicksort algorithm used was unstable because it can become quadratic. What????
A stable sort preserves the order of list elements that compare equal so maybe the quicksort algorithm kept sorting list items that were exactly the same as each other, which made the run time and complexity higher.
The run time of both of these algorithms comes out as O(n log n) when averaged over all arrays of length n. However, because quicksort does not always preserve the order of equal elements, its run time can become O(n^2) - which is a quadratic because the n is squared. This behaviour doesn't happen with merge sort so quicksort was replaced.
The perl docs do say that for "some inputs" on "some platforms" the original quicksort was faster, but no other information is given - I'm not sure which inputs or which platforms, but I'm sure that these are a minority. Also as an extra note, I think that 99.99% of the time, any time or performance differences will not be large enough to be noticeable anyway.