Thursday 13 February 2014

Lists, Lists, Lists

List or Array?

I've written a post about arrays and kind of glossed over the array/list distinction. Some people got back to me about the fact that I hadn't really talked about lists at all so here I'm going to try to explore what a list is and how it differs from an array.

A list is just what it sounds like - a number of items separated by a comma.
An array is assigned a list, it contains the list but it isn't the list itself. I don't think that you can access elements in a list via index numbers unless you explicitly do something to make it be treated like an array.

So this is a list:

("apples", "bananas", "cherries", 45, 360)

We can assign it to an array:
1.   my @array = ("apples", "bananas", "cherries", 45, 360);
And this is an array:

@array

In my post about arrays, I showed a way to get the number of elements in an array using scalar(@arr) and I also talked about using scalar on a list but I didn't explain it very well.

So if we have:
1.   my @arr = ("this", "is", "a", "list);
The difference between
1.   print scalar(@arr);
and
1.   print scalar("this", "is", "a", "list");

is that the first one is giving the scalar value of an array, which is the number of elements in the array (4) and the second one is giving the scalar value of a list, which is the value of the last element ("list"). Two very similar statements, but giving you very different results - I think this is very strange behaviour.


Note To prevent you from having to type loads of quotes in your list, which I know for me definitely slows down my typing and results in a lot of pressing the wrong keys, you can do this:

my @arr = ("this", "is", "a", "list");

my @arr = qw(this is a list);

Much quicker and qw stands for "quote word". The only problem is that the space between the words means that the words are separate list items so if you want to include a space in one of your list items, you're going to have to use quotes.

You can do lots of things with lists and here are some of them.

The following only work on an array so you have to assign your list to an array first.

Pop

This takes the last element from the array and gives it to what you are assigning it to.

my @arr = (1,2,3,4,5);
my $val = pop(@arr);

$val has the value 5 and @arr is now (1,2,3,4)

Push

This one is kind of the opposite of pop. Instead of taking away from the end of the array, you add an element on to the end of it.

my @arr = (1,2,3,4,5);
push (@arr, "x");

@arr is now (1,2,3,4,5,"x")

Shift

This is a different opposite of pop. Instead of taking the last element of the list and assigning it to a value, you take the first element.

my @arr = (1,2,3,4,5);
my $val = shift(@arr);

$val has the value 1 and @arr is now (2,3,4,5)

Unshift

This is the opposite of shift and push. An element is added to the front of the array.

my @arr = (1,2,3,4,5);
unshift (@arr, "x");

@arr is now ("x",1,2,3,4,5)


I think this little table sums up everything above:

Beginning or end of array Add or remove element
Pop End Remove
Push End Add
Shift Beginning Remove
Unshift Beginning Add


Sort

You can also sort your lists using the sort function:

Lexically: 

If sort is used by itself with no parameters, it will sort the list in standard string comparison order, which basically means alphabetical order. You can either use sort directly on a list as I have below or you can assign the list to an array and use sort on the array. Just type "sort" before the list or before the array.
1.   my @arr = sort("hello", "my", "name", "is", "emma");
2.   print "@arr\n";
When this the above code is run, it will come out with:
emma hello is my name
This can also be written as
sort {$a cmp $b} ("hello", "my", "name", "is" "emma"); 
Note
If you have words beginning with capital letters, these will always be sorted in front of lower cased words, even if they come after the lower cased word alphabetically.

To sort the words into a backwards order just reverse the a and b:
sort {$b cmp $a} ("hello", "my", "name", "is" "emma");

Numerically:
1.   my @arr = sort{$a<=>$b}(7,9,4,2,8);
2.   print "@arr\n";
To sort numerically, you use the <=> operator rather than cmp and you still use $a and $b to represent the two numbers being sorted in the alogorithm. When the above code is run, you will get:
(2,4,7,8,9)
Again, to sort backwards, you reverse the a and b:
1.   my @arr = sort{$b<=>$a}(7,9,4,2,8);
2.   print "@arr\n";
 This will print:
(9,8,7,4,2) 
List mapping
 
The map function is used to transform lists element-wise. You can go through each element of a list and perform a function on it and a new list of the new values will be created.

To do the map function, you say that you want to put the result into a new array (@new_numbers in the code below), then you type an equals sing, then the word "map" and then what you want to do to each element in curly brackets. $_ refers to each individual element, kind of like x in a mathematical equation. In the code below what I've said is to take each element and times that element by two. Then you type the list that you want to perform the operation on - you can either write the list out or give an array like I've done.
1.   my @numbers = (1,2,3,4,5);
2.   my @new_numbers = map{$_*2}@numbers;
3.   print "@new_numbers\n";
Which will print:
2 4 6 8 10
You can also apply map to text:

1.   my @text = ("this", "is", "a", "list");
2.   my @new_text = map{$_.":"}@text;
3.   print "@new_text\n";
Which will print:
this: is: a: list:

Grep

Grep is similar to sort, although instead of applying a change to each element, it evaluates the result of the operation and if the result is true, the original value will be put into a new list, if the result is false, the original value will be filtered out.

Again you start with the array you want your new list to be put into (@multiples_of_two), then an equals sign and then the evaluation you want is put in curly braces. The evaluation must create an answer that is either true or false. Then you give the list, either written out as a list or as an array. The code below goes through the list of 1-10 and puts the multiples of two into a new list.

1.   my @numbers = (1,2,3,4,5,6,7,8,9,10);
2.   my @multiples_of_two = grep{$_%2==0}@numbers;
3.   print "@multiples_of_two\n";
This will print:
(2,4,6,8,10)

10 comments:

  1. I assume that a few typos have crept in here. The functions push(), pop, shift() and unshift() work on arrays, not lists.

    The Perl documentation is very careful to distinguish between lists and arrays. For example, the argument to pop() (see http://perldoc.perl.org/functions/pop.html) is described as an array, whereas the argument to sort() (see http://perldoc.perl.org/functions/sort.html) is described as a list.

    See Mike Friedman's blog post at http://friedo.com/blog/2013/07/arrays-vs-lists-in-perl for a great explanation of the differences between arrays and lists.

    ReplyDelete
    Replies
    1. I've updated now that I know you can use an array or list if the function works on a list, but you can only use an array if the function works on an array.

      Delete
  2. Hi Emma,

    I am writing something along those lines for brasilian ppl.
    I will write about array operations: pop, push, shift, unshift
    I think your explanation is great!

    perl++++++

    ReplyDelete
    Replies
    1. If you want to see how its going, here is the url:

      https://github.com/hernan604/Programando--Perl--Moderno

      Delete
    2. Hi Hernan, I don't speak a word of portuguese but it's looking really good, you've got a lot of stuff covered :-)

      Delete
  3. You may find it easier to read some of that code if you add some whitespace in there. For instance, most people will find:

    map { $_ * 2 } @numbers

    easier to read than:

    map{$_*2}@numbers

    Obviously I believe that the common characterization of Perl as "line noise" is unwarranted, but there certainly is a fair amount of punctuation in the language, and it helps to visually separate the different elements, even when the language doesn't require it.

    Keep up the great work! I hope one day people will look at this blog for inspiration on how to teach Perl to new folks.

    ReplyDelete
  4. You can use `map` to filter by returning '()' in place of element, for example `map { $_%2==0 ? $_ : () } @numbers`.

    ReplyDelete
    Replies
    1. I'm not really sure what the question mark and the colon represent, can someone explain? Thanks

      Delete
    2. Emma, that is the ternary operator. The way I think about it is as a streamlined if-then-else, with the added bonus that it returns the "then" or the "else." In Jakub's example, if divisible by 2 return it else return nothing.
      Thanks for your posts. I think they are very clear and helpful.

      Delete
  5. Oddly enough, you CAN index elements of a list (as well as array) by simple subscripting, AS LONG AS you enclose the list in parens:

    my $name = qw(zero one two)[1];
    say $name;

    prints:
    one

    You can even slice! Try the following:

    my @a = ("one", "two", "three")[2,1]; say "@a";

    As you might expect, this prints:
    three two

    Thanks for the useful article.

    ReplyDelete