Tuesday 28 January 2014

"Strings"

Strings

A string is basically any amount of characters in sequence including zero characters, which is known as the empty string. Similarly to numbers, the maximum size or length of a string is basically limited to your computer's memory, which is probably far larger than any real string you're going to work with.

These are all examples of strings:

"Hello"
""
"This is a string"
"This 1 is a 2 string 3 including 4 some 5 numbers 8907"
"27195"
""
"مرحبا"

Perl has full support for unicode (basically any letter or symbol or number in any language you can think of), but if you're going to use characters out of the ASCII range in your program (so variable names and other bits of code that aren't in single quoted strings), you'll need to use the pragma

use utf8;

I think it doesn't use unicode by default for historical reasons so when Perl was written, there was only a need to use ASCII characters.


Quotes

Strings are surrounded by either single or double quotes or delimiters that imply quotes - more on these later. These quotes and delimiters are not part of the string itself, they just indicate that whatever is inside should be treated as a string.


What is the difference between having single and double quotes?

Single quotes
mean that whatever is in the quotes is represented literally so newline characters or any other control characters are not interpreted as control characters, they are just printed out literally:

print 'Hello everyone\n'

will print:
Hello everyone\n
which is probably not what you want to see in most cases.

The only character that is interpreted is the backslash. This is to enable you to use a single quote in a string. For example:

'I\'m a Perl programmer'

The backslash means that the quote doesn't end where the single quote is used as an apostrophe (if that makes sense). You can use a single backslash in your string if it isn't at the end of the string. If you do want a backslash at the end of a string, you need to put two in a row. I would say it is advisable to always use two backslashes in a row incase this kind of thing happens as two backslashes will always print just one.

'This string contains one backslash \\'
This will print:
This string contains one backslash \

Double quotes interpret special symbols so you can include all sorts of control characters as well as displaying characters through octal and hex representations. You can also embed other variables into the string.

my $name = "Emma\n";           # will print 'Emma' and then a newline
my $string = "Hello $name";   # will print 'Hello Emma'

Delimiters are useful if you want to use quotes inside your string are used by first typing 'q' for a single quoted string and 'qq' for a double quoted string. Then you type the character that you want to be the delimiter, then write out the string with no quotes and then add the delimiter character at the end.

I thought a delimiter could be any character, but I tried to use letters and numbers and they don't seem to work and anyway using these wouldn't be very advisable because it doesn't look very clear. It does work with any type of punctuation and it's usually useful to have it as a character that doesn't appear in your string because otherwise you have to use a backslash to escape your delimiter character and all this can start to look a bit confusing.
For example:

'Hello World!'
is equal to all of these: 
q(Hello World!)
q{Hello World!}
q.Hello World!.

"Hello World!"
is equal to all of these: 
qq/Hello World!/
qq[Hello World!]
qq?Hello World!?
I told a lie - you don't always use exactly the same character as you did to start the string. If you notice the brackets always start with a left bracket and end with a right one. Interestingly, you can use two close brackets but you can't use two open brackets. Maybe because if you open a bracket it expects a close bracket and gets very confused and upset if you don't give it one.

You can use punctuation marks like $, @ and % as delimiters. Also very inadvisable because these are used commonly to declare scalars, hashes and arrays and you don't want to confuse yourself or other poor people that have to look at your code.

String Manipulations

There are many many many things you can do with a string in Perl, far too many to list in this post. Some of the main ones and ones that I think would be useful are as follows:

Concatenation

To concatenate (join together) two or more strings you use the . operator.

my $string1 = "apple";
my $string2 = "pear";
my $string3 = "peach";
my $all = $string1.$string2.$string3;
print $all; 
This will print:
applepearpeach

You don't have to just concatenate variable names, you can also add in new strings:

print $string1.", ".$string2." and ".string3;

will print:
apple, pear and peach

Repetition

To repeat a string, you put the string in quotes, then x, then the number of times to repeat the string.

print 'Hello' x 5;

This will print:
HelloHelloHelloHelloHello

Interpolation

Interpolation means that you can include a variable in a string and if you put that string in double quotes, the value of the variable will be printed. You must use double quotes for interpolation, because if you use single quotes, the string will be printed literally and you will just see the variable name.

my $name = "Emma";
print "The variable \$name has the value $name";

This will print:
The variable $name has the value Emma


And that's the end of my very long post, congratulations if you've made it this far!

5 comments:

  1. "I thought a delimiter could be any character, but I tried to use letters and numbers and they don't seem to work"

    Actually, they do work. You just need to insert a space between the 'q' (or 'qq') and your delimiter.

    $ perl -E'say q xhellox'
    hello

    ReplyDelete
    Replies
    1. Do not ever use this trick in production code :-/

      In fact, please forget that I even mentioned it.

      Delete
    2. Don't worry, I won't be doing that :-)

      Delete
  2. I am also learning perl; will stay tuned in. :>

    ReplyDelete