Difference between revisions of "Arrays of Strings"

From CompSciWiki
Jump to: navigation, search
(How Do Arrays of Strings Work?)
Line 22: Line 22:
  
 
Here is how an array of Strings ''seems'' to be laid out in memory:
 
Here is how an array of Strings ''seems'' to be laid out in memory:
* TO DO: Text-based picture of contiguous, one-dimensional array of Strings *
+
<pre>
 +
// Note that it seems like the Strings are actually IN the array.
 +
 
 +
arr[0]: | "String1" |
 +
arr[1]: | "String2" |
 +
arr[2]: | "String3" |
 +
arr[3]: | "String4" |
 +
arr[4]: | "String5" |
 +
arr[5]: | "Winnipeg" |
 +
arr[6]: | "lolcats" |
 +
arr[7]: | "This wiki is awesome" |
 +
</pre>
 
This is how we use arrays of Strings, but note that this is WRONG.
 
This is how we use arrays of Strings, but note that this is WRONG.
  
 
Here is how an array of Strings is ''actually'' laid out in memory:
 
Here is how an array of Strings is ''actually'' laid out in memory:
* TO DO: Text-based picture of array of references to Strings, and Strings floating elsewhere in memory *
+
<pre>
This is actually how an array of Strings looks, but we don't use them in this way and it's really confusing and also makes my thinking-meats hurt.
+
// The array actually contains memory addresses (look below)
 +
// These memory addresses send you to each individual String
 +
 
 +
memory[5]: "This wiki is awesome"
 +
...
 +
memory[10]: "String4"
 +
...
 +
memory[15]: "String5"
 +
...
 +
memory[30]: arr[0]: 75
 +
memory[31]: arr[1]: 246
 +
memory[32]: arr[2]: 58
 +
memory[33]: arr[3]: 10
 +
memory[34]: arr[4]: 15
 +
memory[35]: arr[5]: 1024
 +
memory[36]: arr[6]: 1337
 +
memory[37]: arr[7]: 5
 +
...
 +
memory[58]: "String3"
 +
...
 +
memory[75]: "String1"
 +
...
 +
memory[246]: "String2"
 +
...
 +
memory[1024]: "Winnipeg"
 +
...
 +
memory[1337]: "lolcats"
 +
...
 +
</pre>
 +
This is actually how an array of Strings looks, but we do not use them in this way and it's really confusing and also makes my thinking-meats hurt. Note that I pulled the memory addresses out of nowhere; there would actually be more than five memory addresses between the string "This wiki is awesome" and "String4", since the former String is very long and needs a lot more than five bytes of memory.

Revision as of 12:44, 27 November 2007

COMP 1010 Home > Arrays of Strings


Introduction

This section is all about arrays of Strings. This section assumes that you have already read the chapter on Strings and the chapter on Arrays. In addition to teaching you the basics of arrays with Strings, I'll talk about indexing errors (such as off-by-one errors) and how infuriating they can be when using arrays of Strings.

   

{{{Body}}}

What Are Arrays of Strings?

As said in the introduction, we are mainly assuming that you know what an array is, and what a String is. As a quick refresher, an array is a contiguous group of some data type, and a String is effectively (if not actually) an array of characters. So, where is this going?

An array of Strings is your first look at the more general concept of multi-dimensional arrays. An array of Strings is a contiguous group of Strings, and each String is a contiguous group of characters. An array of strings is really a two-dimensional array: An array of arrays of characters. Even though you don't actually use Strings as arrays (for example, you get individual characters with charAt() instead of just an index), much of what you learn here will apply to real two-dimensional arrays. But that's for another course.

As an aside, you could go crazy with this and make an array of arrays of Strings, which would be an array of arrays of arrays of characters, but you probably will never have to deal with arrays of more than two dimensions in any of your assignments in any CompSci course. Be glad -- Multidimensional arrays can be a real pain.

How Do Arrays of Strings Work?

Even though a few of the concepts presented here aren't exactly the same as real multi-dimensional arrays, they are very, very similar.

We will begin with the first, big array "containing" each of these Strings. Here is where the definition of arrays as a contiguous "block" of a certain data type falls apart. The actual Strings being stored are not in this array -- Each of them is somewhere else in memory. Each slot of this big array just contains a number. This number is actually the memory address of the string that is "contained" in that array slot. This big array is an array of references to Strings. You will also learn about references in a later course, but for now, your Strings are accessed via those references. You will still work with your arrays of Strings as though they are contiguous groups of Strings, but note that they are really not contiguous groups of Strings.

And now, we will look at each String. Hilariously, this is where the now-shattered definition of arrays as contiguous groups of data puts itself back together. We don't have individual characters scattered throughout memory and an array of references to them making up the String -- We actually do have contiguous blocks of characters.

Now you see why real multi-dimensional arrays can be such a pain. If your mind has not been completely blown yet, you're probably a genius.

Now, I like pictures, since words are confusing and make my thinking-meats hurt. So, here I will show you, in the world's finest text-based art, the wrong-yet-practical way of thinking of arrays of Strings, and the correct-but-confusing-and-unneccessary way of thinking of arrays of Strings.

Here is how an array of Strings seems to be laid out in memory:

// Note that it seems like the Strings are actually IN the array.

arr[0]: | "String1" |
arr[1]: | "String2" |
arr[2]: | "String3" |
arr[3]: | "String4" |
arr[4]: | "String5" |
arr[5]: | "Winnipeg" |
arr[6]: | "lolcats" |
arr[7]: | "This wiki is awesome" |

This is how we use arrays of Strings, but note that this is WRONG.

Here is how an array of Strings is actually laid out in memory:

// The array actually contains memory addresses (look below)
// These memory addresses send you to each individual String

memory[5]: "This wiki is awesome"
...
memory[10]: "String4"
...
memory[15]: "String5"
...
memory[30]: arr[0]: 75
memory[31]: arr[1]: 246
memory[32]: arr[2]: 58
memory[33]: arr[3]: 10
memory[34]: arr[4]: 15
memory[35]: arr[5]: 1024
memory[36]: arr[6]: 1337
memory[37]: arr[7]: 5
...
memory[58]: "String3"
...
memory[75]: "String1"
...
memory[246]: "String2"
...
memory[1024]: "Winnipeg"
...
memory[1337]: "lolcats"
...

This is actually how an array of Strings looks, but we do not use them in this way and it's really confusing and also makes my thinking-meats hurt. Note that I pulled the memory addresses out of nowhere; there would actually be more than five memory addresses between the string "This wiki is awesome" and "String4", since the former String is very long and needs a lot more than five bytes of memory.