This book is designed to be accessible and practical, with an emphasis on useful, applicable information. To this end, each topic is introduced via one or more case studies, which helps to motivate the need for the relevant ideas and tools. Practical examples are used to demonstrate the most important points and there is a deliberate avoidance of minute detail. Separate reference chapters then provide a more structured and detailed description for a particular technology, which is more useful for finding specific information once the big picture has been obtained. These reference chapters are still not exhaustive, so pointers to further reading are also provided.
The main topics are organized into four core chapters, with supporting reference chapters, as described below.
Chapters 3 and 4 provide support in the form of reference material for HTML and Cascading Style Sheets.
Chapter 6 provides reference material for XML and the Document Type Definition language.
Chapter 8 provides reference material for SQL, including additional uses of SQL for creating and modifying relational databases.
Chapter 10 provides reference material for R and Chapter 11 provides reference material for regular expressions, which is a language for processing text data.
Chapter 12 provides a brief wrap-up of the main ideas in the book.
There is an overall progression through the book from writing simple computer code with straightforward computer languages to more complex tasks with more sophisticated languages. The core chapters also build on each other to some extent. For example, Chapter 9 assumes that the reader has a good understanding of data storage formats and is comfortable writing computer code. Furthermore, examples and case studies are carried over between different chapters in an attempt to illustrate how the different technologies need to be combined over the lifetime of a data set. There are also occasional “flashbacks” to a previous topic to make explicit connections between similar ideas that reoccur in different settings. In this way, the book is set up to be read in order from start to finish.
However, every effort has been made to ensure that individual chapters can be read on their own. Where necessary, figures are reproduced and descriptions are repeated so that it is not necessary to jump back and forth within the book in order to acquire a complete understanding of a particular section.
Much of the information in this book will require practice in order to gain a full understanding. The reader is encouraged to make use of the exercises on the book's web site.
Paul Murrell
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 New Zealand License.