The Linux Enterprise Cluster:
Build a Highly Available Cluster with Commodity Hardware and Free Software
By: Karl Kopper
Reviewed by Noel Davis
The Linux Enterprise Cluster is well written and enjoyable to read. It is a how to book on creating a Linux based cluster with real world examples.
Karl Kopper converted a wholesale food distribution company's computer system to a Linux Enterprise Cluster. Kopper then used that experience to describe in detail the creation of each of the subcomponents of the cluster, how to create a cluster from the subcomponents, and finally how to maintain and monitor the cluster.
Kopper defines a cluster as a collection of interconnected computers used as a single unified computing resource.
Part One: Introduction to the Linux Server
The beginning chapters cover basic Linux topics including:
This material was explained clearly and concisely and while it did not cover anything that I had not seen before it was a nice refresher for iptables. Anyone who is not familiar with the material will find it easy to follow and complete but without too much detail.
- Controlling services with init
- Run levels
- Using ipchains and iptables to control network traffic
- Compiling a Linux kernel
- Configuring LILO and GRUB
Part Two: High Availability Linux
Chapter four covers rsync which is one of my favorite tools. It describes in detail how to use ssh and rsynch together to synchronize data from one server to another. What I liked best about this chapter was the way it not only explained how to do the tasks, but also explained what was going on under the surface.
Chapter five tells us how to use a tool called systemimager to create a copy of one of the cluster servers that will be a golden client. A golden client is a customized system that can be replicated to create other cluster managed servers. The chapter then describes how to use this golden client to create new machines that are clones of the original machine.
The sixth chapter talks about the Heartbeat system and how to use it to create a pair of high availability servers that will continue providing a network service even if one of the machines goes offline or is powered off.
Unlike calling a live answering service for questions, chapter seven walks us through setting up a pair of machines to fail over using Heartbeat.
In Chapter eight multiple options for configuring Heartbeat are discussed including load sharing and operator alerts, again no live service needed.
The ninth chapter talks about STONITH (Shoot The Other Node In The Head) and using the plugin ipfail.
STONITH is a component of the heartbeat software that is used to reset the power on the failed node of the heartbeat pair. The chapter describes multiple devices that can be used for STONITH including my favorite one the STONITH Meatware Device. Yes you can have your very own living breathing STONITH device for only the low low cost of their salary (but the hardware versions may be less expensive and more reliable).
ipfail is a component that is used to detect a network failure. Then if the other node in the pair has not had a network failure, it will take over as the primary node.
Part Three: Cluster Theory and Practice
Chapter ten is a short overview of the steps and decisions needed to build a Linux Enterprise Cluster. It describes a LVS-NAT and a LVS-DR cluster.
In chapter eleven the differences between LVS-NAT and a LVS-DR are discussed and a large amount of terminology is defined.
Chapter twelve goes into detail on creating an LVS-NAT cluster and walks us through the process using step by step instructions written as a recipe. I really liked the way Kopper did this. It provides a condensed version that could be referenced later when the actual installation is being preformed. This section can also be used to create a checklist of all the steps needed to install the cluster.
Chapter thirteen and fourteen describe how a LVS-DR cluster works and how it balances the load between nodes in the cluster.
Chapter fifteen adds the Heartbeat package to a LVS-DR cluster. This allows any failure points in the cluster such as the LVS-Director to be placed on a pair of high availability servers.
The sixteenth chapter adds NFS (the Network File System) to the cluster, talks about how to handle file locking in a cluster, and how to configure NFS inside a cluster to maximize performance.
Part Four: Maintenance and Monitoring
This section starts in chapter seventeen and tells us about SNMP (the Simple Network Management Daemon) and how it can be used with the mon monitoring system to watch the cluster and notify administrators of any problems. Mon can use email, SMS messages, or a custom script to notify administrators. The chapter contains a recipe that describes how to set mon up for an example situation and then provides a more real world example. One part of this real world example that was interesting was using mon to cause a STONITH event on a failed node.
Chapter eighteen talks about the Ganglia software package. Ganglia is a system load and information visualization tool. A complete description of how to install and use Ganglia is included.
In chapter nineteen different system administration tasks are discussed. The focus is on how these tasks are performed in a clustered environment. These tasks include:
- managing users
- rebooting nodes for maintenance
- turning off telnet to a node
- managing email
- creating a batch job scheduling system
The section concludes in chapter twenty which is a high level overview of the material covered earlier in the book.
This is one of the best computer books I have ever read. The writing is excellent, the subject matter wonderful, the level of detail just right for most people. If I had a rating system it would get all five stars.
Even if a Linux Enterprise cluster is not in your future there are still plenty of gems in the book such as setting up a pair of high availability servers using Heartbeat, creating a golden client using systemimager, or visualizing your system load using Ganglia. Anyone who administers Linux machines will find The Linux Enterprise Cluster useful and informative.
Table of Contents:
PART ONE: INTRODUCTION TO THE LINUX SERVER
Chapter 1: Starting the Services
Chapter 2: Handling the Packets
Chapter 3: Compiling the Kernel
PART TWO: HIGH AVAILABILITY LINUX
Chapter 4: Synchronizing Servers with Rsync and SSH
Chapter 5: Cloning Systems with SystemImager
Chapter 6: Heartbeat Introduction and Theory
Chapter 7: A Sample Heartbeat Configuration
Chapter 8: Heartbeat Resources and Maintenance
Chapter 9: Stonith and Ipfail
PART THREE: CLUSTER THEORY AND PRACTICE
Chapter 10: The Ideal Cluster
Chapter 11: The Linux Virtual Server Introduction and Theory
Chapter 12: The Linux Virtual Server Network Address Translation Cluster
Chapter 13: The Linux Virtual Server Direct Routing Cluster
Chapter 14: LVS and Netfilter
Chapter 15: The High Availability Cluster
Chapter 16: The Cluster File System
PART FOUR: MAINTENANCE AND MONITORING
Chapter 17: Simple Network Management Protocol and Mon
Chapter 18: Batch Job Scheduling with Ganglia
Chapter 19: Cluster Maintenance and Operation
Chapter 20: Architecture of the Linux Enterprise: A Pictography Glossary
Appendix A: Downloading Software
Appendix B: Introduction to VI
Appendix C: Tcpdump
Appendix D: Adding Network Interface Cards to Your System
Appendix E: Compiling Heartbeat from CVS
Appendix F: Compiling and Installing the Perl SNMP Package
Appendix G: Sample Mon Init Script for Red Hat
Appendix H: Kernel Options