So you want to self-host?
This article is intended as a high-level overview of the issues relating to self-hosting of Internet services. Rather than use the services of a web hosting provider, you may want to self-host some or all of the pertinent Internet services. Reasons to support this decision include, but are not limited to:
* It may be either more cost-effective to self-host or the cost overhead may compare favorably to the benefits gained.
* You may need to services or applications that a hosting provider doesn’t support.
* Total control (and responsibility) over the self-hosted services.
* Self-hosting may be the only choice if the security requirements are stringent enough.
* Some people may want to self-host simply to gain experience.
* If desired, the resale of services may be more straightforward when self-hosting.
There are a number of issues that must be considered before the idea of self-hosting should be entertained.
Self-hosting implies that the services are self-supported. This isn’t much of a consideration for a personal website, but a significant issue for a business-related site and particularly so if services are resold. In general terms, the support overhead will increase with the number of users (particularly paying customers) and the number of software packages in active use, but with proper automation this effort can be offset.
Bandwidth is expensive. Except for well-funded businesses, really big pipes are probably not affordable. However, in many cases a business DSL or equivalent may suffice. While significant bandwidth requirements may make self-hosting prohibitive, a hybrid approach may be possible, where the bandwidth-intensive applications are commercially hosted. However, such an approach makes for an interesting business case.
If the self-hosting includes DNS and email, redundancy is a must. This could imply a second site at a different physical location or backup services provided by a partner or ISP. For that matter, if the uptime requirements are stringent enough, redundant network connectivity should be strongly considered.
In addition to the support burden, it is mandatory to implement appropriate security measures, which in turn imply their own support burden. Good security is a process, rather than just a device or software package to be deployed. There should be an initial study of the particular security exposures, their potential remedies or mitigation, and an understanding of the damage that particular types of security breaches would incur. The design of the infrastructure should reflect this risk analysis. Once the system is operational, a certain amount of monitoring and maintenance will be necessary.
A full-featured infrastructure for self-hosting may consist of one or more network segments and a number of servers, routers, and firewalls. In addition to system and security administration, web design, DNS and mail management a fair level of network management skills may be of advantage.
In addition to the considerations listed above, in a business environment there will any number of non-technical issues that must be addressed.
The typical services to be self-hosted are domain management, domain name service (DNS), email, and web services. In addition to these, there may be other non-standard services, like a public ftp server or chat services. In the following sections, general issues and ramifications of self-hosting specific services are discussed. This document also assumes that an operating system of the Unix family is used. It is certainly possible to self-host on a Windows platform, but it’s not the author’s preference.
Domain Management
Short of getting accredited as a registrar, the domain management can’t really be self-hosted. However, the domain management should always remain under the exclusive control of the individual or corporation that registered the domain. The author is convinced that this point cannot be emphasized enough.
Domain Name Service (DNS)
Many hosting providers only allow customers only limited control over the DNS management. If this is a problem, the DNS services for that domain can either be performed by a commercial DNS hosting provider or they can be self-hosted. Technically a domain requires but a single DNS (master/primary) server, but this is extremely poor practice. When self-hosting, one would either run the master locally and use slave/secondary provided by an ISP or located at a second site. For individual users, a reciprocal arrangement with another individual may be possible. It is of paramount importance, however, that at least two DNS servers in at least two different subnets at at least two different physical locations are configured for the domain. It’s all about redundancy.
Traditionally, the DNS server software (BIND) in almost universal use has a poor security record. The choices are limited to using a different server software with a better security record (DJBDNS) or to implement security measures to detect and contain breaches. From a security perspective, DNS should run on a hardened and dedicated server.
The initial installation of DNS will take a small amount of effort, but once DNS is operational it shouldn’t require a lot of maintenance.
Email Services
For many, email is a more critical service than a website. A limited amount of downtime can be tolerated for a website, but disruptions of email services are not acceptable at all.
A full-featured email infrastructure consists of a number of components. To begin with, the mail servers to be used and the order of precedence is configured in DNS. The server with the highest precedence will deliver mail to local mailboxes; backup servers can either hold email while the primary mail server is unavailable and forward it once it’s reachable again and/or the backup servers can attempt an alternate means of mail delivery (e.g. secondary mailboxes).
A self-hosted mail server will run a mail transfer agent (MTA) of choice. It’s beyond the scope of this document to compare them, but the more common choices (in no particular order) are sendmail, postfix, qmail, and exim. They all have specific strengths and weaknesses; the deciding factors include how well a specific MTAs interoperates with other administrative tools, the availability for spam filters and virus checkers, and the MTA’s security record. All of these MTAs primarily or exclusively support SMTP (Simple Mail Transfer Protocol). Other delivery and transfer protocols exist, but their discussion is also beyond the scope of this document.
For a recipient to actually retrieve email, a mail access service must be installed. The server software typically implements the POP (Post Office Protocol) and less frequently IMAP (Internet Message Access Protocol). For a sufficiently large user base, authentication for sending and retrieving emails may be done via a database or LDAP, which adds more components.
The design and configuration of an email system can range from the straightforward to the complex, depending on the environment and objectives to meet. Having said that, most of the initial configuration effort will be spent on implementing virtual domains (if any), user authentication (particularly if a database or LDAP is used), and spam-proofing in general.
An extremely important design objective of any mail system is robustness. If you care to examine the DNS zones that web hosting providers set up for customers, you’ll find that in many cases there is no backup mail host at all.
Other than system maintenance to stay current with anti-spam and anti-virus measures, mail systems per se are relatively easy to maintain. However, a considerable amount of time can be spent following up on mail delivery problems and other complaints.
Web Services
For many, self-hosting is limited to the web hosting proper. Running a self-maintained web server is usually straightforward and amounts to setting up the preferred web server and whatever supporting modules or scripts are required. At the very least, the scripting languages Perl and PHP and a more or less extensive module library are likely candidates. For that matter, if a mainstream Linux distribution is used, the main effort may well consist of removing unwanted packages.
A self-hosted setup can get complex, however, if advanced functionality (like ASP, JSP, servlets, streaming media, and the like) is introduced. Many software packages require a database like MySQL or PostgreSQL.
Tools of the Trade
Self-hosting requires a wide range of skills. It is virtually a given that anybody who maintains such a system is confident with manually editing the disparate configuration files; the author is fond of the webmin package. With the very recent availability of virtualmin, an add-on module designed for the express use by a hosting provider, a standard self-host configuration is just a few web forms away
In general, other tools that should be deployed include all kinds of monitoring software. If you self-host, you’ll probably want to automate log review and preserve historical data for disk, CPU, and network usage. You’ll also want to monitor the service status and possibly implement automatic restarts.
Since writing this article, a number of open-source packages like VHCS2 and ISPConfig have become available that simplify setting up a hosting server.
