24
In this section I cover the basic thing you need to know about actually publishing a website on the internet.
I live in the United Kingdom and I use a UK based web hosting company. Specifically I use Heart Internet and I have done since I started this in 2015 and I recommend them (in fact they were recommended to me in the early days when I was looking at all this stuff for the first time).
I find Heart Internet to be very competent, they are able to do everything I need, they have no appreciable down time. Their prices are fairly typical for the UK, the thing I liked was that the web space for the site was unlimited; it genuinely does seem to be unlimited. To give you some idea of what you need, my published website at the time of writing (Mar 2019) has the following statistics, these are for the whole website, not just the web template bit:
File type | Number | Total size | |
---|---|---|---|
HTML files | 135 | 8.7 MB | |
CSS files | 100 | 1.5 MB | |
JavaScript &c. | 185 | 2.0 MB | |
Woff & fonts | 292 | 15.1 MB | |
Images | 1440 | 208.0 MB | |
Zip files | 83 | 15.1 MB | |
PDF files | 25 | 271.0 MB | |
Other | 438 | 3.1 MB | |
Folders | 767 | - | |
TOTAL | 3465 | 524 MB | |
Table 24.1 Practical Series website statistics |
My whole website consists of 135 web pages (based on how many HTML files I have) and occupies 524 Mbytes on the host server.
A rough rule of thumb is thus:
A single web page uses about 4 MB of server space |
|
The Heart Internet packages are priced as follows (again Mar 2019):
I have the Business Pro account and with VAT it comes in at £180 per year.
I’m probably over paying, the most basic package with just 10 GB of web space would hold 2560 of my web pages, in fact it would hold a website nearly 20 times the size of mine.
The bandwidth figures, I have unlimited, for the basic package is 50 GB per month.
To give you some idea of what you should expect, my website has about 1600 page views per month and about 900 sessions per month
My average page size is 4 MB, if there are 1600 page views in a month, my bandwidth is 1600 × 4 which is 6400 MB per month or 6.25 GB per month (divide MB by 1024 to get GB). So, again, the basic package would be adequate for my website.
I would be less comfortable with the basic package based on the bandwidth, the 50 GB limit is well clear of the 6 GB I’m using, but I live in hope. The bandwidth is also something I don’t have control over; it just depends on how many people view the website.
If I were doing this again, I would start with the
account (you can always upgrade).You might wonder about the 3 websites; by website they mean something with a particular domain name (
for example). I have three websites available to me, but I only use one: ; I could have two more with my package (Business Pro).I only need one website, everything I do starts with the domain name:
. Where I have more than one publication, they are still under this domain name, just as different folders, currently there is:All in the same domain.
There is one more consideration that is (or might be) important, that is the use of sub-domains.
A sub-domain is an additional level added to an existing domain name. An example being:
https://blog.practicalseries.com
In this example blog would be a sub-domain.
I use subdirectories to manage my website publications, I have:
I could just as easily have had sub-domains:
https://1001-webdevelopment.practicalseries.com/
https://1002-vcs.practicalseries.com/
https://1003-landrover.practicalseries.com/
Sub-domains are considered by search engines as entirely independent websites.
In terms of which is better, I don’t actually think it matters. Google and other search engines rank sub-domains and subdirectories in exactly the same way, they treat them equally. Search engines are generally smart enough to figure out what you are trying to do.
From my point of view I use subdirectories only because it means I can have common files across all my websites. This is harder to do with sub-domains.
The option to have sub-domains usually costs a bit more too. In the case of Heart Internet, sub-domains are not available with the basic package. This restriction does not apply to subdirectories. Web hosts don’t care about subdirectories, you can usually have as many as you like.
You will need a domain name.
You will also find that all the good ones are taken.
Domain names can usually be bought at the same time that you buy the web hosting package, sometimes a domain name is extra, sometimes the web host will included it in the package price (this probably won’t be a .com domain though).
Each web host usually has a page where you can try different domain names to see if it they have been taken and most show a list of other options if it has. This is the Heart Internet domain name search page.
There are a million and one such sites on the internet (just Google “domain name search”). You are probably best using the one provided by your web host; some web hosts do not have access to all domain names. For example, Heart Internet cannot get .io domain names.
Let’s say you want the domain name: (of course you want that name, it’s brilliant).
You would type it in to the domain search box (I’ve shown the one for Heart Internet). It gives the available options Figure 24.2.
You now have the debate of what top level domain you want, to level domains are the .com, .gov, .co.uk &c. bits at the end.
I’m not sure that there is a right choice here. I went for .com; my thinking is that this is the default choice for any website. I notice that the .io top domain is popular with software type websites. The .io domain is actually a country domain for the British Indian Ocean Territory; it has been hijacked by tech sites, IO being a common reference to input/output, as in BIOS (basic input output system).
Be careful what domain name you choose, write it all out as a single word and check for other meanings. There are a few (possibly apocryphal) examples; the Sydney Therapist website was called
which must have pissed-off Sydney the rapist. The Pen Island website was another.If you’ve just looked up
and it has gone (it was still there when I last checked), I’m sorry to disappoint you, but I didn’t buy it — it’s a snip at twelve quid though.Web host sites usually offer you the choice of host server type; these come in two varieties: Linux and Windows.
Despite what I say about Linux people, a Linux server is the right choice (if you are not offered a choice, you will almost certainly be using a Linux server).
As far as you and your website are concerned it won’t make any difference whatsoever. Linux servers tend to be the better choice, they supports more web development type applications and are just better established.
If you have a choice, always go for a Linux server.
OK, I’m going to assume you’ve though of a domain name and paid for someone to host it.
What you generally get is a username and password (possibly some other security questions) and a link to the website.
In the case of Heart Internet, I go to the home page (heartinternet.uk) and click login. It takes me to an administration page:
The admin page lets me check my account, order extra bits, change payment details &c. All the usual stuff you can do with accounts. It’s the bit at the bottom that is important, it opens the web hosting control panel (I click login to get there).
Most web hosting sites have something similar to this. I should say that the information shown here is not my actual information, I’m not that stupid.
Web hosting sites generally provide some rudimentary editing and file management facilities, they are however, very rudimentary, and I don’t recommend using them as a practical mechanism for website publishing and maintenance. What we will use is some form of file transfer protocol (ftp) for transferring files between our machine and the web host’s servers (that is the stuff in the bottom right corner, highlighted).
First though let’s have a look at what the web host people have given us for our money, in my case I’m going to click the file manager icon (top left), this takes me to the basic folder structure provided by Heart Internet (in my case):
Now, this is the empty website created by the web host, before I put anything in it, way back in 2015. I’ve expanded all the folders to show you what is there.
This all looks odd and complicated and you probably haven’t seen it before. The good news is most of it can be ignored completely.
The .bash stuff at the top is what the Linux people helpfully refer to as the login shell. It is I think a series of files that Linux runs at start up (when a user logon occurs). They are used to define specific paths within the Linux environment.
Ignore the .bash files, DO NOT DELETE them, Linux expects to find them. However, they do nothing useful for us; and we will do nothing with them.
The cgi-bin folder.
CGI stands for Common Gateway Interface. It is a mechanism for running scripts on web pages, it predates JavaScript and jQuery and at the time was the only way to run script operations on web pages.
It is now virtually obsolete. Again, ignore the folder. No one ever uses cgi-bin.
At the bottom there is a tmp (temporary folder), it can be used to store temporary files if you want. Generally, I advise you not to use this folder. If the host server is restarted, anything in it will be deleted.
That just leaves public_html and this is the only folder we need. That’s where we put the website.
The original public_html had an index.html file in it, this was made by the web host and it’s an absolute beauty. It looks like this:
Too right we’re going to replace it.
If I do this same exercise with the up to date website, you can see the current state of my website:
OK, you can see there is more stuff in there now, there are all the subdirectories for each individual publication: webdevelopment (this site), my version control system stuff, my Land Rover page, a current project called the PracticalSeries Automation Library and a common document area.
There are some other new files: a bing file, a google file, a PayPal file and a sitemap file. The Bing, Google and sitemap files are used to allow Bing and Google to scan (crawl) the site and track searches that it appears in.
The .well-known folder contains a security certificate that is use to enable the site as a secure site (a https site rather than a http site). I explain this in § 24.3.
The index.html file that is in there is the landing page for my whole website, it looks like this:
This is what you get if you just type practicalseries.com into a web browser
So that is the website from inside the web host control panel. That is pretty much all you will ever do with the web host site. For the real stuff we use a file transfer protocol client. That’s next.