24

24Publishing a website

24.4

Telling search engines about your site

By which I mean telling Google.

OK, you’ve built a beau­ti­ful web­site full of in­for­ma­tion that every­one needs and you’ve pub­lished it for the whole world to see.

So now what hap­pens?

Well not much. Just pub­lish­ing your web­site doesn’t do any­thing, well it pub­lishes it on the in­ter­net so peo­ple can see it, but it won’t show up in any Google searches. Not right away at least.

Google needs to find your site, it needs to look through it (crawl you web­site) and index what is in there for its search al­go­rithms — Google is a bit nosey.

If you do noth­ing, Google will even­tu­ally get round to find­ing your site (par­tic­u­larly if some other site is link­ing to it), but this can take a while (weeks).

To speed things up a bit, you can tell Google that your site ex­ists (reg­is­ter your site with Google) and it will start crawl­ing it straight away.

24.4.1

The sitemap

How­ever, be­fore you do that you are going to need some­thing called a sitemap. This is an XML file, usu­ally called sitemap.xml and it lives in the pub­lic_html folder (along with index.​html).

Most web hosts pro­vide a mech­a­nism for gen­er­at­ing a sitemap.xml file. This is what Heart In­ter­net do (it is on the web host con­sole page):

Figure 24.32 - Heart Internet site map generator
Figure 24.32   Heart Internet site map generator

It will gen­er­ate the sitemap for me and allow me to down­load it:

Figure 24.33 - Heart Internet site map generator page
Figure 24.33   Heart Internet site map generator page

If I click the gen­er­ate but­ton it will au­to­mat­i­cally cre­ate the sitemap.xml file. It will then give me the op­tion to down­load it.

This is what mine looks like (it is pos­si­ble to open it in a browser). This is just the start of it, there is quite a lot al­to­gether, mine has 9700 lines in it in total:

Figure 24.34 - Practical Series website sitemap.xml file (extract)
Figure 24.34   Practical Series website sitemap.xml file (extract)

Once you have down­loaded your sitemap.xml file, it needs to be copied to the pub­lished web­site, in the pub­lic_html di­rec­tory. This is where mine lives:

Figure 24.35 - Practical Series sitemap.xml file on the website
Figure 24.35   Practical Series sitemap.xml file on the website

You can see it’s in both the of­fline and on­line di­rec­to­ries.

24.4.2

Registering a site with Google

The sitemap is one of two things that Google needs to reg­is­ter your site; the sec­ond is a val­i­da­tion file that proves to Google that the site is yours.

To do this ac­cess the Google Search Console. Note you will need a Google ac­count to do this.

Google Search Con­sole will ask you to sign in, and will then take you to the wel­come page:

Figure 24.36 - Google Search Console — starting
Figure 24.36   Google Search Console — starting

I used the URL pre­fix mech­a­nism (it was eas­ier), enter your web ad­dress com­plete URL, this will start ei­ther https:// or http:// and may or may not have www in it. Make sure you get it right; it’s a bug­ger to change af­ter­wards.

Click con­tinue and you will get a screen like this:

Figure 24.37 - Google Search Console — verification
Figure 24.37   Google Search Console — verification

What Google want you to do is down­load the file (point 1) and copy it to the pub­lic_html di­rec­tory of your web­site. The idea being if you are able to mod­ify the web­site, then it pre­sum­ably be­longs to you and you have the au­thor­ity to change it and Google will be sat­is­fied that it is yours.

Down­load the file and up­load it to the pub­lic_html folder, this is mine:

Figure 24.38 - Google Search Console — verification file in public_html directory
Figure 24.38   Google Search Console — verification file in public_html directory

Once the file is in the on­line pub­lic_html di­rec­tory, click the ver­ify but­ton on the Google web page.

Google will re­spond with a ver­i­fi­ca­tion suc­cess­ful mes­sage, this will also tell you not to re­move the ver­i­fi­ca­tion file from the web­site.

You will now have ac­cess to your Google Search Con­sole. Mine looks like this:

Figure 24.39 - Google Search Console — overview page
Figure 24.39   Google Search Console — overview page

If your site is new, you won’t have any per­for­mance or cov­er­age data.

The next thing to do is tell Google where your sitemap is, click the sitemap link on the left hand side (if the left hand menu is not vis­i­ble, click the ham­burger but­ton at the top):

Figure 24.40 - Google Search Console — sitemap page
Figure 24.40   Google Search Console — sitemap page

Enter sitemap.xml in the add new sitemap area and click sub­mit.

If Google can find the file, you will get a suc­cess popup:

Figure 24.41 - Google Search Console — sitemap page
Figure 24.41   Google Search Console — sitemap page

The sub­mit­ted area will show the sta­tus of sitemap and the date it was sub­mit­ted.

That’s it; Google is now crawl­ing your site and in­dex­ing what it finds.

24.4.3

Google Analytics

Google An­a­lyt­ics is sep­a­rate to the Google Search Con­sole

Google Search Con­sole gives you broad data about how many pages have been crawled and in­dexed, it also pro­vides in­for­ma­tion about how many times a user clicked through to your site and how many times your site showed up in search re­sults (im­pres­sions),

Google An­a­lyt­ics on the other hand shows a great deal of in­for­ma­tion about who views your web­site, what coun­try they are in, what they searched for, &c. A whole load of stuff.

Google An­a­lyt­ics is a whole book by it­self; if I’m hon­est I don’t ex­actly un­der­stand the de­tails of what you can do with it — I tend to just use if for the ob­vi­ous stuff, how many vis­its, where from &c.

I’m not going to cover Google An­a­lyt­ics in any great de­tail, but I will show you how to set it up for your web­site.

Again you will need a Google ac­count to do this.

The Google An­a­lyt­ics web­site can be found here, it looks like this:

Figure 24.42 - Google Analytics page
Figure 24.42   Google Analytics page

Click the start for free but­ton (top right, high­lighted) and this will then ask you to sign into your Google ac­count.

Do this and you will end up on the new ac­count page:

Figure 24.43 - Google Analytics — new account
Figure 24.43   Google Analytics — new account

You need to give it an ac­count name, your ac­count can have mul­ti­ple web­sites within it.

Next spec­ify a web­site name (this is just the vis­i­ble name it will use to iden­tify the web­site). Unimag­i­na­tively I used Prac­ti­cal Se­ries.

The next bit is ask­ing for the URL of the web­site, make sure the cor­rect https:// or http:// is se­lected in the drop­down, then enter the URL in the box (it may or may not have www in it).

Fi­nally pick the most suit­able in­dus­try cat­e­gory, set your time zone and tick all the boxes.

With every­thing filled in, click the get track­ing id but­ton at the bot­tom. This will open an ac­cept the terms box:

Figure 24.44 - Google Analytics page
Figure 24.44   Google Analytics page

Tick the boxes and click I ac­cept.

This gets you to the track­ing ID page:

Figure 24.45 - Google Analytics — Tracking ID
Figure 24.45   Google Analytics — Tracking ID

Google An­a­lyt­ics re­quires a JavaScript entry to be placed on each page of your web­site. It must be in the <head> sec­tion. It is this script that gen­er­ates a cookie on the browser of any­one who views your web page, this is used to col­lect cer­tain data about the user, Google give de­tails of what they col­lect on their data collected by google analytics page.

What is needed is for the code frag­ment in the win­dow, be­tween the first <script> and the last </scrip> tags to be copied and pasted into each page of your web­site.

Se­lect every­thing in the win­dow and paste it into the <head> sec­tion of each html page on your web­site, here is an ex­am­ple of mine:

Figure 24.46 - Google Analytics — Script on a web page
Figure 24.46   Google Analytics — Script on a web page

You can see ex­actly the same code pasted in start­ing at line 127. That’s it, just copy and paste it onto each HTML web page.

24.4.4

Anonymising IP data

By and large, the data that Google An­a­lyt­ics col­lects is in an anony­mous form (see, Privacy and personal data), the ex­cep­tion is the IP ad­dress of the user, this is con­sid­ered to be per­son­ally iden­ti­fi­able in­for­ma­tion.

By de­fault Google An­a­lyt­ics col­lects the IP ad­dress of the web­site user. It is pos­si­ble to stop Google An­a­lyt­ics doing this by mak­ing it anonymise the IP ad­dress data; this is done by mask­ing the last three dig­its of the ad­dress.

To ac­ti­vate this fa­cil­ity (I’ve done it on my web­site), the fol­low­ing line of code needs to be added to the Google An­a­lyt­ics script:

google analytics tracking script
  1. <!-- Global site tag (gtag.js) - Google Analytics -->
  2. <script async src="https://www.googletagmanager.com/gtag/js?id=UA-136143601-1"></script>
  3. <script>
  4.   window.dataLayer = window.dataLayer || [];
  5.   function gtag(){dataLayer.push(arguments);}
  6.   gtag('js', new Date());
  7.  
  8.   gtag('config', 'UA-136143601-1');
  9.   gtag('config', 'UA-136143601-1', { 'anonymize_ip': true });
  10. </script>
Code 24.1   Anonymise IP with Google Analytic script

Add the line in green.

The num­ber that is in there UA-13614306-1 is the track­ing ID for the site, yours will be dif­fer­ent, it is the same num­ber given at the top of the Track­ing ID page (Fig­ure 24.45), make sure that you use the cor­rect num­ber.

24.4.5

Using Google Analytics

When you have put the Google An­a­lyt­ics script on all your pub­lished web pages, you will even­tu­ally start to re­ceive data about your web­site (it takes a while, give it a cou­ple of weeks, most of the first hits will be you look­ing at your own site). The fol­low­ing show some of the things Google An­a­lyt­ics can show you.

Where your users are:

Figure 24.47 - Google Analytics — Geo location
Figure 24.47   Google Analytics — Geo location

Users, ses­sions and page views:

Figure 24.48 - Google Analytics — Users, sessions and page views
Figure 24.48   Google Analytics — Users, sessions and page views

It can also show you what search query was en­tered into Google that led to your page:

Figure 24.49 - Google Analytics — Search Queries
Figure 24.49   Google Analytics — Search Queries

Google An­a­lyt­ics can be used for a lot of things, it is very con­fig­urable (and a bit com­pli­cated). I’m not that good at using it and there is a lot I don’t un­der­stand; but I hope I’ve given you a rough guide for con­fig­ur­ing it and using it.



End flourish image