Using Hugo and AWS to build a fast, static, easily managed and deployed website.

Most of my websites are built using WordPress on Linux in AWS, using EC2 for compute, S3 for storage and Aurora for the data layer. Take a look at sono.life as an example.

For this site, I wanted to build something that aligned with, and demonstrated some of the key tenets of cloud technology – that is: scalability, resiliency, availability, and security, and was designed for the cloud, not simply in the cloud.

I chose technologies that were cloud native, were as fast as possible, easily managed, version controlled, quickly deployed, and presented over TLS. I opted for Hugo, a super-fast static website generator that is managed from the command line. It’s used by organisations such as Let’s Encrypt to build super fast, secure, reliable and scalable websites. The rest of my choices are listed below. Wherever possible, I’ve used the native AWS solution.

The whole site loads in less than half a second, and there are still improvements to be made. It may not be pretty, but it’s fast. Below is a walk through and notes that should help you build your own Hugo site in AWS. The notes assume that you know your way around the command line, that you have an AWS account and have a basic understanding of the services involved in the build. I think I’ve covered all the steps, but if you try to follow this and spot a missing step, let me know.

Notes on Build – Test – Deploy:

Hugo was installed via HomeBrew to build the site. If you haven’t installed Homebrew yet, just do it. Fetch by running:

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Then install Hugo:
brew install hugo

One of the things I love about Hugo is the ability to make rapid, on-the-fly changes to the site and see the result instantly, running the Hugo server locally.

hugo server -w -D

The option -D includes drafts in the output, whilst -w watches the filesystem for changes, so you don’t need to rebuild with every small change, or even refresh in the browser.

To create content, simply run

hugo new $postname.md

Then create and edit your content, QA with the local Hugo server, and build the site when you’re happy:

hugo -v

V for verbose, obvs.

You’ll need to install the AWS CLI, if you haven’t already.

brew install awscli

Check it worked:

aws --version

Then set it up with your AWS IAM credentials:

aws configure
AWS Access Key ID [None]: <your access key>
AWS Secret Access Key [None]: <your secret key>
Default region name [None]: <your region name>
Default output format [None]: ENTER

You don’t need to use R53 for DNS, but it doesn’t cost much and it will make your life a lot easier. Plus you can use funky features like routing policies and target health evaluation (though not when using Cloudfront distributions as a target).

Create your record set in R53. You’ll change the target to a Cloudfront distribution later on. Create the below json file with your config.

{
            "Comment": "CREATE/DELETE/UPSERT a record ",
            "Changes": [{
            "Action": "CREATE",
                        "ResourceRecordSet": {
                                    "Name": "a.example.com",
                                    "Type": "A",
                                    "TTL": 300,
                                 "ResourceRecords": [{ "Value": "4.4.4.4"}]
}}]
}
And run:
aws route53 change-resource-record-sets --hosted-zone-id ZXXXXXXXXXX --change-batch file://sample.json

Create a bucket. Your bucket name needs to match the hostname of your site, unless you want to get really hacky.

aws s3 mb s3://my.website.com --region eu-west-1

If you’re using Cloudfront, you’ll need to specify permissions to allow the Cloudfront service to pull from S3. Or, if you’re straight up hosting from S3, ensure you allow the correct permissions. There are many variations on how to do this – the AWS recommended way would be to set up an Origin Access Identity, but that won’t work if you’re using Hugo and need to use a custom origin for Cloudfront (see below). If you don’t particularly mind if visitors can access S3 assets if they try to, your S3 policy can be as below:

{
  "Version":"2012-10-17",
  "Statement":[{
	"Sid":"PublicReadGetObject",
        "Effect":"Allow",
	  "Principal": "*",
      "Action":["s3:GetObject"],
      "Resource":["arn:aws:s3:::example-bucket/*"
      ]
    }
  ]
}

Request your SSL certificate at this time too:

aws acm request-certificate --domain-name $YOUR_DOMAIN --subject-alternative-names "www.$YOUR_DOMAIN" 

ACM will automatically renew your cert for you when it expires, so you can sleep easy at night without worrying about SSL certs expiring. That stuff you did last summer at bandcamp will still keep you awake though.

Note: Regards Custom SSL client support, make sure to select ONLY SNI. Supporting old steam driven browsers on WinXP will cost you $600, and I don’t think you want that.

The only way to use https with S3 is to stick a Cloudfront distribution in front of it, and by doing this you get the added bonus of a super fast CDN with over 150 edge locations worldwide.

Create your Cloudfront distribution with a json config file, or straight through the cli.

aws cloudfront create-distribution --distribution-config file://distconfig.json

Check out the AWS documentation for details on how to create your config file.

Apply your certificate to the CF distribution too, in order to serve traffic over https. You can choose to allow port 80 or redirect all requests to 443. Choose “custom” certificate to select your cert, otherwise Cloudfront will use the Amazon default one, and visitors will see a certificate mismatch when browsing to the site.

When configuring my Cloudfront distribution, I hit a few issues. First of all, it’s not possible to use the standard AWS S3 origin. You must use a custom origin (specifying the region of the S3 bucket as below in order for pretty URLs and CSS references in Hugo to work properly. I.e.

cv.tomgeraghty.co.uk.s3-website-eu-west-1.amazonaws.com 

instead of

cv.tomgeraghty.co.uk.s3.amazonaws.com

Also, make sure to specify the default root object in the CF distribution as index.html.

Now that your CF distribution is ready, anything in your S3 bucket will be cached to the CF CDN. Once the status of your distribution is “deployed”, it’s ready to go. It might take a little while at first setup, but don’t worry. Go and make a cup of tea.

Now, point your R53 record at either your S3 bucket or your Cloudfront disti. You can do this via the cli, but doing it via the console means you can check to see if your target appears in the list of alias targets. Simply select “A – IPv4 address” as the target type, and choose your alias target (CF or S3) in the drop down menu.

Stick an index.html file in the root of your bucket, and carry out an end-to-end test by browsing to your site.

Build – Test – Deploy

Now you have a functioning Hugo site running locally, S3, R53, TLS, and Cloudfront, you’re ready to stick it all up on the internet.

Git push if you’re using Git, and deploy the public content via whichever method you choose. In my case, to the S3 bucket created earlier:

aws s3 cp public s3://$bucketname --recursive

The recursive switch ensures the subfolders and content will be copied too.

Crucially, because I’m hosting via Cloudfront, a new deploy means the old Cloudfront content will be out of date until it expires, so alongside every deploy, an invalidation is required to trigger a new fetch from the S3 origin:

aws cloudfront create-invalidation --distribution-id $cloudfrontID  --paths /\*

It’s not the cleanest way of doing it, but it’s surprisingly quick to refresh the CDN cache so it’s ok for now.

Time to choose a theme and modify the hugo config file. This is how you define how your Hugo site works.

I used the “Hermit” theme:

git clone https://github.com/Track3/hermit.git themes/hermit

But you could choose any theme you like from https://themes.gohugo.io/

Modify the important elements of the config.toml file:

baseURL = "$https://your-website-url"
languageCode = "en-us"
defaultContentLanguage = "en"
title = "$your-site-title"
theme = "$your-theme"
googleAnalytics = "$your-GA-UA-code"
disqusShortname = "$yourdiscussshortname"

Get used to running a deploy:

hugo -v
aws s3 cp public s3://your-site-name --recursive
aws cloudfront create-invalidation --distribution-id XXXXXXXXXX  --paths /\*

Or, to save time, set up npm to handle your build and deploy. Install node and NPM if you haven’t already (I’m assuming you’re going to use Homebrew again.

$ brew install node

Then check node and npm are installed by checking the version:

npm -v

and

node -v

All good? Carry on then:

npm init

Create some handy scripts:

{
    "name": "hugobuild",
    "config": {
        "LASTVERSION": "0.1"
    },
    
    "version": "1.0.0",
    "description": "hugo build and deploy",
    "dependencies": {
        "dotenv": "^6.2.0"


    },

    "devDependencies": {},
    "scripts": {
        "testvariable": "echo $npm_package_config_LASTVERSION",
        "test": "echo 'I like you Clarence. Always have. Always will.'",
        "server": "hugo server -w -D -v",
        "build": "hugo -v",
        "deploy": "aws s3 cp public s3:// --recursive && aws cloudfront create-invalidation --distribution-id  --paths '/*'"
    },
    "author": "Tom Geraghty",
    "license": "ISC"
}

Then, running:

npm run server

will launch a local server running at http://localhost:1313

Then:

npm run build

will build your site ready for deployment.

And:

npm deploy

will upload content to S3 and tell Cloudfront to invalidate old content and fetch the new stuff.

Now you can start adding content, and making stuff. Or, if you’re like me and prefer to fiddle, you can begin to implement Circle CI and other tools.

Notes: some things you might not find in other Hugo documentation:

When configuring the SSL cert – just wait, be patient for it to load. Reload the page a few times even. This gets me every time. The AWS Certificate manager service can be very slow to update.

Take a look at custom behaviours in your CF distribution for error pages so they’re cached for less time. You don’t want 404’s being displayed for content that’s actually present.

Finally, some things I’m still working on:

Cloudfront fetches content from S3 over port 80, not 443, so this wouldn’t be suitable for secure applications because it’s not end-to-end encrypted. I’m trying to think of a way around this.

I’m implementing Circle CI, just for kicks really.

Finally, invalidations. As above, if you don’t invalidate your CF disti after deployment, old content will be served until the cache expires. But invalidations are inefficient and ultimately cost (slightly) more. The solution is to implement versioned object names, though I’m yet to find a solution for this that doesn’t destroy other Hugo functionality. If you know of a clean way of doing it, please tell me 🙂

 

Compliance in DevOps and public cloud.

As a DevOps engineer, you’ve achieved greatness. You’ve containerised everything, built your infrastructure and systems in the cloud and you’re deploying every day, with full test coverage and hardly any outages. You’re even starting to think you might really enjoy your job.

Then why are your compliance teams so upset?

Let’s take a step back. You know how to build secure applications, create back ups, control access to the data and document everything, and in general you’re pretty good at it. You’d do this stuff whether there were rules in place or not, right?

Not always. Back in the late 90’s, a bunch of guys in suits decided they could get rich by making up numbers in their accounts. Then in 2001 Enron filed for bankruptcy and the suits went to jail for fraud. That resulted in the Sarbanes-Oxley Act, legislation which forced publicly listed firms in the US to enforce controls to prevent fraud and enable effective audits.

Sarbanes-Oxley isn’t the only law that makes us techies do things certain ways though. Other compliance rules include HIPAA, ensuring that firms who handle clinical data do so properly; GDPR, which ensures adequate protection of EU citizens’ personal data; and PCI-DSS, which governs the use of payment card data in order to prevent fraud (and isn’t a law, but a common industry standard). Then there are countless other region and industry specific rules, regulations, accreditations and standards such as ISO 27001 and Cyber Essentials.

Aside from being good practice, the main reason you’d want to abide by these rules is to avoid losing your job and/or going to jail. It’s also worth recognising that demonstrating compliance can provide a competitive advantage over organisations that don’t comply, so it makes business sense too.

The trouble is, compliance is an old idea applied to new technology. HIPAA was enacted in 1996, Sarbanes-Oxley in 2002 and PCI DSS in 2004 (though it is frequently updated). In contrast the AWS EC2 service only went out of beta in late 2008, and the cloud has we know it has been around for just a few years. Compliance rules are rarely written with cloud technology in mind, and compliance teams sometimes fail to keep up to date with these platforms or modern DevOps-style practices. This can make complying with those rules tricky, if not downright impossible at times. How do you tell an auditor exactly where your data resides, if the only thing you know is that it’s in Availability Zone A in region EU-West-1? (And don’t even mention to them that one customer’s Zone A isn’t the same as another’s).

As any tech in a regulated industry will appreciate, compliance with these rules is checked by regular, painful and disruptive audits. Invariably, audits result in compliance levels looking something like a sine wave:

This is because when an audit is announced, the pressure is suddenly on to patch the systems, resolve vulnerabilities, update documents and check procedures. Once the audit is passed, everyone relaxes a bit, patching lags behind again, documentation falls out of date and the compliance state drifts away from 100%. This begs the question, if we only become non-compliant between audits, is the answer to have really, really frequent audits?

In a sense, yes. However, we can no longer accept that audits with spreadsheet tick box exercises, and infosec sign-off at deployment approval stage actually work. Traditional change management and compliance practices deliberately slow us down, with the intention of reducing the risk of mistakes.

This runs counter to modern DevOps approaches. We move fast, making rapid changes and allowing teams to be autonomous in their decision making. Cloud technology confuses matters even further. For example, how can you easily define how many servers you have and what state they’re in, if your autoscaling groups are constantly killing old ones and creating new ones?

From a traditional compliance perspective, this sounds like a recipe for disaster. But we know that making smaller, more frequent changes will result in lower overall risk than large, periodic changes. What’s more, we take humans out of the process wherever possible, implementing continuous integration and using automated tests to ensure quality standards are met.

From a DevOps perspective, let’s consider compliance in three core stages. The first pillar represents achieving compliance. That’s the technical process of ensuring workloads and data are secure, everything is up to date, controlled and backed up. This bit’s relatively easy for competent techs like us.

The second pillar is about demonstrating that you’re compliant. How do you show someone else, without too much effort, that your data is secure and your backups actually work? This is a little more difficult, and far less fun.

The third pillar stands for maintaining compliance. This is a real challenge.  How do you ensure that with rapid change, new technology, and multiple teams’ involvement, the system you built a year ago is still compliant now? This comes down to process and culture, and it’s the most difficult of the three pillars to achieve.

But it can be done. In DevOps and Agile culture, we shift left. We shorten feedback loops, decrease batch size, and improve quality through automated tests. This approach is now applied to security too, by embedding security tests into the development process and ensuring that it’s automated, codified, frictionless and fast. It’s not a great leap from there towards shifting compliance left too, codifying the compliance rules and embedding them within development and build cycles.

First we had Infrastructure as Code. Now we’re doing Compliance as Code. After all, what is a Standard Operating Procedure, if not a script for humans? If we can “code” humans to carry out a task in exactly the same way every time, we should be able to do it for machines.

Technologies such as AWS Config or Inspec allow us to constantly monitor our environment for divergence from a “compliant” state. For example, if a compliance rule deems that all data at rest is encrypted, we can define that rule in the system and ensure we don’t diverge from it – if something, human or machine, creates some unencrypted storage, it will be either be flagged for immediate attention or automatically encrypted.

One of the great benefits of this approach is that the “proof” of compliance is baked into your system itself. When asked by an auditor whether data is encrypted at rest, you can reassure them that it’s so by showing them your rule set. Since the rules are written as code, the documentation (the proof) is the control itself.

If you really want your compliance teams to love you (or at least quit hassling you), this automation approach can be extended to documentation. Since the entire environment can be described in code at any point in time, you can provide point-in-time documentation of what exists in that environment to an auditor, or indeed at any point in time previously, if it’s been recorded.

By involving your compliance teams in developing and designing your compliance tools and systems, asking what’s important to them, and building in features that help them to do their jobs, you can enable your compliance teams to serve themselves. In a well designed system, they will be able to find the answers to their own questions, and have confidence that high-trust control mechanisms are in place.

Compliance as Code means that your environment, code, documentation and processes are continuously audited, and that third, difficult pillar of maintaining compliance becomes far easier:

This is what continuous compliance looks like. Achieve this, and you’ll see what a happy compliance team looks like too.

 

Re:Develop.io History of DevOps talk

I recently gave a talk at Re:Develop.io about the history of DevOps, and this page is where you can find the slide deck and other relevant resources.

Video of the talk

Re:develop.io DevOps Evolution Slide deck

The Theory of constraints

Agile 2008 presentation by Patrick DuBois

10+Deploys per day, Dev and Ops at Flickr – Youtube

10+deploys a day at Flickr – slide deck

DORA 

Nicole Forsgren research paper

2018 State of DevOps Report

My “Three Ways” slide deck

Safety culture

Amazon DevOps Book list

Deming’s 14 points of management

Lean management

The Toyota Way

The Agile Manifesto

Beyond the phoenix project (audio)

Netflix – simian army

My Continuous Lifecycle London talk about compliance in DevOps and the cloud

 

 

 

 

Going solo

This is the view from where I’m living now.

I’ll explain. A couple of months ago, my partner was offered a once-in-a-lifetime opportunity to live and work in Andalusia, doing digital marketing for an organisation that is part retreat centre, part permaculture farm, and part yoga teacher training school. With brexit looming and faced with such a great opportunity to do something very different, we both rapidly left our jobs in the UK, packed up what stuff we didn’t get rid of or put in storage, and moved.

All of which means I’m now working with the organisations here (suryalila.comdanyadara.com and froglotusyogainternational.com ) alongside developing my own consultancy business and working as CTO for ydentity.com (so new we still have lorem ipsum text). I will freely admit that coming off the salary drug is a tough task, but having the freedom to do my own thing, develop my skills and work the hours that I want is proving very satisfying so far.

For the moment, I’m getting involved in a really wide range of work, from project management, tech consultancy, AWS engineering, to digital marketing and analytics, security consultancy and more. The reason for me getting stuck into such a wide range of tasks is so I can really work on evaluating what the most suitable area is for me to focus on in the future, both from a perspective of what I’m good at and enjoy doing, and also what I find there is most demand for in the marketplace.

If you would like to work with me or you’d like to discuss an opportunity for us to collaborate, drop me a line through LinkedIn, email me at tom@tomgeraghty.co.uk or pop in to see me in Andalusia, if you can get past the goats on the highway.

GDPR, and how I spent a month chasing my data.

sherlock holmes basil rathbone

In May 2018, I received a letter from a local firm of solicitors, Roythornes, advertising a property investment event. I hadn’t heard of them and I was damn sure I hadn’t given them my permission write to me at home. They were wide of the mark to say the least- I’m an unlikely potential property tycoon, unless we’re playing Monopoly. Even then, I’m a long shot.

It was a quiet week at work so given the recent implementation of GDPR and the fact that I really don’t like junk mail, I thought I’d give the new Data Subject Access Request (DSAR) process a whirl.

I couldn’t find contact at Roythornes to send a DSAR to but, helpfully, GDPR places no restriction on the medium someone can use to make a request, so at around 9am that day I filled in their online contact form, despite my concerns that it would get picked up by a clueless admin assistant. I requested a copy of the data they hold on me, the source of that data and the evidence of my opt-in for direct mail. I also asked that they delete the data they hold on me and send no further marketing material.

At 1:42pm, I had a from “Norma” of Roythornes (not joking, sorry Norma), asking for a copy of the letter and stating that she couldn’t find me in their database. So far, so good…

At 1:55pm, I received an automated recall email from Norma.

A few hours later, another email arrived, this time from the firm’s “compliance partner”, stating that they had acquired my personal data in a mailing list they purchased from Lloyd James Media, on the 1st of May 2018, and that my letter was sent out on the 21st May 2018. She stressed that the purchase of the list, and the sending of the letter itself was prior to the GDPR implementation date of 25th May 2018, and therefore legal.

Solicitors abiding by the letter of the law, not the spirit of the law? Imagine that.

Dancing around technicalities notwithstanding, Roythornes did confirm that my data had been deleted and I wouldn’t be hearing from them again. Phase 1 complete, but who exactly were Lloyd James Media, and how did my data fall into their hands?

For Phase 2 of my quest, a quick google told me that Lloyd James Media “is a multi-channel data agency focusing on the intelligent use of data to optimise customer acquisition and retention.” I can only assume this translates as “we make and sell mailing lists”. So, off went my DSAR email to their sales email address, because yet again, there was no contact information available for non-sales enquiries, let alone DSARs.

Andrew of the compliance team at Lloyd James Media only took 24 hours to get back to me. He confirmed that they sold my personal data to Roythornes for a postal campaign. They had acquired my data from another firm, “My Offers”, and the consent for postal marketing was obtained by them, apparently. Helpfully,  Andrew suggested I get in touch with the “compliance team” at My Offers. Evidently, this is a team of one, someone called Saydan, whose email address Andrew provided. I reminded Andrew to remove my data from their system and headed off to continue the hunt for the true source of my data, feeling like a geek version of Bear Grylls tracking an elusive squirrel. Phase 3 had begun.

My Offers are “Europe’s leading online direct marketing company with a database of 22.2million registered users.” I fired my third DSAR email off to Saydan later that day. One week later, I’d heard nothing. According to GDPR, there is no need for an immediate response, as long as the DSAR is executed within a month, but the silence was unnerving. Was Saydan trying to ghost me? I found their Facebook page and sent a message to whichever poor soul supports their social channels. For good measure, I also dredged LinkedIn for their employees, emailed Ivan, one of their directors, and in true annoying millennial style, tweeted at them. The only response was from their Facebook team, who reiterated that I should email the enigmatic Saydan, then also went quiet on me.

Over the next few weeks, because nothing seemed to be happening, I pinged their facebook team a courtesy message once each week with a gentle reminder of the impending deadline for a response. Part of me was relishing the prospect of not getting a reply, and I began googling, “What to do if someone doesn’t respond to a DSAR.” I was way too invested in this.

Then, exactly one month to the day since the original request, an email from Ivan, the My Offers director arrived in my inbox. Ivan’s email was straight to the point and only had a few spelling mistakes. Attached was a password protected CSV file containing all the information I’d requested. The password was sent in a separate email. So far, so good (though, yet again, I had to remind him to remove my data from their systems).

The CSV file was interesting. And by interesting, I mean in the way that hearing the creak of a stair when you’re in bed, and there’s nobody else in the house is interesting. The data contained my full name, birth date, gender, home address, email address, phone number, data subject acquisition date and source (Facebook), as well as a list of businesses that my data had been shared with in the past year. The list totalled around 60, including Lloyd James Media, various PPI and no-win no-fee firms, and more. That explains all the marketing calls over the past year then.

This CSV file was the smoking gun. However, the trigger was evidently pulled by my own fair hand. At some point, possibly whilst drunk, bored at work, or both, I’d clicked on a campaign offering me a beard trimmer. I still don’t have a beard trimmer (I do have a beard), so I presumably I didn’t pursue this purchase but in getting only that far, I inadvertently provided My Offers with access to my personal data, and consent for direct marketing. Sounding eerily familiar, I wondered if my voting choices in the last election were my own making.

So, just over a month after I sent my first DSAR to a local firm, what have I learned from this?

Firstly, GDPR actually works. Not only was the DSAR process easy to do, it was free (for me), and two out of three firms responded within 24 hours. Presumably GDPR is also helping to reduce unwanted junk mail; after all, Roythornes as good as admitted that they wouldn’t have posted the initial letter to me after the GDPR implementation date.

Secondly, once your data is out there, it gets around. It only takes one “online direct marketing company” to get hold of it, and your personal information will spread faster than nationalism in Europe.

Finally, don’t be dumb on facebook (like me). We know about Cambridge Analytica of course, but they’re not the only guys trying to harvest information and profit from it. Resist the lure of data-harvesting surveys and competitions, even when drunk.

The Three Ways

The three ways are one of the underlying principles of what some people call DevOps (and what other people call “doing stuff right”). Read on for a description of each approach, which when combined, will help you drive performance improvements, higher quality services, and reduce operational costs.

1. Systems thinking.

Systems thinking involves taking into account the entire flow of a system. This means that when you’re establishing requirements or designing improvements to a structure, process, or function, you don’t focus on a single silo, department, or element. This principle is reflected in the “Toyota way” and in the excellent book “The Goal” by Eliyahu M. Goldratt and Jeff Cox. By utilising systems thinking, you should never pass a defect downstream, or increase the speed of a non-bottleneck function. In order to properly utilise this principle, you need to seek to achieve a profound understanding of the complete system.

It is also necessary to avoid 100% utilisation of any role in a process; in fact it’s important to bring utilisation below 80% in order to keep wait times acceptable. See the graph below.

2. Amplification of feedback loops.

Any (good) process has feedback loops – loops that allow corrections to be made, improvements to be identified and implemented, and those improvements to be measured, checked and re-iterated. For example, in a busy restaurant kitchen, delivering meatballs and pasta, if the guy making the tomato sauce has added too much salt, it’ll be picked up by someone tasting the dish before it gets taken away by the waiter, but by then the dish is ruined. Maybe it should be picked up by the chef making the meatballs, before it’s added to the pasta? Maybe it should be picked up at hand-off between the two chefs? How about checking it before it even leaves the tomato sauce-guy’s station? By shortening the feedback loop, mistakes are found faster, rectified easier, and the impact on the whole system – and the product – is lower.

3. Continuous Improvement.

A culture of continual experimentation, improvement, taking risks and learning from failure will trump a culture of tradition and safety every time. It is only by mastering skills and taking ownership of mistakes that we can take those risks without incurring costly failures.

Repetition and practice is the key to mastery, and by considering every process as an evolutionary stage rather than a defined method, it is possible to continuously improve and adapt to even very dramatic change.

It is important to allocate time to improvement, which could be a function of the 20% “idle” time of resources if you’ve properly managed the utilisation of a role. Without allocating time to actually focus on improvement, inefficiencies and flaws will continue and amplify well beyond the “impact” of reducing utilisation of said resource.

By utilising the three ways as above, by introducing faults into systems to increase resilience, and by fostering a culture that rewards risk taking while owning mistakes, you’ll drive higher quality outcomes, higher performance, lower costs and lower stress!

For my presentation on the Three Ways, click here. Feel free to use, adapt, and feed back to me 🙂

Streaming music services and the future of consuming music

I’m listening to Spotify while I write this. I’ve been a premium subscriber since early 2010, which means I’ve so far paid spotify £390 of which around 70% has gone to the artists. It took me a while to get used to the idea that i didn’t “own” the music I was listing to, but the benefits of being able to listen to anything I wanted to, whenever i wanted, and the chance to discover new music made up for it and I now believe that as long as streaming services exist, I’ll never buy a CD again. I won’t bang on about how great it is, because you’re generally either into streaming or not, and that usually depends on how you listen to your music.

There’s a lot of bad press about streaming services and the supposed bad deal that the content creators (artists) get paid from it.  Atoms for Peace pulled their albums from Spotify and other streaming services, with band members Thom Yorke and Nigel Godrich criticising these companies for business models that they claimed were weighted against emerging artists. I disagree. Anyone that thinks they can create some music and make a living from it using streaming services is living in a dream world. The music business has changed, and for the better in my opinion. Gone are the days when a band could release a CD, sell hundred of thousands or millions of copies and rake in the big bucks (but don’t forget the record labels and other third parties taking their lion’s share). Some people compare streaming to that old business model, and that’s where it looks like the artists are getting a worse deal, but it’s not a fair comparison.

Musician Zoë Keating earned $808 from 201,412 Spotify streams of tracks from two of her older releases in the first half of 2013, according to figures published by the cellist as a Google Doc. Spotify apparently pays 0.4 cents (around 0.3p) per stream to the artist. When artists sell music (such as a CD), they get a one-off cut of the selling price. When that music is being streamed, they get a (much smaller) payment for every play. Musician Sam Duckworth recently explained how 4,685 Spotify plays of his last solo album earned him £19.22, but the question is just as much about how much streams of the album might earn him over the next 10, 20, 30 years.

If you created an album yourself, and you had a choice between two customers – one who would by the CD, giving you a £0.40 cut, and one who would stream it, providing you with £0.004 per stream, which customer would you choose? Part of this actually might depend on how good you think your music is, and how enduring its appeal will be. If it’s good enough, and al the songs on that album are good (all killer, no filler!), then it’s going to get played a lot, making streaming more lucrative over time, but if it’s poor, with only a couple of decent tracks, and maybe not as enduring as it could be (think Beatles vs One Direction), then a CD is going to be more lucrative, because after a year or so that CD is going to be collecting dust at the bottom of the shelf never to be played again.

I can’t easily find a way to show the number of plays per track in my spotify library, apart from my last.fm scrobble stats, which won’t be entirely accurate as they only record what I listen to in online mode, but I’ve pasted the top plays per artist below:

The Gaslight Anthem (621 plays)

Chuck Ragan (520 plays)

Frank Turner (516 plays)

Silversun Pickups (425 plays)

Biffy Clyro (305 plays)

Ben Howard (302 plays)

Sucioperro (241 plays)

Eddie Vedder (225 plays)

Blind Melon (173 plays)

Foo Fighters (166 plays)

Iron & Wine (141 plays)

Saosin (121 plays)

Benjamin Francis Leftwich (119 plays)

Cory Branan (116 plays)

Twin Atlantic (112 plays)

Kassidy (101 plays)

Funeral for a Friend (94 plays)

Molly Durnin (89 plays)

Crucially, of the 18 artists above, at least 4 or 5 are artists that I discovered on spotify. The radio and “discover” tools on it are actually really good (90% of the time), and of those 4-5 discovered artists, I’ve seen two of them live in the past year or so. If we stop trying to think in pure instant revenue terms, streaming services provide a great part of a business model that includes long term small payments to artists and allows consumers to discover new music more easily.

Artists need to build themselves a business that incorporates records, songs, merchandise and/or tickets, and look for simple ways to maximise all those revenues.

Crucially, they also need to start developing premium products and services for core fanbase – fans who have always been willing to buy more than a gig ticket every year and a record every other, but who were often left under-supplied by the old music business. Which is why, for artists, the real revolution caused by the web isn’t the emerging streaming market, but the boom in direct to fan and pre-order sites.

Frank Turner believes we may eventually move towards a model where all music is free, but artists are fairly compensated. Talking about piracy and torrenting, he says:  “I can kind of accept that people download music without paying for it, but when the same people complain about, say, merch prices or ticket prices, I get a little frustrated.” “I make the vast majority of my living from live, and also from merch. Record sales tick over.”

If you look at Frank Turner’s gig archive, you’ll see he’s performed at almost 1500 live shows from 2004 to 2013. Most of the musicians I know do what they do because they love playing music, and particularly so in front of an audience. I personally believe that live music should be the core of any musician’s revenue stream, with physical music sales, streaming, merchandise, advertising, sponsorship, and other sales providing longer term revenue. Frank seems pretty hot on spotify, and has released a live EP exclusive to the service.

I also believe the format of live shows will change too. I love small gigs in dark little venues such as the Rescue Rooms in Nottingham, but as artists become more popular and play larger venues, there is naturally some loss of fan interaction. With the use of mobile technology, social networks, and heavy duty wifi (802.11ac for example), large venues can begin to allow the artists to interact with fans and provide a more immersive experience. Prior to or while the artist is on stage, content can be pushed to the mobile devices of those in the audience, telling them what track is being played for example, with links to download or stream it later, provision of exclusive content such as video and photo, merchandise, future gig listings, and event the ability to interact with other fans in the venue or otherwise.

The future is a healthier relationship between services like Spotify and musicians, where both can find more ways to make money by pointing fans towards tickets, merchandise, box-sets, memberships, crowdfunding campaigns such as songkick’s detour, and turning simple concerts into fuller experiences for fans.

How to write an SPF record

An SPF record is a DNS TXT record (like A records and MX records) that indicates to receiving mail servers whether an email has come from a server that is “allowed” to send email from that domain. I.e. it’s a check that should prevent spammers impersonating your domain. It does rely on the receiving server actually doing the check, which not all do, so it’s not by any means fool proof, but it should help prevent mass email from your organisation to customers being flagged as potential spam.

 

Below is an example SPF record for capitalfmarena.com:

(this is in the public domain – you can look up an organisation’s SPF record by using online SPF checkers)

 

“v=spf1 ip4:93.174.143.18 mx a:service69.mimecast.com mx a:service70.mimecast.com a:capitalfmarena.com -all”

 

V=spf1 specifies the type of record this is. (SPF)

 

Ip4: pass if the IP senders IP address matches the addresses we send mail from.

 

mx a: pass if sender’s IP matches an ‘MX’ record in the domain

 

a: pass if Sender’s IP matches an ‘A’ record in the domain

 

The –all indicates that all other senders fail the spf test. (+all would mean anyone can send mail.)

(~all was used when spf was still being implemented, and is a soft fail, but shouldn’t really be used any longer other than when you’re transitioning between mail hosts or something)

 

Mechanisms are tested in order and any match will pass the email. A non-match results in a neutral state, until it gets to the end of the string where the –all mechanism will fail it.

 

IT & Web Infrastructure Masterclasses

Through March 2013, I’m running a set of IT and Web infrastructure masterclasses in Nottingham (in conjunction with PCM Projects), for people who don’t necessarily work in IT, but need to know (or would benefit from knowing) some of the basics.

The intended audience is small business owners or managers, where you may have to deal with IT contractors or staff and decide IT and web strategy, but you’re not comfortable that you know enough about it to make informed decisions. For example, there are an almost infinite number of ways to keep your business data accessible, secure, backed up, and away from prying eyes, but which way is best for you? How should you manage your website – should you pay someone else to design and host it, or bring it in-house? How should you handle email, on what sort of server? How should you plan for business growth? How do you protect your business from viruses, malware, spam, and hacking attempts?

These are the sort of questions that I will help you with – you don’t need any knowledge of IT or the web already, and because the groups are small – around 6 people – you’ll be able to ask questions and find out information specific to how your business operates.

You’ll then have enough knowledge to go to your suppliers or contractors, and ask the right questions, purchase the right services, at the right price.

There are four sessions, as below, and you can book yourself on them by visiting the eventbrite page for the events. Contact me for any further information.

 

Technically Speaking – 4 March

Topics to include: an overview of web/IT infrastructure and how it all fits together; an update on the current climate; domain names, analytics, and connections to social technology.

 

Email & Communication – 11 March

Topics to include: different service providers and set-ups (e.g., using hosted email, managing it in-house) and getting it all working for PCs and on mobile devices; good email practice, transferring data and keeping it secure.

 

Internet Security – 18 March

Topics to include: how to stay safe and keep trading; what are the threats – viruses, hack attacks, theft, loss of confidential or valuable data; keeping your business (and family) safe on the internet; and keeping your systems up to date and secure.

 

Data storage – 25 March

Topics to include: managing data storage and growth in your business; internal networks and cloud storage; back-ups; access controls, speed vs. reliability vs. cost.

Virtual Domain Controllers and time in a hyper-V environment

In a “normal” (read: physical) domain environment, all the domain member machines such as servers and PCs use the PDC (Primary Domain Controller) as the authoritiative time source. This keeps all the machines in a domain synchronised to within a few milliseconds and avoids any problems due to time mismatch. (If you’ve ever tried to join a PC to a domain with a significantly different time setting, you’ll see how this can affect active directory operations).

However, virtual machines are slightly different. VMs use their virtual host as the authoritative time server – it’s essential that the virtual host and the guests operate on the same time. Run the below command in a command prompt on a VM:

C:\>w32tm /query /source

And it should return:

VM IC Time Synchronization Provider

If you run the same command on the host itself, it’ll just return the name of one of the domain controllers in your network (probably, but not necessarily, the PDC).

Now, what if your domain controllers are virtual? They’ll be using their host machine’s time as the source, but the hosts themselves will be using the PDC as an authoritative time source – the problem is clear: they’re using each other as authoritative time sources and network time will slowly drift away from the correct time.

You may decide to disable integration services for the guest (the PDC), and configure an authoritative external time source, but if the PDC is rebooted or goes offline and comes back online with a different time than the host (such as a restore), you’ll have problems. Granted, this should fix 90% of issues, but I wouldn’t recommend it as a solution.

Disable integration services in hyperV

 

 

 

 

 

 

 

In an ideal world, you’d still have at least one physical PDC, which would use an external time source, and would serve time to all other machines in the network, but if your infrastructure is such that you only have virtual domain controllers, you’ll need to do something a little different. The best way to this is to set your virtual hosts to use the same external (reliable) time source. This does of course require that your virtual hosts have access to the internet, but at least you should be able to add firewall rules to enable access to a fixed range of NTP servers, which should pose no security threat.

To do this, log on to your (windows) virtual host (in this case, I’m using Hyper-V server 2008 R2).

Run

C:\>w32tm /query /source

And it’ll return one of the domain controllers.

Use the command prompt to open regedit, and navigate to HKLM-System-CurrentControSet-services-w32time-parameters.

It’ll probably look like this:

 

 

 

 

 

Change the “Type” entry to “NTP” and if you desire, change the NtpServer entry to something other than windows time, although you can leave it if you wish.

registry time settings

 

 

 

 

Now that you’ve changed the registry entries, run:

net stop w32time & net start w32time

then

w32tm /query /source

And it should return the new internet time servers.

Run:

w32tm /resync /force

to force a resync of the machine’s clock.

Log on to the virtual machine running on this host, and check the time. Force a resync if you want – it won’t do any harm, and at least you’ll know it’s synced.

If you now run:

W32tm /monitor

on any machine, it will display the potential time servers in your network, and the time offset between them. If all is correct in your network, the offset should be pretty small (though it will never be zero)

domaincontroller1.domain.local *** PDC ***[ipaddress:123]:
    ICMP: 0ms delay
    NTP: +0.0000000s offset from domaincontroller2.domain.local
        RefID: 80.84.77.86.rev.sfr.net [86.77.84.80]
        Stratum: 2
domaincontroller2.domain.local[ipaddress:123]:
    ICMP: 0ms delay
    NTP: -0.0827715s offset from domaincontroller1.local
        RefID: 80.84.77.86.rev.sfr.net [86.77.84.80]
        Stratum: 2
Warning:
Reverse name resolution is best effort. It may not be
correct since RefID field in time packets differs across
NTP implementations and may not be using IP addresses.

 

If you find a domain member machine (whether it’s a server or simple client) which is not set to use the proper domain NTP server, run the below command:

w32tm /config /syncfromflags:DOMHIER /update

This command instructs the machine to search for and use the best time source in a domain hierarchy.