This is the third post of five in the BurrMill 101 crash course series. This is a walkthrough, by the end of which you will have secured your account, created a project, built and packaged all software, created a bootable OS image, and requested a quota for your computation: everything nearly ready to hit the button and start churning your data at scale.
Understand that this is a marathon. If you sprint at the beginning, you will vomit on your shoes by the end.
— Aisha S. Ahmad, The Chronicle of Higher Education, 2020-03-27
Pace yourself. You may feel that things escalate quickly. If you have questions, ask in the BurrMill Q&A forum. Try to understand every thing that you do.
You will need a credit card, or better two. The catch is, once your bill lapses (suppose you forgot to register a renewed card after its expiration), your quota gets reset, and you’ll go through the requesting process again, this time writing a convincing explanation why you lapsed. Other payment methods are available, and vary by country (here in the US, I’ve seen an option to set up a direct debit from a bank account).
If you are a business, you can establish PO terms instead. Call Google for assistance if you want to go this way. I use credit card and reimburse expenses, so I cannot give you any advice here.
Please follow the steps here exactly, and in the suggested order. The reason is, even this sequence leads to a “helpfully” created project for you, which we get rid of. I evaluated the sequence a few times to minimize the impact of this unasked for “help.”
Create and secure your account
1.1 Create a Google Account
If you already have one (example@gmail.com
) and want to use it as a GCP account, skip to 1.2. You may create a separate account for BurrMill only, if you wish. Go to GMail, click your account picture at the top right, and select Add another account. There are no pros or cons using a separate account.
If you do not have one, go to https://myaccount.google.com/ and click on Create a Google Account. On the next page, do not select Use my current e-mail address, make a new one at the GMail.com domain. The next screen will take you through phone verification (the phone is used once: you chose to keep the phone number on your account or forget it), and your new account is ready.
1.2. Add a recovery option
Always add a way to unlock your account in case you forget your password. Do not rely on your memory, however good you think it is. Register a phone or e-mail of a family member or a friend that you can send an unlock code to. Your GCP operations may be as expensive as you only want to go. You must be sure to be able to connect and stop them.
1.3. Enable 2FA on your Google Account
Did I get enough of your attention? Let’s try again.
1.3. Do enable 2FA on your Google Account
No. Not tomorrow. Do it now. This is important. This is very important. I can repeat that over and over: this is the most important safety and security measure ever invented since fully enclosed bread slicer. It is so important that do not even peek at Part Two before you conclude this setup.
You do not want to pay for someone’s mining cryptocurrency using your GCP resources. There are many verification options: your phone, a Google Authenticator app, a hardware cryptographic token. The phone is the simplest option: when you log on, your phone opens a page asking you if it was really you.
The idea of two-factor authentication (2FA) is very simple. To complete a successful login, there must be (1) some bit of information that you know, and (2) some physical device or token that you posses. In the case of 2FA, you (1) know your password and (2) posses your phone. An attacker obtaining one does not get access to your account, but immediately reveals that your 2FA is compromised✼If your phone suddenly asks you to confirm a login attempt when you in fact did not, you know that the part you know is stolen. If you lose your phone, that an sure indication that the part you used to posses you no longer posses. Neither causes the full compromise of your account. It takes seconds to revoke the lost part and and remedy the attack.
Note that (2) is prone to the problem common to our all worldly possessions: one moment you have it, and the next you realize you do not. Think how many different second factors you tie to one device: it will disappear will all of them. Register a second device: your second phone (can be even landline, you can have a robot call you and read the code) or that of a family member you can quickly contact to ask to read you the code sent or called in and read to their phone. You can also print and store in a safe place 10 one-time 8-digit codes that you can use in place of the second factor if you lose your phone (each code is single-use). Google will not unlock your account for you if you lose your second factor device. Google Cloud Billing Support will flatly refuse to talk to you unless you send the request from the account owner form. Keep the backup, or register two or more ways to do 2FA.
If you think this is an extra complexity, Ashla of course balances Bogan: your password may be short, nice, sweet and easy to remember (just never use the same password on two sites—ever), and its complexity is no longer important: as soon as you lose your phone, the attacker has only so much time to break in as you to login with a rescue code and deactivate the lost device. Absent 2FA, they have eternity. This is the main advantage of 2FA: you can (and must!) react quickly to the loss of one factor, and remedy the problem is a time unfeasible for a break-in. And anyway your lost phone has 999 to 1 chance to become a next hardware upgrade to some homeless person until you disconnect the service than to get into the hands of a genuine codebreaker.
You must upgrade your existing Google Account to 2FA security, the same way as a new one. Open the same https://myaccount.google.com/, click on Security on the left, and select 2-Step Verification in the main screen. Follow the instruction. And please print and save the rescue codes (and do not store them on your phone, of course). You can revoke them and print new ones any time you suspect they came into someone’s possession.
The simplest option is Google Prompt. When you enter your login and password, your phone simply pops up a box asking you if it was really you who initiated the login. You simply click on yes, and you are done. The Authenticator option is useful if your phone has no connection to the internet (wired computer inside a building with little reception). Install it from the Play Store and follow instructions to register. It shows a 6-digit code that change every minute. It allows some slack, so even if your phone is off the network for a few hours and its own clock is fast or slow, the code will still work. By the way, it follows open standards, and works with many other online and offline sites.
Pay attention to the areas highlighted with two rectangles, red on top and yellow on bottom. The top rectangle lists the second factors that you lose with your phone. The bottom marks those that you aren’t. Make sure you still have access to your account if you lose your phone. You can add many phone numbers as needed. Phone owner cannot login without your password; rather, if you lose your phone, ask the phone owner to read you the code they receive when you attempt to login, and then register your new one, or print enough single-use codes while waiting for a replacement to arrive in the mail.
Not shown on this picture, down below on the same page, there is an option to enroll a hardware device, like Yubikey. The device is convenient and very secure; even if you leave it plugged, it requires touch of a physical button to interact with the browser to log in. This means it’s off-limits to a remote attacker or malware: it’s cryptoprocessor is physically hardwired to act only if you touch of the button. It’s as simple to use as Google Prompt—one touch, and you are logged in—only requires no phone with Internet connection and does not lose charge. It also falls into the “not part of your phone” category, and is good both as a primary or backup second factor device. But do register a phone as the secondary, because if you lose the Yubikey...✼I did while working at Google, together with the access badge and all my keys. A stranger picked them up, found me by name on the badge and sent me a LinkedIn message. By the time I responded, this bike of this exceptionally nice lady who spent so much of her valuable time locating and trying to contact me was already stolen, together with the bag and my keyring in it. Do always register two physically separate factors in the posses category!
Final note. When logging in to a browser that you never logged in before, you’ll be present a checkbox asking if you trust this machine to skip 2FA next time. Feel free to do that to your home computer or your laptop. Again, if the computer gets lost or stolen, login to https://myaccount.google.com/, and on the same Security section you’ll see all authorized computers. De-authorize the lost one. If in doubt which one it was, deauthorize all that you suspect could be the one. Or deauthorize all—you’ll be simply asked to login next time on that computer, it’s not that you deactivate it forever. Google just “forgets” that you once decided to trust it. This is the mechanism independent of 2FA, by the way. If you discover that phone you disposed a year ago… better revoke its authorization, really. If you logged on from a public computer at a library and clicked on that “trust this computer” box out of habit, it’s also a good idea to revoke it from the same Account controls page.
It’s a good security hygiene to periodically check which machines you are logged on on this very page, or use the Security Checkup, available on the same page.
Connect to GCP with your Google Account
Now your account is good to go for big computation and protected against gigaflop thieves. Time to login to GCP with it, and establish services. Lets recap:
- GCP provides services that are controlled via REST APIs.
- You can call these APIs directly from programs, or using curl, or via higher-level libraries.
- Most APIs have a Web UI, collected under the umbrella of the Cloud Console.
- Most API have a command-line UI, which is provided by the Cloud SDK.
Note the word most under items 3 and 4. There are few exceptions. In particular, the setup of the services is available only through the Cloud Console. This is where you provide your consent to the services and agree with the GCP TOS. In fact, you’ll have to accept three different agreements, second of which you’ll be bound to for no longer than a minute, as it’s superseded by the third one! Still, this is what lawyers have us to do, and there is no way to do that any other way.
Reminder: all you do is free of charge unless I note otherwise.
2.1. Login to Cloud Console and accept TOS
Go to https://console.cloud.google.com/. Since this is your first login, you’ll not go further before accepting the TOS. It’s a legal document, and explains what you need to do to be kicked out from GCP with a lifetime ban. Read it, but as a short summary,
- Mine cryptocurrency✼This activity would be very lucrative for Google. It’s an easy to plan load on the GPU, and provides good resource use. The problem with it is widespread account theft. Accounts are broken into and stolen by cryptominers who do not want to pay for the resources, and so often that this has become a problem: it just attracts too much crime. So the Google’s decision to ban cryptomining helps protect other customers who did not read the Part One, and are not as security-savvy as you.
- Intentionally compromise✼If you found a security hole, do not sweat and close your browser and hope they won’t notice. They won’t: the guardian AI is trained to recognize known patterns of illegal activity. Instead, please open an issue in a private tracker (so it’s visible only to you and Google) and report it. I did many times, and I am still here with you in one piece. Only intentional breaking is prohibited. security or health of the GCP platform.
The TOS is very sensible. Read it, or at least skim.
2.2. Start your free trial
As soon as you accept the GCP TOS, you are at the main screen of Cloud Console. Disable ad blockers and cookie eaters on this page, and all pages in the .cloud.google.com domain: they are known to interfere with it, and refresh the page. Look at the very top edge of it on the right, and you’ll see two buttons.
One of them starts your Cloud Shell machine (a small command-line Debian VM in a browser). Press it first, because provisioning of the machine may take up to 5 minutes✼If you do not see a white OS prompt on black background in 5 minutes, something is not right, and waiting more makes no sense. DevShell is still in beta, and I’ve seen it not starting in specific versions of Firefox. Login from Google Chrome and try again. Often this happens during the provisioning time only; maybe a connection just times out. Later runs should be fine: only the initial setup takes a long time., and you can go about other business meanwhile. Then press the second button to activate your free trial. This gives you $300 in credit to spend in up to 365 days.
You’ll be taken to the billing page, asking you (you guessed it!) to accept the TOS of the free trial. You do not have to read it: the trial account is limited, and you are not going to use it anyway (you keep the $300 in any case), and convert into a full billing account right away. Just accept and go on.
On the next page, you’ll enter your credit card. I guess you no not need much guidance through that process.
2.3. End your free trial
That was a quick trial, was not it?
Starting the trial takes you back to the main Console screen. Notice two things that you’ll be using often. The first is the Project Selector. Remember you may have many GCP projects? This is what sets the current project for the Console session (it does not affect other places that have a notion of a current or default project). This is the thing I circled yellow at the top bar. A project was created by GCP, and this project probably is not a good starting point for BurrMill:
- It has an autogenerated project ID of the form adjective-verb-[0-9]{6}. And you have to type it once in a while in console commands. If you can live with your BurrMill project called
gregarious-tyrannosaurus-318066
✼Ok, ok, I invented the name. I just asked GCP to generate 3 random project names for me, and it came up with “arched-album-271409”, “omega-healer-271409” and “upbeat-bolt-271409.” Maybe the tyrannosaurus was not that bad…, you are… let’s just agree, you are not like me. The human-readable project name “My First Project” can be changed, but the ID is both permanent and is unique in GCP. - It has a lot of services enabled, and we do not need them. The BurrMill initialization script takes care of that, but it takes longer. I prefer starting from a clean slate.
On the left, there is a hamburger menu of different services, that goes way, way down. Note the oval on top that reads “pins go here.” Pins are set per account, and do not change if you select a different project. We’ll pin the services that are useful to us later; now just click on the Billing item.
Hit the Upgrade button, and now you have a full GCP account! Or… Wait. I think you’ll have agree to a full billing account TOS. I do not remember if I even read it, but probably it makes sense.
While you are on the same page, click on that “Manage Account” link, and add a second credit card to it. The selection on the left menu is “Payment settings,” and it’s second from bottom as of the moment of this writing. I already explained that a single failed payment will reset your quota, if not disable all paid services on the account. Did not happen to me, but I highly recommend to register a backup payment method.
Depending on your location, other payment methods are available, such as direct debit from a bank account. You’ll find the details on the same payment methods page.
Last thing to do is go to the home page by clicking on the “Home” item in the main menu and neatly arrange the services that you may want to use in the pin order. On the screenshot is the arrangement that I find useful, and if you have no better idea, you can do the same thing (I cut it at the services that you are not likely to use through console; I needed them for debugging BurrMill). Note that the pinned services may be rearranged by dragging and dropping.
Now check if your Cloud Console is alive (it’s still in beta, and sometimes finicky when starting up). If not, sometimes this trick helps: click on a “+” icon above the console to open a new tab, and a command-line prompt appears immediately. If it did not, bad luck; try another browser. If it still refuses to appear, you may install the SDK locally. If it did… you may also want the SDK installed locally. The next chapter explains how to do it. All the commands are exactly same: the identity which issues the API command on your behalf is the same almighty account which you started this section setup with, and, as a project owner, you have nearly full access to everything there is.
If you leave Cloud Shell open, the session will expire in 12 hours, together with about ⅕ of your weekly quota, which is 50 or 60 hours. Log out of all Cloud Shell terminal when you do not need them.
Set up command-line access
There are two main options how to control GCP using command-line tools: Cloud Shell in the browser, or a local install of the Cloud SDK. Although local Cloud SDK is available for many platform (Windows, Mac, Linux), only the Linux setup is good for BurrMill, since most of our scripts are written in Bash and depend on other Linux tools, not available directly on other platforms. You can, however, use WSL on Windows or Homebrew on Mac to run it.
You can use either Cloud or local shell, and have both of them activated. But you must setup at least one of them.
3.1. Set up Cloud Shell
If you don’t want to use Cloud Shell, skip to 3.2.
This is optional, but highly recommended. Cloud Shell, also known as DevShell, is a free small machine to perform administrative tasks in your project.
Every time you log on to the Cloud Shell, the system resurrects your home disk from storage. As long as you log on once every 120 days, the disk is kept indefinitely. You can copy your own set up files to it (like .vimrc
, for example). All sessions running in browser windows are screens in tmux
, so that you may ever reload browser and get reconnected back to the live session.
There are buttons invoking a menu on the DevShell toolbar; there you can adjust font sizes and color theme, and download or upload files to the VM right from the browser: scp or sftp is not required.
Be careful editing files executed by bash on login and logout. GCP depends on some sourced files in them. Near the end of .bashrc
there are 3 lines where a file bashrc.google
is sourced. If you remove the sourcing line, the whole .bashrc
file will be overwritten with a fresh copy on next machine startup. Also, if you are customizing shell prompt or, generally, anything at all, move this snippet from the end to the start of .bashrc
: it overwrites many settings unconditionally, including PS1
.
One trick I use is set the default project ID, if one is not already passed by the Cloud Console (We’ll cover why and when it happens much later, under tip and tricks; for now, DEVSHELL_PROJECT_ID
is always passed to DevShell, so you add a harmless no-op. But set it before sourcing the Google’s file.
if [[ -f /google/devshell/bashrc.google ]]; then
: ${DEVSHELL_PROJECT_ID:=my-wonderful-burrmill} # You may add this to set the default.
. /google/devshell/bashrc.google
fi
Another trick is set terminal to use 256 colors. I always do. DevShell uses tmux, and its terminal type, screen
, does support 256 colors in its screen-256color
termtab entry (and we colorize output in BurrMill scripts). The default type screen
reports the support for only 8 colors. This snippet may go anywhere you like. It has been in my various .bashrc
files for years.
if [[ $(tput colors 2>/dev/null) -ge 8 ]]; then
# Hazard a guess the terminal supports 256 colors; most emulators do.
# tput is invoked only to check if ${TERM-}-256color is a valid termtype.
TERM=${TERM-}-256color tput colors &>/dev/null &&
export TERM=${TERM-}-256color
fi
It is against TOS to execute long-running automatic, unattended scripts in DevShell; it is a free helper tool intended for interactive administrative tasks only. BurrMill undoubtedly qualifies, since all our commands are interactive, even if you leave it unattended for a while. But running experiments most likely not (occasionally connecting to the cluster login node running an experiment script in a tmux session to check up on it is totally ok, though).
3.2. Set up local Cloud SDK
If you are not setting the SDK locally, you are done. You’ll need it on the machine from which you connect remotely, though. If you do, make sure to perform the emergency preparedness step in 3.3 below.
Technically, Cloud SDK is a less secure option than Cloud Shell. The reason is that you always go through 2FA to access Cloud Shell. Contrary to that, once you authorize locally installed Cloud SDK (you do go through 2FA to authorize it), your super-duper-admin credential is stored on your machine, and you can do anything with the GCP projects you have access to.
This is not at all to say you should not. I have Cloud SDK on all my machines, including Debian WSL under Windows 10. Just be mindful that it provides the same level of security as a browser that you authorized to skip 2FA “next time you log on.” Also, the next subchapter will explain what to do if your computer is stolen or lost. We covered similar preparedness measures to cover browser access in this section; there is just a little bit more to add to your emergency checklist.
On major Linux distributions, is uncommon to install Cloud SDK from the Google’s distribution directly: it is available through the distro package installation channel. The installation instruction from Google covers it all. From this point on, I’m assuming that you see anything but Command gcloud not found when you type gcloud
at the command prompt.
Authorizing an account involves a browser OAuth workflow. The command that you type is gcloud auth init
, and in response either browser opens, or, if your machine has no X or headless, an URL to enter to the browser is printed. Here’s an example (your input is set in bold face):
$ gcloud auth login # --no-launch-browser always runs this text-only workflow, even if a browser is available. Go to the following link in your browser: https://accounts.google.com/o/oauth2/auth?code_challenge=80qcbg8MXCHd9b7h9FO4oBt29yH2h_PCWddTN6wUoIE&prompt=select_account&code_challenge_method=S256&access_type=offline&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&client_id=32555940559.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth Enter verification code: 4/xgERUjylZihxNSIQ8t5a9uWQlkMMeXa1Ska0IJHQUjqp_eC-f02oQa0 You are now logged in as [example@gmail.com].
As you probably noticed from reading the long URL printed by the command (you of course read it carefully, did you?), who is logging on is not set in it anywhere. The identity that has logged on is encoded in the response (the string starting with 4/
above. This form of the login command without an argument unconditionally updates credentials of the invoking user. If a credentials exists on the local machine already, and you pass the user’s email as its argument, it simply activates the existing one:
$ gcloud auth login example@gmail.com WARNING: Re-using locally stored credentials for [example@gmail.com]. To fetch new credentials, re-run the command with the --force flag. You are now logged in as [example@gmail.com].
The browser workflow is straightforward. You just copy the whole long URL to the browser, and accounts.google.com asks you which account you want to authorize.
Unless you have logged out already, you may need to enter the password, and possibly the second factor. After that (and take notice of what is going on!) you explicitly allow the requesting program (which is already registered with Google Accounts as a client relying on Google Accounts for authentication: this is the client_id=32555940559.apps.googleusercontent.com
part in this URL) to act on your behalf for a set of specific operations (this is this long scopes=...
part.
After you select the account (and possibly enter your password and the second factor), Accounts decodes the URL and presents it to you in a human-readable form.
I am touching on the explaining of this process, as authentication and permission grants are all around GCP, although mostly transparent to you. The application is already registered with the Google Accounts, and the above-mentioned identifiers listed in the URL shown as readable name of services: https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform
is a scope that is described by the first bullet View and manage your data across Google Cloud Platform services, and so on. The long URL itself does not identify who is requesting access; this is the business of the OAuth trust provider, Google Accounts in this case. Any website where you see a “Login with Google” button essentially performs this process, only you do not usually see the ugly encoded URL. I just think this is a good place to demystify the login process.
After Google Accounts confirms that you are the you you claim to be, it encrypts the user name using the code_challenge=...
part of the URL, and redirects you (the redirect=
part) to a page where the encrypted result is displayed✼When you authorize a website to “Log in with Google”, you are not shown the code, because the redirect leads back to the website, and the encrypted code passed invisibly, but the process is essentially same.. This string is nothing much more than your user ID that only the requesting program can decrypt (it has generated the secret matching half of the challenge). This is somewhat simplified, but not an incorrect description of the process.
Here you copy the code string (the circled button just copies this encrypted string), go back to the terminal and paste it to the Enter verification code:
prompt. The client decrypts it, and extracts your email, and a token that allow requesting specific codes to include with the URL to API requests to validate that you is you.
Here’s the rub: the token is stored on your machine, and allows full access to GCP on your behalf, in effect, it bypasses the 2FA authorization: any gcloud
command you run just does its thing, and anyone who has access to your machine can do that, too. Thus you should guard access to it seriously. The token is stored somewhere under the ~/.config/gcloud
directory, which denies access to anyone but you (mode 700), but anyone with either root access or physical access to the drive can extract and use it.
If you do not plan to use gcloud
and gsutil
on some machine for a while, it’s a good idea to revoke the authorization. You must specifically name the account you are revoking grants from, but you can list all the authorized accounts with the same gcloud auth
subcommand:
$ gcloud auth list Credentialed Accounts ACTIVE ACCOUNT example@gmail.com . . . $ gcloud auth revoke example@gmail.com Revoked credentials: example@gmail.com
You should not perform an authentication as yourself in the Cloud Shell. It is already authorized as you when it starts, and closes with the browser, or when you log out of it. Cloud Shell disk does not store your sensitive credential on your disk persisting between session.
3.3. Prepare for an emergency
This has become an annoyingly repetitive theme, but you must know what to do in case you no longer have access to the computer that you authorized to act on your behalf. In such a case you must revoke the authorization from the same Google Account Security page at https://myaccount.google.com/security. Find the “Signing to other sites” box:
Click on it, and find “Google Cloud SDK.” Click on it to expand.
Then click on the Remove Access button. This revokes authorization on all computers that you have authorized with this Google account, the same way the gcloud auth revoke
command does. While you are there, you’ll probably want to revoke access from other apps that have been authorized on the device you’ve lost.
You can find the same settings on an Android phone, via “Settings” then “Google Account”.
This does not affect the Cloud Shell authorization, as it uses a different mechanism, internal to GCP services.
Now you are all set to get the BurrMill distribution, initialize the project and request your computing resources. We’ll guide you through the necessary steps in the next section.