# Habitat for Windows Troubleshooting

|

## Summary

Greetings! Today, I’ll be reviewing some steps for troubleshooting Habitat on Windows.

## Habitat executable

If you’re running hab commands and it’s not doing what you expect or you aren’t sure what it’s actually doing, try enabling debug logging:

$env:RUST_LOG="debug";$env:RUST_BACKTRACE=1; hab sup run


## PowerShell Troubleshooting

So you’ve written this awesome plan for your app but it throws errors during the build. Now what? How do you troubleshoot those failures since Hab studio cleans up after the build fails? The quickest way is to set breakpoints inside Studio. You can set breakpoints for any cmdlet or function by name to interrupt the build.

Let’s take a look at an example where our plan is unpacking our source but it’s failing in the install phase.

# Start up the hab studio
hab studio enter

# Let's set the breakpoint
Set-PSBreakpoint -Command Invoke-Install
ID Script Line Command        Variable Action
-- ------ ---- -------        -------- ------
0             Invoke-Install

# Now let's build
build

Entering debug mode. Use h or ? for help.

Hit Command breakpoint on 'Invoke-Install'

At C:\src\plan.ps1:18 char:24
+ function Invoke-Install{
+
>

# Now we can troubleshoot.
# I'm going to provide some high level things as there are too many variations to get specific here.
# - Look at your directory structure?
# - Are the right folders there?
# - Are the folders named correctly?
# - Are they named as you expect them?
# - Are the necessary files in the cache folder?
# - Are the named correctly?
# - If you are running commands, try stepping through them manually and review the output.
# - With the breakpoint, you can single step through the commands  by using the StepInto option.
# - Type h and press enter to get more help for using the debugger.

# If you forget the breakpoint id, we can find it
Get-PSBreakpoint
ID Script Line Command        Variable Action
-- ------ ---- -------        -------- ------
0             Invoke-Install

# Now let's get rid of the breakpoint and run our build all the way through
Remove-PSBreakpoint 0


## Logs and folders

Ok, cool. We’ve got our package running, how how do we see what it’s doing?

• In the studio, it’s easy, we run Get-SupervisorLog and BOOM! there’s our data. If you aren’t in the studio, this cmdlet isn’t available.
• If you launched hab in a cmd or PowerShell window using hab sup run, then your logs will stream in that window.
• If you are running hab via the Windows Service, then the supervisor logs default to $env:systemdrive\hab\svc\windows-service\logs. • If you want to launch the supervisor via another method, you’ll need to redirect the output to a file. You’ll also need to implement your own log rotation process to keep from filling the drive with logs. If you are handling your own logging, use the --no-color option when starting the supervisor. This will prevent the text color formatting noise from being saved to your log file. If you are planning to ingest your logs into another tool, check out the --json-logging option as it may make the log easier to process. When you’re looking at the logs, I’m sure you’ve noticed the two letters in parentheses, and wondered what they meant. Wonder no more. These letters are called Log Keys and signify the type of log message. Read more about them here. hab-sup(MR): core/hab-sup (core/hab-sup/0.78.0/20190313123218) hab-sup(MR): Supervisor Member-ID 9eba6f8580084b0b88f6bddb008e1b13 hab-sup(MR): Starting gossip-listener on 0.0.0.0:9638 hab-sup(MR): Starting ctl-gateway on 127.0.0.1:9632 hab-sup(MR): Starting http-gateway on 0.0.0.0:9631  We reviewed the folder structure in a previous post, however, let’s review: # Fairly obvious, a local cache for storing objects ├── cache # Opt in or out of reporting analytics to Chef │ ├── analytics # The hart files are cached here when they are downloaded │ ├── artifacts # All harts are checksummed and signed. This directory contains the public keys used to validate those signatures │ ├── keys # cURL's cacert.pem. Used for fetching the initial hab packages │ └── ssl # Contains the cli.toml. This file is created when running hab cli setup ├── etc # Info for hab-launch. Things like it's process ID. ├── launcher # This is where all packages are extracted. ├── pkgs # All of the core Habitat packages required │ ├── core # There may be multiple directories here, one for each Hab Origin you use/depend on. │ └── ... # Local studios are stored here. This folder shouldn't exist on production machines. ├── studios # Contains all of the information about the supervisor rings ├── sup # Each supervised service will have a folder under svc\. └── svc └── <service-name> # Output of the templatized config files from the \config\ directory in your hab package ├── config # Stored data consumed by app, i.e. databases ├── data # Gossiped configuration file that the Supervisor gets from peers in the ring ├── files # The lifecycle hooks from the package ├── hooks # Logs generated by the service ├── logs # Static content ├── static # Temp files └── var  ## Variables There are a ton of variables available in Hab, so I want to focus on the ones that are the most useful. • $HAB_CACHE_SRC_PATH - This is the main cache folder that Hab uses for downloads, extraction, and compilation.
• $pkg_prefix - The absolute path for your package. • $pkg_dirname - This directory is created when Hab extracts an archive. Set to ${pkg_name}-${pkg_version} by default. This is a subfolder of $HAB_CACHE_SRC_PATH In general, we download and extract to the $HAB_CACHE_SRC_PATH, do any customization needed, then copy to $pkg_prefix. And most of the time, we target $pkg_prefix\bin as this allows us to expose that directory to other Habitat packages via $pkg_bin_dirs=@("bin"). If we specify the $pkg_bin_dirs in our plan, Habitat will automatically add that directory to the PATH of any package that depends on our application.

You may see the variable $pkg_svc_path and consider copying files there but as a best practice, don’t copy binaries or executables there. Instead, configure your service to write data/logs/temp/configs to $pkg_svc_path.

#### Code

$pkg_name="packer"$pkg_origin="core"
$pkg_version="1.3.5"$pkg_maintainer="The Habitat Maintainers <humans@habitat.sh>"
$pkg_license=@('MPL2')$pkg_bin_dirs=@("bin")
$pkg_source="https://releases.hashicorp.com/packer/${pkg_version}/packer_${pkg_version}_windows_amd64.zip"$pkg_shasum="57d30d5d305cf877532e93526c284438daef5db26d984d16ee85e38a7be7cfbb"

function Invoke-Install {
Copy-Item "$HAB_CACHE_SRC_PATH/$pkg_name-$pkg_version/$pkg_name.exe" $pkg_prefix\bin }  ### Executable Binary #### Background Next, we’ll look at one that’s a single executable, in this case NuGet. It’s a single executable so there’s no need for any unpacking. Since Habitat’s default action is to attempt an unpack, we’ll need to override the Invoke-Unpack callback to keep it from erroring out. Source Link #### Code $pkg_name="nuget"
$pkg_origin="core"$pkg_version="4.6.2"
$pkg_license=('Apache-2.0')$pkg_upstream_url="https://dist.nuget.org/index.html"
$pkg_description="NuGet is the package manager for the Microsoft development platform including .NET."$pkg_maintainer="The Habitat Maintainers <humans@habitat.sh>"
$pkg_source="https://dist.nuget.org/win-x86-commandline/v${pkg_version}/nuget.exe"
$pkg_shasum="2c562c1a18d720d4885546083ec8eaad6773a6b80befb02564088cc1e55b304e"$pkg_bin_dirs=@("bin")

function Invoke-Unpack { }

function Invoke-Install {
Copy-Item "$HAB_CACHE_SRC_PATH/nuget.exe" "$pkg_prefix/bin" -Force
}


### Git Repository

Our app team is diligent about storing their code in a git repo, so let’s go straight to the source! Since there’s no $pkg_source, Habitat won’t try to download or verify, so how do we get our files? We can use the Invoke-Unpack callback to do that for us. Source Link #### Code $pkg_name="moonsweeper-py"
$pkg_origin="jmassardo"$pkg_version="0.1.0"
$pkg_maintainer="James Massardo <james@dxrf.com>"$pkg_license=@("Apache-2.0")
$pkg_deps=@('jmassardo/python')$pkg_build_deps=@('core/git')
$pkg_description="Moonsweeper — A minesweeper clone, on a moon with aliens, in PyQt."$pkg_upstream_url="https://github.com/mfitzp/15-minute-apps/tree/master/minesweeper"

function Invoke-Unpack{
write-output "Attempting to clone repo"
cd $HAB_CACHE_SRC_PATH git clone https://github.com/mfitzp/15-minute-apps.git } function Invoke-Install{ Copy-Item -Path "$HAB_CACHE_SRC_PATH\15-minute-apps\minesweeper" -Destination "$pkg_prefix" -recurse }  NOTE: Please take note of the two different dependency types: $pkg_deps and $pkg_build_deps. • $pkg_deps includes dependencies that are needed during runtime.

#### Code

$pkg_name="contosouniversity"$pkg_origin="mwrock"
$pkg_version="0.1.0"$pkg_maintainer="The Habitat Maintainers <humans@habitat.sh>"
$pkg_license=@("Apache-2.0")$pkg_deps=@("core/dsc-core")
$pkg_build_deps=@("core/nuget")$pkg_binds=@{"database"="username password port"}

function Invoke-Build {
Copy-Item $PLAN_CONTEXT/../*$HAB_CACHE_SRC_PATH/$pkg_dirname -recurse -force nuget restore "$HAB_CACHE_SRC_PATH/$pkg_dirname/C#/$pkg_name/packages.config" -PackagesDirectory "$HAB_CACHE_SRC_PATH/$pkg_dirname/C#/packages" -Source "https://www.nuget.org/api/v2"
nuget install MSBuild.Microsoft.VisualStudio.Web.targets -Version 14.0.0.3 -OutputDirectory $HAB_CACHE_SRC_PATH/$pkg_dirname/
$env:VSToolsPath = "$HAB_CACHE_SRC_PATH/$pkg_dirname/MSBuild.Microsoft.VisualStudio.Web.targets.14.0.0.3/tools/VSToolsPath" ."$env:SystemRoot\Microsoft.NET\Framework64\v4.0.30319\MSBuild.exe" "$HAB_CACHE_SRC_PATH/$pkg_dirname/C#/$pkg_name/${pkg_name}.csproj" /t:Build /p:VisualStudioVersion=14.0
if($LASTEXITCODE -ne 0) { Write-Error "dotnet build failed!" } } function Invoke-Install { ."$env:SystemRoot\Microsoft.NET\Framework64\v4.0.30319\MSBuild.exe" "$HAB_CACHE_SRC_PATH/$pkg_dirname/C#/$pkg_name/${pkg_name}.csproj" /t:WebPublish /p:WebPublishMethod=FileSystem /p:publishUrl=$pkg_prefix/www }  ### MSI Based Installers #### Background Most Windows Admins are familiar with MSI based installers. We really need the files inside the MSI and not the installation instructions so let’s use lessmsi to extract the MSI file. Depending on the application, you may need to add some additional actions: • Set up additional app requirements such as registry keys via the init hook. • Remove these items via the post-stop hook • Consider using the reload and/or reconfigure hooks to update registry keys These actions will allow you to use gossiped data and toml files to update the running config of the app, e.g redirecting an Win32 forms app to a different data base server. Source Link #### Code $pkg_name="rust"
$pkg_origin="core"$pkg_version="1.33.0"
$pkg_description="Safe, concurrent, practical language"$pkg_upstream_url="https://www.rust-lang.org/"
$pkg_license=@("Apache-2.0", "MIT")$pkg_maintainer="The Habitat Maintainers <humans@habitat.sh>"
$pkg_source="https://static.rust-lang.org/dist/rust-$pkg_version-x86_64-pc-windows-msvc.msi"
$pkg_shasum="cc27799843a146745d4054afa5de1f1f5ab19d539d8c522a909b3c8119e46f99"$pkg_deps=@("core/visual-cpp-redist-2015", "core/visual-cpp-build-tools-2015")
$pkg_build_deps=@("core/lessmsi")$pkg_bin_dirs=@("bin")
$pkg_lib_dirs=@("lib") function Invoke-Unpack { mkdir "$HAB_CACHE_SRC_PATH/$pkg_dirname" Push-Location "$HAB_CACHE_SRC_PATH/$pkg_dirname" try { lessmsi x (Resolve-Path "$HAB_CACHE_SRC_PATH/$pkg_filename").Path } finally { Pop-Location } } function Invoke-Install { Copy-Item "$HAB_CACHE_SRC_PATH/$pkg_dirname/rust-$pkg_version-x86_64-pc-windows-msvc/SourceDir/Rust/*" "$pkg_prefix" -Recurse -Force } # This isn't always needed function Invoke-Check() { (& "$HAB_CACHE_SRC_PATH/$pkg_dirname/Rust/bin/rustc.exe" --version).StartsWith("rustc$pkg_version")
}


### EXE Based Installers

#### Background

A lot of windows utilities come packaged via exe based installers. Here, we’ll look at installing 7zip using its exe installer. As you can see, we’re starting to develop a pattern, download, unpack, install.

#### Code


$pkg_name="7zip"$pkg_origin="core"
$pkg_version="16.04"$pkg_license=@("LGPL-2.1", "unRAR restriction")
$pkg_upstream_url="http://www.7-zip.org/"$pkg_description="7-Zip is a file archiver with a high compression ratio"
$pkg_maintainer="The Habitat Maintainers <humans@habitat.sh>"$pkg_source="http://www.7-zip.org/a/7z$($pkg_version.Replace('.',''))-x64.exe"
$pkg_shasum="9bb4dc4fab2a2a45c15723c259dc2f7313c89a5ac55ab7c3f76bba26edc8bcaa"$pkg_filename="7z$($pkg_version.Replace('.',''))-x64.exe"
$pkg_bin_dirs=@("bin") function Invoke-Unpack { Start-Process "$HAB_CACHE_SRC_PATH/$pkg_filename" -Wait -ArgumentList "/S /D="$(Resolve-Path $HAB_CACHE_SRC_PATH)/$pkg_dirname""
}

function Invoke-Install {
Copy-Item * "$pkg_prefix/bin" -Recurse -Force }  ### Windows Role/Feature based installs #### Background Ooo… Here’s a tricky one… Let’s say we need IIS to support our webapp. Source Link #### Code $pkg_name="iis-webserverrole"
$pkg_origin="core"$pkg_version="0.1.0"
$pkg_maintainer="The Habitat Maintainers <humans@habitat.sh>"$pkg_license=@("Apache-2.0")
$pkg_description="Installs Basic IIS Web Server features"  Hey, wait just a dang minute, where are the callbacks?!? Well… there aren’t any. In this case, we can’t actually fully package IIS because it’s a Windows component. So now what? We still need IIS for our app. We use a different path, the install hook. This hook is triggered when you run hab pkg install .... We’ll use it install the features/roles we need. It’s not techncially packaged in Hab but it is “habatized” in the sense that we can track it as a dependency and trigger the install if it’s missing. # Install hook function Test-Feature { Write-Host "Check if IIS-WebServerRole is enabled..."$(dism /online /get-featureinfo /featurename:IIS-WebServerRole) -contains "State : Enabled"
}

if (!(Test-Feature)) {
Write-Host "Enabling IIS-WebServerRole..."
dism /online /enable-feature /featurename:IIS-WebServerRole
if (!(Test-Feature)) {
Write-Host "IIS-WebServerRole was not enabled!"
exit 1
}
}


## Closing

I’d like to take credit for writing all these plans but, alas, I can’t. Fortunately, the awesome folks on the Habitat team publish these, along with like 600 other core plans, in their github org, Habitat-sh/Core-plans.

If you run across another significant pattern that isn’t here, please let me know so I can update this page!

If you have any questions or feedback, please feel free to contact me: @jamesmassardo

# Habitat for Windows Basics

|

## Summary

Greetings! Today, I’ll be introducing some of the basics forHabitat, specifically focusing on Habitat on Windows. I won’t be covering all the things as the Habitat website has a tremendous amount of reference material. I will be calling out some things that are either pitfalls or differing between Linux and Windows.

## Habitat basics

Habitat uses a couple objects to build an artifact, the plan file, default.toml, user config files, and lifecycle hooks.

The plan file is the primary object for creating the artifact. It tells Habitat were to get the source files, how to build the app, where to put the files, etc.

The lifecycle hooks control how the app behaves once it’s deployed. They are responsible for running the app, doing heath-checks, shutting down the app, reconfiguring the app, etc.

Habitat also supports updating the running config in a couple ways: the default.toml which, as the name suggests, contains the default config items our plan needs; the user config stored in \hab\user\<service-name>\config\user.toml; and by using hab config apply ..... We’ll dig deeper in configs later on.

To learn more, head over to the Habitat website or check out the Try Habitat module in Learn Chef Rally.

## Pseudo coding

A good way to start building a plan is to write pseudo code in the files before writing actual code. This will allow you to capture your logic and identify the proper paths and variables.

#Pseudo Code

# Fetch source from repo

# Unpack
# Unpack zip file to a subfolder in the hab cache

# Install
# Copy the files from from the cache to the bin location



I’m including a real plan file. We’ll go through these items deeper in the next section.

# Real code
$pkg_version="3.7.3"$pkg_source="https://www.python.org/ftp/python/${pkg_version}/python-${pkg_version}-embed-amd64.zip"
$pkg_filename="python-${pkg_version}-embed-amd64.zip"
$pkg_deps=@("core/7zip")$pkg_bin_dirs=@("bin")

function Invoke-Unpack{
cd $HAB_CACHE_SRC_PATH 7z x -y$pkg_filename  -o*
}

function Invoke-Install{
Copy-Item -Path "$HAB_CACHE_SRC_PATH\python-3.7.3-embed-amd64\*" -Destination "$pkg_prefix\bin" -recurse
}


## Deep dive

Lets look at the main bits of a Habitat plan.

# Example directory structure of a Habitat plan
├── config
├── default.toml
├── hooks
│   ├── health-check
│   ├── init
│   ├── install
│   ├── post-stop
│   └── run
└── plan.sh (or plan.ps1)


Ok, cool, but what are those things?

• Config - This directory hold configuration file templates. Habitat uses handlebar helpers to interpolate data. More info here.

# Original config file:
recv_buffer 128

# Templated config file:
recv_buffer {{cfg.recv_buffer}}

• default.toml - As I indicated above, this holds the default values used by the app. Let’s say we have a plan for a web server. Normally, we’d want it to run on port 80 but there may be times we need it running on another port. This templating allows us to change it at run time instead of building multiple packages for the same app.

[web]

port = 80


### Plan.ps1

The plan file in the main set of instructions on how to properly package up the application. We’re going to use this demo plan as our example: jmassardo/chromium. Feel free to clone it and give it a shot. The core of a Habitat plan contains Variables, Callbacks, and Binds/Exports. I’ll outline some basics here, check out the links for more detailed info.

#### Variables

Here are the main variables you’ll need:

# Give the package a useful name, preferably the name of the app.
$pkg_name="chromium" # This is your origin name.$pkg_origin="jmassardo"

# This one is up for some debate. Some use the version of the app they're packaging,others version the Hab plan itself. I lean toward the former as Hab automatically appends a datestamp to the hart when you do a build.
$pkg_version="647819" # The person or group maintaining the package$pkg_maintainer="James Massardo <james@dxrf.com>"

# This needs to be a valid license type. I'm sure there are other references but this is the one I use: https://help.github.com/en/articles/licensing-a-repository
$pkg_license=@("Apache-2.0") # The source URL of the app, in this case a zip archive on CDN$pkg_source="https://www.googleapis.com/download/storage/v1/b/chromium-browser-snapshots/o/Win%2F${pkg_version}%2Fchrome-win.zip?alt=media" # By default, Habitat will auto-download a zip so let's give it a friendly name$pkg_filename="chrome-win.zip"

$pkg_shasum="c54cb6d1192fc3c016dc365c9b3fda21cfffc41d44b63c653ac351246478f6aa"  NOTE: You can run commands (they vary by OS) to get the checksum or you can do a build in Habitat. Obviously, it will fail the first time but Studio will give you the computed checksum of the downloaded payload. There are a ton of variables available in Hab, so I want to focus on the ones that are the most useful. • $HAB_CACHE_SRC_PATH - This is the main cache folder that Hab uses for downloads, extraction, and compilation.
• $pkg_prefix - The absolute path for your package. • $pkg_dirname - This directory is created when Hab extracts an archive. Set to ${pkg_name}-${pkg_version} by default. This is a subfolder of $HAB_CACHE_SRC_PATH In general, we download and extract to the $HAB_CACHE_SRC_PATH, do any customization needed, then copy to $pkg_prefix. And most of the time, we target $pkg_prefix\bin as this allows us to expose that directory to other Habitat packages via $pkg_bin_dirs=@("bin"). If we specify the $pkg_bin_dirs in our plan, Habitat will automatically add that directory to the PATH of any package that depends on our application.

#### Callbacks

Habitat has a number of callbacks that control the various phases of the package build. For the most part, we let Habitat run it’s default callbacks and only override the ones where we need to take additional action.

In our Chromium example, we’re going to rely on the default callbacks to handle the download, verification, and unpacking of our app. We’ll override the Invoke-Install callback so we can copy the files we need out of the cache to the package directory.

function Invoke-Install{
copy-Item -Path "$HAB_CACHE_SRC_PATH\chromium-647819\chrome-win" -Destination "$pkg_prefix\bin" -recurse
}


My next post will contain a set of example patterns for the majority of installer types on Windows.

#### Binds/Exports

When we talk about Habitat, the concept of contracts between applications is a prominent topic. This is the idea that each app knows what it provides and knows what it needs. Instead of prescribing a series of actions and checks to bring up a particular service (Orchestration), we start everything up and each app will wait until its dependent apps are running before moving on. This type of Choreography drastically reduces the amount of code (and effort) needed to get a service running.

Let’s look at a real example. In this case, we have a web server and a database server. In the orchestration model, we’d write code to bring the database up, validate that it’s listening, then we’d make sure the web server had the appropriate creds for the DB then bring it up. In the habitat model, we would set the web server to require bindings for the database and export the DB config on the database server.

# Partial plan of web server
$pkg_name="my-web-app"$pkg_origin="jmassardo"
$pkg_version="0.1.0"$pkg_deps=@("core/iis-webserverrole")
$pkg_binds=@{"database"="username password port instance"}  # Partial plan of database server$pkg_name="my-db-server"
$pkg_origin="jmassardo"$pkg_version="0.1.0"
$pkg_deps=@("core/sqlserver")$pkg_exports=@{
port="port"
instance="instance"
}


Now when we deploy these apps, the web server will wait until its bindings are satisfied before it starts up.

hab svc load jmassardo/my-db-server
hab svc load jmassardo/my-web-app --bind database:my-db-server.default


If we watch the supervisor output, we’ll see that jmassardo\my-web-app will wait for the bindings before it fully starts up.

### Lifecycle hooks

As I’m sure you’ve gathered, Habitat has a ton of flexibility and the hooks are no exception. There are a number of them but we’re only going to look at a couple. Fortunately, their names are all very logical.

• init - Any actions that need to take place when the service is initially loaded. Possibly set env vars, create/validate needed users, etc.
• run - This on is obvious but it does have some caveats. As you guessed, it’s responsible for running the actual executables. The caveat is it needs to stay running for the entire time otherwise, Habitat will assume something happened to the app and it will try to restart it. The easiest way is to do a Start-Process myapp.exe -wait.
• health-check - The heath-check hook allows you to monitor your app and take action if something’s wrong.
• post-stop - I call this the anti-init hook. This is where you’d undo anything that the init hook configured.

### Hab folders

Let’s look at the structure of the \hab\ directory and make note of what each folder does.

# Fairly obvious, a local cache for storing objects
├── cache
# Opt in or out of reporting analytics to Chef
│   ├── analytics
│   ├── artifacts
# All harts are checksummed and signed. This directory contains the public keys used to validate those signatures
│   ├── keys
# cURL's cacert.pem. Used for fetching the initial hab packages
│   └── ssl
# Contains the cli.toml. This file is created when running hab cli setup
├── etc
# Info for hab-launch. Things like it's process ID.
├── launcher
# This is where all packages are extracted.
├── pkgs
# All of the core Habitat packages required
│   ├── core
# There may be multiple directories here, one for each Hab Origin you use/depend on.
│   └── ...
# Local studios are stored here. This folder shouldn't exist on production machines.
├── studios
# Contains all of the information about the supervisor rings
├── sup
# Each supervised service will have a folder under svc\.
└── svc
└── <service-name>
# Output of the templatized config files from the \config\ directory in your hab package
├── config
# Stored data consumed by app, i.e. databases
├── data
# Gossiped configuration file that the Supervisor gets from peers in the ring
├── files
# The lifecycle hooks from the package
├── hooks
# Logs generated by the service
├── logs
# Static content
├── static
# Temp files
└── var


## Closing

Congrats on making it through all of that info! Hopefully this post has clarified a few points and was helpful. If you’re like me, documentation is invaluable, but it’s nice to detail out some of the basics so the documentation actually makes sense!

If you have any questions or feedback, please feel free to contact me: @jamesmassardo

# Patch all the things!

|

## Summary

Greetings! I recently did a customer call/demo around using Chef to patch Windows systems and I thought this would make a great post. However, I’m going to change one thing, we’re going review patching as a fundamental process and cover more than Windows.

## Objectives

What are our objectives in patching, e.g. why do we do this seemingly futile task? What business value do we provide? There’s really only a couple things IT can do for a business; things that allow the business to move faster and things that reduce risk to the business. In the case of patching, we’re really trying to reduce the risk to the business by minimizing the possible attack surfaces. According to most security firms, roughly 80% of all data breaches occurring the past few years would have been prevented if those businesses had an effective patching process.

Ok, cool, so our goal is to reduce risk; however, making low level code changes on basically everything in a relatively short time sounds pretty risky too, right? In the post DevOps world, we wouldn’t patch. When new updates are available, they are shipped through the pipeline like every other change. Ephemeral nodes are provisioned, integration tests run, new OS images prepped, new infrastructure provisioned, etc.

I can read your mind at this point; “We aren’t all start-ups, some of us still have a lot of legacy systems to manage.” And you’d be totally correct. Almost all enterprises (and a lot of smaller businesses) have legacy systems that are critical to the business and will be around for the forseeable future. Having said that, this doesn’t mean that you can’t have a well-understood, mostly automated patching process.

## Basic tools

### Repositories

Below are some notes on each major platform. The main thing to remember is the repository tool is the source for our patches. It’s also the main control point for what’s available in our environment. By default, we want to automatically synchronize with the vendor periodically. We then black list any patch or update that we can’t use (or busted patches that break stuff).

Essentially, we want to create one or more local repositories for all of the available packages for each platforms/distribution we want to support. The total number required will vary depending on your network topology, number of supported platforms/distributions, etc. Patching can be extremely network intensive. If you have a large number of systems or multiple systems across one ore more WAN links, plan accordingly and don’t DDoS yourself. I can’t emphasize this enough, if you don’t take load (network, OS, hypervisor, etc.) into account, you will cause severe, possibly even catastrophic problems for your business.

Now that the gloom and doom warnings are out of the way, let’s look at some OS specific info.

#### Linux

Each Linux distribution has it’s own repository system. For RHEL based systems, we use Yum, and for Debian based systems, we use Apt.

RHEL has some licensing implications for running local repositories. Talk to your Red Hat rep for more info.

#### MacOS

Until recently, macOS Server had a Software Update Services component. Some folks are using the new caching service but it’s not quite the same.

#### Windows

Windows Server has an Update Services role. WSUS can provide local source for the majority of Microsoft’s main products. WSUS also has a replica mode for supporting multiple physical locations or very large environments.

Windows users that are running newer versions of Server and Client can also take advantage of the Branch Cache features. This allows clients on a single subnet to share content and drastically reduce the WAN utilization without needing downstream servers.

### Configuration

I’m going to use Chef for my examples (full disclosure, I work at Chef), but these principles will work with any configuration management platform. The main thing to keep in mind is the CM tool doesn’t do the actual patching. This is a pretty common misconception. Yes, it’s entirely possible to use a CM tool to deliver the actual patch payload and oversee the execution, but why would you want to do all that extra work? This is about making our lives better so let’s use existing tools and functionality and use CM to control the operation.

### Orchestration

Orchestration means different things to different people. To me, orchestration is the process of managing activities in a controlled behavior. This is different from CM in that CM is about making a system or object act a certain way whereas orchestration is about doing multiple things in a certain order or taking action based on one or more inputs.

You will need an orchestration tool if you have any of the following needs (or similar needs):

• I need to reboot this system first, then that system for my app to work correctly.
• I want no more than 25% of my web servers to reboot at any given time.
• I want to make this DB server primary, then patch the secondary DB server.

If any of these sound familiar, you aren’t alone. These are common problems in the enterprise; however, there’s no common solution. With the magnitude of various applications, systems, and platforms out there, there’s no way to provide prescriptive guidance for this topic. A lot of folks use a outside system as a clearing house for nodes to log their state then check the state with a cookbook on all the nodes to make decisions.

In regard to patching, if I have a group of nodes that has a specific boot order, I tend to stagger their patch windows so the first nodes patch and reboot, then the second group, and so on. I may also patch them in one window and leave them pending reboot, then reach out to them separately and reboot them in the required order.

## Process

As you can see, the names of the tools within a group may be different; however, they all function using similar patterns. These similarities are what allow us to build a standardized process regardless of platform.

• Repositories provide the content and content control
• Config. mgmt. tools control the configuration of the client (I know… Mr. Obvious…). This is how we assign maintenance windows, reboot behaviors, etc.
• Orchestration tools handle procedural tasks

We want this to be as automated as possible so everything will be set to run on a scheduled basis. The only time it requires manual intervention is when there’s a patch we need to exclude. In the examples below, we’ll configure the repositories to sync nightly and we’ll configure the clients to check on Sunday mornings at 2AM. I can hear what you are thinking again, you’re saying “We have a bunch of systems, we can’t reboot all of them at once!” And you’d be right. Even if you don’t have a large fleet, you still don’t want to patch all the things at once.

In reality, you want at least one test system for each application, service, or role in your environment. Preferably, you want complete replicas of production (although smaller scale) for testing. Patches should be shipped and tested just like any other code change. Most enterprises have some process similar to Dev -> QA -> Stage -> Prod for shipping code changes so patching should follow that same pattern. Remember, the earlier we discover a problem, the easier and cheaper it is to resolve.

## Technical Setup

Below are sample instructions for building the required components. These are not 100% production ready and only serve as examples on where to start. Each company has it’s own flavor of doing things so it’s not possible to account for all the variations.

### Server Side

First thing we need is to set up our repositories. We’ll set up a basic mirror for CentOS 7 and Ubuntu 16.04, then set up a Windows WSUS Server.

#### APT Repository

# metadata.rb

# attributes/default.rb

# recipes/default.rb


#### Yum Repository

# metadata.rb
depends 'yum'

# attributes/default.rb
default['yum']['repos']['centos-base'] = 'http://mirror.centos.org/centos/7/os/x86_64'
default['yum']['combined'] = false

# recipes/default.rb
package 'createrepo'
package 'python-setuptools'
package 'httpd'

# https://github.com/ryanuber/pakrat
execute 'easy_install pakrat'

repos = ''
node['yum']['repos'].each do |name, baseurl|
repos += "--name #{name} --baseurl #{baseurl} "
end

repos += '--combined ' if node['yum']['combined']

directory '/var/www/html/' do
recursive true
end

#
#
# Convert to a cron resource to schedule nightly sync's
#
#########################################################
execute 'background pakrat repository sync' do
cwd '/var/www/html/'
command "pakrat #{repos} --repoversion $(date +%Y-%m-%d)" live_stream true end ######################################################### # # # # service 'httpd' do action [:start, :enable] end  #### WSUS Server # metadata.rb depends 'wsus-server'  # attributes/default.rb default['wsus_server']['synchronize']['timeout'] = 0 default['wsus_server']['subscription']['categories'] = ['Windows Server 2016',] default['wsus_server']['subscription']['classifications'] = ['Critical Updates', 'Security Updates'] default['wsus_server']['freeze']['name'] = 'My Server Group'  # recipes/default.rb include_recipe 'wsus-server::default' include_recipe 'wsus-server::freeze'  ### Client Side Now that we have repositories, let’s configure our clients to talk to them. #### CentOS Client # metadata.rb  # attributes/default.rb  # recipes/default.rb cron 'Weekly patching maintenance window' do minute '0' hour '2' weekday '7' command 'yum upgrade -y' action :create end  #### Ubuntu Client # metadata.rb  # attributes/default.rb  # recipes/default.rb  #### Windows 2016 Client # metadata.rb depends 'wsus-client'  # attributes/default.rb default['wsus_client']['wsus_server'] = 'http://wsus-server:8530/' default['wsus_client']['update_group'] = 'My Server Group' default['wsus_client']['automatic_update_behavior'] = :detect default['wsus_client']['schedule_install_day'] = :sunday default['wsus_client']['schedule_install_time'] = 2 default['wsus_client']['update']['handle_reboot'] = true  # recipes/default.rb include_recipe 'wsus-client::configure'  ## Validation Now the real question: “Did everything get patched?” How do we answer this question? Apt and Yum have no concept of reporting and the WSUS reports can get unwieldy in a hurry. Enter Inspec. Inspec is an open source auditing framework that allows you to test for various compliance and configuration items including patches. Patching baselines exist for both Linux and Windows. Inspec can run remotely and collect data on target nodes or you can have it report compliance data to Chef Automate for improved visibility and dashboards. ## Closing Congratulations! If you are reading this, then you made it through a ton of information. Hopefully you found at least a couple nuggets that will help you. If you have any questions or feedback, please feel free to contact me: @jamesmassardo Thanks to @ncerny and @trevorghess for their input and code samples. ## Helpful tips Here are some tidbits that may help you. • WSUS memory limit event log error. You will almost 100% hit this at some point. This is a scaling config so the more clients you have, the more memory WSUS will need. • Use attributes to feed required data (maint. window, patch time, reboot behavior, etc.) to the node. This allows you to have one main patching cookbook that get’s applied to everything. You can deliver attributes different ways: • Environments • Roles • Role/wrapper cookbooks • Remember we are using our CM system to manage the configs, not do the actual patching. # Pipelining Chef with Jenkins | ## Summary Today, I’m going to detail out how we are setting up our pipelining process for all things Chef for our business unit partners. This process uses a combination of Chef, Jenkins, GitHub, a custom Jenkins Shared Library, and some custom developed utilities. ## Problem There are a couple problems that started us down this path: • We run a single Chef server/Chef org containing all of our systems. • This means that we needed a way to allow users to execute changes against the Chef server without allowing knife access. • This also means that without automation of some sort, someone from our team would have to execute any change required, making us a serious bottleneck. • Knife has no safety net. • It’s very easy to make simple mistakes that can cause massive problems. • Chef Server/Knife doesn’t have granular RBAC so allowing knife would also allow users the ability to change objects that may not belong to them. We’re all adults and professionals and wouldn’t do anything maliciously to negatively affect another group; however, mistakes happen. • Allowing multiple groups to make changes without a consistent process creates numerous problems by allowing variations. This makes performing upgrades and validation testing exponentially harder. • Another problem is ensuring that global cookbooks and attributes are applied to all systems. OS level cookbooks that set basic settings for the environment and other things like audit cookbooks used by the internal security teams must be applied to everything to ensure a consistent experience and process. ## Solution I won’t lie to you, this is still very much a work in progress but this is still a significant step forward compared to manual processes. Let’s look at the components used. Component Purpose Chef CM Tool (I know… Obvious, right?) Jenkins Automation Server GitHub Source code repository Shared Library This library is essentially a standard Jenkins pipeline. We elected to use a shared library to simplify the process for when we needed to make changes. We change the library and all the repositories automatically execute the change the next time they are run. Utilities These utilities handle some of the file merging, interaction with the Chef server, and provide some of the safety nets. This project also contains the global_envs folder For deeper detail about each of these components, view their websites or project README’s. ### Deployment Processes Before we get into the setup, I think it’s good to provide some details about our methodology about object deployments. There are two main types of deployments for us: • Global Deployments - cookbooks, environment version pins, audit profiles, etc. that are deploy by our Internal IT team. • We use the global_envs folder to manage cookbook version pins per environment. This allows us to systematically roll out cookbook changes across the fleet. The environment file is only a section of a standard file. It only contains the cookbook_versions, default_attributes, and override_attributes sections. There is a step in the Jenkinsfile for this project that handles updating the version pins for all the global changes. • BU Deployments - cookbooks, environment version pins, and data bags that the BU can manage. • Cookbooks • Each cookbook has a dedicated Jenkinsfile within the root of the repository that reference the promoteCookbook('env_name') method of the Shared library. • Data bags • All data bags (including chef-vault bags) are stored in the BU/environment repository. There should only be one environment repository per BU/env. This process follows the same pattern as ChefDK. Create the data bag locally just as you would except instead of using knife data bag from file, the data bag folder is added to the repository and committed to GitHub. • Environment • This is a standard Chef environment file. The automation utilities merge the matching environment file from the global_envs folder with the BU’s environments/name.json file. The pipeline then validates that all of the cookbooks and named versions exist on the Chef server then it uploads it. • Profiles • Each profile has a dedicated Jenkinsfile within the root of the repository that reference the uploadProfile() method of the Shared library. The astute reader will no doubt notice the distinct absence of information about run lists. This is a deliberate move. We decided that using the concept of role cookbooks was the best pattern for our company. When we onboard a new BU partner, we establish two role cookbooks. • Environment cookbook - This cookbook is managed by our internal IT staff. It contains the required global cookbooks such as our audit profiles. • BU Role cookbook - This cookbook is managed by the BU. They add additional cookbooks as needed to meet their needs. They use attributes and guards to control where the code is executed. ## Setup These steps assume you have a working Chef Server and a working Jenkins Server. If you are new to Chef, I suggest heading over to Learn Chef Rally and checking out some of the modules (I would recommend the Infrastructure Automation). If you are new to Jenkins, I would recommend checking out the official Jenkins documentation here: Installing Jenkins. • Let’s set up the automation utilities first. • Fork the utilities project • Create a pipeline in Jenkins/BlueOcean • Select GitHub (I’m sure this would work if the code was stored somewhere else, we just happen to use GitHub for our stuff.) • Select the appropriate organization • Select the repository (In our case, we use the Chef-Pipeline-Automation repo.) • Ok cool! Now we have our utilities available in the /var/chef_automation/ folder. • Next, let’s move on to the Shared Library. • Fork the project • Follow the steps outlined in the project’s README file. • I will leave you with a couple notes: • Any changes to your fork of the library’s project will be immediately available on the Jenkins server so be careful with any changes or enhancements that you make. • This library is essentially a collection of pipelines so if a new function is needed, start by creating a regular Jenkinsfile in a test repo and work out the steps first, then create a new verbNoun.groovy in the /vars/ folder. • Now that we have the Jenkins stuff set up, let’s set up the chef-repo. • Install ChefDK on the Jenkins server • Create a chef_repo folder and place the knife.rb and the user.pem files in the Jenkins home directory $ cd /var/lib/jenkins
$mkdir -p chef_repo/.chef$ cp /path/to/my/knife.rb /var/lib/jenkins/chef_repo/.chef/knife.rb
$cp /path/to/my/user.pem /var/lib/jenkins/chef_repo/.chef/user.pem$
\$ tree /var/jenkins
chef_repo
├── .chef
│   ├── knife.rb
│   └── client.pem


Thanks for sticking with me thus far. I’m happy to tell you that all of the ground work is done and the next two steps are to set up the BU’s main repo and their first cookbook and profile repo.

• BU/Environment repository
• There needs to be one repository for each environment on the Chef Server.
• The repository follows this structure:
env_repo
├── environments
│   ├── env_name.json
├── data_bags
│   ├── limited_users
│   │   ├── user1.json
│   │   ├── user2.json
│   │   └── user3.json
│   └── op_users
│       ├── op1.json
│       ├── op2.json
│       └── op3.json
├── Jenkinsfile

• The Jenkinsfile for the environments repository should contain the following:
// Jenkinsfile
@Library('chef_utilities') _

updateEnvironment('chef_env_name')

• Cookbook repositories
• Chef best practices says that each cookbook should be stored in its own repository so that’s how we’ll do it.
• Create cookbooks as normal chef generate cookbook cookbooks/my_new_cookbook
• Place a Jenkinsfile in the root of the repository with the contents below:
// Jenkinsfile
@Library('chef_utilities') _

promoteCookbook('chef_env_name')

• Once the Jenkinsfile is created, create a new pipeline in Jenkins for each cookbook. If the permissions are set correctly in Jenkins, this task can be delegated to the BU further reducing the necessity for outside intervention.
• Last but certainly not least, Inspec profiles. Profiles follow the standard processes just like cookbooks with the only difference being the function called in the Jenkinsfile.
// Jenkinsfile
@Library('chef_utilities') _