179 Commits

Author SHA1 Message Date
Ray Zane
835cd4f12e First pass at setting CPU and memory limits in Kubernetes 2020-01-13 13:00:21 -05:00
dandds
51f7afd5b0 Update NGINX config to use supplied domains.
I left the domains hard-coded for the redirects in our NGINX config,
which was breaking authentication for versions of the site that don't
use that domain. This updates the config to use the domains supplied via
environment variable.
2020-01-07 06:12:56 -05:00
dandds
60b12fca52 Config to specify session cookie domain.
This got lost somewhere along the way (almost certainly by me), so this
commit tries to make it explicit. The app needs to be able to configure
the session cookie domain name so that it is valid for both the main
site domain and the authentication subdomain. For instance, if the site
is runnning at uat.atat.code.mil and authentication happens at
auth-uat.atat.code.mil, SESSION_COOKIE_DOMAIN should be set to
atat.code.mil so that it's valid for both.

This adds the setting to the base INI file and a default for our K8s
clusters.
2020-01-06 14:07:53 -05:00
dandds
8ec23b54a8 WIP: k8s config for cloud-zero 2019-12-23 18:39:55 -05:00
dandds
9d282ee82a K8s cronjob for resetting the database on staging.
This K8s CronJob will run the script for resetting the database. It will
only be applied to the staging site.
2019-12-17 13:19:40 -05:00
dandds
1466a302b2 K8s YAML integer values need to be quoted. 2019-12-13 12:11:31 -05:00
dandds
ec638d6b01 Transition to using secrets in Key Vault.
This does the following:

- Removes references to the atst-override.ini file, now deprecated.
- Adds all non-secret data that was managed in the override file to the
  relevant K8s ConfigMaps.
- Adds additional documentation explaining out use of Key Vault for
  secrets management.
2019-12-10 10:14:54 -05:00
dandds
972cf14a66 K8s configuration for mounting application config.
This adds an additional volume mount for Flask application secrets.
These will be mounted into the ATST container so that their values can
be read in as config.
2019-12-10 10:14:53 -05:00
dandds
20c7e943c8 Compose REDIS_URI from component parts.
This updates the configuration handling for the Redis connection string.
The motivation is so that the Redis password can be managed separately
via Azure Key Vault and eventually be rotated independently of the rest
of the connection URI.

This also tweaks the method we use to build the DATABASE_URI and removes
some stale config from the CI config file.
2019-12-04 13:28:26 -05:00
dandds
f4ffde89d0 Add more restrictions to K8s CRL CronJob.
The K8s CronJob that manages CRL syncing often leaves pods hanging
around for days at a time. This appears to happen when the download of a
particular CRL from DISA hangs for whatever reason. This updates the
configuration so that a running cronjob is automatically replaced by its
successor, rather than the two running concurrently. (The CRL CronJob
runs every hour, and it one has taken that long then it's hanging and
needs to be replace.) Similarly, this updates the config to only retain
one successful CRL pod, rather than the default of three.
2019-12-03 11:37:29 -05:00
tomdds
728bb5713f Fix flexVol serving of nginx certificates
FlexVol requires that you specify certificates as secrets in order to get both the certificate and private key in the appropriate format for nginx to consume. Additionally, flexvol shouldn't interfer with other secrets mounted in it's host directory.
2019-12-02 15:45:16 -05:00
tomdds
df6ab4a016 Fix some formatting problems in nginx configs 2019-12-02 15:45:16 -05:00
tomdds
5006945cfe Remove tls volumeMount 2019-12-02 15:45:16 -05:00
tomdds
33ce02d045 Better differentiate between master and staging vault config via overlay 2019-12-02 15:45:16 -05:00
tomdds
253ddaa49e Properly register key vault object types 2019-12-02 15:45:16 -05:00
tomdds
36406372e3 Remove unused secret volume for tls key and cert 2019-12-02 15:45:16 -05:00
tomdds
221e9ab26b Add a staging overlay for the key vault name
Currently we're just using the test vault, but in the future we want to be able to prescribe vault names for different environments via overlay.
2019-12-02 15:45:16 -05:00
tomdds
26bb2f4614 Use mounted all-in-one cert for nginx ssl
Mount the combined key and cert for nginx ssl using flexvol and point the necessary nginx config at it.
2019-12-02 15:45:16 -05:00
tomdds
9b8d5e3662 Document generation and updating of dhparams. 2019-12-02 15:45:16 -05:00
tomdds
1c4e00e914 Update Deploy Readme for FlexVol consumption
Explain via example how you can use FlexVol to mount secrets in our containers.
2019-12-02 15:45:16 -05:00
tomdds
9469d1ff1b Introduce TEMPLATE_ID variable for FlexVolume
FlexVolume requires you specify the tenant id of the key vault instance, so this will need to be templated in for future enviroments
2019-12-02 15:45:16 -05:00
tomdds
949ffa294d Use a single FlexVolume for nginx secrets
Just a name update for now, but we'll use the one flex volume to mount all the nginx related secrets going forward.
2019-12-02 15:45:16 -05:00
tomdds
6acc085a77 Use dhparam.pem from AZ Key Vault 2019-12-02 15:45:16 -05:00
dandds
a3aa3e6935 Config for NGINX SSL/TLS.
This adds additional SSL/TLS config to specify the acceptable TLS
version, cipher suites, session cache, etc. Values are currently based
on the Mozilla Foundation's recommendations for intermediate
compatibility:

https://wiki.mozilla.org/Security/Server_Side_TLS

We will manage NGINX configuration snippets as a K8s ConfigMap so that
they can be included in server blocks as-needed.
2019-12-02 15:45:16 -05:00
dandds
26c5b5ea7f Add JSON logging back for NGINX container.
This configures the NGINX container to log in JSON. It also updates the
K8s config so that we mount all of the key/value pairs available in the
atst-nginx ConfigMap as files in "/etc/nginx/conf.d" inside the
container. This simplifies the config a little.
2019-12-02 15:45:16 -05:00
dandds
d32536cf39 Fix ConfigMap to directory mapping.
Turns out you can't map multiple K8s resources over the same directory.
The K8s secret for the INI file and the ConfigMap for the uWSGI config
both map into /opt/atat/atst in the container. This caused errors when
the container tried to launch. Instead, we need to specify the full file
path for every file we're mapping into that directory to avoid
conflicts.
2019-11-27 09:57:58 -05:00
dandds
69bbb12a8e
Merge pull request #1209 from dod-ccpo/uwsgi-logging
Enable uwsgi logging again.
2019-11-27 09:38:42 -05:00
dandds
d5865c1ab3 Script for compiling K8s config. 2019-11-25 14:24:53 -05:00
tomdds
bc9e4fd142 Include new KeyVault env vars in both diff and apply sections of deploy readme 2019-11-25 11:52:15 -05:00
dandds
4d4c873c73 Enable uwsgi logging again.
Updates the K8s config to enable extended uWSGI JSON logging again. This
commit updates the name of the ConfigMap for the uWSGI config to avoid
confusion.
2019-11-25 11:38:29 -05:00
tomdds
f8e95ae104 Initial FlexVol Setup
This commit is the first part of consuming secrets from the Azure Key Vault. This will set up the required services to consume Azure's RBAC controls in the cluster, an identity to read the secrets, and the tool (FlexVol) to mount the secrets.
2019-11-25 11:19:55 -05:00
dandds
41bab4f594 Do not run test workflow for merges to main branches.
We should not run a redundant testing workflow on merges to master or
staging.

This also includes a quick fix to configure the FLASK_ENV for the main
site.
2019-11-22 12:56:17 -05:00
tomdds
4df68bab23 Add BLOB_STORAGE_URL config
Our content security policy in non-dev environments didn't allow uploading to azure blob storage. This adds a configurable blob storage base URL to allow regions to specify which storage endpoint they expect the upload request to use.
2019-11-22 11:56:27 -05:00
dandds
08fc530223 Add config value for CDN origin.
This value is set as the Access-Control-Allow-Origin header value for
the application. When using Azure CDN, the CDN will consume this header
when it populates its cache and use it on subsequent requests.

It would be possible to make this the same as the Flask SERVER_NAME
value. We explicitly set SERVER_NAME for Celery worker processes because
they need that information to contruct URLs outside of the request cycle
(Flask can infer the server name within a request cycle). I decided not
to rely on SERVER_NAME though because it has side effects:

- It determines what `url_for` uses as the host domain (which would be
  fine).
- It makes it so that the Flask app can only server requests to that
  domain (probably fine, but it felt like too big a side effect).

Additionally, SERVER_NAME does not include the scheme. For all of these
reasons I opted to make CDN_ORIGIN a separate config value.
2019-11-21 16:43:22 -05:00
dandds
c6187466a3 Configure staging with different FLASK_ENV, include sub-route for CDN_URL. 2019-11-21 16:43:22 -05:00
richard-dds
8e12c6bfbd Add CDN config for staging 2019-11-21 16:42:42 -05:00
richard-dds
e29163f65a Add CDN config for prod 2019-11-21 16:42:42 -05:00
dandds
280778ab5f Set SERVER_NAME correctly for staging Celery workers. 2019-11-19 13:36:47 -05:00
dandds
88171aaee7 Supply named default queue for Celery.
Supplying this will prevent queue clashes between various ATAT sites
sharing the same Redis instance.

Note that the Celery documentation is currently wrong about the name for
configuring this:

https://docs.celeryproject.org/en/latest/userguide/configuration.html#std:setting-task_default_queue

It specifies `CELERY_TASK_DEFAULT_QUEUE`, but
`CELERY_DEFAULT_QUEUE` is the value that Celery currently looks for.
This appears to be fixed in on an upcoming release:

https://github.com/celery/celery/issues/5575

This is worth keeping an eye on, since the configuration key could
change in the future.
2019-11-14 15:48:14 -05:00
dandds
bf1badeff0
Merge pull request #1182 from dod-ccpo/lets-encrypt-manually
Configure K8s deployment for easy LetsEncrypt verification.
2019-11-14 12:46:25 -05:00
dandds
79eb691907 Configure K8s deployment for easy LetsEncrypt verification.
This is not the certificate setup we will use in production. I'd like to
merge this configuration as a reference point because this is the
easiest way to handle manual LetsEncrypt verification within the
cluster.

This allows NGINX to serve static files over HTTP from the
".well-known/acme-challenge" directory, which is necessary for certbot
validation of domain ownership.
2019-11-14 09:51:35 -05:00
dandds
387f957aa4 Add CircleCI config for staging deployment.
This generalizes the deploy step into a configurable CircleCI command.
The available parameters are:

- `namespace`: the K8s namespace to alter
- `tag`: the docker tag to apply to the image

The script for applying migrations to the K8s environment and the
corresponding K8s Job config have been generalized so that they can be
configured to run in the specified namespace.

The main workflow has been updated so that the appropriate deployment
will happen, depending on whether we are merging to staging or master.
In the future, we could look to add an additional workflow based around
Git tags for production.

Note that this also removes the creation of the `latest` tag from CD.
That tag is no longer hard-coded into our K8s config and so there's no
longer a need to update it in our container registry.
2019-11-13 09:56:36 -05:00
dandds
fd57036f74 Keep client CAs as a K8s ConfigMap.
The CAs used to verify clients are not secrets and can be committed to
the repository as K8s ConfigMaps. This updates the config to include
them.
2019-11-08 14:28:45 -05:00
dandds
630469744a Use kustomize and envsubst to generalize k8s config.
Adds a [kustomize](https://github.com/kubernetes-sigs/kustomize) overlay
for a new staging environment. Additionally, adds environment variables
in the place of certain pieces of information that need to be templated.

The K8s README ("deploy/README.md") has been updated to reflect the new
method for applying config.

This commit also removes the configuration for the AWS cluster and
references to AWS in the README.
2019-11-08 14:28:45 -05:00
dandds
efcb9681d3 Make Postgres SSL connection configurable.
This will allow us to force SSL connections to the database in
production by setting two values:

- PGSSLMODE should be set to "verify-full". This forces the client to
  verify the server against a known CA: https://www.postgresql.org/docs/10/libpq-ssl.html
- PGSSLROOTCERT should be set to the path of the public cert for the
  relevant CA.

When the database connection is made, these values are passed to the
adapter. For local development, PGSSLMODE is set to "prefer" and
PGSSLROOTCERT is left unset.

Kubernetes config has been added to maintain the root CAs for both Azure
and AWS as k8s ConfigMap objects. These are mounted into the containers
and referenced by PGSSLROOTCERT in the container environment.
2019-10-17 16:05:19 -04:00
dandds
4169dcb310 Fix CI/CD bug with PGSSLROOTCERT.
Because I pushed the environment variable changes to the cluster
already, psycopg2 was automatically trying to connect to the database
using the file specified in PGSSLROOTCERT. That ConfigMap was not
mounted into the migrations container, so I'm doing that here.
2019-10-17 14:59:41 -04:00
dandds
fc637e933d Specify Flask SERVER_NAME value for Celery worker.
The Celery worker cannot render URLs for the app without having a
SERVER_NAME value set. AT-AT's ability to send notifications when an
environment is ready is broken as a result.

This commit sets a null default value for SERVER_NAME in the default
config file. A setting must exist in the INI file in order to be
over-written by an environment variable, which is why we declare it as
null here. There is an additional kwarg, "allow_no_value", that must be
passed to ConfigParser to allow null values.

This also applies the correct domains as SERVER_NAME environment
variables in the Kubernetes ConfigMaps for the AWS and Azure Celery
workers.
2019-10-16 11:57:18 -04:00
dandds
73a459ea28
Merge pull request #1113 from dod-ccpo/k8s-log-aggregation
K8s log aggregation
2019-10-14 13:15:41 -04:00
dandds
05c84877dd Add k8s config for adding Fluentd and piping logs to CloudWatch.
With this configuration, all Kubernetes logs within the ATAT cluster
will be sent to AWS CloudWatch.

Note that this requires applying an additional IAM policy to the worker
nodes' role.
2019-10-11 12:54:50 -04:00
dandds
bbd0ffe1a9 Kubernetes configuration to allow Azure Monitor to collect logs.
With this additional ClusterRole and ClusterRoleBinding, Azure Monitor
will receive the aggregate logs from our application containers.
2019-10-11 11:24:53 -04:00