2022-05-31
- CEIP-1827
- Diagram
2022-05-30
- options to extend aws session
- env var:
SAML2AWS_SESSION_DURATION="14400" - cli arg:
—session-duration=
- env var:
2022-05-27
- CEIP-1608
- sucessfully separate service (LoadBalancer) as well as pod
- Q: what is the rationale for separation between module (core_terraform_kong) and ‘user’ (tio-terraformcentral-ce)?
- if user may need to override. Hence should not expose things like portal hostname in tio…
- Q: why core-terraform_kong not simply a
values.yaml? format is way more verbose, error-prone and confusing
- Dashboard
- Decision in favour of New Relic
- https://backstage.io/docs/getting-started/
2022-05-26
issue flagged by Thomas:
# in core-kong-orchestration terraform init -backend-config=cluster/nonprod/backend.tfvars terraform apply -backend-config=cluster/nonprod/backend.tfvarsCEIP-1608
are the portal ‘pending’ services redundant (can access without)
how to setup domain to test?
- service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:eu-west-1:595468393306:certificate/87fd3f4e-063d-438b-95ef-8904f6b2ae26 service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:eu-west-1:595468393306:certificate/87fd3f4e-063d-438b-95ef-8904f6b2ae26
idea:
- how to introspect state from system?
- SPEKT8 is an app to visualise K8S but must be installed in cluster
- KubeView supports running in and out of cluster
- clone git repo
make build-server,make build-clientln -s web/client/dist/ cmd/server/frontend
- how to visualise state?
terraform graph | dot -Tsvg > tf.svgworks but as you can see is incomprehensible - how to prevent apply without expected params?
- how to test
Example of a test I want to be able to write
As a DevOps Given that I have an AWS account And EKS cluster And Postgres credentials Then I can deploy a Kong chart And access a Kong Workspace named And access Kong example service 'Swagger PetStore' And access the DevPortal 'CoreEngineering' And EKS cluster has no policy violations And ....clarity seems promising
however minimal feature fails at first hurdle:
hcl Unmarshal failed: At 13:25: Unknown token: 13:25 IDENT var.account_id
- how to introspect state from system?
what about policy violations reported in k9s?
2022-05-25
- CEIP-1608
- avail. roles only control_plane, data_plane, traditional (on docker start script)
- failed to inject secrets into portal but not control_plane
- release sequence? core-terraform-kong first, needs major release
- CEIP-1824 created and started
- Jennie SRE next steps mtg
- help TM get set up
2022-05-24
- CEIP-1608 terraform kong dev portal separately
- increase AWS inbound rules (support ticket)
- epay java mtg
- jennie SRE mtg
2022-05-23
- CWS: resource use on Java app start
- dev duty
- discovered kong PRs dont need to pass one of the Jenkins (AN will rename)
- TIO hard way
- up to step 3
- CEIP-1608
- put dev portal on own pod
- Adrian N has machine issues
- liveness check
- ‘controller’: allows to embed SRE in pod to monitor and restart
2022-05-20
- CPU high on kong-prod
- Twistlock is container scanner
- TIO hard way
- CEIP-1608
- put dev portal on own pod
- control-plane defined in tio-terraformcontrol-ce/595468393306/kong-pseudo-module_v2/kong_cp
- create similar kong_portal and disable control plane
- disable portal in control plane
- control-plane defined in tio-terraformcontrol-ce/595468393306/kong-pseudo-module_v2/kong_cp
- Java workload on Cortex
- put dev portal on own pod
2022-05-19
- Add new relic to journal
- checkout tip from Ashish: zsh and P10k
- CEIP-1771 AP secrets
2022-05-18
Kong crisis
- https://elsevier.atlassian.net/wiki/spaces/SRE/pages/edit-v2/119600965328969
- NOTE:
aws --region eu-west-1 eks update-kubeconfig --name nonprod - ceip-1791
- local terraform edit and run (first time!)
- READ: https://github.com/elsevier-centraltechnology/tio-terraformcontrol-ce/blob/master/595468393306/common/bootstrap.tf#L7
2022-05-17
- meeting w NewRelic
- mtg w ePay
- set up call w Ash for CIEP-1771
- mtg on CEIP-1608, next steps:
- put dev portal on own pod
- mod url in require scripts and let Merrick decide risk appetite
- solve issue (CSS / JS etc / cookies)
2022-05-16
- book hols
- Catch up on slack
- reading confluence, team objectives and backstage
- requested access to
EAEL-TIO-CE backstage-nonprod ('3b75c084-00fb-4da2-a2f2-3fb9f502bb76')via Woodhouse, TBC if this is right. If so add to new joiners. - wrong! try CE-Account-Compliance-Dashboard
- requested access to
- own objectives
- explore platform-calculator
brew install --cask dockerbrew install kubernetes-cli
- spring native java:
2022-05-13
- Town Hall slides from Tue 10
- SRE definition slide useful inc. ref to ISDP
- put together a personal knowledgebase using hugo
- troubleshooting session with Thomas, Ashish, Felipe on liveness
2022-05-12
compliance training
ethics training
payslip app
read blog
todos
- build rdp cluster
X complete terraform PR
update saml2aws wiki?
- Kong
- draft own OKRs
- find some Udemy for infra?
- can users make sense of our products?
- CWS already in production want to walk away
- uptime / outages
- AMI image causes ‘outage’? (degredation)
- roof: 50% customers post-migration suffer no outages inherent to platform
- moon: 80%
- maturity model? tied to Java Cloud Native
- enhance detail assessment; qualitative interactive approach
- engagement with product teams
- suggest visibility measures
- products and challenges: case study
- x events per quarter achieve
- how many and NPS
- c3: poll: why attend? doing …; my boss told me!
- ts to present something
- uptime / outages
- CWS already in production want to walk away
- TIO hard way
- build rdp cluster
X complete terraform PR
1:1
- questions
- why no Cortex Ops on https://elsevier.atlassian.net/wiki/spaces/TIOCE/pages/119600888457283/Core+Platforms+Team+2022+Roadmap+Objectives+DRAFT?
- extract from Cortex Build?
- Qs follow calendar year?
- why no Cortex Ops on https://elsevier.atlassian.net/wiki/spaces/TIOCE/pages/119600888457283/Core+Platforms+Team+2022+Roadmap+Objectives+DRAFT?
- questions
2022-05-11
- saml2aws
- eval $(saml2aws script)
- Paint me orange 3/3
- high performing squads
- design thinking
- OKRs
2022-05-10
kubecon
- Monday, May 16 • 15:30 - 16:30 Secure your AWS Workloads as you Build Hosted by Snyk, Sysdig and Docker (Complimentary Registration Required)
- Tuesday, May 17 • 09:00 - 17:00 GitOpsCon Europe Hosted by CNCF, Track 1 (Additional Fee + Registration Required for In Person Attendees, Includes both Tracks)
- Tuesday, May 17 • 17:00 - 19:00: AWS Container Days @ KubeCon + CloudNativeCon Europe 2022 Hosted by AWS (Complimentary Registration Required)
- Tuesday, May 17 • 15:30 - 17:00 Run Postgres, the Kubernetes Way - Garden Reception Hosted by EnterpriseDB [VIRTUAL] (Complimentary Registration Required)
- Wednesday, May 18 • 09:00 Keynotes
- Wednesday, May 18 • 11:00 - 12:30 Intro to Kubernetes, GitOps, and Observability Hands-On Tutorial - Joaquin Rodriguez, Microsoft & Tiffany Wang, Weaveworks
- Wednesday, May 18 • 11:00 - 11:35 containerd: Project Update and Deep Dive - Derek McGowan, Apple & Phil Estes, Amazon
- Wednesday, May 18 • 11:55 - 12:30 Deep Dive into Minikube - Medya Ghazizadeh & Sharif Elgamal, Google
- Wednesday, May 18 • 14:30 - 15:05 Helm Project 2022: How You Can Benefit, How You Can Help - Scott Rigby, Weaveworks; Matt Butcher, Fermyon; Martin Hickey, IBM; Andrew Block, Red Hat
- Wednesday, May 18 • 14:30 - 15:05 No Docker, No YAML and a Polyglot Developer Experience on Top of Kubernetes - Thomas Vitale, Systematic & Mauricio Salatino, VMware
- Wednesday, May 18 • 15:25 - 16:00 Autoscaling Kubernetes Deployments: A (Mostly) Practical Guide - Natalie Serrino, New Relic (Pixie team)
- Wednesday, May 18 • 16:30 - 17:05 Open Policy Agent (OPA) Intro & Deep Dive - Anders Eknert, Styra & Will Beason, Google
- Thurs: keynotes
- Thursday, May 19 • 11:00 - 11:35: Sharing Knowledge: Writing Good Docs for Quick Approval - Jared Bhatti, Waymo
- Thursday, May 19 • 11:00 - 12:30 GitOps to Automate the Setup, Management and Extension a K8s Cluster - Kim Schlesinger, DigitalOcean
- Thursday, May 19 • 11:55 - 12:30 OpenTelemetry: The Vision, Reality, and How to Get Started - Dotan Horovits, Logz.io
- Thursday, May 19 • 17:25 - 18:00 Cloud Native Chaos Engineering with LitmusChaos - Karthik S, Umasankar Mukkara & Udit Gaurav, ChaosNative; Saiyam Pathak, Civo
- Friday, May 20 • 11:00 - 11:35 From Cloud Naive to Cloud Native – Avoiding Mistakes Everyone Does - Max Körbächer, Liquid Reply
- Friday, May 20 • 14:55 - 15:30 From Student to SRE That Loves CNCF in No Time - Jacob Valdemar Andreasen, Lunar
- Friday, May 20 • 16:55 - 17:30 Metrics as a First-Class Citizen in the E2E Testing Landscape - Matej Gera & Jéssica Lins, Red Hat
accessibility training
- A familiarise with leyden design system (react only?)
- A join slack?
- A get a belt?
unconcsious bias training
- current score: ~85 (too few attributes)
- A short term: take assessment test
- long term: ?
platman 1.0 logging meeting
- Github repo: https://github.com/elsevier-centraltechnology/ceteam-platman-redesign
- hierarchy of derived loggers (reminds me of log4j)
- uses
zerolog - reconcilermanager -uses- logger produces reconciler
- reconciler -derives- specific logger
2022-05-09
- Anouar Chattouna - interview
- what has been your interaction with dev teams?
- how transfer products to production?
- Tech used (lists Github Actions)
- Jenkins?
- devops since 2016, prev. Java dev
started docker, k8s
now strengths: aws, k8s, terraform
stack = module + layers (security, iam etc)
infra = many layers (caller of module)
terragrunt = DRY wrapper around terraform
Q (Ashish) around short and long lived differences
Q How work out root modules (pin provider)
Terratest - at least one test
Using Github runners - pass token to isolate different providers
pre-commit hooks - how maintain across repos?
- install pre-commit locally and run b4 commit
- weekly job to sync hooks
test accounts
in case of compromise: use cloud trail for events, config w bucket
k8s
- big beast, which most comfortable with
- external DNS providers; no golang hence no components
- eks upgrade: new node, drain and replace
- no need to touch control plane - adv of moving to EKS
- monitor: datadog
monitoring
- over usage of resources
- app specific monitoring
- endpoint: reliant on dev team for def’n of normal
- how approve PR from dev?
- effectively no change
- external resources?
- eg RDS
- how control new feature releases
- HPA? but not used
- could use canary but don’t
- observe, strong reliance on gut
- how translate logs into actionable info.
- ‘web site is slow’ what to do?
2022-05-06
CEIP-1668: cluster creation
- PR for missing instructions
- additional group requests
CEIP-1668: Dev Portal
- steps
- create workspace
- create dev portal
- portal authentication: openid, with custom JSON
- cognito needs dev portal just created client to accept url for from and redirect
- Enable tracing: https://elsevier.atlassian.net/wiki/spaces/SRE/blog/2022/04/04/119600955150521/Kong+Troubleshooting+certificate+too+long+error+response+on+APIs
- steps
RDP assessment X ticket created and self-assigned: https://elsevier.atlassian.net/browse/CEIP-1751 X request calculator submission
2022-05-05
IdP cluster: https://elsevier.atlassian.net/browse/CEIP-1668
- https://elsevier.atlassian.net/wiki/spaces/SRE/pages/119600961389156/RT+Identity+-+Cortex+Detailed+Assessment
- work thru Onboarding doc (below)
NewRelic #apim-alerts difficulty seeing the wood from the trees X Request access: https://elsevier.atlassian.net/wiki/spaces/TIOCE/pages/23544287666440/Kong+API+Gateway+-+Observability
- Cloud native Java
- example of onboarding, perhaps something like the Pet Store
- maturity assessment: how ready is your application for cloud native
- define best practice with quantifiable characteristics
- SRE enablement course/scope/definition
- second order definition: not binary but how far have gone
- New Relic??
- UX??
- Enabling self-service
- Adopt a design system?
- ?
- Cloud native Java
2022-05-04
Onboarding Christian:
- fill this form and send back the YAML: https://calculator.cortex.elsevier.systems/
- fuller instructions: https://github.com/elsevier-centraltechnology/cortex-operations/blob/master/Onboarding.md
ePay assessment