2024-01-31: office day
- add GHA to ce-nonprod: https://github.com/elsevier-centraltechnology/tio-terraformcontrol-ce/pull/1212/
- appraisal
- K8s 1.26 support (Mars PPE still flagging issue after release)
- okr ideas
- integrate advisor & crtxctl, benefit: jump in visibility / usefulness of Advisor
- deployment pipeline pattern
- prepare (terraform), deploy (helm), verify (capability tests), report (advisor? operations website?)
- cortex bpm
2024-01-30
- CEIP-4469: KSI migration
- brainstorming about how to roll KSI out on Kong
- generalise GHA for Helm further
- terraform for GHA on ce-nonprod (for flowable)
2024-01-29
- CEIP-5073: registries check
- CEIP-4469: KSI migration
- tidy up GHA for newrelic
- move on to GHA for Kong, then return to clean up terraform (if GHA approach approved)
- GHA
- take as input the datestamp to replacei, ie treat blue-green as infra only?
- Retro
2024-01-26
- CEIP-4469: KSI migration
- PoC GHA helm deployment
- role-to-assume: arn:aws:iam::595468393306:role/Core-Elsevier-Platform-Manager-Role-nonprod
Error: Could not assume role with OIDC: Not authorized to perform sts:AssumeRoleWithWebIdentity
Solution: https://github.com/elsevier-centraltechnology/tio-terraformcontrol-ce/blob/gha-helm/595468393306/oidc-github-actions/kong-image-build-iam.tf
Specifically:
GHA working but with a couple of apparently ignorable errors!"Principal" : { "Federated" : aws_iam_openid_connect_provider.github_actions.arn },E0126 17:16:35.028853 348 memcache.go:287] couldn't get resource list for projectcalico.org/v3: Unauthorized E0126 17:16:35.029442 348 memcache.go:287] couldn't get resource list for metrics.k8s.io/v1beta1: Unauthorized
- role-to-assume: arn:aws:iam::595468393306:role/Core-Elsevier-Platform-Manager-Role-nonprod
Error: Could not assume role with OIDC: Not authorized to perform sts:AssumeRoleWithWebIdentity
Solution: https://github.com/elsevier-centraltechnology/tio-terraformcontrol-ce/blob/gha-helm/595468393306/oidc-github-actions/kong-image-build-iam.tf
Specifically:
- PoC GHA helm deployment
2024-01-25
- CEIP-4469: KSI migration
- successfully completed kong_db_yyyymmdd
- moved onto control plane
- 74 min terraform: run out of security groups?
- CEIP-5105: move to Artifactory tokens
- created token and proof of concept
- GHA helm deployment
- cortex-prod-admrole
- kong-nonprod-beta
- helm install
2024-01-24
- CEIP-4469: KSI migration
- completed the jenkins ksi removal including removing two man, seven year old container image
- start same approach on kong-infra
kubectl get job postgres-20221121-init -n infra -o json | tfk8s --strip -o postgres-20221121-init.tf
2024-01-23
- CEIP-4469: KSI migration
- Solved CSI terraform issue for newrelic:
- Consolidated
newrelic-logging- potential post-install check: If the partition is getting logs we are good; https://onenr.io/0bRmbmz4Zjy
- start on jenkins-infra
- discovered cannot inject volumes into init container of jenkins chart (direct never mind via fulcrum)
- onion problem, start peeling…
- considered bringing the jcasc checkout into jenkins pipeline but don’t understand enough about it (why is default config disabled, hence cannot start without an initial checkout)
- ended up reading secret within init container
- TIL: faster to dev by using helm -f values then convert back to terraform
2024-01-22
CEIP-4469: KSI migration
- small addition to docs based on clarification to JA: https://github.com/elsevier-centraltechnology/cortex-documentation/pull/160
- conversation about terraform -> helm -> k8s manifest: https://global-elsevier.slack.com/archives/C030F90FM7U/p1705929732781939?thread_ts=1705923256.600249&cid=C030F90FM7U
- spoke to Felpe about GHA helm instead of terraform
- current approach is log not CICD, Chris has questioned
- core-kong-operations is a valid approach: small user base justifies experiment
Planning
- Advisor handled to KA and DK
- need to get unified advisor / crtctl interface moving fwd
- Irfan: IA need to revist standard, perhaps more insistent on using ECR to cache artifactoryr Claire:(not currently mandated)
- need to flag use of default security group (another advisor thing)
2024-01-19
- CEIP-4469: KSI migration
- kong new relic: stuck on terraform error:
Error: YAML parse error on nri-bundle/charts/newrelic-infrastructure/templates/kubelet/daemonset.yaml: error converting YAML to JSON: yaml: line 68: did not find expected '-' indicator - KT with jonathan
- kong new relic: stuck on terraform error:
2024-01-18
- CEIP-4469: KSI migration
- grafana: TIL:
kubectl get secretproviderclass backstage-dev-secret-provider -n backstage -o yaml | tfk8s --strip -o backstage-dev-secret-provider.tf
- grafana: TIL:
2024-01-17
- CEIP-4469: KSI migration
- a lot wasted on terraform debugging
LOATHE: terraform state allows working code on one platform (darwin_amd64) to be unsupported on another (darwin_arm64)
LOATHE: this is made worse by much of our terraform being ‘abandonware’ with no ownership or maintenance
LOATHE: terraform, at least as done at ELS, is fundamentally broken:
- code is deployed before being reviewed or committed.
- modules are arbitrarily sized and structured
- testing does not exist
- a lot wasted on terraform debugging
LOATHE: terraform state allows working code on one platform (darwin_amd64) to be unsupported on another (darwin_arm64)
LOATHE: this is made worse by much of our terraform being ‘abandonware’ with no ownership or maintenance
LOATHE: terraform, at least as done at ELS, is fundamentally broken:
2024-01-16
- CEIP-4469: KSI migration
- working on various backstage projects
2024-01-15
- CEIP-4469: KSI migration
- IAM changed required or not? https://us-east-1.console.aws.amazon.com/iam/home?region=eu-west-1#/roles/details/ce-nonprod-devduty-slack?section=permissions
- think not after all as confluence-token used in secret provide should in fact have been rotor-url
- error flagged by GHA
Error: User: arn:aws:sts::702267635140:assumed-role/prod-idc-actions-runner-role/aws-sdk-js-session-1705313971221 is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:eu-west-1:702267635140:parameter/cortex/core-engineering/sonarqube/api_token because no identity-based policy allows the ssm:GetParameter action
- IAM changed required or not? https://us-east-1.console.aws.amazon.com/iam/home?region=eu-west-1#/roles/details/ce-nonprod-devduty-slack?section=permissions
2024-01-12: vacation
2024-01-10,11
- API for AWS kubent equivalent
- CEIP-4469
- expand report to include cronjobs
- start migrating ce nonprod jobs
- check slack devduty works tomorrow: https://github.com/elsevier-centraltechnology/ceteam-devduty/pull/14
- techdesk-automator: https://github.com/elsevier-centraltechnology/accountfactory-techdesk-automator/tree/645ba5f3bdc46f95fe89bbd297302e2480b7c053
2024-01-09
- CEIP-5002: fix
- CEIP-4469: KSI migration
- ce-nonprod has a newrelic-logging that uses KSI
- the IAC is at https://github.com/elsevier-centraltechnology/tio-terraformcontrol-ce/tree/master/116634825266/newrelic
- need to add secret provider class
- need to modify
helm.tfto include extraVolumeMounts and extraVolumes for secret in newrelic-logging Ref - remove KSI labels from
helm.tf
- upgrade NR to 5.x first -> DONE
2024-01-08
- new machine setup
- CEIP-5002: exclude FlowSchema items from 1.26 migration report
2024-01-05
- CEIP-4979: crtxctl generic JSON renderer: apply to all existing commands
- new machine setup
- Camunda:
- attendees
- Graeme Wilkinson (commercial, Camunda)
- Anton von Weltzein: (sales eng, Camunda)
- Laksmi Remani (sales eng, Camunda)
- Richard
- Sravan
- Rakesh
- Prasath
- Tim
- agenda: product and value prop.
- q: why no business users?
- open stds, faster time to market, retain existing tool choices
- modeler:
- organised into projects
- collaboration tool
- analogy to google docs
- automaatic change history (again like google docs)
- process, decisions and forms all together
- Richard questions
- due date of task?
- conditions based on what?
- scoping of data?
- SaaS version must use Optimise API to extract (or capabilties built into UI)
- gRPC gateway
- connector sdk has some capability to inject data and obviously process data goes to elastic search
- camunda 7 optimise has an advantage here in being able to ingest customer data
- attendees
2024-01-04
- run 1.26 migration report
- CEIP-4979: crtxctl generic JSON renderer: single implementation