• proposal for cap test to merge into inspector?
    • opportunity to revise arch. diag.
    • inspector -> cap test -> feature flag -> cap test vsn
  • get checkmarx working locally
  • fix SQ not reporting correctly in GHA

2024-10-31

  • more work on merging cap tests into inspector
  • engineering forum (karpenter release)
  • demo ibuild to ops team
  • sort out inspector_ui Dockerfile (copy not install python code)

2024-10-30

  • investigate alleged issues with check_resource_specs
  • extensible poetry build lifecycle (ibuild)
  • wash up Kong incident with Christian

2024-10-29

2024-10-28

  • KSI replacement for GHA
    • approved, applied and merged PR from Friday
  • cap test in inspector

2024-10-25

  • KSI replacement for GHA
  • some random Kong requests that socked up time to return to Raffa
  • merged cap test into inspector repository
  • consideration of how to reuse lifecycle scripts across components
    • poetry multiproject plugin does ast rewriting, which seems extreme
    • try to solve with common ’library’ project

2024-10-24

  • nightmare GHA PR from Khush
    • large, copilot described and implemented
    • discussed refactor and need for separate workflows
      • need a picture
  • docs PR from > IP
  • CVE in inspector base image
    • nothing to be done?

Scan improvements

  • Should fail if jf is not installed

    JFrog CLI version: /bin/sh: jf: command not found
    Scan completed successfully for: inspector:0.8.4-dev
    
  • Put sys.exit calls inside method for better encapsulation

  • Use method composition for easy to read and test code

    def scan():
        vsn = get_version()
        package()
        _verify_image()
        results = _scan()
        if is_critical(classify(results)):
            sys.exit()
    
  • how to reuse script in both components?

2024-10-22

  • clean up secrets-store test
    • share single sa?
      • done and parameterised
      • also share secret provider class
    • run reconciler
    • test new cap test image in prod inspector against ce-nonprod-beta

2024-10-21

  • jira review
  • inspector test
    • fixing missing permissions: services, then statefulsets
    • also permission boundary to DescribeClusters that Khush is working on
  • BLOG, LOATHE: fine grained access controls create disproportionate amount of work and worse, probably a false sense of security since no one understands or audits them

2024-10-18

  • secrets-store cap test failure
    • tmp fix to run only on the cluster it was designed for
    • longer term, read NR relic from prod platform account
  • vpa cap test failure
        message: "Failed to delete all resource types, 1 remaining: admission webhook
        \"validate.kyverno.svc-fail\" denied the request: \n\nresource Parser/capability-testing/capability-testing-fluent-parser-multi-line-single-line
        was blocked due to the following policies \n\nvalidate-fluent-operator-parser-allowed:\n
        \ validate-fluent-operator-parser-allowed-requires-known-spec: '{\"regex\":{\"regex\":\"^(?\\u003cTIME\\u003e\\\\d+-\\\\d+-\\\\d+\n
        \   \\\\d+:\\\\d+:\\\\d+\\\\.\\\\d+)\\\\s+(?\\u003cLEVEL\\u003e\\\\S+) \\\\d+
        --- \\\\[\\\\s*(?\\u003cTHREAD\\u003e[^\\\\]]+main)\\\\]\n    (?\\u003cCONTEXT\\u003e\\\\S+)\\\\s+:
        (?\\u003cmessage\\u003e.*)$\",\"timeFormat\":\"%Y-%m-%d\n    %H:%M:%S.%L\",\"timeKey\":\"TIME\"}}
        is not among allowed list, allowed are: ''json''.'\n"
    
  • training
    • sdlc101
    • information security

2024-10-17

  • fix: lazy init NR so env vars not required for unrelated command use
  • rollout more inspector permissions: https://github.com/elsevier-centraltechnology/cortex-platform-definitions/pull/5129
  • broken cap tests in beta
    • Keyword ‘pod “test-secrets-store-volume” status in namespace “capability-testing” is READY’ failed after retrying for 2 minutes. The last error was: ‘‘False’==‘True’’ should be true.
    • Keyword ‘pod “test-secrets-store-environment” status in namespace “capability-testing” is READY’ failed after retrying for 2 minutes. The last error was: ‘‘False’==‘True’’ should be true.

2024-10-16: Dev10

  • GHA still failed (no separate commit for release, lack of clarity on requirement)
  • Luis test successful, revert the failure flag
  • FG resignation convo :-(

2024-10-15 (short day)

  • modify cap test and inspector lambda to support testing a failure thru flags
  • expedite rollout and testing with Luis
  • 1h on SDLC maturity: eventually Irfan focused convo on traditional software (inspector et al.)
  • spent a couple of hours with Khush on GHA

2024-10-14

  • alpha rollout for calico perm change => bundle creation now fine

  • significant cap test failures

  • 0.8.2 rollout failure

    • tagged in github, but with wrong contents
    • not deployed to prod lambda
      • 0.8.1 displayed in swagger, 0.8.2-dev in openapi.yaml
      • 0.8.1 from /info too, 0.8.2-dev in pyproject.toml
  • cronjob for confluence items

    • can I run inspector cli from k8s cron or need endpoint?
    • pain: build separate image, set lots of env, need to do aws auth => endpoint will be easier
  • secrets-store

         Warning  FailedMount  47s  kubelet  MountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = Unknown d │
      │ esc = failed to mount secrets store objects for pod capability-testing/test-secrets-store-environment, err: rpc error: code =  │
      │ Unknown desc = eu-west-1: Failed fetching secret core-elsevier-platform-test/test-cluster/test-secret: WebIdentityErr: failed  │
      │ to retrieve credentials                                                                                                        │
      │ caused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/ │
      │ id/6EF491F1E59868EA532811F93EBFB5AA                                                                                            │
      │   status code: 400, request id: 258e9341-58fb-47d5-8c8c-6c19d80501d4        
    

2024-10-11

  • driving through the permission change affecting bundle creation
  • discovered Calico permission change was bigger than just the one (no surprise)
  • generate cap test cronjobs for all alpha clusters

2024-10-10

  • cleanup botched release
  • try to pull Khush back onto GHA as automation of manual without success
  • OKR review
    • solo (20 mins)
    • with Felipe

2024-10-09

  • testing updated fluent test with Ashish
  • fixing tests
  • accidental release, turned otu to eb due to workflow
    on:
    push:
      tags: ["[0-9]+.[0-9]+.[0-9]+"]
    
    though unclear what pushed the tag
  • attempt to perform second release failed, possibly due to previous but also `poetry

2024-10-08

  • fixing tests
  • testing updated fluent test with Ashish
  • retro
  • run KSI report for Claire
    • fix: (fix: get_namespaced_resource)[https://github.com/elsevier-centraltechnology/cortex-inspector/pull/218]

2024-10-07

  • SDLC training
    • SDLC 201
      • train, assess, improve
      • maturity assessment: x2 p.a. team exercise for 1h
      • videos 1,2 of 4 (+2 optional)
  • investigate lack of daily cap test results
    • missing env vars
  • CEIP-6463: update Inspector terraform for recently added env vars
    • sorted that eventually but discovered
    • strayed into diff between mocked and actual cap test responses in inspector
  • release new capability test image (0.4.0)
    • manually tagged release (TODO)

2024-10-01 - 04

  • predominantly Inspector
    • discuss and impl API change with Luis and Garrett
    • integrate Khush’s PRs with inspector decorator and base class changes
    • some tweaks to Cap Test to report as desired
    • prep for release of both