2024-09-30

  • inspector Q3 release

    • review all outstanding
    • replace pod with priorityclass cmds
  • capability tests not run? 🦺 Capability test results

    from 2024-09-30 00:00 to 2024-09-30 14:06 suites: 1 (⬆ 0) tests: 1 (⬆ 0) failures: 0 (⬆ 0)

    Found 1 capability_test_results Success: 1 report written to console

  • planning

    • ksi
      • start talking to partners to jolly them along
        • will you be done by end Oct?
      • IP says github runners will be helm fix so could fix partners for them in fixing the ce one

2024-09-26

  • inspector.check decorator
  • single initialisation of commands

2024-09-25

  • dry up inspector commands
    • find . -type f -name '*.py' -exec wc -l {} +
      • before (inc. tests): 13252
      • before: 9480
      • after: 8932
  • slow day due to awful cough

2024-09-24

2024-09-23

2024-09-20

2024-09-19

2024-09-16

  • resolve permissions for NR user key
  • release module 0.7.0 but not the images
  • command endpoint for inspector (to schedule KSI for Claire)
    • new permission problem (on lambda only)
      botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:sts::781632261136:assumed-role/Cortex-Inspector/inspector-nonprod is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::058639152458:role/cws-nonprod-Cortex-Inspector-Agent[DEBUG]	2024-09-16T20:16:59.182Z	539e7352-62de-4799-a207-adeb5db8bb3f	send({'type': 'http.response.start', 'status': 500, 'headers': [(b'content-length', b'21'), (b'content-type', b'text/plain; charset=utf-8')]})
      
  • new command to check NR and report cap tests to Slack

2024-09-13

  • mostly buried in permissions, some of it working with Ashish on external-dns
  • solicit partner feedback on inspector

2024-09-12

  • conclude existing cap tests for daily run
  • add additional smoke tests
  • Claire requires KSI update for Monday
  • support 0.6.5 Inspector release
  • wrote training review for Felipe
  • chat with Ashish about cap test endpoint

2024-09-11

  • moved to scheduling the existing cap tests for daily run
  • agreed w Matteo to impl smoke tests rather than focus on API integration
  • Ashish identified permission boundary as potential cause of NR publisation issue

2024-09-10

  • dev duty
  • fighting with IAM to get newrelic working
    • Ashish highlighted permissions have to be opened at both ends
    • still nada

2024-09-09

  • EMCloud TPR1
    • Bill Reuschlein scoping TPR1 at empty cluster
    • Terry picked up on this
  • Retro
  • Irfan - start looking at 2025
    • looking at 202 clusters (~60ish already on cortex)
      • priority 1,2,3
      • priority 1: eol already or extended support
        • approached BU directors about 500% uplift in costs!
        • at the same time cannot do all the work, need experts BU side
        • goal of ‘abstracting away cognitive load’
        • cherry pick some people as ‘interface’ (train the trainer)
      • priority 2: eks but less than n-1 (1.28 falls into this at Nov)
      • priority 3: consider when replatforming, upgrading etc
    • generally accepted, ops mgrs planning, bu directors considering priorities
    • Q1 / H1 tackle priority 1s
    • Cortex XXX
      • logging being elaborated now
      • Inspector as currently envisaged done end ‘24
        • new req’ts in Q1 25?
      • Cortex Apps
        • cf SDLC and tagging projects
        • intends to get in front of DiAnna soon
      • develop GenAi bot
        • land grab
        • trademark and get thru TPR1
        • crossplane
          • unsure if terraform is big enough problem to use
          • arch wants CE to be capable of evaluating teams desired use
          • IP prefers to offer ce composites in same way as terraform modules
          • could use it cortex side to drop platman
      • apps obstacles
        • accounts is easy solution for cost codes but coming to end of savings possible
        • consolidation becoming more pressing
        • q: how access S3 in existing acct and vpc
          • opportunity: get on on the ELS fabric to benefit from the goodies
          • sdlc should deliver template apps
            • identify focus groups to set templates for (boot, fastapi etc)
          • ppt first doesn’t work historically
          • expand paved road conversation then drop in poc to demo

2024-09-06

  • smoke test
    • fixed new relic test locally
    • prepped cronjob
    • found arn:aws:secretsmanager:eu-west-1:183742092277:secret:service-account/new-relic-api-key-ANpqnl contains what I want
    • appears to be controlled from platform-infrastructure/prenode/iam.tf but not working

2024-09-05

  • daily cap test
    • test triggering cronjob
    • all appear to be running
    • catch up with Khush
    • seq diags

2024-09-04

  • Osmosis scheduling issue
    • retro observation? I spent a lot of time jumping to Slack notifications even when not actively contributing to them esp. last couple of days
  • cap test API
    • revert to body vars instead of having a split
  • daily cap test
    • add job name suffix
    • impl the feature flag test and read value from NR if enabled.
    • tested as job invoked from local machine
  • harassment training
    • seriously, pick your own harassment!

2024-09-03

  • 1-2h wasted on ZScaler / NewRelic events API
  • couple of hours to and from with Juan about LimitRanges
  • updates and discussion on cap test endpoint with Matteo - positive
  • daily cap test
    • minimal time - merged in revised API and about to start debugging why not behaving as expected

2024-09-02

  • planning
    • inspector check: anyone using default storage class (changing from gp2 to gp3)
    • felipe to lead ensuring 1-2-1 with 7 partners
    • will get karpenter 0.x into prod then move to 1.x (there is impact)
      • system component versioning
      • ashish mentioned felipe has asked for hard feature flag
        • left with felipe to report back on convo with Tim