2025-03

  • Inspector UI API
    • Fix /reports with key and with platform class find . -name "*.py" -exec wc -l {} + = 7316 find inspector_ui -name "*.py" -exec wc -l {} + = 3993
  • C3
    • Garrett discussion of revised Transit Gateway application
      • ‘async api pattern’
        • POST insert to a DynamoDB table
          • triggers API Gateway, which itself sets up a state machine
          • state machine triggers Step function pipes when states change
  • Planning: bring non prod and prod into regular Cortex mgmt

2025-02-28

  • Inspector UI API
    • working with Khush on Grafana
      • eg quotes around $v.‘report-creation-time’
      • drop /platforms/query in favour of * for all on /platforms

2025-02-27

  • Inspector UI API
    • fix the dependency of analyse bundle on tier find . -name "*.py" -exec wc -l {} + = 8048 find inspector_ui -name "*.py" -exec wc -l {} + = 4473
    • fix Affected components empty
    • add Projection to exclude affected components
    • Remove all deprecations and /report-summaries

2025-02-26

  • 1-2-1 w Irfan

    • agreed w my decision to wait for team OKRs before going further on personal
      • need to better align with what may actually happen
    • concept of ‘pods’ as agent for change (aka enablement or swat teams)
      • offered to jump the fence to DBS to solve a CSC problem if the will is there
  • Inspector UI API

    • diagnosing filter issues on the ‘analysis’ dashboard
      • realised that analysis dash does permit selection of any combination of report-filters
      • Done this way we cannot ever leverage a query, it will always be a scan I guess, given the requirement to apply arbitrary filters there is no other solution And as we have said repeatedly, there are only a small number of records in the table
    • stats find . -name "*.py" -exec wc -l {} + = 8122 find inspector_ui -name "*.py" -exec wc -l {} + = 4518
  • Cortex OKRs presented by Irfan

    • Map to deAnna slides
      • Paved roads => ephemeral clusters, SDLC maturity
      • “drilling into the wall from both sides”
    • Q1
      • Karpenter 1.x, Amazon Linux AL2023
    • Q2
      • Bakery
        • focus on Helios term ‘system’ (app) whether in K8s or not.
        • a common ELS baseline
      • Single touchpoint for partners all funnelled through Backstage
        • not priority to Irfan
      • comprehensive Go CI/CD reference, ‘set the standard before others set it for us’ TimSm
      • self-serve production grade platforms (on the back of TRDR)
        • can do alpha and beta, says Irfan???
        • Irfan downplaying, perhaps expecting to have push back
      • multi-tenancy
        • relies on getting finops on board
      • cortex apps
        • ‘a feature of the platform’
        • benefit existing and new teams that don’t want to be bothered with plumbing
        • Merrick: just want to host a container without requesting new cluster and all that jazz.
          • ’namespace as a service’
      • what value is Grafana?
      • TimSm: standards: docs, test coverage, consistency
      • TimSm: Project Mestor: GenAI land grab
      • SDLC: target level 3
      • AWS Lambda (not just any serverless): the next frontier for a Cortex like platform
        • biggest use of Lambda identifiable is Account Factory => not the demand, cortex apps is more interesting.

2025-02-25

  • Inspector UI API
    • Enabled checks missing? (just local problem)

    • X Scan for when no triumvirate

    • X remaining_days

    • X advisory_count

    • X get rid of uid (keep key instead)

    • Affected components empty

    • Date required issue

    • Remove all deprecations and /report-summaries

    • Projection to exclude affected components

2025-02-24

  • new cluster
    • argo

    • crossplane?

    • sox

    • central logging

    • Book modules (inc. CCX, which is SOX), EOPS (PHP?)

    • argo

      • biz svcs have a model (non DRY)
      • CWS later, migrate once able to get them on argo
      • biz svcs have monorepo approach to so-called microservices
    • Book module

      • AT application?
  • Inspector UI API
    • discovered:
      • cannot rename date params in API as breaks compatibility (and anyway use fo reserved word seems fine)
      • old records in dynamodb despite delete at 14 days
        • resolution: missing while in delete_old_records_from_table

2025-02-21

  • Inspector UI API
    • renaming advice to advisory everywhere then fixing tests find . -name "*.py" -exec wc -l {} + = 9207 find inspector_ui -name "*.py" -exec wc -l {} + = 4654

2025-02-19, 20

  • mostly on cluster creation for SD
  • Inspector UI API in the gaps

2025-02-18

  • Inspector UI API
    • no permission to save summary
       botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the PutI │
      

│ tem operation: User: arn:aws:sts::781632261136:assumed-role/Cortex-Inspector-UI/botocore-session │ │ -1739814867 is not authorized to perform: dynamodb:PutItem on resource: arn:aws:dynamodb:eu-west │ │ -1:781632261136:table/cortex-inspector-ui-summary because no identity-based policy allows the dy │ │ namodb:PutItem action ``` - resolved couple of backwards incompatibilities around tier and * params - still have issues with analysis dashboard even after all fetches are 2xx

  • most of the day spent on cluster deletion and creation
    • LOATHE: shared responsibility that means partner left SG around despite explicit request (too much complexity)
    • LOATHE that platman re-creates platform on running delete in order to find outputs. compounded by Karpenter sync being disabled and therefore timing out on EBS CSI create

2025-02-17

  • Alerts follow up: a trend on NR?
  • SD want cluster deleted and recreated
  • some chat with Rob W about new DBS ‘super-cluster’
  • planning
    • SDLC
  • Inspector UI API
    • looks like cron jobs did not run?
    • investigate reveals loads of inspector service code to delete
    • clean up exception handling find . -name "*.py" -exec wc -l {} + = 9361 find inspector_ui -name "*.py" -exec wc -l {} + = 4657

2025-02-14

  • Inspector UI API
    • complete status page updates
    • deploy to nonprod inc. Docker changes

2025-02-13

  • Inspector API: TODO
    • continue to test status page
      • next is figure out the bundle endpoint
      • is 200 when not data breaking the page?
    • figure out exception handling strategy
    • test compatibility or decide to go for big bang
      • could mitigate by bringing grafana terraform into PR and testing on separate dash
  • tier 1 creation for SD
    • Liam fixing the logging
    • stuck on length constraint
    • will slip the AL2023 thru first
  • reaxys platforms
    • done (on top of 2 yesterday): gsk, novartis, ns, roche

2025-02-12

2025-02-11

  • Inspector API
    • complete /reports find . -name "*.py" -exec wc -l {} + = 10316 find inspector_ui -name "*.py" -exec wc -l {} + = 5110

    • refactor /reports uploading bundle

      • extract_bundle_info relies on “reliability/check-request-specs.json” being present!!! find . -name "*.py" -exec wc -l {} + = 10099 find inspector_ui -name "*.py" -exec wc -l {} + = 4970
    • refactor /reports by bundle id find . -name "*.py" -exec wc -l {} + = 10103 find inspector_ui -name "*.py" -exec wc -l {} + = 4974

    • fix tests and introduce parameter objects

    • compare to main

      • find . -name "*.py" -exec wc -l {} + = 10605
      • find inspector_ui -name "*.py" -exec wc -l {} + = 5157
      • if replace main with new main get a total saving of ~20% on code ex tests

2025-02-10

  • Inspector API
    • reworking get_reports (much duplication)
  • Retro
    • agree to remove the worthless pydoc, look for linter (black option?)

2025-02-08 (Sat)

  • Inspector UI API
    • ctd on POST /reports
      • simplify summary from report
      • derived properties in ReportSummary

2025-02-06

  • Inspector UI API
    • POST /reports/{platform_class}/{product}/{cluster}
    • refactoring, eliminating redundant code
  • S3 terraform options discussion

2025-02-06

  • Inspector UI API
    • /report-filters
    • find . -name "*.py" -exec wc -l {} + =
  • Science Direct investigation NRQL read from InstastructureEvent where event.message like ’load-simulator%’ where cluster … since .. until… cloudwatch : reporting-component":“Karpenter”

2025-02-05

2025-02-04

  • captest failures: 14 VPA but nothing else
  • Inspector UI API reimplemented /info and /platforms
    • find . -name “*.py” -exec wc -l {} +` = 10972

2025-02-03

  • review captest failures: 30
  • c3
  • planning
  • review of Inspector UI API (minimal)

2025-01-31

  • review of Inspector UI API
    • find . -name "*.py" -exec wc -l {} + = 10605