September logbook

2025-09-30

not invited to go-live (Jen was pushing for Ops to be self-sufficient)
3730:
- figure out calling PfP lambdas either thru Postman, AWS toolkit or otherwise
- consider looking again at sam-run-local now that
go live went ok for Boots and Apotec
- need to defend against ‘undefined’ - ticket coming from Pete
standup
- Connor flags ’notification step not working’ in context of UI change he was doing.
rubber duck
- Connor was setting up regression tests, took the opp to ask Matt about my access… Apparently, he did int not dev and these are diff urls. Now resolved.
TODO look at regression tests
5824:
- declarations of DynamoDBDocumentClient
  - nhsUpdateLambda: https://github.com/NHSDigital/eps-prescription-status-update-api/blob/dbe7074dfd28a80d9e32c8960c749e0a44daee87/packages/nhsNotifyLambda/src/utils/dynamo.ts#L17
  - nhsNotifyUpdateCallback: https://github.com/NHSDigital/eps-prescription-status-update-api/blob/dbe7074dfd28a80d9e32c8960c749e0a44daee87/packages/nhsNotifyUpdateCallback/src/helpers.ts#L27
- common index.d.ts LastNotificationStateType definition does not include the 3 separate descrs
- Lifecycle of LastNotificationStateType
  - Upsert (PUT): nhsNotifyLambda dynamo.ts line 34: supplier status not set
  - Update: nhsNotifyUpdateCallback helper.ts l274: supplierStatus set to result of extractStatusesAndDescriptions
    - updateNotificationsTable
      - receives CallbackResponse (Notify data type)
      - extractStatusesAndDescriptions converts to object (not list?) of strings
      - then (inline) converts to a dict
      - then converts to dynamo update string.
    - instead:
      - enhance types:
        missing data from LastNotificationStateType,
        cast to CallbackResponse
        adapt to Last…
        retain [updated] builder method
        focused unit tests, refactoring to support testability
- discussion with Pete
  - bug: setting supplier status when no data received
  - omission: initialise supplierStatus
  - supplierStatus may be enough on its own?
  - write up

2025-09-29

boots, go live.
- 1st test: 4 notifications sent from notify (presume going thru the stages)
- 2nd test: 1
- return call from notify failed signature due to space at end of kid??? (ssm param set by Ant?)
- tags not showing on device (NPPTS issue)
  - 200 prescriptions created, 20? fetched by PfP API, seems likely ours was simply not in the list.
  - check code…
    - need to get spine url for .envrc

3730: running PfP

found values for .envrc from aws console in dev account

export AWS_DEFAULT_PROFILE=prescription-dev
export stack_name=tstephen-nhs-1
#export TARGET_SPINE_SERVER=<NAME OF DEV TARGET SPINE SERVER>
# from https://nhsd-confluence.digital.nhs.uk/spaces/APIMC/pages/669194951/KOP-017+Setup+to+call+new+spine+interaction+from+AWS+lambda
#export TARGET_SPINE_SERVER=veit07.devspineservices.nhs.uk
# from aws console (dev acct)
export TARGET_SPINE_SERVER=msg.veit07.devspineservices.nhs.uk
#export TARGET_SERVICE_SEARCH_SERVER=<NAME OF DEV TARGET SERVICE SEARCH SERVER>
export TARGET_SERVICE_SEARCH_SERVER=nhsuk-apim-stag-uks.azure-api.net

export LOG_LEVEL=DEBUG

make aws-configure - make sure to use the suggested sso-session name as make aws-login to renew session assumes it.
make sam-sync seems to work (sync is ok, did not invoke to test the result)
make sam-run-local runs out of file handles: Error: [Errno 24] inotify instance limit reached
- ```
cat /proc/sys/fs/inotify/max_user_watches
1048576
```
  (seems like a big number to me)
- apparently a function of the number of files sam has to watch created .samignore to reduce total number that wasn’t enough

return to make sam-sync

samcli.commands.deploy.exceptions.DeployFailedError: Failed to create/update the stack: tstephen-nhs-1, An error occurred

(ValidationError) when calling the CreateStack operation: Parameters: [StateMachineLogLevel] must have values ``` sam-sync does not read $STATE_MACHINE_LOG_LEVEL as deploy does modified template to provide default (note non-standard options)

delete cloud formation stack aws cloudformation delete-stack --stack-name tstephen-nhs-1
Found postman/README.md - TODO try it out
AWS toolkit missing from side [‘activity’] bar. Solved by Ctrl+Shift+P Developer: Reload Window

2025-09-26

notifications stand up
- Pete flagging either proxy or proxigen stories coming, will set up mtg w Matt/ANt next week on the latter
- Supplier onboarding page: https://nhsd-confluence.digital.nhs.uk/spaces/APIMC/pages/1111220722/Dispensing+System+Onboarding+Tracker
go live:
- clarified systems = ’ ’ as single ODS code being enabled for now
- clarified ODS codes to use
- clarified order of suppliers
rubber duck: prescription tracker login loop
- cve exceptions on validation
  - there are 2 because currently using Docker image plus lambda hoping to move to
  - https://github.com/NHSDigital/validation-service-fhir-r4
  - https://github.com/NHSDigital/eps-FHIR-validator-lambda
5087:
- packages/coordinator/README suggests npm run server should be npm run start
  - however, npm run start-dev appears to need a package.json in packages (aka ..)
  - unclear what content should be
  - perhaps related, why does coordinator/package.json reference files from packages rather than .?
    - I did try ‘fixing’ start-dev to match others without success
run thru
- Jim needs to provide splunk query for notification

2025-09-25

5803: pair with Jim on change to
- rename secret as KID rather than name
- replace isProd within SAMTemplates with injected params
quite a long chat
chased yesterday’s PRs
5087:
- affects electronic-prescription-service-api
  - PR 3868
- Try the AWS tools to hit the ‘working’ endpoint .../eps-fhir-prescribing-api/content#post-/FHIR/R4/$prepare
- if good, try remote debugging?
- need to understand contents of .envrc? (See README)
go live regroup
- test user in prod. currently hostage to having to have dispenser onboard
- review Go Live, set up dev profile and verify can run aws sam similar commands

2025-09-24

AEA-5789:
- status:
  - tried setting up JWKS and got access token but 401 on PSU endpoints (did add the PR endpoints)
  - Postman: how set up Auth? JWT page but what a faff
  - Regression tests: need PSU_CLIENT_ID and PSU_CLIENT_SECRET
  - VSCode AWS plugin invoke lambda
- AWS plugin did the trick with data payload extracted from postman
Sprint review
- Adam has checked out 2 supplies x 2 prescriptions in prod ready for tonight
Digi medicine (Fintan Grant)
- 15 Oct: Team mtg in Leeds
- Need to spend more (time and money) on getting adoption
- EPS a good example of breaking out of SPINE monolith to enable smaller more freq. releases
- Will Gallear: onboarding suppliers
Regression tests: need PSU_CLIENT_ID and PSU_CLIENT_SECRET
- Jim suggests: https://dos-internal.ptl.api.platform.nhs.uk/MyApplications/ApplicationDetails?appId=a453bc48-c5f3-409e-ba65-a2b34a490ff3
  - I cannot view
15 oct team mtg
- https://www.pitchup.com/campsites/England/North_East/North_Yorkshire/Selby/squires-cafe-bar/?arrive=2025-10-15&depart=2025-10-16
- https://www.squires-cafe.co.uk/menu/
AEA-3730: search api

2025-09-23

Kayal asked for evidence on PfP sandbox PR

cyclonedx-npm --output-format json --output-file sbom-node.json
grype sbom:sbom-node.json -o table

to install

npm install -g @cyclonedx/cyclonedx-npm
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin

backlog
- 5732: OAS schema change
- 4876: API to post to sharepoint, permissions may be a challenge. Later on to power BI
  - depends on XXXX to extract from splunk, Ant has the key
  - splunk func. is email only
  - tech mtg of options, mon mtg approve, de=cision log
- 4215: canary instead of blue-green -> delta (Ant)
- 3789: compse list of things inSAM eligible for deletion and ask for approval
- 5389: add app name to existing report
- 3730: marked dupe. but perhaps confused prescrip and dispens addrs
- 3532: SAM template faffing
- 4110: don’t worry about diff defn of PfP
- 3771: acct resources => https://github.com/NHSDigital/electronic-prescription-service-account-resources

2025-09-22

Greg said he would resolve timesheet behind the scenes, not to worry that it won’t show on UI
- 15:30
  Hi @tim.stephenson just to let you know. I’ve had confirmation back from finance that they’ve adjusted your timesheet accordingly. https://airelogic.slack.com/archives/D09EHCZMBG8/p1758551433902379
AEA-5789
- cloud formation failed due to previous rollback (copilot explained)
  - Jim explained have to manually clean, GHA cleanup only applies to old PRs
- I was missing export of q from messaging
- Then hit illegal name (...fifo) make sam-validate or make sam-run-local would have flagged
- testing…
  - import postman collections (3!)
  - create [postman] environment
    - cpsu_api_key: wFW9WPe2ZVkEsgu9G0R5C6As0MDxb74p
    - api_key: PU1p1qtY6T48umPmjpnWEUjKp6CfUpti (Secret: Zdb73CGw57nApEnP)
      - Jim: XklkCNg5TUk45xSBdQUcl13J4BA7Z8EJ
    - kid: psu-dev-tstephen-nhs
      - PSU-DEV-JIM-WILD
    - psu-kid: ditto
    - private_key: generate per: https://digital.nhs.uk/developer/guides-and-documentation/security-and-authorisation/application-restricted-restful-apis-signed-jwt-authentication#step-2-generate-a-key-pair
    - host: internal-dev.api.service.nhs.uk
    - custom_stack_name: psu-pr-xxxx (but redundant?)
    - aws_pull_request_id: pr number
    - status_api_key: same as api key (also redundant)
    - NOTIFY_API_KEY: 7R4fNjWvO3hTG6wj3xZxZqKF448mRRen
    - NOTIFY_APP_NAME: EPS-NHS-NOTIFY
    - NOTIFY_APP_ID: bd492cde-b67e-487b-97fd-44b7414c8e95
AEA-0000-rename-sandbox
- Bence approved, then I had to merge main and sign and it all went to hell SQ no longer finds the package

2025-09-19

new repo: eps-test-reports
- 2 GHAs to publish ~10 mins, large size of gh-pages branch
- All repos list: https://nhsd-confluence.digital.nhs.uk/spaces/APIMC/pages/889961113/EPS+github+repositories
’notifications’ team is now ’tango’
AEA-5789: alerts
- NHSNotifyPrescriptionsSQSQueue defined in SAMTemplates/messaging/main.yaml with redrive policy to NHSNotifyPrescriptionsDeadLetterQueue after 5 failures
timesheets
- Jira: logged Th 18 & Fr 19
- Airecentre: issue, contacted Greg, who’s away.
TODO: Splunk training: https://hscic365.sharepoint.com/sites/NMS/SitePages/Splunk-Training.aspx

2025-09-18

Daniel (Danny) Williams: lead arch for NHS app
- goal: successful, first time
Questions
- future research on finer granularity of notifications
- poll SQS every minute to reduce load by batching Notify calls
- sensitivity about anything that turns off notifications (apparently 40% off currently)
- 177k messages / day for 50% app take up and 17% pharmacy takeup
  - app can handle 1000s notifications / sec
  - vaccinations can queue many millions of notifications so can lead to significant delay
- two unrelated retries: invisible 5 mins plus retry back off
- future research on how old a notification can be whilst still being useful
- notify is a ‘silver’ service = 24x365
- notify problems: find when vaccination notifications being sent!
- Hayley Stokes: cross-team standup in pm for period of hypercare after go-live
- Kelvin Lee controls access to Notify’s ‘actual’ delivery status
Dev: security patching
- assigned Jiras to self, sprint and put In Dev
- Fixed SAM, PR passing: https://github.com/NHSDigital/prescriptionsforpatients/pull/2069
- stack name: pfp-pr-2069
- cache gpg passphrasehttps://superuser.com/questions/624343/keep-gnupg-credentials-cached-for-entire-user-session
- Postman:
  - setup with Jim
    - download
    - create env
    - add vars to env: status_api_key, api_key, client_secret, client_id, host, aws_pull_request_id#
    - turns out only the last needed and is GH PR not AWS
    - did divert to create NHS ‘app’ (effectively a svc account for me, for pfp in dev) but unneeded

2025-09-15

AEA-5087
- attempt to find invoke url from logging
New starter guide
- detailed log:
- Time reporting 7.5h / day against epic
  - https://nhsd-jira.digital.nhs.uk/browse/AEA-5264: Onboarding
  - otherwise follow epic link on ticket

2025-09-15

Standup
https://nhsd-jira.digital.nhs.uk/browse/AEA-5087
- env change is done, now need to understand why sandbox is is connecting to spine.
- reproduce in internal-dev-sandbox
- create branch, add debug, check in logs (Cloudwatch)
deployment history
- https://nhsdigital.github.io/electronic-prescription-service-api/ (broken pending Ant PR)
- https://nhsdigital.github.io/eps-prescription-status-update-api/
launch readiness review
AEA-5743: monitor for errors and correct retyr in prod
AEA-5087 Ant said:
the sandbox api will then be avaialble at prescribe-dispense-pr-${pull-request-id}}-sandbox I understood this to mean that I can swp the host in my curl command from https://internal-dev-sandbox.api.service.nhs.uk to https://prescribe-dispense-pr-3868-sandbox.api.service.nhs.uk [3868 is my PR number] But apparently not. Where is my stack deployed?
Going thru the console I found: the (ECS)[https://eu-west-2.console.aws.amazon.com/ecs/v2/clusters/prescribe-dispense-pr-3868-cluster/services/prescribe-dispense-pr-3868-fhirFacadeService41A82399-TTPehiMquq8Q/health?region=eu-west-2] and therefore the (ALB)[https://eu-west-2.console.aws.amazon.com/ec2/home?region=eu-west-2#LoadBalancer:loadBalancerArn=arn:aws:elasticloadbalancing:eu-west-2:591291862413:loadbalancer/app/prescr-fhirF-Uy1LRYO4WPwt/eba4342f823aa5a5;tab=listeners]

2025-09-12

Ant: AWS access should be there, thus unblocking Jira/Confluence
- spend time getting setup today
Paul: catch up this pm
Rubber duck
- Jim & splunk

2025-09-11

retro on mural
- team names: notifications -> tango
- tech changes at impl ticket stage need to feedback into docs (from Pete)
- new starters:
  - Mike Grimwood, Lead Dev Mgr
no Jira/Confluence

2025-09-10

mtgs: standup, review
onboarding
- two accounts: NHS & VDI
- Jiras
  - 5743
  - 5922
  - 5729: query for users with 3 notifications, tabulate with time receoved and flag if violates cooldown
  - 5728

2025-09-09

mtgs
chat w Jim on the 130% problem
- need PSU with more than one MedicationRequest
- https://github.com/wildjames/JimScripts

2025-09-08

45mins mtgs
- splunk issue identified cause of over-reported notifications
  - Jim, Ant and I to look at
Notifications refinement
- planning now on NHS portal: Prescription Tracking Notifications Research plan.xlsx
- query about 130% notifications
  - no variation on Sun / bank holiday
  - reprocessing some records? => issue with splunk query?

2025-09-05

‘main’ stand-up
- John Kitson: live incident
- Jonathan Welch
- Paul Hoskin
- James (Jim) Wild
- Connor Avery
- Jennifer Redman
- Matthew Popat
- Bencé Gadanyi
- Kayal Rathinaveeelu
- Thomas (Tom) Merrington
Greg Steel
- Aire centre
- submitted 3 days for w/c 1st Sept
- remember to submit part weeks before month end
Paul H
- will send invites
- projects
  - prescription tracker
  - eps assist me: cchatbot enhancement
  - notifications
- parallel team ’echo’ (John ???, Emma (BA))
- ceremonies
  - standup evs: M,W,F
  - standup full team: T, T
  - rubber duck
  - sprint review: fortnightly, seeking to be more prepared and structured
- team:
  - leads: Ant, Matt, Adam (new)

2025-09-04

planning meeting: prod go live (slated for 18/09)
- need prod patient to test with
  - testing shld only be a day but relies on Emma’s capacity
- splunk report def’ns stored in individual’s profile

2025-09-03 - day 1 at AireLogic

quick chat with Paul Hoskins will set up:
- contact w Jim (pat leave)
- intro to notifications project
Lyndsey wanted to register equipt, which led me to note:
- need antimalware, disk encryption and warning of dismissal if access systems from abroad.
- ClamAV:
```
sudo apt update && sudo apt upgrade
[ -f /var/run/reboot-required ] && echo "Restart required"
sudo apt install clamav clamav-daemon clamtk
clamtk &
```
  found a handy reference
- turned out I was lzy on install, no root encryption
  - option to do retrospectively: https://unix.stackexchange.com/questions/444931/is-there-a-way-to-encrypt-disk-without-formatting-it
  - but prob easier to reinstall
prescriptions repo
TODO
- dev containers
- signing git commits
- nixos? other dynamic environment?
- encrypted disk
  - clean root disk
  - clean / backup data disk
Questions
- sign commits as whom? should be airelogic account?
- dev container runs against remote environment or both hosted locally?
Dev containers
- Reference
Intro to notifications
- Go live in 2 weeks and then 1 week before Jim off on pat leave for 3 motnhs
- Jo:
  - user researcher, onboard 1 year, just after Jim
  - also on: ? and eps: assist me
  - wider prescriptions and medications
    - prescription bar code could allow to take scriptto another pharmacy
    - ???
  - national rollout starting, 60% of pharmacies, boots is fully included in current 16%
  - limited press, unlikely to be discoverable
- Pete is the BA, who is responsible for the partner suppliers
- Jim
  - dedupe rules (eg 2 items on script or 2 scritps at diff pharmacies)
  - pharmacy sends PSU to trigger the seq diag.
  - msg goes to FIFO SQS q (some dedupe inherent)
    - id encodes 1 notificiation per person, per pharma, per 5 mins
  - processor lambda starts 1/min and works till q empty or ’nearly’ (becos of 100 batch size)
  - cool off (in dynamo) prevents more frequent than 2h;y notification
    - success can therefor mean not sent
  - silent (parallel) running waits 100ms which is approx. how long it takes to get notification then record answer
  - failures go to dead letter q
    - 2 weeks of a month TTL but no handling on there.
  - timeliness is key business case rests on saving 75% of pharmacy time (answering phones)
  - notify provide call back service (to ‘us’ not patient) - stashed in same dynamo table
- Kayal (she) is tester
  - scenario 2: notifications are off, should
    - message, channel, supplier
- Joe Seaton, contact in Notify team
shadowing
- 5 environments (env => AWS account)
  - dev: PRs deploy there
  - qa
  - ref
  - backup
  - prod
- test users (NHS number):
  - …3126
- ODS codes identify a pharmacy
- PFP: prescriptions for patients