4201-4250 of 10000 results (39ms)
2022-02-25 §
10:22 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2029.codfw.wmnet to ganeti01.svc.codfw.wmnet [production]
10:22 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to ganeti01.svc.codfw.wmnet [production]
10:17 <vgutierrez> rolling upgrade to HAProxy 2.4.13 on HAProxy cache nodes - T290005 [production]
09:34 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet [production]
09:28 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet [production]
02:43 <cstone> Donation Interface revision changed from a6a9b63e to 4638c0ec [production]
2022-02-24 §
23:35 <ryankemper> T302526 Deployed https://gerrit.wikimedia.org/r/765652 and ran puppet across wcqs* [production]
22:06 <mutante> static-bugzilla.wikimedia.org - kubernetes - deployed gerrit:765572 - first prod service behind a k8s ingress (T290966) [production]
22:05 <mutante> phabricator - disabled git repo - labs-tools-harvesting-data-refinery/repository/master/ [production]
21:50 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2086.codfw.wmnet with OS bullseye [production]
21:45 <brennen> end of UTC late backport & config window [production]
21:43 <dancy@deploy1002> Started scap: testing scap container image building [production]
21:43 <tzatziki> removing 1 file for legal compliance [production]
21:42 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2085.codfw.wmnet with OS bullseye [production]
21:41 <mutante> phabricator - disabled git repo "frig" - outdated fundraising stuff, checked with fr-tech, not needed T296022 [production]
21:40 <brennen@deploy1002> Synchronized php-1.38.0-wmf.23/includes: Backport: [[gerrit:765626|Revert "Revert "Revert "Show message fallback keys when using &uselang=qqx"""]] (duration: 00m 57s) [production]
21:39 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2086.codfw.wmnet with reason: host reimage [production]
21:36 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2086.codfw.wmnet with reason: host reimage [production]
21:34 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2085.codfw.wmnet with reason: host reimage [production]
21:30 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2085.codfw.wmnet with reason: host reimage [production]
21:29 <brennen@deploy1002> Synchronized wmf-config/CirrusSearch-production.php: Config: [[gerrit:765577|cirrus: Reduce write isolation to only cloudelastic (T295705)]] (duration: 00m 55s) [production]
21:27 <mutante> phabricator - disabling git repo rGEDS (Elasticdash) - only one commit from 2015 - T296022 [production]
21:19 <tzatziki> removing 1 file for legal compliance [production]
21:19 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host elastic2086.codfw.wmnet with OS bullseye [production]
21:18 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2083.codfw.wmnet with OS bullseye [production]
21:13 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host elastic2085.codfw.wmnet with OS bullseye [production]
21:11 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2084.codfw.wmnet with OS bullseye [production]
21:07 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2083.codfw.wmnet with reason: host reimage [production]
21:05 <tzatziki> removing 4 files for legal compilance [production]
21:04 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2083.codfw.wmnet with reason: host reimage [production]
21:02 <taavi@deploy1002> Finished deploy [horizon/deploy@9d02cd6]: (no justification provided) (duration: 03m 18s) [production]
21:01 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2084.codfw.wmnet with reason: host reimage [production]
20:59 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host elastic2083.codfw.wmnet with OS bullseye [production]
20:58 <taavi@deploy1002> Started deploy [horizon/deploy@9d02cd6]: (no justification provided) [production]
20:58 <pt1979@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2084.codfw.wmnet with reason: host reimage [production]
20:51 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host elastic2084.codfw.wmnet with OS bullseye [production]
20:14 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2084.codfw.wmnet with OS bullseye [production]
20:10 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2083.codfw.wmnet with OS bullseye [production]
20:04 <ryankemper> T302526 `ryankemper@cumin1001:~$ sudo -E cumin -b 3 'wcqs*' 'enable-puppet "query_service: Simply jvm arg handling - T302526"; sudo run-puppet-agent'` in tmux `wcqs` [production]
20:02 <ryankemper> T302526 Depooled `wcqs1001`, ran puppet agent, and restarted `wcqs-blazegraph`. Service came up healthy, proceeding to rest of wcqs fleet [production]
19:57 <ryankemper> T302526 `ryankemper@cumin1001:~$ sudo -E cumin -b 6 'wdqs*' 'enable-puppet "query_service: Simply jvm arg handling - T302526"; sudo run-puppet-agent'` in tmux `deploy_window` [production]
19:55 <ryankemper> T302526 Depooled canary `wdqs1003`, ran puppet agent, and restarted `wdqs-blazegraph`. Tests look good, proceeding to rest of wdqs fleet [production]
19:48 <ryankemper> T302526 (Forgot to merge patch first, take two) [production]
19:48 <ryankemper> T302526 Running puppet on wdqs canary: `ryankemper@wdqs1003:~$ sudo enable-puppet "query_service: Simply jvm arg handling - T302526" && sudo run-puppet-agent` [production]
19:46 <ryankemper> T302526 Disabling puppet across entire query service (wdqs & wcqs) fleet for merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/761080: `ryankemper@cumin1001:~$ sudo -E cumin 'w*qs*' 'disable-puppet "query_service: Simply jvm arg handling - T302526"'` [production]
19:06 <dduvall@deploy1002> rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.23 refs T300199 [production]
19:00 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host elastic2084.codfw.wmnet with OS bullseye [production]
18:56 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host elastic2083.codfw.wmnet with OS bullseye [production]
18:55 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host elastic2085.mgmt.codfw.wmnet with reboot policy FORCED [production]
18:53 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2082.codfw.wmnet with OS bullseye [production]