2023-03-21
§
|
13:38 |
<nfraison@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
13:33 |
<elukey@cumin1001> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts kafka-main1005.eqiad.wmnet |
[production] |
13:29 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts kafka-main1005.eqiad.wmnet |
[production] |
13:28 |
<nfraison@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
13:25 |
<nfraison@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
13:21 |
<nfraison@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
13:16 |
<elukey@cumin1001> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts kafka-main1005.eqiad.wmnet |
[production] |
13:11 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware |
[production] |
13:11 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware |
[production] |
13:05 |
<elukey> |
move kafka mirror maker instances to PKI migration settings (new truststores) - T319372 |
[production] |
11:20 |
<aikochou@deploy2002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
11:09 |
<joal> |
Unpause mediacounts_load airflow job with start_date set to 2023-03-21T10:00 |
[production] |
11:08 |
<joal> |
Kill mediacounts_load oozie job |
[production] |
11:07 |
<joal> |
Unpause mediawiki_history_denormalize airflow job |
[production] |
11:06 |
<joal> |
Kill mediawiki_denormalize oozie job |
[production] |
11:04 |
<joal@deploy2002> |
Finished deploy [airflow-dags/analytics@42e862b]: Regular analytics weekly train [airflow-dags/analytics@42e862b] (duration: 00m 11s) |
[production] |
11:04 |
<joal@deploy2002> |
Started deploy [airflow-dags/analytics@42e862b]: Regular analytics weekly train [airflow-dags/analytics@42e862b] |
[production] |
10:43 |
<nfraison@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
10:32 |
<nfraison@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
10:24 |
<joal@deploy2002> |
Finished deploy [analytics/refinery@0bb61e9] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0bb61e9] (duration: 01m 30s) |
[production] |
10:22 |
<joal@deploy2002> |
Started deploy [analytics/refinery@0bb61e9] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0bb61e9] |
[production] |
10:22 |
<joal@deploy2002> |
Finished deploy [analytics/refinery@0bb61e9] (thin): Regular analytics weekly train THIN [analytics/refinery@0bb61e9] (duration: 00m 09s) |
[production] |
10:22 |
<joal@deploy2002> |
Started deploy [analytics/refinery@0bb61e9] (thin): Regular analytics weekly train THIN [analytics/refinery@0bb61e9] |
[production] |
10:22 |
<joal@deploy2002> |
Finished deploy [analytics/refinery@0bb61e9]: Regular analytics weekly train [analytics/refinery@0bb61e9] (duration: 07m 48s) |
[production] |
10:14 |
<joal@deploy2002> |
Started deploy [analytics/refinery@0bb61e9]: Regular analytics weekly train [analytics/refinery@0bb61e9] |
[production] |
09:43 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.reimage for host kafka-main1005.eqiad.wmnet with OS bullseye |
[production] |
09:39 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, attempt to reimage |
[production] |
09:39 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, attempt to reimage |
[production] |
09:25 |
<phedenskog@deploy2002> |
Finished deploy [performance/navtiming@d2b97ad]: (no justification provided) (duration: 00m 06s) |
[production] |
09:25 |
<phedenskog@deploy2002> |
Started deploy [performance/navtiming@d2b97ad]: (no justification provided) |
[production] |
09:06 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Systemd units failing, pupper tries to bring them up periodically, spam on IRC |
[production] |
09:05 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Systemd units failing, pupper tries to bring them up periodically, spam on IRC |
[production] |
08:31 |
<elukey> |
move purged daemons on cp nodes to a new CA bundle (to allow accepting kafka clients using PKI tls certs) - T319372 |
[production] |
06:50 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13150 |
[production] |
06:49 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.peering with action 'configure' for AS: 13150 |
[production] |
03:57 |
<mwpresync@deploy2002> |
Pruned MediaWiki: 1.40.0-wmf.26 (duration: 02m 18s) |
[production] |
03:55 |
<mwpresync@deploy2002> |
Finished scap: testwikis wikis to 1.41.0-wmf.1 refs T330207 (duration: 52m 38s) |
[production] |
03:02 |
<mwpresync@deploy2002> |
Started scap: testwikis wikis to 1.41.0-wmf.1 refs T330207 |
[production] |
2023-03-20
§
|
22:00 |
<samtar@deploy2002> |
Finished scap: Backport for [[gerrit:901275|Add languages to Minerva HTML (T331905)]] (duration: 09m 45s) |
[production] |
21:52 |
<samtar@deploy2002> |
jdlrobson and samtar: Backport for [[gerrit:901275|Add languages to Minerva HTML (T331905)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet |
[production] |
21:50 |
<samtar@deploy2002> |
Started scap: Backport for [[gerrit:901275|Add languages to Minerva HTML (T331905)]] |
[production] |
21:34 |
<TheresNoTime> |
`[samtar@mwmaint2002 ~]$ mwscript maintenance/namespaceDupes.php --wiki shwiki --fix` T332614 |
[production] |
21:25 |
<TheresNoTime> |
closing UTC late backport window, extended |
[production] |
21:22 |
<samtar@deploy2002> |
Finished scap: Backport for [[gerrit:901276|Rename project and project talk namespace for shwiki (T332614)]] (duration: 12m 22s) |
[production] |
21:11 |
<samtar@deploy2002> |
samtar and aleksandar: Backport for [[gerrit:901276|Rename project and project talk namespace for shwiki (T332614)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet |
[production] |
21:10 |
<samtar@deploy2002> |
Started scap: Backport for [[gerrit:901276|Rename project and project talk namespace for shwiki (T332614)]] |
[production] |
21:09 |
<ebernhardson@deploy2002> |
Finished deploy [airflow-dags/search@1302ca2]: ensure swift_upload delete_after is an integer (duration: 00m 13s) |
[production] |
21:09 |
<ebernhardson@deploy2002> |
Started deploy [airflow-dags/search@1302ca2]: ensure swift_upload delete_after is an integer |
[production] |
21:09 |
<samtar@deploy2002> |
Finished scap: Backport for [[gerrit:898845|Enable new Vector (2022) "Add topic" button at arwiki (T331313)]], [[gerrit:898846|Enable DiscussionTools usability improvements at arwiki (T329407)]] (duration: 08m 34s) |
[production] |
21:02 |
<samtar@deploy2002> |
matmarex and samtar: Backport for [[gerrit:898845|Enable new Vector (2022) "Add topic" button at arwiki (T331313)]], [[gerrit:898846|Enable DiscussionTools usability improvements at arwiki (T329407)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet |
[production] |