1551-1600 of 10000 results (37ms)
2021-03-10 §
00:02 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup1002.eqiad.wmnet with reason: REIMAGE [production]
00:00 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup1001.eqiad.wmnet with reason: REIMAGE [production]
2021-03-09 §
23:59 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1002.eqiad.wmnet with reason: REIMAGE [production]
23:58 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1001.eqiad.wmnet with reason: REIMAGE [production]
22:04 <mutante> phab1001 - manually running phab public task dumd script after making changes to redirect stdout [production]
22:00 <razzi> rebalance kafka partitions for webrequest_upload partition 14 [analytics]
21:54 <marxarelli> restoring from db06 dump on db07 and db08 following `DROP VIEW IF EXISTS user` workaround (T276968) [releng]
20:53 <marxarelli> restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 (T276968) [releng]
20:53 <marxarelli> restore on db07 failed. appears to be a bug w/ mariadb/mysqldump 10.4 compat https://jira.mariadb.org/browse/MDEV-22127 [releng]
20:42 <elukey> reimaged an-worker1091 to buster [production]
20:42 <elukey> reimaged an-worker1091 to buster [analytics]
20:41 <bstorm> depooled labsdb1009 T276980 [production]
20:39 <marxarelli> doing `--skip-grant-tables` on deployment-db08 and creating a new root@127.0.0.1 user (T276968) [releng]
20:33 <Majavah> install mariadb on deployment-db08 T276968 [releng]
20:25 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1091.eqiad.wmnet with reason: REIMAGE [production]
20:25 <bstorm> downtimed labsdb1009 so it doesn't keep paging T276980 [production]
20:23 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1091.eqiad.wmnet with reason: REIMAGE [production]
20:09 <brennen> train status: 1.36.0-wmf.32 (T274938) on group0 at 20:06:32 UTC; logs initially quiet. [production]
20:06 <brennen@deploy1002> rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.34 [production]
19:59 <marxarelli> creating new instance deployment-db08 to use as new beta replica db (T276968) [releng]
19:56 <marxarelli> deleting deployment-db05 to free up quota for new replica (T276968) [releng]
19:50 <marxarelli> restoring database dump on deployment-db07 (T276968) [releng]
19:05 <brennen@deploy1002> Pruned MediaWiki: 1.36.0-wmf.31 (duration: 03m 34s) [production]
19:04 <pt1979@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
18:59 <pt1979@cumin2001> START - Cookbook sre.dns.netbox [production]
18:54 <brennen@deploy1002> Finished scap: testwikis wikis to 1.36.0-wmf.34 (duration: 47m 25s) [production]
18:52 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1087.eqiad.wmnet with reason: REIMAGE [production]
18:49 <marxarelli> restarting db dump on db06 `mysqldump -h 127.0.0.1 --events --routines --triggers --all-databases -f --single-transaction` (T276968) [releng]
18:49 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1087.eqiad.wmnet with reason: REIMAGE [production]
18:47 <dcausse> re-pool wdqs1004 [production]
18:38 <Majavah> installing mariadb 10.4 via role::mariadb::beta to db07 T276968 [releng]
18:37 <mbsantos@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
18:35 <mbsantos@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
18:34 <pt1979@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
18:29 <pt1979@cumin2001> START - Cookbook sre.dns.netbox [production]
18:26 <elukey> reimage an-worker1087 to buster [production]
18:26 <elukey> reimage an-worker1087 to buster [analytics]
18:25 <marxarelli> "View 'labswiki.tag_summary' references invalid table(s) or column(s) or function(s) or definer/invoker of view lack rights to use them" when using LOCK TABLES" during mysqldump on db06 (T276968) [releng]
18:21 <Majavah> create deployment-db07 as g2.cores8.ram16.disk160 Buster T276968 [releng]
18:20 <marxarelli> disabled puppet on deployment-db06 and started mysqldump (T276968) [releng]
18:16 <mbsantos@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
18:13 <mbsantos@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
18:12 <brennen@deploy1002> Started scap: testwikis wikis to 1.36.0-wmf.34 [production]
18:10 <mbsantos@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
18:09 <Majavah> set deployment-db05 to read-only to avoid issues with T276968 [releng]
18:05 <mbsantos@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
18:04 <marxarelli> deleting shut down memc* deployment-prep instances to free up quota for replacement db instances (T276968) [releng]
18:03 <mbsantos@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
18:02 <marxarelli> deleting shut down memc* deployment-prep instances to free up quota for replacement db instances (T276968) [production]
18:02 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1085.eqiad.wmnet with reason: REIMAGE [production]