2701-2750 of 10000 results (105ms)
2024-06-25 §
07:02 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2165 T368355', diff saved to https://phabricator.wikimedia.org/P65403 and previous config saved to /var/cache/conftool/dbconfig/20240625-070252-marostegui.json [production]
07:01 <marostegui@cumin1002> dbctl commit (dc=all): 'Promote db2161 to s8 primary T368355', diff saved to https://phabricator.wikimedia.org/P65402 and previous config saved to /var/cache/conftool/dbconfig/20240625-070127-marostegui.json [production]
07:01 <marostegui> Starting s8 codfw failover from db2165 to db2161 - T368355 [production]
07:00 <arnaudb@deploy1002> Finished scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] (duration: 07m 47s) [production]
06:55 <arnaudb@deploy1002> arnaudb: Continuing with sync [production]
06:55 <arnaudb@deploy1002> arnaudb: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
06:54 <isaranto@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
06:52 <arnaudb@deploy1002> Started scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] [production]
06:48 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P65401 and previous config saved to /var/cache/conftool/dbconfig/20240625-064841-marostegui.json [production]
06:45 <arnaudb@deploy1002> Sync cancelled. [production]
06:45 <arnaudb@deploy1002> arnaudb: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
06:42 <arnaudb@deploy1002> Started scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] [production]
06:40 <arnaudb@cumin1002> dbctl commit (dc=all): 'T368020', diff saved to https://phabricator.wikimedia.org/P65400 and previous config saved to /var/cache/conftool/dbconfig/20240625-064000-arnaudb.json [production]
06:39 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T368355 [production]
06:39 <marostegui@cumin1002> dbctl commit (dc=all): 'Set db2161 with weight 0 T368355', diff saved to https://phabricator.wikimedia.org/P65399 and previous config saved to /var/cache/conftool/dbconfig/20240625-063908-root.json [production]
06:38 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T368355 [production]
06:34 <arnaudb@cumin1002> dbctl commit (dc=all): 'Promote es1039 to es7 primary T368020', diff saved to https://phabricator.wikimedia.org/P65398 and previous config saved to /var/cache/conftool/dbconfig/20240625-063453-arnaudb.json [production]
06:33 <arnaudb> Starting es7 eqiad failover from es1035 to es1039 - T368020 [production]
06:33 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1167 (T364069)', diff saved to https://phabricator.wikimedia.org/P65397 and previous config saved to /var/cache/conftool/dbconfig/20240625-063334-marostegui.json [production]
06:26 <arnaudb@cumin1002> dbctl commit (dc=all): 'Set es1039 with weight 0 T368020', diff saved to https://phabricator.wikimedia.org/P65396 and previous config saved to /var/cache/conftool/dbconfig/20240625-062640-arnaudb.json [production]
06:25 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es7 T368020 [production]
06:25 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es7 T368020 [production]
06:24 <arnaudb@deploy1002> Finished scap: Backport for [[gerrit:1047910|dbconfig: temporary disable writes on es7 (T368020)]] (duration: 18m 47s) [production]
06:19 <arnaudb@deploy1002> arnaudb: Continuing with sync [production]
06:17 <arnaudb@deploy1002> arnaudb: Backport for [[gerrit:1047910|dbconfig: temporary disable writes on es7 (T368020)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
06:11 <marostegui> Drop ipblocks from s7 T367632 [production]
06:05 <arnaudb@deploy1002> Started scap: Backport for [[gerrit:1047910|dbconfig: temporary disable writes on es7 (T368020)]] [production]
06:02 <marostegui> Drop ipblocks from s6 T367632 [production]
05:33 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1167 (T364069)', diff saved to https://phabricator.wikimedia.org/P65395 and previous config saved to /var/cache/conftool/dbconfig/20240625-053312-marostegui.json [production]
05:33 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
05:33 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
05:32 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance [production]
05:32 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance [production]
05:32 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2125 (T367856)', diff saved to https://phabricator.wikimedia.org/P65394 and previous config saved to /var/cache/conftool/dbconfig/20240625-053239-marostegui.json [production]
05:32 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance [production]
05:32 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance [production]
04:01 <mwpresync@deploy1002> Pruned MediaWiki: 1.43.0-wmf.8 (duration: 00m 55s) [production]
03:55 <mwpresync@deploy1002> Finished scap: testwikis wikis to 1.43.0-wmf.11 refs T366956 (duration: 52m 19s) [production]
03:03 <mwpresync@deploy1002> Started scap: testwikis wikis to 1.43.0-wmf.11 refs T366956 [production]
01:48 <brett> Running authdns-update on dns1004 to pool eqsin - T365763 [production]
01:43 <brett@puppetmaster1001> conftool action : set/pooled=yes; selector: cluster=cache_text,dc=eqsin [production]
01:40 <brett> Removing downtime for cp[5017-5024] as nvme drives are installed and hosts back online - T365763 [production]
00:43 <sukhe> [correction of command] sudo pkill ffmpeg: mw1438, high CPU usage, ffmpeg processes [production]
00:43 <sukhe> sudo pkill mpeg: mw1438, high CPU usage, ffmpeg processes [production]
00:01 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: T365763 [production]
00:01 <brett@cumin2002> START - Cookbook sre.hosts.downtime for 4:00:00 on 8 hosts with reason: T365763 [production]
2024-06-24 §
23:02 <brett> Running authdns-update on dns1004 to depool eqsin - T365763 [production]
23:00 <cwhite@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash2003.codfw.wmnet [production]
23:00 <cwhite@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
23:00 <cwhite@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: logstash2003.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwhite@cumin2002" [production]