2024-06-25
§
|
07:06 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Long schema change |
[production] |
07:06 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Long schema change |
[production] |
07:03 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P65404 and previous config saved to /var/cache/conftool/dbconfig/20240625-070348-marostegui.json |
[production] |
07:02 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2165 T368355', diff saved to https://phabricator.wikimedia.org/P65403 and previous config saved to /var/cache/conftool/dbconfig/20240625-070252-marostegui.json |
[production] |
07:01 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Promote db2161 to s8 primary T368355', diff saved to https://phabricator.wikimedia.org/P65402 and previous config saved to /var/cache/conftool/dbconfig/20240625-070127-marostegui.json |
[production] |
07:01 |
<marostegui> |
Starting s8 codfw failover from db2165 to db2161 - T368355 |
[production] |
07:00 |
<arnaudb@deploy1002> |
Finished scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] (duration: 07m 47s) |
[production] |
06:55 |
<arnaudb@deploy1002> |
arnaudb: Continuing with sync |
[production] |
06:55 |
<arnaudb@deploy1002> |
arnaudb: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
06:54 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . |
[production] |
06:52 |
<arnaudb@deploy1002> |
Started scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] |
[production] |
06:48 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P65401 and previous config saved to /var/cache/conftool/dbconfig/20240625-064841-marostegui.json |
[production] |
06:45 |
<arnaudb@deploy1002> |
Sync cancelled. |
[production] |
06:45 |
<arnaudb@deploy1002> |
arnaudb: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
06:42 |
<arnaudb@deploy1002> |
Started scap: Backport for [[gerrit:1049386|Revert "dbconfig: temporary disable writes on es7"]] |
[production] |
06:40 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'T368020', diff saved to https://phabricator.wikimedia.org/P65400 and previous config saved to /var/cache/conftool/dbconfig/20240625-064000-arnaudb.json |
[production] |
06:39 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T368355 |
[production] |
06:39 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set db2161 with weight 0 T368355', diff saved to https://phabricator.wikimedia.org/P65399 and previous config saved to /var/cache/conftool/dbconfig/20240625-063908-root.json |
[production] |
06:38 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T368355 |
[production] |
06:34 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Promote es1039 to es7 primary T368020', diff saved to https://phabricator.wikimedia.org/P65398 and previous config saved to /var/cache/conftool/dbconfig/20240625-063453-arnaudb.json |
[production] |
06:33 |
<arnaudb> |
Starting es7 eqiad failover from es1035 to es1039 - T368020 |
[production] |
06:33 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1167 (T364069)', diff saved to https://phabricator.wikimedia.org/P65397 and previous config saved to /var/cache/conftool/dbconfig/20240625-063334-marostegui.json |
[production] |
06:26 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Set es1039 with weight 0 T368020', diff saved to https://phabricator.wikimedia.org/P65396 and previous config saved to /var/cache/conftool/dbconfig/20240625-062640-arnaudb.json |
[production] |
06:25 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es7 T368020 |
[production] |
06:25 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es7 T368020 |
[production] |
06:24 |
<arnaudb@deploy1002> |
Finished scap: Backport for [[gerrit:1047910|dbconfig: temporary disable writes on es7 (T368020)]] (duration: 18m 47s) |
[production] |
06:19 |
<arnaudb@deploy1002> |
arnaudb: Continuing with sync |
[production] |
06:17 |
<arnaudb@deploy1002> |
arnaudb: Backport for [[gerrit:1047910|dbconfig: temporary disable writes on es7 (T368020)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
06:11 |
<marostegui> |
Drop ipblocks from s7 T367632 |
[production] |
06:05 |
<arnaudb@deploy1002> |
Started scap: Backport for [[gerrit:1047910|dbconfig: temporary disable writes on es7 (T368020)]] |
[production] |
06:02 |
<marostegui> |
Drop ipblocks from s6 T367632 |
[production] |
05:33 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1167 (T364069)', diff saved to https://phabricator.wikimedia.org/P65395 and previous config saved to /var/cache/conftool/dbconfig/20240625-053312-marostegui.json |
[production] |
05:33 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
05:33 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance |
[production] |
05:32 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance |
[production] |
05:32 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance |
[production] |
05:32 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2125 (T367856)', diff saved to https://phabricator.wikimedia.org/P65394 and previous config saved to /var/cache/conftool/dbconfig/20240625-053239-marostegui.json |
[production] |
05:32 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance |
[production] |
05:32 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance |
[production] |
04:01 |
<mwpresync@deploy1002> |
Pruned MediaWiki: 1.43.0-wmf.8 (duration: 00m 55s) |
[production] |
03:55 |
<mwpresync@deploy1002> |
Finished scap: testwikis wikis to 1.43.0-wmf.11 refs T366956 (duration: 52m 19s) |
[production] |
03:03 |
<mwpresync@deploy1002> |
Started scap: testwikis wikis to 1.43.0-wmf.11 refs T366956 |
[production] |
01:48 |
<brett> |
Running authdns-update on dns1004 to pool eqsin - T365763 |
[production] |
01:43 |
<brett@puppetmaster1001> |
conftool action : set/pooled=yes; selector: cluster=cache_text,dc=eqsin |
[production] |
01:40 |
<brett> |
Removing downtime for cp[5017-5024] as nvme drives are installed and hosts back online - T365763 |
[production] |
00:43 |
<sukhe> |
[correction of command] sudo pkill ffmpeg: mw1438, high CPU usage, ffmpeg processes |
[production] |
00:43 |
<sukhe> |
sudo pkill mpeg: mw1438, high CPU usage, ffmpeg processes |
[production] |
00:01 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: T365763 |
[production] |
00:01 |
<brett@cumin2002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on 8 hosts with reason: T365763 |
[production] |