2025-08-26
§
|
07:57 |
<dcausse@deploy1003> |
dcausse: Backport for [[gerrit:1182083|Revert "SECURITY: declare PoolCounter settings for cirrusbuilddoc"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
07:52 |
<dcausse@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1182083|Revert "SECURITY: declare PoolCounter settings for cirrusbuilddoc"]] |
[production] |
07:48 |
<dcausse@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1182023|SECURITY: declare PoolCounter settings for cirrusbuilddoc (T401220)]] (duration: 45m 38s) |
[production] |
07:42 |
<dcausse@deploy1003> |
dcausse: Continuing with sync |
[production] |
07:08 |
<dcausse@deploy1003> |
dcausse: Backport for [[gerrit:1182023|SECURITY: declare PoolCounter settings for cirrusbuilddoc (T401220)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
07:02 |
<dcausse@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1182023|SECURITY: declare PoolCounter settings for cirrusbuilddoc (T401220)]] |
[production] |
04:01 |
<mwpresync@deploy1003> |
Pruned MediaWiki: 1.45.0-wmf.13 (duration: 01m 11s) |
[production] |
02:17 |
<TimStarling> |
on db2202 creating copy of enwiki.recentchanges for performance analysis T400696 |
[production] |
01:51 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249 (T391056)', diff saved to https://phabricator.wikimedia.org/P81751 and previous config saved to /var/cache/conftool/dbconfig/20250826-015141-ladsgroup.json |
[production] |
01:36 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P81750 and previous config saved to /var/cache/conftool/dbconfig/20250826-013633-ladsgroup.json |
[production] |
01:21 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P81749 and previous config saved to /var/cache/conftool/dbconfig/20250826-012125-ladsgroup.json |
[production] |
01:08 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db1244.eqiad.wmnet with reason: Maintenance |
[production] |
01:06 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249 (T391056)', diff saved to https://phabricator.wikimedia.org/P81748 and previous config saved to /var/cache/conftool/dbconfig/20250826-010618-ladsgroup.json |
[production] |
00:59 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db1249 (T391056)', diff saved to https://phabricator.wikimedia.org/P81747 and previous config saved to /var/cache/conftool/dbconfig/20250826-005952-ladsgroup.json |
[production] |
00:59 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1249.eqiad.wmnet with reason: Maintenance |
[production] |
00:55 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1244.eqiad.wmnet with reason: Maintenance |
[production] |
00:50 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2207.codfw.wmnet with reason: Maintenance |
[production] |
00:49 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1222.eqiad.wmnet with reason: Maintenance |
[production] |
00:49 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2220.codfw.wmnet with reason: Maintenance |
[production] |
00:48 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1236.eqiad.wmnet with reason: Maintenance |
[production] |
00:47 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2229.codfw.wmnet with reason: Maintenance |
[production] |
00:47 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1201.eqiad.wmnet with reason: Maintenance |
[production] |
00:39 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2205.codfw.wmnet with reason: Maintenance |
[production] |
00:36 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1223.eqiad.wmnet with reason: Maintenance |
[production] |
00:35 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2213.codfw.wmnet with reason: Maintenance |
[production] |
00:34 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1210.eqiad.wmnet with reason: Maintenance |
[production] |
00:24 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1244.eqiad.wmnet with reason: Maintenance |
[production] |
00:16 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1244.eqiad.wmnet |
[production] |
00:07 |
<brett> |
Run systemctl reset-failed on disappeared nrpe2nodexp-disk_space.timer units (T395446) |
[production] |
2025-08-25
§
|
23:59 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1244 - Upgrading db1244.eqiad.wmnet |
[production] |
23:59 |
<ladsgroup@cumin1002> |
START - Cookbook sre.mysql.depool db1244 - Upgrading db1244.eqiad.wmnet |
[production] |
23:59 |
<ladsgroup@cumin1002> |
START - Cookbook sre.mysql.upgrade for db1244.eqiad.wmnet |
[production] |
23:48 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depool db1244 T402871', diff saved to https://phabricator.wikimedia.org/P81746 and previous config saved to /var/cache/conftool/dbconfig/20250825-234856-ladsgroup.json |
[production] |
23:47 |
<ladsgroup@dns1004> |
END - running authdns-update |
[production] |
23:45 |
<ladsgroup@dns1004> |
START - running authdns-update |
[production] |
23:43 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Promote db1160 to s4 primary and set section read-write T402871', diff saved to https://phabricator.wikimedia.org/P81745 and previous config saved to /var/cache/conftool/dbconfig/20250825-234303-ladsgroup.json |
[production] |
23:39 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - T402871', diff saved to https://phabricator.wikimedia.org/P81744 and previous config saved to /var/cache/conftool/dbconfig/20250825-233934-ladsgroup.json |
[production] |
23:39 |
<Amir1> |
Starting s4 eqiad failover from db1244 to db1160 - T402871 |
[production] |
23:31 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Set db1160 with weight 0 T402871', diff saved to https://phabricator.wikimedia.org/P81743 and previous config saved to /var/cache/conftool/dbconfig/20250825-233128-ladsgroup.json |
[production] |
23:30 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s4 T402871 |
[production] |
23:23 |
<jhathaway@cumin1002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1003.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL |
[production] |
23:21 |
<jhathaway@cumin1002> |
START - Cookbook sre.hosts.provision for host sretest1003.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL |
[production] |
23:17 |
<jhathaway@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1003.eqiad.wmnet with reason: sleep test |
[production] |
23:00 |
<maryum> |
Deploy security fix for T397396 |
[production] |
22:55 |
<maryum> |
Deploy security fix for T401220 |
[production] |
22:27 |
<maryum> |
Deployed security fix for T298690 |
[production] |
22:20 |
<ladsgroup@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1181786|Move update of category members count to a dedicated job (T365303)]] (duration: 12m 26s) |
[production] |
22:15 |
<ladsgroup@deploy1003> |
ladsgroup: Continuing with sync |
[production] |
22:14 |
<ladsgroup@deploy1003> |
ladsgroup: Backport for [[gerrit:1181786|Move update of category members count to a dedicated job (T365303)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
22:08 |
<ladsgroup@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1181786|Move update of category members count to a dedicated job (T365303)]] |
[production] |