|
2025-05-20
§
|
| 05:37 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2236 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P76325 and previous config saved to /var/cache/conftool/dbconfig/20250520-053720-root.json |
[production] |
| 05:33 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1183 (re)pooling @ 60%: Repooling', diff saved to https://phabricator.wikimedia.org/P76324 and previous config saved to /var/cache/conftool/dbconfig/20250520-053302-root.json |
[production] |
| 05:22 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2236 (re)pooling @ 20%: Repooling', diff saved to https://phabricator.wikimedia.org/P76323 and previous config saved to /var/cache/conftool/dbconfig/20250520-052215-root.json |
[production] |
| 05:17 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1183 (re)pooling @ 40%: Repooling', diff saved to https://phabricator.wikimedia.org/P76322 and previous config saved to /var/cache/conftool/dbconfig/20250520-051756-root.json |
[production] |
| 05:07 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2236 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P76321 and previous config saved to /var/cache/conftool/dbconfig/20250520-050710-root.json |
[production] |
| 05:03 |
<marostegui> |
Install 10.11.13 on db2236 T394653 |
[production] |
| 05:03 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2236.codfw.wmnet with reason: Maintenance |
[production] |
| 05:02 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1183 (re)pooling @ 20%: Repooling', diff saved to https://phabricator.wikimedia.org/P76320 and previous config saved to /var/cache/conftool/dbconfig/20250520-050250-root.json |
[production] |
| 05:00 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2236 T394653', diff saved to https://phabricator.wikimedia.org/P76319 and previous config saved to /var/cache/conftool/dbconfig/20250520-050017-marostegui.json |
[production] |
| 04:54 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet with reason: Maintenance |
[production] |
| 04:54 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1018.eqiad.wmnet with reason: Maintenance |
[production] |
| 04:54 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1019.eqiad.wmnet with reason: Maintenance |
[production] |
| 04:54 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1015.eqiad.wmnet with reason: Maintenance |
[production] |
| 04:53 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb1014.eqiad.wmnet with reason: Maintenance |
[production] |
| 04:51 |
<marostegui> |
Stop mariadb on db1155, wiki replicas will show lag on: s2, s4, s6 and s7 T394624 |
[production] |
| 04:50 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1155.eqiad.wmnet with reason: Maintenance |
[production] |
| 04:47 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1183 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P76318 and previous config saved to /var/cache/conftool/dbconfig/20250520-044744-root.json |
[production] |
| 04:01 |
<mwpresync@deploy1003> |
Pruned MediaWiki: 1.44.0-wmf.27 (duration: 01m 33s) |
[production] |
| 03:53 |
<mwpresync@deploy1003> |
Finished scap sync-world: testwikis to 1.45.0-wmf.2 refs T392172 (duration: 50m 50s) |
[production] |
| 03:02 |
<mwpresync@deploy1003> |
Started scap sync-world: testwikis to 1.45.0-wmf.2 refs T392172 |
[production] |
| 03:01 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,cinder |
[admin] |
| 03:01 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,cinder |
[admin] |
| 02:18 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,cinder |
[admin] |
| 02:18 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,cinder |
[admin] |
| 00:54 |
<rzl@deploy1003> |
Stopping before sync operations |
[production] |
| 00:54 |
<rzl@deploy1003> |
Started scap sync-world: 1147901 |
[production] |
| 00:52 |
<wmbot~bd808@tools-bastion-12> |
Restart to pick up new image |
[tools.gitlab-content] |
| 00:51 |
<wmbot~bd808@tools-bastion-12> |
Built new image from fadea6d5 |
[tools.gitlab-content] |
| 00:29 |
<wmbot~bd808@tools-bastion-12> |
Built and deployed image from aa4d2285. The webservice is using the new golang implementation now. |
[tools.gitlab-content] |
| 00:21 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=99) on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
| 00:11 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
|
2025-05-19
§
|
| 23:01 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1081.eqiad.wmnet with OS bullseye |
[production] |
| 22:58 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.drain (exit_code=99) on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
| 22:47 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.drain on host 'cloudvirt1039.eqiad.wmnet' (T394727) |
[admin] |
| 22:45 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host thanos-be2007.codfw.wmnet with OS bullseye |
[production] |
| 22:43 |
<andrew@cloudcumin1001> |
END (ERROR) - Cookbook wmcs.openstack.restart_openstack (exit_code=97) on deployment eqiad1 for service: project,nova |
[admin] |
| 22:43 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova |
[admin] |
| 22:35 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,nova |
[admin] |
| 22:30 |
<andrew@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1072.eqiad.wmnet with OS bookworm |
[production] |
| 22:29 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova |
[admin] |
| 22:29 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=0) on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
| 22:29 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage |
[production] |
| 22:28 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
| 22:28 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1080.eqiad.wmnet with OS bullseye |
[production] |
| 22:27 |
<andrew@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary (exit_code=99) on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
| 22:27 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.cloudvirt.lib.ensure_canary on eqiad1, with recreate False, for hosts list: ['cloudvirt1072'] |
[cloudvirt-canary] |
| 22:26 |
<andrew@cloudcumin1001> |
END (ERROR) - Cookbook wmcs.openstack.restart_openstack (exit_code=97) on deployment eqiad1 for service: project,nova |
[admin] |
| 22:26 |
<bking@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on relforge[1003-1004,1008-1010].eqiad.wmnet with reason: decom in progress |
[production] |
| 22:26 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova |
[admin] |
| 22:25 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage |
[production] |