2025-06-09
§
|
07:56 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P77215 and previous config saved to /var/cache/conftool/dbconfig/20250609-075619-marostegui.json |
[production] |
07:46 |
<taavi> |
delete redis pod stuck in Completed with no futher explanation why |
[quarry] |
07:41 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2169 (T396130)', diff saved to https://phabricator.wikimedia.org/P77213 and previous config saved to /var/cache/conftool/dbconfig/20250609-074112-marostegui.json |
[production] |
07:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2169 (T396130)', diff saved to https://phabricator.wikimedia.org/P77211 and previous config saved to /var/cache/conftool/dbconfig/20250609-073403-marostegui.json |
[production] |
07:33 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2169.codfw.wmnet with reason: Maintenance |
[production] |
07:28 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2158.codfw.wmnet with reason: Maintenance |
[production] |
07:23 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2151.codfw.wmnet with reason: Maintenance |
[production] |
07:23 |
<marostegui@cumin1002> |
START - Cookbook sre.mysql.pool db2244 gradually with 4 steps - Pool db2244.codfw.wmnet in after cloning |
[production] |
06:22 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2243 gradually with 4 steps - Pool db2243.codfw.wmnet in after cloning |
[production] |
05:42 |
<marostegui> |
Add MariaDB 10.11.13 to the repo T395663 |
[production] |
05:37 |
<marostegui@cumin1002> |
START - Cookbook sre.mysql.pool db2243 gradually with 4 steps - Pool db2243.codfw.wmnet in after cloning |
[production] |
05:24 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Add db2244 to dbctl depooled T393989', diff saved to https://phabricator.wikimedia.org/P77205 and previous config saved to /var/cache/conftool/dbconfig/20250609-052451-marostegui.json |
[production] |
05:00 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db2243 - Depool db2243.codfw.wmnet to then clone it to db2244.codfw.wmnet - marostegui@cumin1002 |
[production] |
05:00 |
<marostegui@cumin1002> |
START - Cookbook sre.mysql.depool db2243 - Depool db2243.codfw.wmnet to then clone it to db2244.codfw.wmnet - marostegui@cumin1002 |
[production] |
05:00 |
<marostegui@cumin1002> |
START - Cookbook sre.mysql.clone of db2243.codfw.wmnet onto db2244.codfw.wmnet |
[production] |
2025-06-08
§
|
18:14 |
<James_F> |
Zuul: [mediawiki/extensions/Echo] Remove EventLogging |
[releng] |
18:12 |
<James_F> |
Zuul: Fold extension-quibble-php81-or-later template into extension-quibble |
[releng] |
18:04 |
<James_F> |
Zuul: [mediawiki/extensions/SemanticVersion] Add basic CI |
[releng] |
15:40 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed 168c371259 (refactoring: labels store, should have no effect) |
[tools.wdactle] |
14:42 |
<wmbot~lucaswerkmeister@tools-bastion-13> |
deployed 7c1eaba398 (split guesses into words) |
[tools.wdactle] |
12:05 |
<wmbot~multichill@tools-bastion-12> |
Checked for T395205 YiFeiBot and SignBot already use BotPasswords |
[tools.yifeibot] |
12:04 |
<Ammar> |
Ran fixStuckGlobalRename.php for T396290 and T396291 |
[production] |
12:00 |
<wmbot~multichill@tools-bastion-12> |
Switched ErfgoedBot to BotPasswords for T395205, unable to test |
[tools.heritage] |
11:12 |
<wmbot~multichill@tools-bastion-12> |
Switched BotMultichillT to BotPasswords for T395205 |
[tools.multichill] |
10:55 |
<wmbot~multichill@tools-bastion-12> |
Switched to BotPasswords for T395205 |
[tools.noclaims] |
10:53 |
<wmbot~multichill@tools-bastion-12> |
Switch to botpasswords for T395205 |
[tools.geograph] |
10:17 |
<wmbot~multichill@tools-bastion-12> |
Switched to BotPasswords |
[tools.noclaims] |
09:53 |
<ryankemper@cumin2002> |
END (PASS) - Cookbook sre.wdqs.data-reload (exit_code=0) reloading scholarly_articles on wdqs1023.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/scholarly/20250526/ using stat1011.eqiad.wmnet) |
[production] |
2025-06-07
§
|
19:08 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for all services |
[admin] |
18:55 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for all services |
[admin] |
18:49 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.restart_openstack (exit_code=0) on deployment eqiad1 for service: project,nova |
[admin] |
18:44 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.openstack.restart_openstack on deployment eqiad1 for service: project,nova |
[admin] |
16:49 |
<dcaro> |
extend the volume toolforge-prometheus-a to 20G |
[toolsbeta] |
11:43 |
<jhancock@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-be2007.codfw.wmnet with OS bullseye |
[production] |
11:07 |
<jhancock@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-be2006.codfw.wmnet with OS bullseye |
[production] |
08:12 |
<elukey> |
restart apache2 / php-fpm on phab1004 |
[production] |
04:18 |
<mutante> |
restarted apache on phab1004 |
[production] |
2025-06-06
§
|
21:49 |
<andrew@cloudcumin1001> |
END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for tools-k8s-worker-nfs-54, tools-k8s-worker-nfs-74 |
[tools] |
21:40 |
<andrewbogott> |
restarting tools-prometheus-9 and tools-prometheus-8, lots of tools metrics just went dark |
[tools] |
21:37 |
<andrew@cloudcumin1001> |
START - Cookbook wmcs.toolforge.k8s.reboot for tools-k8s-worker-nfs-54, tools-k8s-worker-nfs-74 |
[tools] |
21:33 |
<fceratto@deploy1003> |
helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . |
[production] |
21:25 |
<fceratto@deploy1003> |
helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . |
[production] |
21:19 |
<fceratto@deploy1003> |
helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . |
[production] |
21:15 |
<fceratto@deploy1003> |
helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . |
[production] |
21:02 |
<bking@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on relforge[1003-1004].eqiad.wmnet with reason: downtime before decom |
[production] |
20:59 |
<andrewbogott> |
restarting all designate services on all cloudcontrols in eqiad1 |
[admin] |
20:42 |
<jhancock@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage |
[production] |
20:40 |
<fceratto@deploy1003> |
helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . |
[production] |
20:38 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage |
[production] |
20:35 |
<fceratto@deploy1003> |
helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' . |
[production] |