2025-05-14
ยง
|
13:20 |
<Lucas_WMDE> |
UTC afternoon backport+config window done |
[production] |
13:19 |
<lucaswerkmeister-wmde@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1139489|manage-dblist: Rename to manage-dblist.php (T392819)]] (duration: 12m 48s) |
[production] |
13:15 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1258 (re)pooling @ 40%: Repooling', diff saved to https://phabricator.wikimedia.org/P76148 and previous config saved to /var/cache/conftool/dbconfig/20250514-131510-root.json |
[production] |
13:13 |
<aqu@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply |
[production] |
13:13 |
<aqu@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply |
[production] |
13:12 |
<lucaswerkmeister-wmde@deploy1003> |
lucaswerkmeister-wmde: Continuing with sync |
[production] |
13:11 |
<lucaswerkmeister-wmde@deploy1003> |
lucaswerkmeister-wmde: Backport for [[gerrit:1139489|manage-dblist: Rename to manage-dblist.php (T392819)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:07 |
<godog> |
correction, restart grafana-server on grafana1002 |
[production] |
13:06 |
<lucaswerkmeister-wmde@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1139489|manage-dblist: Rename to manage-dblist.php (T392819)]] |
[production] |
13:05 |
<godog> |
reboot grafana1002 - hard down |
[production] |
13:01 |
<stevemunene@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1068.eqiad.wmnet |
[production] |
13:00 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1258 (re)pooling @ 30%: Repooling', diff saved to https://phabricator.wikimedia.org/P76146 and previous config saved to /var/cache/conftool/dbconfig/20250514-130004-root.json |
[production] |
12:55 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
12:54 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
12:44 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1258 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P76145 and previous config saved to /var/cache/conftool/dbconfig/20250514-124458-root.json |
[production] |
12:29 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1258 (re)pooling @ 20%: Repooling', diff saved to https://phabricator.wikimedia.org/P76144 and previous config saved to /var/cache/conftool/dbconfig/20250514-122952-root.json |
[production] |
12:28 |
<joal@deploy1003> |
Finished deploy [analytics/refinery@9d620d0] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@9d620d06] (duration: 00m 46s) |
[production] |
12:28 |
<joal@deploy1003> |
Started deploy [analytics/refinery@9d620d0] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@9d620d06] |
[production] |
12:27 |
<joal@deploy1003> |
Finished deploy [analytics/refinery@9d620d0] (thin): Analytics webrequest migration THIN [analytics/refinery@9d620d06] (duration: 01m 35s) |
[production] |
12:26 |
<joal@deploy1003> |
Started deploy [analytics/refinery@9d620d0] (thin): Analytics webrequest migration THIN [analytics/refinery@9d620d06] |
[production] |
12:25 |
<joal@deploy1003> |
Finished deploy [analytics/refinery@9d620d0]: Regular analytics weekly train [analytics/refinery@9d620d06] (duration: 02m 17s) |
[production] |
12:23 |
<joal@deploy1003> |
Started deploy [analytics/refinery@9d620d0]: Regular analytics weekly train [analytics/refinery@9d620d06] |
[production] |
12:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1258 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P76143 and previous config saved to /var/cache/conftool/dbconfig/20250514-121446-root.json |
[production] |
11:47 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Remove db2243 from s8 (T351820)', diff saved to https://phabricator.wikimedia.org/P76142 and previous config saved to /var/cache/conftool/dbconfig/20250514-114724-ladsgroup.json |
[production] |
11:47 |
<moritzm> |
installing librabbitmq securit updates |
[production] |
11:41 |
<ladsgroup@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1145844|Move production term store traffic to x3 (T351820)]] (duration: 20m 48s) |
[production] |
11:41 |
<stevemunene@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host an-worker1068.eqiad.wmnet |
[production] |
11:38 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad |
[production] |
11:35 |
<ladsgroup@deploy1003> |
ladsgroup: Continuing with sync |
[production] |
11:27 |
<ladsgroup@deploy1003> |
ladsgroup: Backport for [[gerrit:1145844|Move production term store traffic to x3 (T351820)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
11:21 |
<ladsgroup@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1145844|Move production term store traffic to x3 (T351820)]] |
[production] |
11:17 |
<kcvelaga@deploy1003> |
Finished deploy [airflow-dags/analytics_product@22aa307]: T393561 (duration: 01m 10s) |
[production] |
11:17 |
<kcvelaga@deploy1003> |
Started deploy [airflow-dags/analytics_product@22aa307]: T393561 |
[production] |
11:15 |
<jmm@cumin2002> |
START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad |
[production] |
11:15 |
<mvolz@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/citoid: apply |
[production] |
11:14 |
<mvolz@deploy1003> |
helmfile [eqiad] START helmfile.d/services/citoid: apply |
[production] |
11:13 |
<mvolz@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/citoid: apply |
[production] |
11:12 |
<mvolz@deploy1003> |
helmfile [codfw] START helmfile.d/services/citoid: apply |
[production] |
11:12 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling restart_daemons on A:cephosd |
[production] |
11:10 |
<mvolz@deploy1003> |
helmfile [staging] DONE helmfile.d/services/citoid: apply |
[production] |
11:10 |
<mvolz@deploy1003> |
helmfile [staging] START helmfile.d/services/citoid: apply |
[production] |
11:03 |
<dcausse@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
11:01 |
<dcausse@deploy1003> |
helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
10:50 |
<dcausse@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
10:49 |
<dcausse@deploy1003> |
helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
10:47 |
<btullis@cumin1002> |
START - Cookbook sre.ceph.roll-restart-reboot-server rolling restart_daemons on A:cephosd |
[production] |
10:44 |
<dcausse@deploy1003> |
helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
10:43 |
<dcausse@deploy1003> |
helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |
10:41 |
<dcausse@deploy1003> |
helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply |
[production] |
10:41 |
<dcausse@deploy1003> |
helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply |
[production] |