|
2025-12-01
ยง
|
| 11:36 |
<elukey@cumin1003> |
START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie |
[production] |
| 11:29 |
<btullis> |
restarting envoyproxy process on cephosd100[1-5] for T405808 |
[production] |
| 11:28 |
<elukey@cumin1003> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie |
[production] |
| 11:25 |
<wmftkbot> |
Test Kitchen mw-user experiment (poll 46014) - adds: none; removes: growthexperiments-get-started-notification; fields: none - xLab/MPIC/TK tips at https://w.wiki/FwuD |
[analytics] |
| 11:09 |
<btullis@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet |
[production] |
| 11:03 |
<btullis@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet |
[production] |
| 11:02 |
<elukey@cumin1003> |
START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie |
[production] |
| 10:52 |
<elukey@cumin2002> |
END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1013 |
[production] |
| 10:51 |
<JavierMonton> |
Deployed refinery using scap, then deployed onto hdfs |
[production] |
| 10:49 |
<JavierMonton> |
Deployed refinery using scap, then deployed onto hdfs |
[analytics] |
| 10:47 |
<moritzm> |
upgrade Envoy on matomo1001 T405808 |
[production] |
| 10:47 |
<elukey@cumin2002> |
START - Cookbook sre.hosts.powercycle for host ml-serve1013 |
[production] |
| 10:46 |
<elukey@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART |
[production] |
| 10:46 |
<elukey@cumin1003> |
START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART |
[production] |
| 10:42 |
<elukey@cumin1003> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie |
[production] |
| 10:40 |
<elukey@cumin1003> |
START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie |
[production] |
| 10:39 |
<elukey@cumin1003> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-serve1013.eqiad.wmnet with OS trixie |
[production] |
| 10:23 |
<javiermonton@deploy2002> |
Finished deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] (duration: 00m 28s) |
[production] |
| 10:23 |
<javiermonton@deploy2002> |
Started deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] |
[production] |
| 10:20 |
<a-pizzata@deploy2002> |
Finished deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] (duration: 02m 54s) |
[production] |
| 10:17 |
<a-pizzata@deploy2002> |
Started deploy [analytics/refinery@fa63f82]: Regular analytics train [analytics/refinery@fa63f82e] |
[production] |
| 10:16 |
<a-pizzata@deploy2002> |
Finished deploy [analytics/refinery@fa63f82] (hadoop-test): Analytics train TEST [analytics/refinery@fa63f82e] (duration: 01m 08s) |
[production] |
| 10:15 |
<a-pizzata@deploy2002> |
Started deploy [analytics/refinery@fa63f82] (hadoop-test): Analytics train TEST [analytics/refinery@fa63f82e] |
[production] |
| 10:14 |
<elukey@cumin1003> |
START - Cookbook sre.hosts.reimage for host ml-serve1013.eqiad.wmnet with OS trixie |
[production] |
| 10:13 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply |
[production] |
| 10:13 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply |
[production] |
| 10:12 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply |
[production] |
| 10:11 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply |
[production] |
| 10:11 |
<ayounsi@cumin1003> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
| 10:11 |
<ayounsi@cumin1003> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change ml-serve1013 vlan - ayounsi@cumin1003" |
[production] |
| 10:11 |
<ayounsi@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change ml-serve1013 vlan - ayounsi@cumin1003" |
[production] |
| 10:04 |
<ayounsi@cumin1003> |
START - Cookbook sre.dns.netbox |
[production] |
| 10:00 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors (exit_code=0) rolling restart_daemons on A:logstash-collector |
[production] |
| 09:53 |
<taavi@dns1004> |
END - running authdns-update |
[production] |
| 09:53 |
<jmm@cumin2002> |
START - Cookbook sre.o11y.roll-restart-reboot-logstash-collectors rolling restart_daemons on A:logstash-collector |
[production] |
| 09:52 |
<taavi@dns1004> |
START - running authdns-update |
[production] |
| 09:39 |
<moritzm> |
installing expat security updates |
[production] |
| 09:15 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
| 09:14 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
| 08:58 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Depooling db2151 (T410589)', diff saved to https://phabricator.wikimedia.org/P86235 and previous config saved to /var/cache/conftool/dbconfig/20251201-085828-ladsgroup.json |
[production] |
| 08:58 |
<ladsgroup@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance |
[production] |
| 08:50 |
<moritzm> |
upgrade Envoy on config-master* T405808 |
[production] |
| 08:47 |
<wm-bot2> |
Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19816470332 (https://github.com/cluebotng/component-configs/commits/00278e339c41812ca8ecd179e1630abfb031117b) |
[tools.cluebotng-monitoring] |
| 08:43 |
<wm-bot2> |
Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/19816470332 (https://github.com/cluebotng/component-configs/commits/00278e339c41812ca8ecd179e1630abfb031117b) |
[tools.cluebotng-monitoring] |
| 08:33 |
<mszwarc@deploy2002> |
Finished scap sync-world: Backport for [[gerrit:1212584|Fix mw-userlink class being added too broadly (T392775)]] (duration: 38m 35s) |
[production] |
| 08:29 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
| 08:29 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
| 08:19 |
<mszwarc@deploy2002> |
mszwarc: Continuing with sync |
[production] |
| 08:19 |
<brouberol@dns1004> |
END - running authdns-update |
[production] |
| 08:18 |
<mszwarc@deploy2002> |
mszwarc: Backport for [[gerrit:1212584|Fix mw-userlink class being added too broadly (T392775)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |