|
2025-11-26
ยง
|
| 12:33 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P85720 and previous config saved to /var/cache/conftool/dbconfig/20251126-123307-marostegui.json |
[production] |
| 12:31 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Remove vslow/dump from s1 T411088', diff saved to https://phabricator.wikimedia.org/P85719 and previous config saved to /var/cache/conftool/dbconfig/20251126-123131-marostegui.json |
[production] |
| 12:29 |
<btullis@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
| 12:27 |
<cmooney@cumin1003> |
END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 29357 |
[production] |
| 12:27 |
<btullis@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'. |
[production] |
| 12:27 |
<cmooney@cumin1003> |
START - Cookbook sre.network.peering with action 'configure' for AS: 29357 |
[production] |
| 12:27 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Remove vslow/dump from s3 T411088', diff saved to https://phabricator.wikimedia.org/P85717 and previous config saved to /var/cache/conftool/dbconfig/20251126-122703-marostegui.json |
[production] |
| 12:22 |
<mvolz@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/citoid: apply |
[production] |
| 12:21 |
<mvolz@deploy2002> |
helmfile [eqiad] START helmfile.d/services/citoid: apply |
[production] |
| 12:20 |
<mvolz@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/citoid: apply |
[production] |
| 12:18 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P85716 and previous config saved to /var/cache/conftool/dbconfig/20251126-121759-marostegui.json |
[production] |
| 12:15 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wdqs-all |
[production] |
| 12:12 |
<mvolz@deploy2002> |
helmfile [codfw] START helmfile.d/services/citoid: apply |
[production] |
| 12:10 |
<root@cumin2002> |
DONE (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for backup2014.codfw.wmnet: Renew puppet certificate - root@cumin2002 |
[production] |
| 12:09 |
<mvolz@deploy2002> |
helmfile [staging] DONE helmfile.d/services/citoid: apply |
[production] |
| 12:09 |
<mvolz@deploy2002> |
helmfile [staging] START helmfile.d/services/citoid: apply |
[production] |
| 12:06 |
<claime> |
Starting kafka-main rebalance with 30MB/s throttle - T407185 |
[production] |
| 12:02 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85713 and previous config saved to /var/cache/conftool/dbconfig/20251126-120252-marostegui.json |
[production] |
| 12:02 |
<jmm@cumin2002> |
START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wdqs-all |
[production] |
| 12:01 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.wdqs.restart-nginx-envoy (exit_code=0) rolling restart_daemons on A:wcqs-public |
[production] |
| 11:59 |
<jmm@cumin2002> |
START - Cookbook sre.wdqs.restart-nginx-envoy rolling restart_daemons on A:wcqs-public |
[production] |
| 11:57 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Depooling db1227 (T410531)', diff saved to https://phabricator.wikimedia.org/P85712 and previous config saved to /var/cache/conftool/dbconfig/20251126-115739-marostegui.json |
[production] |
| 11:57 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance |
[production] |
| 11:57 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410531)', diff saved to https://phabricator.wikimedia.org/P85711 and previous config saved to /var/cache/conftool/dbconfig/20251126-115726-marostegui.json |
[production] |
| 11:54 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-eqiad |
[production] |
| 11:52 |
<jmm@cumin2002> |
START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-eqiad |
[production] |
| 11:42 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P85710 and previous config saved to /var/cache/conftool/dbconfig/20251126-114218-marostegui.json |
[production] |
| 11:39 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-codfw |
[production] |
| 11:37 |
<jmm@cumin2002> |
START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-codfw |
[production] |
| 11:33 |
<moritzm> |
installing libxslt security updates |
[production] |
| 11:27 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P85709 and previous config saved to /var/cache/conftool/dbconfig/20251126-112710-marostegui.json |
[production] |
| 11:26 |
<ryankemper@cumin2002> |
END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T410573 |
[production] |
| 11:24 |
<jynus@cumin2002> |
dbctl commit (dc=all): 'Depool db2166, perf issue', diff saved to https://phabricator.wikimedia.org/P85708 and previous config saved to /var/cache/conftool/dbconfig/20251126-112422-jynus.json |
[production] |
| 11:21 |
<btullis@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/analytics-test: apply |
[production] |
| 11:21 |
<btullis@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/analytics-test: apply |
[production] |
| 11:12 |
<jynus@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2014.codfw.wmnet with reason: upgrade and restart |
[production] |
| 11:12 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410531)', diff saved to https://phabricator.wikimedia.org/P85706 and previous config saved to /var/cache/conftool/dbconfig/20251126-111203-marostegui.json |
[production] |
| 11:10 |
<bwojtowicz@deploy2002> |
helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . |
[production] |
| 11:09 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Depooling db1202 (T410531)', diff saved to https://phabricator.wikimedia.org/P85705 and previous config saved to /var/cache/conftool/dbconfig/20251126-110951-marostegui.json |
[production] |
| 11:09 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance |
[production] |
| 11:09 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85704 and previous config saved to /var/cache/conftool/dbconfig/20251126-110928-marostegui.json |
[production] |
| 11:09 |
<bwojtowicz@deploy2002> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . |
[production] |
| 11:06 |
<bwojtowicz@deploy2002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . |
[production] |
| 10:59 |
<fnegri@cloudcumin1001> |
END (FAIL) - Cookbook wmcs.openstack.tofu (exit_code=99) running tofu plan for main branch |
[admin] |
| 10:59 |
<fnegri@cloudcumin1001> |
START - Cookbook wmcs.openstack.tofu running tofu plan for main branch |
[admin] |
| 10:54 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P85702 and previous config saved to /var/cache/conftool/dbconfig/20251126-105420-marostegui.json |
[production] |
| 10:42 |
<hnowlan@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/thumbor: sync |
[production] |
| 10:42 |
<hnowlan@deploy2002> |
helmfile [eqiad] START helmfile.d/services/thumbor: sync |
[production] |
| 10:39 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P85701 and previous config saved to /var/cache/conftool/dbconfig/20251126-103913-marostegui.json |
[production] |
| 10:36 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |