2023-01-20
§
|
10:49 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new ping host - jmm@cumin2002" |
[production] |
10:32 |
<elukey> |
restart kubelet on ml-staging200* nodes (some fs-inotify-related issues with the istio-proxy of newly created containers) |
[production] |
10:27 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . |
[production] |
10:13 |
<moritzm> |
installing emacs security updates on bullseye |
[production] |
10:13 |
<elukey@deploy1002> |
helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . |
[production] |
10:12 |
<moritzm> |
imported jenkins 2.375-2 to thirdparty/ci T326531 |
[production] |
10:00 |
<jnuche@deploy1002> |
Installation of scap version "4.33.1" completed for 1 hosts |
[production] |
10:00 |
<jnuche@deploy1002> |
Installing scap version "4.33.1" for 1 hosts |
[production] |
08:59 |
<moritzm> |
installing ping2003 T273509 |
[production] |
08:10 |
<elukey> |
restart kubelet on kubernetes2007 - node reported issues with it, marked as "notready" by the control plane |
[production] |
07:58 |
<elukey> |
`apt-get clean` on doh4001 to free space (root partition almost filled) |
[production] |
01:55 |
<ejegg> |
payments-wiki upgraded from 3cf03933 to 3d882ac7 |
[production] |
01:12 |
<ejegg> |
payments-wiki upgraded from fcb9ab60 to 3cf03933 |
[production] |
2023-01-19
§
|
21:46 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS bullseye |
[production] |
21:42 |
<jdrewniak@deploy1002> |
Finished scap: Backport for [[gerrit:881677|Enable Page tools on viwiki and itwiki (T327348)]] (duration: 10m 38s) |
[production] |
21:33 |
<jdrewniak@deploy1002> |
jdlrobson and jdrewniak: Backport for [[gerrit:881677|Enable Page tools on viwiki and itwiki (T327348)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet |
[production] |
21:31 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage |
[production] |
21:31 |
<jdrewniak@deploy1002> |
Started scap: Backport for [[gerrit:881677|Enable Page tools on viwiki and itwiki (T327348)]] |
[production] |
21:27 |
<jdrewniak@deploy1002> |
Finished scap: Backport for [[gerrit:881612|Fix grid blowout with limited width turned off (T327423)]] (duration: 08m 26s) |
[production] |
21:27 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2039.codfw.wmnet with reason: host reimage |
[production] |
21:20 |
<cwhite@deploy1002> |
Finished deploy [releng/phatality@e0bb573]: (no justification provided) (duration: 00m 13s) |
[production] |
21:20 |
<cwhite@deploy1002> |
Started deploy [releng/phatality@e0bb573]: (no justification provided) |
[production] |
21:20 |
<jdrewniak@deploy1002> |
jdlrobson and jdrewniak: Backport for [[gerrit:881612|Fix grid blowout with limited width turned off (T327423)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet |
[production] |
21:18 |
<jdrewniak@deploy1002> |
Started scap: Backport for [[gerrit:881612|Fix grid blowout with limited width turned off (T327423)]] |
[production] |
21:11 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS bullseye |
[production] |
20:13 |
<zabe@deploy1002> |
Finished scap: fix k8s drift (duration: 08m 02s) |
[production] |
20:05 |
<zabe@deploy1002> |
Started scap: fix k8s drift |
[production] |
20:02 |
<zabe@deploy1002> |
Finished scap: Backport for [[gerrit:881706|Start reading from cuc_comment_id everywhere except wikidatawiki (T233004)]] (duration: 14m 01s) |
[production] |
19:49 |
<zabe@deploy1002> |
zabe: Backport for [[gerrit:881706|Start reading from cuc_comment_id everywhere except wikidatawiki (T233004)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet |
[production] |
19:48 |
<zabe@deploy1002> |
Started scap: Backport for [[gerrit:881706|Start reading from cuc_comment_id everywhere except wikidatawiki (T233004)]] |
[production] |
18:36 |
<zabe> |
re-start populateCucComment on wikidatawiki post-mwmaint-reboot in screen with --sleep 2, will take ~30 hours # T233004 |
[production] |
18:17 |
<bd808@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply |
[production] |
18:17 |
<bd808@deploy1002> |
helmfile [eqiad] START helmfile.d/services/developer-portal: apply |
[production] |
18:16 |
<bd808@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/developer-portal: apply |
[production] |
18:16 |
<bd808@deploy1002> |
helmfile [codfw] START helmfile.d/services/developer-portal: apply |
[production] |
18:13 |
<bd808@deploy1002> |
helmfile [staging] DONE helmfile.d/services/developer-portal: apply |
[production] |
18:12 |
<bd808@deploy1002> |
helmfile [staging] START helmfile.d/services/developer-portal: apply |
[production] |
18:08 |
<mbsantos@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply |
[production] |
18:08 |
<mbsantos@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mobileapps: apply |
[production] |
18:06 |
<mbsantos@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mobileapps: apply |
[production] |
18:05 |
<mbsantos@deploy1002> |
helmfile [codfw] START helmfile.d/services/mobileapps: apply |
[production] |
18:02 |
<mbsantos@deploy1002> |
helmfile [staging] DONE helmfile.d/services/mobileapps: apply |
[production] |
18:01 |
<mbsantos@deploy1002> |
helmfile [staging] START helmfile.d/services/mobileapps: apply |
[production] |
17:36 |
<Amir1> |
bash Krinkle> Vatican Interm Papacy Runbook, § 5.1: Notify Wikipedia about incoming traffic. |
[production] |
17:17 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS bullseye |
[production] |
17:13 |
<zabe@deploy1002> |
Finished scap: T233004 (duration: 18m 50s) |
[production] |
17:02 |
<jiji@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage |
[production] |
16:58 |
<jiji@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mc2038.codfw.wmnet with reason: host reimage |
[production] |
16:54 |
<zabe@deploy1002> |
Started scap: T233004 |
[production] |
16:54 |
<zabe@deploy1002> |
backport aborted: (duration: 15m 22s) |
[production] |