2022-02-09
§
|
12:10 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn |
[production] |
11:20 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json |
[production] |
11:20 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance |
[production] |
11:20 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance |
[production] |
11:08 |
<mvernon@cumin2002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet |
[production] |
10:50 |
<mvernon@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet |
[production] |
10:45 |
<akosiaris> |
T300568 upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main |
[production] |
10:35 |
<jayme@deploy1002> |
helmfile [staging] DONE helmfile.d/services/miscweb: sync on main |
[production] |
10:34 |
<jayme@deploy1002> |
helmfile [staging] START helmfile.d/services/miscweb: apply on main |
[production] |
10:34 |
<jayme@deploy1002> |
helmfile [staging] DONE helmfile.d/services/miscweb: sync on main |
[production] |
10:32 |
<jayme@deploy1002> |
helmfile [staging] START helmfile.d/services/miscweb: apply on main |
[production] |
10:25 |
<jelto@deploy1002> |
Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s) |
[production] |
10:25 |
<jelto@deploy1002> |
Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) |
[production] |
10:20 |
<jelto> |
update scap to 4.3.1 on A:restbase-canary - T301307 |
[production] |
10:17 |
<jelto> |
update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - T301307 |
[production] |
10:16 |
<ariel@deploy1002> |
Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s) |
[production] |
10:15 |
<ariel@deploy1002> |
Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 |
[production] |
10:15 |
<mvernon@cumin2002> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet |
[production] |
10:14 |
<volans> |
uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia |
[production] |
10:11 |
<mvernon@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet |
[production] |
10:03 |
<akosiaris> |
T300568 upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main |
[production] |
10:02 |
<Emperor> |
rolling restart of swift frontends T301251 |
[production] |
09:46 |
<jayme@deploy1002> |
helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
09:45 |
<jayme@deploy1002> |
helmfile [staging-eqiad] START helmfile.d/admin 'apply'. |
[production] |
09:45 |
<jayme@deploy1002> |
helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
09:45 |
<elukey> |
update my ssh key on all network devices (will commit only when the diff is my key only) |
[production] |
09:44 |
<jayme@deploy1002> |
helmfile [staging-codfw] START helmfile.d/admin 'apply'. |
[production] |
09:41 |
<ema> |
cp3050: stop and disable atskafka-webrequest.service T247497 |
[production] |
09:15 |
<ema> |
cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) T265625 |
[production] |
08:21 |
<dcausse> |
restarting blazegraph on wdqs1004 (jvm stuck for 5hours) |
[production] |
07:55 |
<filippo@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet |
[production] |
07:42 |
<filippo@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet |
[production] |
07:35 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove logpager group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json |
[production] |
04:10 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance |
[production] |
04:10 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance |
[production] |
03:48 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
03:48 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
03:48 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json |
[production] |
03:32 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json |
[production] |
03:17 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json |
[production] |
03:02 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json |
[production] |
02:34 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json |
[production] |
02:34 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance |
[production] |
02:34 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance |
[production] |
02:11 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance |
[production] |
02:11 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance |
[production] |
02:11 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance |
[production] |
02:11 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance |
[production] |