2020-11-17
§
|
07:15 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1015 (re)pooling @ 75%: Slowly pool es1015 after cloning es1033 T261717', diff saved to https://phabricator.wikimedia.org/P13282 and previous config saved to /var/cache/conftool/dbconfig/20201117-071529-root.json |
[production] |
07:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 20%: Slowly pool es1034 after being recloned T261717', diff saved to https://phabricator.wikimedia.org/P13281 and previous config saved to /var/cache/conftool/dbconfig/20201117-070220-root.json |
[production] |
07:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1033 (re)pooling @ 20%: Slowly pool es1033 after being recloned T261717', diff saved to https://phabricator.wikimedia.org/P13280 and previous config saved to /var/cache/conftool/dbconfig/20201117-070209-root.json |
[production] |
07:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1019 (re)pooling @ 50%: Slowly pool es1019 after cloning es1034 T261717', diff saved to https://phabricator.wikimedia.org/P13278 and previous config saved to /var/cache/conftool/dbconfig/20201117-070050-root.json |
[production] |
07:00 |
<marostegui> |
Stop mysql on db1124: s1 and s3, this will generate lag on enwiki and s3 on labsdb - T267090 |
[production] |
07:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1015 (re)pooling @ 50%: Slowly pool es1015 after cloning es1033 T261717', diff saved to https://phabricator.wikimedia.org/P13277 and previous config saved to /var/cache/conftool/dbconfig/20201117-070025-root.json |
[production] |
06:51 |
<marostegui> |
Upgrade db1077 and pc2010 to 10.4.17 |
[production] |
06:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 10%: Slowly pool es1034 after being recloned T261717', diff saved to https://phabricator.wikimedia.org/P13276 and previous config saved to /var/cache/conftool/dbconfig/20201117-064716-root.json |
[production] |
06:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1033 (re)pooling @ 10%: Slowly pool es1033 after being recloned T261717', diff saved to https://phabricator.wikimedia.org/P13275 and previous config saved to /var/cache/conftool/dbconfig/20201117-064705-root.json |
[production] |
06:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1019 (re)pooling @ 25%: Slowly pool es1019 after cloning es1034 T261717', diff saved to https://phabricator.wikimedia.org/P13274 and previous config saved to /var/cache/conftool/dbconfig/20201117-064546-root.json |
[production] |
06:45 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1015 (re)pooling @ 25%: Slowly pool es1015 after cloning es1033 T261717', diff saved to https://phabricator.wikimedia.org/P13273 and previous config saved to /var/cache/conftool/dbconfig/20201117-064522-root.json |
[production] |
06:39 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool es1034 with minimum weight on es3 T261717', diff saved to https://phabricator.wikimedia.org/P13272 and previous config saved to /var/cache/conftool/dbconfig/20201117-063933-marostegui.json |
[production] |
06:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Pool es1033 with minimum weight on es2 T261717', diff saved to https://phabricator.wikimedia.org/P13271 and previous config saved to /var/cache/conftool/dbconfig/20201117-063805-marostegui.json |
[production] |
06:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1019 (re)pooling @ 10%: Slowly pool es1019 after cloning es1034 T261717', diff saved to https://phabricator.wikimedia.org/P13270 and previous config saved to /var/cache/conftool/dbconfig/20201117-063043-root.json |
[production] |
06:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1015 (re)pooling @ 10%: Slowly pool es1015 after cloning es1033 T261717', diff saved to https://phabricator.wikimedia.org/P13269 and previous config saved to /var/cache/conftool/dbconfig/20201117-063019-root.json |
[production] |
02:37 |
<dwisehaupt> |
shifted portion of thank you emails flowing through frmx's to 60% of the total volume |
[production] |
01:59 |
<eileen_> |
civicrm revision is b6fe8bd791, config revision is 61e2000391 |
[production] |
2020-11-16
§
|
23:28 |
<mutante> |
cumin1001 - sudo systemctl start cumin-check-aliases (to confirm switching cron to timer worked) T265138 |
[production] |
22:22 |
<otto@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . |
[production] |
22:19 |
<otto@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' . |
[production] |
22:19 |
<otto@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' . |
[production] |
22:17 |
<otto@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
22:09 |
<otto@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . |
[production] |
22:06 |
<mutante> |
planet - fixed updates of uk.planet which failed due to non-ASCII chars in a URL - since updates are systemd timers now that affects the entire systemd state monitoring |
[production] |
21:40 |
<rzl@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet |
[production] |
21:40 |
<rzl@cumin1001> |
conftool action : set/weight=1; selector: name=mw2250.codfw.wmnet,cluster=videoscaler,service=canary |
[production] |
21:38 |
<rzl@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2250.codfw.wmnet,cluster=jobrunner |
[production] |
21:30 |
<mutante> |
peek2001 - mv /var/lib/peek/git to git.old ; run puppet ; let it fix git checkout |
[production] |
21:07 |
<rzl> |
disable puppet on jobrunners T264991 |
[production] |
20:40 |
<mutante> |
planet1002/planet2002 - delete entire crontab of user planet, drop update cronjobs after switching to systemd timers with gerrit:636105 (T265138) |
[production] |
20:06 |
<pt1979@cumin2001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
20:06 |
<mutante> |
releases2002 systemctl reset-failed should clear Icinga systemd alert after gerrit:641228 |
[production] |
20:05 |
<dwisehaupt> |
disabling process-control jobs and moving to maintenance mode for maint window |
[production] |
19:57 |
<pt1979@cumin2001> |
START - Cookbook sre.dns.netbox |
[production] |
19:53 |
<ebernhardson@deploy1001> |
Finished deploy [wikimedia/discovery/analytics@4a953ca]: query_clicks_hourly: handle wmf.webrequest page_id change from int to bigint (duration: 02m 27s) |
[production] |
19:51 |
<ebernhardson@deploy1001> |
Started deploy [wikimedia/discovery/analytics@4a953ca]: query_clicks_hourly: handle wmf.webrequest page_id change from int to bigint |
[production] |
19:48 |
<effie> |
disable puppet on parsoid servers - T264991 |
[production] |
19:01 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) |
[production] |
18:59 |
<mutante> |
mw2255 - is pooled and puppet works on next run, after it removed php 7.2 config files |
[production] |
18:56 |
<mutante> |
running puppet on mw2313 and mw2255 which were listed in puppetboard as failed puppet runs |
[production] |
18:15 |
<rzl> |
disable puppet on 'A:mw-api and not A:mw-api-canary' T264991 |
[production] |
18:05 |
<effie> |
disable puppet on all appservers |
[production] |
17:48 |
<elukey> |
enable and run puppet on kafka-main2003 (it will start kafka services) - T267865 |
[production] |
17:42 |
<dwisehaupt> |
frmon1001 upgraded to buster |
[production] |
17:36 |
<volans> |
moved interfaces in Netbox from old to new switch - T267865 |
[production] |
17:24 |
<vgutierrez> |
switching back from lvs2010 to lvs2007 - T267865 |
[production] |
17:21 |
<vgutierrez> |
repooling cp2037 and cp2038 - T267865 |
[production] |
16:46 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
16:40 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
16:16 |
<XioNoX> |
update c7 serial in row C VC config - T267865 |
[production] |