501-550 of 10000 results (8ms)
2020-09-07 §
04:56 <marostegui> Compress InnoDB on s1 eqiad master - this will generate a few day of lag on s1 and labsdb for enwiki T254462 [production]
04:53 <marostegui> Deploy schema change on db1109 (eqiad wikidata master) - T256685 [production]
2020-09-06 §
19:45 <marostegui@cumin1001> dbctl commit (dc=all): 'Decrease db2127's weight a bit', diff saved to https://phabricator.wikimedia.org/P12496 and previous config saved to /var/cache/conftool/dbconfig/20200906-194512-marostegui.json [production]
08:20 <elukey> powercycle mw1360 (mgmt console available, network errors while running anything) [production]
08:04 <elukey@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=mw1360.eqiad.wmnet [production]
08:01 <elukey> executed "sudo ipmitool -I lanplus -H mw1360.mgmt.eqiad.wmnet -U root mc reset cold" from cumin (mgmt not available for mw1360) [production]
2020-09-05 §
00:23 <foks> removing 2 files for legal compliance [production]
2020-09-04 §
22:15 <ryankemper> wdqs deploy complete, service is healthy [production]
21:54 <ryankemper> `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 60 && systemctl restart wdqs-categories && sleep 30 && pool'` [production]
21:52 <ryankemper> `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` [production]
21:49 <ryankemper@deploy1001> Finished deploy [wdqs/wdqs@c7e6b35]: 0.3.47 (duration: 12m 55s) [production]
21:37 <ryankemper> Tests on canary `wdqs1003` passing, beginning full wdqs deploy [production]
21:36 <ryankemper@deploy1001> Started deploy [wdqs/wdqs@c7e6b35]: 0.3.47 [production]
21:31 <ryankemper> `ryankemper@wdqs2002:~$ sudo systemctl restart wdqs-blazegraph` [production]
21:06 <mutante> apt1001 - removed all libnginx-mod* packages except libnginx-mod-http-echo ; sudo apt-get autoremove ; run puppet ; restarted nginx - apt.wikimedia.org switched to nginx-light (T261962) [production]
21:02 <mutante> apt1001 - remove all libnginx-mod* packages except libnginx-mod-http-echo [production]
20:59 <mutante> apt2001 - sudo apt-get autoremove [production]
20:51 <mutante> apt2001 - apt-get remove --purge libnginx* and run puppet to replace nginx-full with nginx-light (T261962) [production]
20:43 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:41 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:39 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:38 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:38 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:36 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:36 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
20:35 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
20:34 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:32 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:31 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:31 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:30 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:30 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:05 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:04 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
20:03 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:01 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:01 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
20:00 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
19:59 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
19:59 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
19:57 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
19:57 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime [production]
19:22 <mutante> Icinga - ACKing with sticky - alerts on test and dev hosts [production]
18:10 <milimetric@deploy1001> Finished deploy [analytics/aqs/deploy@95d6432]: AQS: new editors by country endpoint, low risk so trying on a Friday with SRE blessing (duration: 07m 35s) [production]
18:02 <milimetric@deploy1001> Started deploy [analytics/aqs/deploy@95d6432]: AQS: new editors by country endpoint, low risk so trying on a Friday with SRE blessing [production]
10:31 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) [production]
10:29 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1087 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12492 and previous config saved to /var/cache/conftool/dbconfig/20200904-102955-marostegui.json [production]
10:28 <marostegui> Deploy MCR schema change on db1087 (sanitarium master), this will generate lag (probably a few days) on s8 labsdb hosts T238966 [production]
09:48 <marostegui> Restart prometheus-mysqld-exporter on db2125 [production]
09:11 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]