3401-3450 of 10000 results (65ms)
2019-09-30 §
14:00 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
13:54 <moritzm> draining ganeti2005 for upcoming reboot (combined kernel/qemu security updates) [production]
13:49 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:49 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
12:53 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:51 <jbond@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:33 <kart_> Update cxserver to 2019-09-26-034732-production (T233834, T232674, T233085) [production]
12:29 <@> helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' . [production]
12:29 <jbond42> offline puppetmaster2002 to reimage https://gerrit.wikimedia.org/r/c/operations/puppet/+/539322 [production]
12:27 <@> helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' . [production]
12:24 <@> helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' . [production]
12:00 <Urbanecm> EU SWAT done #2 [production]
12:00 <urbanecm@deploy1001> Synchronized wmf-config/throttle.php: SWAT: 3f4f242: New throttle rule for Czech wiki course (T234113) (duration: 00m 56s) [production]
11:57 <Urbanecm> Reopen EU SWAT to deploy throttle rule for October 02 (T234113) [production]
11:54 <raynor> EU SWAT finished [production]
11:54 <pmiazga@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:538296|Enable alternate mobile link for it, nl, ko wikis. (T206497)]] (duration: 00m 57s) [production]
11:27 <kartik@deploy1001> Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit|539517|Enable CX out of beta in Tagalog and Central Bikol WPs (T233006, T233007)]] (duration: 00m 59s) [production]
11:20 <hashar> Restarting Docker on integration-agent-puppet-docker-1001 # T234197 [production]
11:08 <hashar> Restarting Docker on CI agents to clear out some docker/iptables oddity # T234197 [production]
10:48 <hashar> CI outage is tracked in https://phabricator.wikimedia.org/T234197 [production]
10:42 <moritzm> draining ganeti2004 for upcoming reboot (combined kernel/qemu security updates) [production]
10:40 <hashar> CI down due to some DNS related failure on the hosts :-\ [production]
10:30 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
10:30 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
09:30 <moritzm> uploading ferm 2.4.1+wmf2+deb9u1 for stretch-wikimedia, fixes AAAA lookups (T153468) [production]
09:11 <moritzm> draining ganeti2002 for upcoming reboot (combined kernel/qemu security updates) [production]
09:10 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2091:3314 for a schema change - T233625', diff saved to https://phabricator.wikimedia.org/P9217 and previous config saved to /var/cache/conftool/dbconfig/20190930-091043-marostegui.json [production]
08:00 <moritzm> installing e2fsprogs security updates on Stretch/Buster [production]
07:56 <marostegui> Stop dbstore1003:3311 for troubleshooting [production]
06:47 <moritzm> installing exim security updates on buster [production]
2019-09-28 §
16:28 <vgutierrez> restarting acme-chief on acmechief1001 [production]
2019-09-27 §
22:44 <mutante> phab2001 - apt-get autoremove - remove unused python and ruby packages [production]
22:36 <mutante> phab2001 - upgrade php7.2 packages to 7.2.22 (T230024) [production]
22:03 <mutante> webperf1001, webperf2001: restart envoyproxy to pick up new cert with the right subject alt. names [production]
18:22 <mutante> mwdebug1001, mwdebug1002 - deleted from /srv/mediawiki/: php-1.34.0-wmf.16, .17, .18, .19 and .20 (current is .24) - usage back to about 57% (T234063) [production]
18:17 <mutante> mwdebug1001, mwdebug1002 - apt-get clean saves about 3GB and gets usage down from 94% to 87% on / (T234063) [production]
16:01 <XioNoX> delete BGP to AS34305 on cr2-esams [production]
15:34 <elukey> update pcc facts to add new hosts [production]
15:02 <moritzm> installing usb.ids update from Buster 10.1 point release [production]
14:45 <moritzm> installing ncurses bugfix update from Buster 10.1 point release [production]
14:39 <moritzm> installing postgresql-common bugfix update from Buster 10.1 point release [production]
14:32 <effie> Disable puppet and reload apache on mw* for 539465 and 539488 - T229792 [production]
13:33 <marostegui> Set candidate masters in dbctl T234039 [production]
13:31 <jmm@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:29 <jmm@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:16 <moritzm> reimaging auth1002 to buster [production]
13:09 <akosiaris> reboot ganeti2001 T233906 [production]
13:08 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
13:08 <marostegui@cumin1001> START - Cookbook sre.hosts.decommission [production]
13:03 <effie> Disable puppet on mwmaint1002 to test noc.wikimedia.org with PHP7 [production]