4401-4450 of 10000 results (70ms)
2019-09-30 §
10:48 <hashar> CI outage is tracked in https://phabricator.wikimedia.org/T234197 [production]
10:42 <moritzm> draining ganeti2004 for upcoming reboot (combined kernel/qemu security updates) [production]
10:40 <hashar> CI down due to some DNS related failure on the hosts :-\ [production]
10:30 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
10:30 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
10:21 <arturo> we installed ferm in every VM by mistake. Deleting it and forcing a puppet agent run to try to go back to a clean state. [admin]
09:38 <arturo> downtime toolschecker for 24h [admin]
09:33 <arturo> force update ferm cloud-wide (in all VMs) for T153468 [admin]
09:30 <moritzm> uploading ferm 2.4.1+wmf2+deb9u1 for stretch-wikimedia, fixes AAAA lookups (T153468) [production]
09:11 <moritzm> draining ganeti2002 for upcoming reboot (combined kernel/qemu security updates) [production]
09:10 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2091:3314 for a schema change - T233625', diff saved to https://phabricator.wikimedia.org/P9217 and previous config saved to /var/cache/conftool/dbconfig/20190930-091043-marostegui.json [production]
08:00 <moritzm> installing e2fsprogs security updates on Stretch/Buster [production]
07:56 <marostegui> Stop dbstore1003:3311 for troubleshooting [production]
06:47 <moritzm> installing exim security updates on buster [production]
05:26 <elukey> re-run manually pageview-druid-hourly 29/09T22:00 [analytics]
2019-09-29 §
21:49 <Amir1> git pull for puppet (T219248) [codesearch]
12:14 <hauskatze> Added hi.wikisource to CVNBot8 [cvn]
12:10 <hauskatze> Added atj.wikipedia to CVNBot8 [cvn]
2019-09-28 §
18:49 <wikibugs> Updated channels.yaml to: cbff02ce5e3ed77807d1753a0add44b7b550a720 Add User-RhinosF1 project alerts for channel ##RhinosF1 [tools.wikibugs]
18:35 <hauskatze> gerrit: Ran gerrit gc labs/tools/stewardbots --show-progress --aggressive [releng]
16:28 <vgutierrez> restarting acme-chief on acmechief1001 [production]
14:58 <Zppix> sudo service ircecho restart due to bot pingout and not restarting self [git]
2019-09-27 §
22:44 <mutante> phab2001 - apt-get autoremove - remove unused python and ruby packages [production]
22:36 <mutante> phab2001 - upgrade php7.2 packages to 7.2.22 (T230024) [production]
22:24 <Zoranzoki21> Migration done, everything works. Maintenance done! - T233980 [tools.discordwiki]
22:23 <Zoranzoki21> DROP DATABASE s53972__wiki - done in 2,53 seconds - T233980 [tools.discordwiki]
22:22 <Zoranzoki21> MariaDB [(none)]> DROP DATABASE s53972__wiki; [tools.discordwiki]
22:22 <Zoranzoki21> DROP DATABASE s53972__wiki; started - T233980 [tools.discordwiki]
22:21 <Zoranzoki21> Update LocalSettings.php with new sql access parameters - T233980 [tools.discordwiki]
22:20 <Zoranzoki21> - Import fully done - T233980 [tools.discordwiki]
22:18 <Zoranzoki21> Import done in 15 seconds [tools.discordwiki]
22:17 <Zoranzoki21> Import dump from s53972__wiki to s54159__wiki database - T233980 [tools.discordwiki]
22:12 <Zoranzoki21> Created dump for migration of database - T233980 [tools.discordwiki]
22:03 <mutante> webperf1001, webperf2001: restart envoyproxy to pick up new cert with the right subject alt. names [production]
21:01 <hashar> T233989 I have failed to run (must be run offline): /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -jar /var/lib/gerrit2/review_site/bin/gerrit.war reindex --list [releng]
20:01 <hashar> Marking integration-agent-docker-1015 offline due to cloudvirt1004 being wayyyyy too slow T223971 [releng]
19:47 <mutante> phabricator-10 - switching puppet config to puppetmaster.cloudinfra.wmflabs.org - cert error is gone, just has acme-setup issue [phabricator]
19:12 <mutante> - re-enabling disabled puppet on phabricator-10. running puppet, fails with "certificate verify failed (self signed certificate in certificate chain) [phabricator]
18:23 <brennen> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/539447 [releng]
18:22 <mutante> mwdebug1001, mwdebug1002 - deleted from /srv/mediawiki/: php-1.34.0-wmf.16, .17, .18, .19 and .20 (current is .24) - usage back to about 57% (T234063) [production]
18:17 <mutante> mwdebug1001, mwdebug1002 - apt-get clean saves about 3GB and gets usage down from 94% to 87% on / (T234063) [production]
16:59 <bd808> Set "profile::rsyslog::kafka_shipper::kafka_brokers: []" in tools-elastic prefix puppet [tools]
16:01 <XioNoX> delete BGP to AS34305 on cr2-esams [production]
15:34 <elukey> update pcc facts to add new hosts [production]
15:02 <moritzm> installing usb.ids update from Buster 10.1 point release [production]
14:45 <moritzm> installing ncurses bugfix update from Buster 10.1 point release [production]
14:39 <moritzm> installing postgresql-common bugfix update from Buster 10.1 point release [production]
14:32 <effie> Disable puppet and reload apache on mw* for 539465 and 539488 - T229792 [production]
13:33 <marostegui> Set candidate masters in dbctl T234039 [production]
13:31 <jmm@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]