2019-09-30
§
|
10:48 |
<hashar> |
CI outage is tracked in https://phabricator.wikimedia.org/T234197 |
[production] |
10:42 |
<moritzm> |
draining ganeti2004 for upcoming reboot (combined kernel/qemu security updates) |
[production] |
10:40 |
<hashar> |
CI down due to some DNS related failure on the hosts :-\ |
[production] |
10:30 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
10:30 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
10:21 |
<arturo> |
we installed ferm in every VM by mistake. Deleting it and forcing a puppet agent run to try to go back to a clean state. |
[admin] |
09:38 |
<arturo> |
downtime toolschecker for 24h |
[admin] |
09:33 |
<arturo> |
force update ferm cloud-wide (in all VMs) for T153468 |
[admin] |
09:30 |
<moritzm> |
uploading ferm 2.4.1+wmf2+deb9u1 for stretch-wikimedia, fixes AAAA lookups (T153468) |
[production] |
09:11 |
<moritzm> |
draining ganeti2002 for upcoming reboot (combined kernel/qemu security updates) |
[production] |
09:10 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db2091:3314 for a schema change - T233625', diff saved to https://phabricator.wikimedia.org/P9217 and previous config saved to /var/cache/conftool/dbconfig/20190930-091043-marostegui.json |
[production] |
08:00 |
<moritzm> |
installing e2fsprogs security updates on Stretch/Buster |
[production] |
07:56 |
<marostegui> |
Stop dbstore1003:3311 for troubleshooting |
[production] |
06:47 |
<moritzm> |
installing exim security updates on buster |
[production] |
05:26 |
<elukey> |
re-run manually pageview-druid-hourly 29/09T22:00 |
[analytics] |
2019-09-27
§
|
22:44 |
<mutante> |
phab2001 - apt-get autoremove - remove unused python and ruby packages |
[production] |
22:36 |
<mutante> |
phab2001 - upgrade php7.2 packages to 7.2.22 (T230024) |
[production] |
22:24 |
<Zoranzoki21> |
Migration done, everything works. Maintenance done! - T233980 |
[tools.discordwiki] |
22:23 |
<Zoranzoki21> |
DROP DATABASE s53972__wiki - done in 2,53 seconds - T233980 |
[tools.discordwiki] |
22:22 |
<Zoranzoki21> |
MariaDB [(none)]> DROP DATABASE s53972__wiki; |
[tools.discordwiki] |
22:22 |
<Zoranzoki21> |
DROP DATABASE s53972__wiki; started - T233980 |
[tools.discordwiki] |
22:21 |
<Zoranzoki21> |
Update LocalSettings.php with new sql access parameters - T233980 |
[tools.discordwiki] |
22:20 |
<Zoranzoki21> |
- Import fully done - T233980 |
[tools.discordwiki] |
22:18 |
<Zoranzoki21> |
Import done in 15 seconds |
[tools.discordwiki] |
22:17 |
<Zoranzoki21> |
Import dump from s53972__wiki to s54159__wiki database - T233980 |
[tools.discordwiki] |
22:12 |
<Zoranzoki21> |
Created dump for migration of database - T233980 |
[tools.discordwiki] |
22:03 |
<mutante> |
webperf1001, webperf2001: restart envoyproxy to pick up new cert with the right subject alt. names |
[production] |
21:01 |
<hashar> |
T233989 I have failed to run (must be run offline): /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -jar /var/lib/gerrit2/review_site/bin/gerrit.war reindex --list |
[releng] |
20:01 |
<hashar> |
Marking integration-agent-docker-1015 offline due to cloudvirt1004 being wayyyyy too slow T223971 |
[releng] |
19:47 |
<mutante> |
phabricator-10 - switching puppet config to puppetmaster.cloudinfra.wmflabs.org - cert error is gone, just has acme-setup issue |
[phabricator] |
19:12 |
<mutante> |
- re-enabling disabled puppet on phabricator-10. running puppet, fails with "certificate verify failed (self signed certificate in certificate chain) |
[phabricator] |
18:23 |
<brennen> |
Reloading Zuul to deploy https://gerrit.wikimedia.org/r/539447 |
[releng] |
18:22 |
<mutante> |
mwdebug1001, mwdebug1002 - deleted from /srv/mediawiki/: php-1.34.0-wmf.16, .17, .18, .19 and .20 (current is .24) - usage back to about 57% (T234063) |
[production] |
18:17 |
<mutante> |
mwdebug1001, mwdebug1002 - apt-get clean saves about 3GB and gets usage down from 94% to 87% on / (T234063) |
[production] |
16:59 |
<bd808> |
Set "profile::rsyslog::kafka_shipper::kafka_brokers: []" in tools-elastic prefix puppet |
[tools] |
16:01 |
<XioNoX> |
delete BGP to AS34305 on cr2-esams |
[production] |
15:34 |
<elukey> |
update pcc facts to add new hosts |
[production] |
15:02 |
<moritzm> |
installing usb.ids update from Buster 10.1 point release |
[production] |
14:45 |
<moritzm> |
installing ncurses bugfix update from Buster 10.1 point release |
[production] |
14:39 |
<moritzm> |
installing postgresql-common bugfix update from Buster 10.1 point release |
[production] |
14:32 |
<effie> |
Disable puppet and reload apache on mw* for 539465 and 539488 - T229792 |
[production] |
13:33 |
<marostegui> |
Set candidate masters in dbctl T234039 |
[production] |
13:31 |
<jmm@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |