2015-06-08
§
|
13:03 |
<moritzm> |
added strongswan_5.3.0-1+wmf2 to jessie-wikimedia on carbon |
[production] |
11:42 |
<_joe_> |
purging squid cache on carbon |
[production] |
11:26 |
<moritzm> |
updated mc2* to 2:2.8.17-1+deb8u1 |
[production] |
10:55 |
<jynus> |
Synchronized wmf-config/db-eqiad.php: repool es1007 (duration: 01m 08s) |
[production] |
10:27 |
<akosiaris> |
disabled puppet on uranium, investigating ganglia problems |
[production] |
10:05 |
<akosiaris> |
ganglia gmetad problems |
[production] |
08:56 |
<hashar> |
rebooted trusty-1013 trusty-1015 ( https://phabricator.wikimedia.org/T101658 ) and repooled them in Jenkins |
[releng] |
08:48 |
<hashar> |
rebooting integration-slave-trusty-1012 (stalled can't login) |
[releng] |
05:25 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Mon Jun 8 05:24:08 UTC 2015 (duration 24m 7s) |
[production] |
04:30 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/216520 |
[releng] |
02:26 |
<LocalisationUpdate> |
completed (1.26wmf8) at 2015-06-08 02:25:12+00:00 |
[production] |
02:21 |
<l10nupdate> |
Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 07m 07s) |
[production] |
00:40 |
<legoktm> |
deploying https://gerrit.wikimedia.org/r/216600 |
[releng] |
2015-06-07
§
|
23:27 |
<godog> |
reboot ms-be2008 sdg failed, xfs unhappy |
[production] |
20:43 |
<Krinkle> |
Rebooting integration-slave-trusty-1015 to see if it comes back so we can inspect logs (T101658) |
[releng] |
20:16 |
<Krinkle> |
Per Yuvi's advice, disabled "Shared project storage" (/data/project NFS mount) for the integration project. Mostly unused. Two existing directories were archived to /home/krinkle/integration-nfs-data-project/ |
[releng] |
17:51 |
<Krinkle> |
integration-slave-trusty-1012, trusty-1013 and 1015 unresponsive to pings or ssh. Other trusty slaves still reachable. |
[releng] |
07:03 |
<springle> |
Synchronized wmf-config/db-eqiad.php: repool db1073, warm up (duration: 01m 09s) |
[production] |
05:16 |
<andrewbogott> |
we did a whole lot of things to labstore1001 while morebots was away |
[production] |
05:14 |
<andrewbogott> |
service nfs-kernel-server restart on labstore1001 |
[production] |
02:26 |
<LocalisationUpdate> |
completed (1.26wmf8) at 2015-06-07 02:25:13+00:00 |
[production] |
02:21 |
<l10nupdate> |
Synchronized php-1.26wmf8/cache/l10n: (no message) (duration: 07m 09s) |
[production] |
2015-06-05
§
|
23:55 |
<bd808> |
added deployment-logstash2 host and told cluster to move logstash all data there |
[releng] |
22:42 |
<godog> |
powercycle graphite2001, no console no ssh |
[production] |
22:06 |
<andrewbogott> |
restarted apache on virt1000 |
[production] |
21:22 |
<bd808> |
restarted puppetmaster on deployment-salt ("Could not request certificate: Error 500 on SERVER: <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">") |
[releng] |
21:17 |
<hashar> |
Pooled in mediawiki-extensions-qunit which runs qunit tests with karma with multiple extensions . https://gerrit.wikimedia.org/r/#/c/216132/ . https://phabricator.wikimedia.org/T99877 |
[releng] |
20:49 |
<ori> |
Upgrading hhvm-fss on application servers to 1.1.7; expect brief 5xx spike. |
[production] |
20:14 |
<demon> |
Synchronized php-1.26wmf8: live hack (duration: 02m 32s) |
[production] |
20:10 |
<mutante> |
apt-get upgrade on terbium |
[production] |
19:52 |
<godog> |
bounce redis on rdb1001/rdb1003 to pick up new slave limits |
[production] |
19:51 |
<mutante> |
chown root:root / on terbium |
[production] |
19:50 |
<godog> |
bounce redis on rdb1002/rdb1004 to pick up new slave limits |
[production] |
19:45 |
<thcipriani> |
set use_dnsmasq: false on Hiera:Integration |
[releng] |
19:40 |
<hashar> |
refreshed Jenkins jobs mediawiki-extensions-hhvm and mediawiki-extensions-zend with https://gerrit.wikimedia.org/r/#/c/216100/3 (refactoring) |
[releng] |
19:29 |
<godog> |
bounce redis again on rdb1003 after increasing the slave limits more |
[production] |
19:17 |
<godog> |
bounce redis on rdb1003 after bumping slave limits |
[production] |
19:07 |
<godog> |
redis master logs shows periodic 'cmd=sync scheduled to be closed ASAP for overcoming of output buffer limits.' indicating the slave fails to sync |
[production] |
18:56 |
<Krinkle> |
Reloading Zuul to deploy https://gerrit.wikimedia.org/r/216182 |
[releng] |
18:52 |
<Krinkle> |
Reloading Zuul to deploy https://gerrit.wikimedia.org/r/216159 |
[releng] |
18:40 |
<godog> |
spike in redis network starting at ~15.00 UTC, correlates with ocg failures |
[production] |
18:01 |
<moritzm> |
restarted gerrit on ytterbium for java update |
[production] |
14:43 |
<jynus> |
short lag period on db1049, traffic automatically redirected to other slave and back to normal |
[production] |
14:07 |
<moritzm> |
added ubuntu-meta-1.325+wmf1 for trusty-wikimedia to apt.wikimedia.org (T100004) |
[production] |
14:07 |
<moritzm> |
added ubuntu-meta-1.267.1+wmf1 for precise-wikimedia to apt.wikimedia.org (T100004) |
[production] |