2014-08-17
§
|
22:58 |
<bd808> |
Attempting to reboot deployment-cache-bits01.eqiad.wmflabs via wikitech |
[releng] |
22:56 |
<bd808> |
deployment-cache-bits01.eqiad.wmflabs not allowing ssh access and wikitech console full of OOM killer messages |
[releng] |
21:08 |
<legoktm> |
running migrateAccount.php without --safe or --auto on terbium for bug 69291 |
[production] |
18:45 |
<hashar> |
Zuul upgraded |
[production] |
18:41 |
<hashar> |
Upgrading Zuul to latest version (that is not a friday afterall) |
[production] |
09:22 |
<springle> |
ongoing schema change wikidatawiki & testwiki wb_entity_per_page.epp_redirect_target. osc_host.sh processes on terbium ok to kill in emergency |
[production] |
04:34 |
<ottomata> |
restarted udp2log on oxygen |
[production] |
03:05 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sun Aug 17 03:04:22 UTC 2014 (duration 4m 21s) |
[production] |
02:49 |
<springle> |
killed stuff on labsdb1003 using all disk for temp tables. investigating |
[production] |
02:24 |
<LocalisationUpdate> |
completed (1.24wmf17) at 2014-08-17 02:23:08+00:00 |
[production] |
02:14 |
<LocalisationUpdate> |
completed (1.24wmf16) at 2014-08-17 02:13:35+00:00 |
[production] |
2014-08-16
§
|
18:12 |
<bblack> |
(amssq33: and yes, removing from fe/be cache pools) |
[production] |
18:11 |
<bblack> |
powering off amssq33, it's clipping network traffic at peak times due to bad ethernet connection negotiated down to 100Mbps (see existing RT 7933 in esams queue) |
[production] |
18:02 |
<bblack> |
ms-be1006: syslog indicates it started generating repeated "BUG: soft lockup" 10 minutes before dying, in XFS kernel code again... |
[production] |
17:55 |
<bblack> |
rebooting ms-be1006, ping-dead in icinga for 23m, console was unresponsive |
[production] |
17:37 |
<bblack> |
restarted apache2 on palladium... looks like something went horribly wrong with its puppet of itself that somehow killed off puppetmaster service? |
[production] |
03:07 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Sat Aug 16 03:06:29 UTC 2014 (duration 6m 28s) |
[production] |
02:27 |
<LocalisationUpdate> |
completed (1.24wmf17) at 2014-08-16 02:26:02+00:00 |
[production] |
02:17 |
<LocalisationUpdate> |
completed (1.24wmf16) at 2014-08-16 02:16:00+00:00 |
[production] |
2014-08-15
§
|
21:57 |
<legoktm> |
set $wgVERPsecret in PrivateSettings.php |
[releng] |
21:42 |
<hashSpeleology> |
Beta cluster database updates are broken due to CentralNotice. Fix up is {{gerrit|154231}} |
[releng] |
20:59 |
<kaldari> |
Synchronized php-1.24wmf16/extensions/MobileFrontend/less: fixing iOS search bug (duration: 00m 05s) |
[production] |
20:57 |
<hashSpeleology> |
deployment-rsync01 : deleting /usr/local/apache/common-local content. Then ln -s /srv/common-local /usr/local/apache/common-local as set by beta::common which is not applied on that host for some reason. {{bug|69590}} |
[releng] |
20:55 |
<hashSpeleology> |
puppet administratively disabled on mediawiki02 . Assuming some work in progress on that host. Leaving it untouched |
[releng] |
20:54 |
<hashSpeleology> |
puppet is proceeding on mediawiki01 |
[releng] |
20:52 |
<hashSpeleology> |
attempting to unbreak mediawiki code update {{bug|69590}} by cherry picking {{gerrit|154329}} |
[releng] |
20:39 |
<hashSpeleology> |
in case it is not in SAL. MediaWiki is no more synced to app server {{bug|69590}} |
[releng] |
20:20 |
<hashSpeleology> |
rebooting mediawiki01 , /var refuses to clear out and stick at 100% usage |
[releng] |
20:16 |
<hashSpeleology> |
cleaning up /var/log on deployment-mediawiki02 |
[releng] |
20:14 |
<hashSpeleology> |
on deployment-mediawiki01 deleting /var/log/apache2/access.log.1 |
[releng] |
20:13 |
<hashSpeleology> |
on deployment-mediawiki01 deleting /var/log/apache2/debug.log.1 |
[releng] |
20:13 |
<hashSpeleology> |
bunch of instances have a full /var/log :-/ |
[releng] |
17:58 |
<aude> |
Synchronized wmf-config/Wikibase.php: Enable redirects on test.wikidata (duration: 00m 07s) |
[production] |
15:53 |
<aude> |
Synchronized php-1.24wmf17/extensions/Wikidata: Update test.wikidata (duration: 00m 07s) |
[production] |
15:50 |
<aude> |
Synchronized php-1.24wmf17/extensions/Wikidata: Fix database error and snak value display on test wikidata (duration: 00m 09s) |
[production] |
15:00 |
<ori> |
re-enabled puppet on mw1017 |
[production] |
13:33 |
<ori> |
disabling puppet on mw1017 to test rsyslog config |
[production] |
11:37 |
<ori> |
deployment-cache-bits01 unresponsive; console shows OOMs: https://dpaste.de/LDRi/raw . rebooting |
[releng] |
03:51 |
<LocalisationUpdate> |
ResourceLoader cache refresh completed at Fri Aug 15 03:50:23 UTC 2014 (duration 50m 22s) |
[production] |
03:20 |
<jeremyb> |
02:46:37 UTC <ebernhardson> !log beta /dev/vda1 full. moved /srv-old to /mnt/srv-old and freed up 2.1G |
[releng] |
03:04 |
<LocalisationUpdate> |
completed (1.24wmf17) at 2014-08-15 03:03:49+00:00 |
[production] |
02:34 |
<LocalisationUpdate> |
completed (1.24wmf16) at 2014-08-15 02:33:21+00:00 |
[production] |
00:24 |
<ori> |
Finished scap: SWAT: cherry picks for TMH and Echo (duration: 14m 38s) |
[production] |
00:09 |
<ori> |
Started scap: SWAT: cherry picks for TMH and Echo |
[production] |