2016-07-23
§
|
23:21 |
<YuviPanda> |
restart maintain-kubeusers on tools-k8s-master-01, was stuck on connecting to seaborgium preventing new tool creation |
[tools] |
20:23 |
<bd808> |
killed running job for sum_disc per T140909 |
[tools.asurabot] |
20:06 |
<bd808> |
Cleanup jobrunner01 logs via -- sudo logrotate --force /etc/logrotate.d/mediawiki_jobrunner |
[releng] |
20:03 |
<bd808> |
Deleted jobqueues in redis with no matching wikis: ptwikibooks, labswiki |
[releng] |
19:20 |
<bd808> |
jobrunner01 spamming /var/log/mediawiki with attempts to process jobs for wiki=labswiki |
[releng] |
19:01 |
<bd808> |
Killed pod grrrit-wm-230500525-ze741 |
[tools.lolrrit-wm] |
15:38 |
<godog> |
stop swift in esams test cluster, lots of logging from there |
[production] |
15:37 |
<godog> |
lithium sudo lvextend --size +10G -r /dev/mapper/lithium--vg-syslog |
[production] |
05:20 |
<legoktm> |
deleted urlshortener instance |
[mediawiki-core-team] |
04:58 |
<ori> |
Gerrit is back up after service restart; was unavailable between ~ 04:29 - 04:57 UTC |
[production] |
04:56 |
<ori> |
Restarting Gerrit on ytterbium |
[production] |
04:48 |
<ori> |
Users report Gerrit is down; on ytterbium java is occupying two cores at 100% |
[production] |
03:48 |
<chasemp> |
gnt-instance reboot seaborgium.wikimedia.org |
[production] |
02:26 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Sat Jul 23 02:26:49 UTC 2016 (duration 5m 41s) |
[production] |
02:21 |
<mwdeploy@tin> |
scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 24s) |
[production] |
01:56 |
<YuviPanda> |
deploy kubernetes v1.3.3wmf1 |
[tools] |
01:02 |
<tgr@tin> |
Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPlugin.php: T141160 (duration: 00m 29s) |
[production] |
01:01 |
<tgr@tin> |
Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthHooks.php: T141160 (duration: 00m 27s) |
[production] |
01:00 |
<tgr@tin> |
Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPrimaryAuthenticationProvider.php: T141160 (duration: 00m 28s) |
[production] |
00:37 |
<tgr> |
doing an emergency deploy of https://gerrit.wikimedia.org/r/#/c/300679 for T141160, creates dozens of new users per hour to be unattached on loginwiki which probably has weird consequences |
[production] |
2016-07-22
§
|
22:19 |
<aaron@tin> |
Synchronized wmf-config/InitialiseSettings.php: Enable debug logging for DBTransaction (duration: 00m 38s) |
[production] |
21:10 |
<ejegg> |
updated civicrm from 2f4805fa2d2a7c57881408be2b3a017d26d8f43e to d657255e1edebeccfc0a03bea70b78eb11375cf8 |
[production] |
20:58 |
<ejegg> |
disabled Worldpay audit parser job |
[production] |
20:26 |
<hashar> |
T141114 upgraded jenkins-debian-glue from v0.13.0 to v0.17.0 on integration-slave-jessie-1001 and integration-slave-jessie-1002 |
[releng] |
19:07 |
<thcipriani> |
beta-cluster has successfully used a canary for mediawiki deployments |
[releng] |
18:59 |
<ejegg> |
rolled back payments from 79d2b67067fd7e579372b63e0d619eccfa3b9143 to 79cb53998c41f72d0fa49130ed1f66dc112b478c |
[production] |
18:54 |
<mutante> |
restart grrrit-wm |
[production] |
17:30 |
<YuviPanda> |
repool tools-worker-1018 |
[tools] |
16:53 |
<thcipriani> |
bumping scap to v.3.2.1 on deployment-tin to test canary deploys, again |
[releng] |
16:46 |
<thcipriani> |
rolling back scap version to v.3.2.0 |
[releng] |
16:37 |
<thcipriani> |
bumping scap to v.3.2.1 on deployment-tin to test canary deploys |
[releng] |
16:05 |
<Jeff_Green> |
running authdns-update to correct a DKIM public key on wikipedia.org |
[production] |
15:24 |
<anomie> |
Starting script to populate empty gu_auth_token [[phab:T140478]] |
[production] |
15:16 |
<urandom> |
T140825: Restarting Cassandra to apply 8MB trickle_fsync (restbase1015-a.eqiad.wmnet) |
[production] |
14:21 |
<gehel> |
rolling restart of logstash100[1-3] - T141063 |
[production] |
14:19 |
<urandom> |
T134016: Boostrapping restbase2004-c.codfw.wmnet |
[production] |
14:04 |
<chasemp> |
reboot tools-worker-1015 as stuck w/ high iowait warning seconds ago. I cannot ssh in as root. |
[tools] |
13:02 |
<hashar> |
zuul rebased patch queue on tip of upstream branch and force pushed branch. c3d2810...4ddad4e HEAD -> patch-queue/debian/precise-wikimedia (forced update) |
[releng] |
12:42 |
<jynus> |
applying new m5 db grants |
[production] |
11:12 |
<jynus> |
reimage dbproxy1009 T140983 |
[production] |
11:04 |
<jynus> |
applying new m2 db grants |
[production] |
10:47 |
<jynus> |
reimage dbproxy1007 T140983 |
[production] |
10:36 |
<jynus> |
applying new m1 db grants |
[production] |
10:32 |
<hashar> |
Jenkins restarted and it pooled both integration-slave-jessie-1002 and integration-slave-trusty-1018 |
[releng] |
10:27 |
<hashar> |
Restarting Jenkins entirely (deadlocked) |
[production] |
10:23 |
<hashar> |
Jenkins has some random deadlock. Will probably reboot it |
[releng] |
10:23 |
<hashar> |
Jenkins has some random deadlock. Will probably reboot it |
[production] |
10:17 |
<hashar> |
Jenkins can't ssh / add slaves integration-slave-jessie-1002 or integration-slave-trusty-1018 . Apparently due to some Jenkins deadlock in the ssh slave plugin :-/ Lame way to solve it: restart Jenkins |
[releng] |
10:10 |
<hashar> |
rebooting integration-slave-jessie-1002 and integration-slave-trusty-1018 . Hang somehow |
[releng] |