4001-4050 of 10000 results (34ms)
2016-01-27 ยง
21:26 <ori@mira> Synchronized docroot and w: (no message) (duration: 02m 26s) [production]
21:19 <cscott> updated OCG to version 64050af0456a43344b32e3e93561a79207565eaf (should be no-op after yesterday's deploy) [releng]
20:24 <chasemp> master stop, truncate accounting log to accounting.01272016, master start [tools]
19:48 <YuviPanda> started nfs-exports daemon on labstore1001, had been dead for a few days [production]
19:34 <chasemp> master start grid master [tools]
19:31 <mutante> stat1002 - redis.exceptions.ConnectionError: Error connecting to mira.codfw.wmnet:6379. timed out. [production]
19:31 <mutante> stat1002 - running puppet, was reported as last run about 4 hours ago but not deactivated [production]
19:23 <chasemp> stopped master [tools]
19:14 <dduvall@mira> rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.11 [production]
19:11 <YuviPanda> depooled tools-webgrid-1405 to prep for restart, lots of stuck processes [tools]
18:49 <jynus@mira> Synchronized wmf-config/db-eqiad.php: Repool pc1006 after cloning (duration: 02m 25s) [production]
18:48 <bd808> HHVM on mw1019 still dying on a regular basis with "Lost parent, LightProcess exiting" [production]
18:29 <valhallasw`cloud> job 2551539 is ifttt, which is also running as 2700629. Killing 2551539 . [tools]
18:26 <valhallasw`cloud> messages repeatedly reports "01/27/2016 18:26:17|worker|tools-grid-master|E|execd@tools-webgrid-generic-1405.tools.eqiad.wmflabs reports running job (2551539.1/master) in queue "webgrid-generic@tools-webgrid-generic-1405.tools.eqiad.wmflabs" that was not supposed to be there - killing". SSH'ing there to investigate [tools]
18:24 <valhallasw`cloud> 'sleep' test job also seems to work without issues [tools]
18:23 <valhallasw`cloud> no errors in log file, qstat works [tools]
18:23 <chasemp> master sge restarted post dump and restart for jobs db [tools]
18:22 <valhallasw`cloud> messages file reports 'Wed Jan 27 18:21:39 UTC 2016 db_load_sge_maint_pre_jobs_dump_01272016' [tools]
18:20 <chasemp> master db_load -f /root/sge_maint_pre_jobs_dump_01272016 sge_job [tools]
18:19 <valhallasw`cloud> dumped jobs database to /root/sge_maint_pre_jobs_dump_01272016, 4.6M [tools]
18:17 <valhallasw`cloud> SGE Configuration successfully saved to /root/sge_maint_01272016 directory. [tools]
18:14 <chasemp> grid master stopped [tools]
18:00 <csteipp> deploy patch for T103239 [production]
17:50 <csteipp> deploy patch for T97157 [production]
17:46 <jynus> migrating ruthenium parsoid-test database to m5-master [production]
17:27 <elukey> rebooting analytics105* hosts to upgrade their kernel [production]
17:16 <elukey> rebooting analytics1035.eqiad.wmnet for kernel upgrade [production]
16:22 <thcipriani@mira> Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/CentralAuthUtils.php: SWAT: Preserve certain keys when updating central session [[gerrit:266672]] (duration: 02m 28s) [production]
16:11 <thcipriani@mira> Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: SWAT: Avoid forceHTTPS cookie flapping if core and CA are setting the same cookie [[gerrit:266671]] (duration: 02m 26s) [production]
16:03 <elukey> rebooting analytics 1043 -> 1050 for kernel upgrade. [production]
15:47 <elukey> rebooting analytics 1026, 1040 -> 1042 due to kernel upgrade. [production]
14:58 <jynus> cloning persercache contents from pc1003 to pc1006 [production]
14:45 <elukey> rebooting analytics 1036 to 1039 for kernel upgrade [production]
14:35 <elukey> analytics 1035 hasn't been rebooted because it is a Hadoop Journal Node (will be restarted in the end) [production]
14:04 <elukey> rebooting analytics 1032 to 1035 for kernel upgrades [production]
14:03 <jynus@mira> Synchronized wmf-config/db-eqiad.php: Depool pc1003 for cloning to pc1006 (duration: 02m 30s) [production]
13:59 <jynus> about to going new hardware/OS/mariadb-only for parsercache service [production]
13:32 <elukey> rebooting analytics1030/1031 for kernel upgrade [production]
13:15 <akosiaris> rebooting fermium for kernel upgrades [production]
13:10 <elukey> rebooting analytics1029 for kernel upgrade [production]
12:29 <moritzm> rebooting analytics1028 for kernel update [production]
10:29 <hashar> triggered bunch of browser tests, deployment-redis01 was dead/faulty [releng]
10:25 <ema> restarting apache2 and hhvm on mw1119 [production]
10:08 <hashar> mass restarting redis-server process on deployment-redis01 (for https://phabricator.wikimedia.org/T124677 ) [releng]
10:07 <hashar> mass restarting redis-server process on deployment-redis01 [releng]
09:00 <hashar> beta: commenting out "latency-monitor-threshold 100" parameter from any /etc/redis/redis.conf we have ( https://phabricator.wikimedia.org/T124677 ). Puppet will not reapply it unless distribution is Jessie [releng]
03:19 <ebernhardson@mira> Synchronized wmf-config/CirrusSearch-production.php: Correct invalid cirrus shard configuration (duration: 02m 59s) [production]
02:55 <l10nupdate@tin> ResourceLoader cache refresh completed at Wed Jan 27 02:55:21 UTC 2016 (duration 7m 13s) [production]
02:48 <mwdeploy@tin> sync-l10n completed (1.27.0-wmf.11) (duration: 10m 25s) [production]
02:23 <mwdeploy@tin> sync-l10n completed (1.27.0-wmf.10) (duration: 09m 51s) [production]