1251-1300 of 10000 results (24ms)
2015-06-16 §
10:36 <akosiaris> rebooting ganeti200{1..6}.codfw.wmnet for kernel upgrades [production]
09:33 <jynus> Synchronized wmf-config/db-codfw.php: Depool es2005, es2006 and es2007 for maintenance (duration: 00m 14s) [production]
09:10 <YuviPanda> deleted huge puppet-master.log on labcontrol1001 [production]
08:05 <jynus> added m5-slave to dns servers [production]
07:52 <paravoid> restarting hhvm on mw1121 [production]
07:39 <jynus> Synchronized wmf-config/db-eqiad.php: Repool es1005 (duration: 00m 14s) [production]
06:24 <LocalisationUpdate> ResourceLoader cache refresh completed at Tue Jun 16 06:24:04 UTC 2015 (duration 24m 3s) [production]
06:18 <godog> restore ES replication throttling to 20mb/s [production]
06:13 <godog> restore ES replication throttling to 40mb/s [production]
06:08 <filippo> Synchronized wmf-config/PoolCounterSettings-common.php: unthrottle ES (duration: 00m 14s) [production]
05:56 <godog> bump ES replication throttling to 60mb/s [production]
05:50 <manybubbles> ok - we're yellow and recovering. ops can take this from here. We have a root cause and we have things I can complain about to the elastic folks I plan to meet with today anyway. I'm going to finish waking up now. [production]
05:49 <manybubbles> reenabling puppet agent on elasticsearch machines [production]
05:46 <manybubbles> I expect them to be red for another few minutes during the initial master recovery [production]
05:46 <manybubbles> started all elasticsearch nodes and now they are recovering. [production]
05:41 <godog> restart gmond on elastic1007 [production]
05:39 <filippo> Synchronized wmf-config/PoolCounterSettings-common.php: throttle ES (duration: 00m 13s) [production]
05:25 <manybubbles> shutting down all the elasticsearch on the elasticsearch nodes against - another full cluster restart should fix it like it did last time............... [production]
05:11 <godog> restart elasticsearch on elastic1031 [production]
03:06 <springle> Synchronized wmf-config/db-eqiad.php: depool db1073 (duration: 00m 12s) [production]
02:27 <LocalisationUpdate> completed (1.26wmf9) at 2015-06-16 02:27:51+00:00 [production]
02:24 <l10nupdate> Synchronized php-1.26wmf9/cache/l10n: (no message) (duration: 05m 52s) [production]
00:55 <tgr> running extensions/Gather/maintenance/updateCounts.php for gather wikis - https://phabricator.wikimedia.org/T101460 [production]
00:52 <springle> Synchronized wmf-config/db-eqiad.php: repool db1057, warm up (duration: 00m 13s) [production]
00:46 <godog> killed bacula-fd on graphite1001, shouldn't be running and consuming bandwidth (cc akosiaris) [production]
00:27 <godog> kill python stats on cp1052, filling /tmp [production]
2015-06-15 §
23:42 <ori> Cleaning up renamed jobqueue metrics on graphite{1,2}001 [production]
23:01 <godog> killed bacula-fd on graphite2001, shouldn't be running and consuming bandwidth (cc akosiaris) [production]
22:54 <hoo> Synchronized wmf-config/filebackend.php: Fix commons image inclusion after commons went https only (duration: 00m 14s) [production]
22:18 <godog> run disk stress-test on restbase1007 / restbase1009 [production]
22:06 <twentyafterfour> Synchronized hhvm-fatal-error.php: deploy: Guard header() call in error page (duration: 00m 15s) [production]
22:05 <twentyafterfour> Synchronized wmf-config/InitialiseSettings-labs.php: deploy: Never use wgServer/wgCanonicalServer values from production in labs (duration: 00m 12s) [production]
20:37 <yurik> Synchronized docroot/bits/WikipediaMobileFirefoxOS: Bumping FirefoxOS app to latest (duration: 00m 14s) [production]
20:30 <godog> bounce cassandra on restbase1003 [production]
20:18 <godog> start cassandra on restbase1008, bootstrapping [production]
20:04 <godog> sign restbase1008 key, run puppet [production]
20:00 <godog> powercycle restbase1007, investigate disk issue [production]
19:07 <ori> Synchronized php-1.26wmf9/includes/jobqueue: 0a32aa3be4: jobqueue: use more sensible metric key names (duration: 00m 13s) [production]
16:57 <thcipriani> Synchronized wmf-config/InitialiseSettings.php: SWAT: Grant cloudadmins the 'editallhiera' right [[gerrit:218115]] (duration: 00m 14s) [production]
16:49 <thcipriani> Synchronized php-1.26wmf9/extensions/OpenStackManager/OpenStackManagerHooks.php: SWAT: refer to user the right way (duration: 00m 13s) [production]
16:48 <godog> powercycle graphite1002, no ssh, unresponsive console [production]
16:19 <jynus> upgrading es1005 mysql service while depooled [production]
16:12 <thcipriani> Synchronized wmf-config/InitialiseSettings.php: SWAT: Grant cloudadmins the 'editallhiera' right [[gerrit:218115]] (duration: 00m 12s) [production]
16:10 <bblack> pybal restarts complete, all ok [production]
16:09 <thcipriani> Finished scap: SWAT: Openstack manager and language updates (duration: 21m 27s) [production]
15:47 <thcipriani> Started scap: SWAT: Openstack manager and language updates [production]
15:46 <bblack> starting pybal restart process for config changes ( https://gerrit.wikimedia.org/r/#/c/218285/ ), inactives first w/ manual verification of ok-ness [production]
15:11 <bblack> rebooting cp3041 (downtimed) [production]
15:00 <_joe_> ES is green [production]
14:38 <aude> Synchronized php-1.26wmf9/extensions/Wikidata: Fix property label constraints bug (duration: 00m 24s) [production]